[Devel] https://bugs.openvz.org/browse/OVZ-6834 CUDA in container

Submitted by Andrey Ryabinin on Dec. 13, 2016, 2:57 p.m.

Details

Message ID 48d10f94-42f0-b499-99d6-1007a43bcb33@virtuozzo.com
State New
Series "https://bugs.openvz.org/browse/OVZ-6834 CUDA in container"
Headers show

Commit Message

Andrey Ryabinin Dec. 13, 2016, 2:57 p.m.
On 12/12/2016 03:58 PM, Thomas Hoberg wrote:
> Hi Andrey,
> 
> I'm very sorry to contact you directly, but I've run out of options to help myself.
> 
> I am trying to get CUDA programs to run inside OpenVZ containers (they already run on Docker containers on the host) and my problem is that the NVidia runtime library is looking at files in the /proc directory at startup, which are supressed in OpenVZ containers.
> 
> You have implemented a fix to make /proc/modules visible (thank you!), but immediately afterwards the runtime wants to see the contents of '/proc/driver/nvidia/params', and potentially more files inside that directory.
> 
> I've tried to find and fix the visibility myself, but I can't find where you implemented the /proc/modules patch.
> 
> The public git repository doesn't yet contain the patch (for easy comparison) and while I've downloaded the patched kernel from your build factory (https://download.openvz.org/virtuozzo/factory/x86_64/os/Packages/v/) and looked through all kernel sources which I thought could possibly contain the patch, it has eluded me.
> 

You can find all patches in devel@openvz.org mailing list archives: https://lists.openvz.org/pipermail/devel/2016-November/069624.html

> So could you either include a patch to make /proc/driver visible or help me find the patch for /proc/modules so I can try myself?
> 

Access to proc directories is slightly different. We show directory in container iff it sticky bit is set.
You can set sticky bit via chmod (it's forbidden for proc entries in OpenVZ kernel, I dunno why),
but you can change the source like this:

Patch hide | download patch | download mbox

diff --git a/fs/proc/root.c b/fs/proc/root.c
index 88be7c2..2a0bd71 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -185,7 +185,7 @@  void __init proc_root_init(void)
 	proc_mkdir_mode("sysvipc", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
 #endif
 	proc_mkdir_mode("fs", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
-	proc_mkdir("driver", NULL);
+	proc_mkdir_mode("driver", S_ISVTX, NULL);
 	/* somewhere for the nfsd filesystem to be mounted */
 	proc_mkdir_mode("fs/nfsd", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
 #if defined(CONFIG_SUN_OPENPROMFS) || defined(CONFIG_SUN_OPENPROMFS_MODULE)

Comments

Andrey Ryabinin Dec. 13, 2016, 4:29 p.m.
On 12/13/2016 05:57 PM, Andrey Ryabinin wrote:

> Access to proc directories is slightly different. We show directory in container iff it sticky bit is set.
> You can set sticky bit via chmod (it's forbidden for proc entries in OpenVZ kernel, I dunno why),
> but you can change the source like this:
> 
> diff --git a/fs/proc/root.c b/fs/proc/root.c
> index 88be7c2..2a0bd71 100644
> --- a/fs/proc/root.c
> +++ b/fs/proc/root.c
> @@ -185,7 +185,7 @@ void __init proc_root_init(void)
>  	proc_mkdir_mode("sysvipc", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
>  #endif
>  	proc_mkdir_mode("fs", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
> -	proc_mkdir("driver", NULL);
> +	proc_mkdir_mode("driver", S_ISVTX, NULL);

Err, of course this should be: proc_mkdir_mode("driver", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
Thomas Hoberg Dec. 14, 2016, 6:15 p.m.
Hi Andrey,

thanks a lot for your feedback!

Ok, now I'm beginning to understand what's going on here...

OpenVZ is "hijacking" the sticky bits on /proc dir entries to decide if 
it should be replicated into containers...

...while CGROUP, Docker, LXC etc. just plain copy everything (except the 
PID etc. stuff) and don't care.

I was looking for the code, implementing the differentiation, perhaps a 
table of some kind, implementing a specification based on which files in 
/proc could be considered "dangerous" for a container to see, perhaps 
even doing the functional equivalent of a PID space translation 
depending on the content of these files etc. bla, bla.

And was was starting to be afraid that perhaps the entire /proc file 
hierarchy would be set up by Docker using some kind of a /proc or CGROUP 
API at spin-up (or the crazy CGROUP dirtree stuff Ubuntu does), but then 
the PID and UID translation was top level etc... ...

And of course nothing like that could be found, because you use the 
sticky bit!

And it also kind of explains, why I see /proc/driver as file not directory.

Now I'm pretty confident I can hack it and see if that makes it work, 
but of course I'd actually have it work for the entire subtree below 
/proc/driver, which would be all over.

And then we still have the issue of wanting to control what containers 
should "see".... or not in such a way that it can become part of the 
mainline OpenVZ kernel...

But first the real-world check, to see if I can finally get CUDA to fly 
in an OpenVZ container.

Thank you soo much!

Kind regards, Thomas

Am 13.12.2016 um 15:57 schrieb Andrey Ryabinin:
> On 12/12/2016 03:58 PM, Thomas Hoberg wrote:
>> Hi Andrey,
>>
>> I'm very sorry to contact you directly, but I've run out of options to help myself.
>>
>> I am trying to get CUDA programs to run inside OpenVZ containers (they already run on Docker containers on the host) and my problem is that the NVidia runtime library is looking at files in the /proc directory at startup, which are supressed in OpenVZ containers.
>>
>> You have implemented a fix to make /proc/modules visible (thank you!), but immediately afterwards the runtime wants to see the contents of '/proc/driver/nvidia/params', and potentially more files inside that directory.
>>
>> I've tried to find and fix the visibility myself, but I can't find where you implemented the /proc/modules patch.
>>
>> The public git repository doesn't yet contain the patch (for easy comparison) and while I've downloaded the patched kernel from your build factory (https://download.openvz.org/virtuozzo/factory/x86_64/os/Packages/v/) and looked through all kernel sources which I thought could possibly contain the patch, it has eluded me.
>>
> You can find all patches in devel@openvz.org mailing list archives: https://lists.openvz.org/pipermail/devel/2016-November/069624.html
>
>> So could you either include a patch to make /proc/driver visible or help me find the patch for /proc/modules so I can try myself?
>>
> Access to proc directories is slightly different. We show directory in container iff it sticky bit is set.
> You can set sticky bit via chmod (it's forbidden for proc entries in OpenVZ kernel, I dunno why),
> but you can change the source like this:
>
> diff --git a/fs/proc/root.c b/fs/proc/root.c
> index 88be7c2..2a0bd71 100644
> --- a/fs/proc/root.c
> +++ b/fs/proc/root.c
> @@ -185,7 +185,7 @@ void __init proc_root_init(void)
>   	proc_mkdir_mode("sysvipc", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
>   #endif
>   	proc_mkdir_mode("fs", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
> -	proc_mkdir("driver", NULL);
> +	proc_mkdir_mode("driver", S_ISVTX, NULL);
>   	/* somewhere for the nfsd filesystem to be mounted */
>   	proc_mkdir_mode("fs/nfsd", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
>   #if defined(CONFIG_SUN_OPENPROMFS) || defined(CONFIG_SUN_OPENPROMFS_MODULE)
Thomas Hoberg Dec. 14, 2016, 6:29 p.m.
Just one correction: Please forget about /proc/driver appearing as a 
file, a mistake a keep repeating, sorry!

Am 14.12.2016 um 19:15 schrieb Thomas Hoberg:
> Hi Andrey,
>
> thanks a lot for your feedback!
>
> Ok, now I'm beginning to understand what's going on here...
>
> OpenVZ is "hijacking" the sticky bits on /proc dir entries to decide 
> if it should be replicated into containers...
>
> ...while CGROUP, Docker, LXC etc. just plain copy everything (except 
> the PID etc. stuff) and don't care.
>
> I was looking for the code, implementing the differentiation, perhaps 
> a table of some kind, implementing a specification based on which 
> files in /proc could be considered "dangerous" for a container to see, 
> perhaps even doing the functional equivalent of a PID space 
> translation depending on the content of these files etc. bla, bla.
>
> And was was starting to be afraid that perhaps the entire /proc file 
> hierarchy would be set up by Docker using some kind of a /proc or 
> CGROUP API at spin-up (or the crazy CGROUP dirtree stuff Ubuntu does), 
> but then the PID and UID translation was top level etc... ...
>
> And of course nothing like that could be found, because you use the 
> sticky bit!
>
> And it also kind of explains, why I see /proc/driver as file not 
> directory.
>
> Now I'm pretty confident I can hack it and see if that makes it work, 
> but of course I'd actually have it work for the entire subtree below 
> /proc/driver, which would be all over.
>
> And then we still have the issue of wanting to control what containers 
> should "see".... or not in such a way that it can become part of the 
> mainline OpenVZ kernel...
>
> But first the real-world check, to see if I can finally get CUDA to 
> fly in an OpenVZ container.
>
> Thank you soo much!
>
> Kind regards, Thomas
>
> Am 13.12.2016 um 15:57 schrieb Andrey Ryabinin:
>> On 12/12/2016 03:58 PM, Thomas Hoberg wrote:
>>> Hi Andrey,
>>>
>>> I'm very sorry to contact you directly, but I've run out of options 
>>> to help myself.
>>>
>>> I am trying to get CUDA programs to run inside OpenVZ containers 
>>> (they already run on Docker containers on the host) and my problem 
>>> is that the NVidia runtime library is looking at files in the /proc 
>>> directory at startup, which are supressed in OpenVZ containers.
>>>
>>> You have implemented a fix to make /proc/modules visible (thank 
>>> you!), but immediately afterwards the runtime wants to see the 
>>> contents of '/proc/driver/nvidia/params', and potentially more files 
>>> inside that directory.
>>>
>>> I've tried to find and fix the visibility myself, but I can't find 
>>> where you implemented the /proc/modules patch.
>>>
>>> The public git repository doesn't yet contain the patch (for easy 
>>> comparison) and while I've downloaded the patched kernel from your 
>>> build factory 
>>> (https://download.openvz.org/virtuozzo/factory/x86_64/os/Packages/v/) 
>>> and looked through all kernel sources which I thought could possibly 
>>> contain the patch, it has eluded me.
>>>
>> You can find all patches in devel@openvz.org mailing list archives: 
>> https://lists.openvz.org/pipermail/devel/2016-November/069624.html
>>
>>> So could you either include a patch to make /proc/driver visible or 
>>> help me find the patch for /proc/modules so I can try myself?
>>>
>> Access to proc directories is slightly different. We show directory 
>> in container iff it sticky bit is set.
>> You can set sticky bit via chmod (it's forbidden for proc entries in 
>> OpenVZ kernel, I dunno why),
>> but you can change the source like this:
>>
>> diff --git a/fs/proc/root.c b/fs/proc/root.c
>> index 88be7c2..2a0bd71 100644
>> --- a/fs/proc/root.c
>> +++ b/fs/proc/root.c
>> @@ -185,7 +185,7 @@ void __init proc_root_init(void)
>>       proc_mkdir_mode("sysvipc", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
>>   #endif
>>       proc_mkdir_mode("fs", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
>> -    proc_mkdir("driver", NULL);
>> +    proc_mkdir_mode("driver", S_ISVTX, NULL);
>>       /* somewhere for the nfsd filesystem to be mounted */
>>       proc_mkdir_mode("fs/nfsd", S_ISVTX | S_IRUGO | S_IXUGO, NULL);
>>   #if defined(CONFIG_SUN_OPENPROMFS) || 
>> defined(CONFIG_SUN_OPENPROMFS_MODULE)
>
>