[Devel] ve: allow to do anything from init_user_ns

Submitted by Andrei Vagin on July 24, 2017, 8:55 p.m.

Details

Message ID 1500929755-18946-1-git-send-email-avagin@openvz.org
State New
Series "ve: allow to do anything from init_user_ns"
Headers show

Commit Message

Andrei Vagin July 24, 2017, 8:55 p.m.
From: Andrei Vagin <avagin@virtuozzo.com>

current_user_ns_initial() is used to restrict operations,
which are allowed in a ve initial userns, but aren't allowed
in sub-user-namespaces. But now this function doesn't take
into account init_user_ns. init user_ns is a root item in the
hierarchy of user namespaces, so it is actually initiall
for ve-s. The upstream kernel allow to do anything from
init_user_ns, and we don't want to change this behaviour.

https://jira.sw.ru/browse/PSBM-58574

Cc: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
---
 kernel/ve/ve.c | 3 +++
 1 file changed, 3 insertions(+)

Patch hide | download patch | download mbox

diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
index 0533d79..e95b3f3 100644
--- a/kernel/ve/ve.c
+++ b/kernel/ve/ve.c
@@ -241,6 +241,9 @@  bool current_user_ns_initial(void)
 	struct ve_struct *ve = get_exec_env();
 	bool ret = false;
 
+	if (current_user_ns() == &init_user_ns)
+		return true;
+
 	rcu_read_lock();
 	if (ve->ve_ns && ve->init_cred->user_ns == current_user_ns())
 		ret = true;

Comments

Andrey Vagin July 24, 2017, 8:59 p.m.
On Mon, Jul 24, 2017 at 11:55:55PM +0300, Andrei Vagin wrote:
> From: Andrei Vagin <avagin@virtuozzo.com>
> 
> current_user_ns_initial() is used to restrict operations,
> which are allowed in a ve initial userns, but aren't allowed
> in sub-user-namespaces. But now this function doesn't take
> into account init_user_ns. init user_ns is a root item in the
> hierarchy of user namespaces, so it is actually initiall
> for ve-s. The upstream kernel allow to do anything from
> init_user_ns, and we don't want to change this behaviour.
> 
> https://jira.sw.ru/browse/PSBM-58574

The previous link is wrong, here is a right one:
https://jira.sw.ru/browse/PSBM-68157

> 
> Cc: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
> Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
> ---
>  kernel/ve/ve.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
> index 0533d79..e95b3f3 100644
> --- a/kernel/ve/ve.c
> +++ b/kernel/ve/ve.c
> @@ -241,6 +241,9 @@ bool current_user_ns_initial(void)
>  	struct ve_struct *ve = get_exec_env();
>  	bool ret = false;
>  
> +	if (current_user_ns() == &init_user_ns)
> +		return true;
> +
>  	rcu_read_lock();
>  	if (ve->ve_ns && ve->init_cred->user_ns == current_user_ns())
>  		ret = true;
> -- 
> 1.8.3.1
>
Stanislav Kinsburskiy July 25, 2017, 6:39 a.m.
I like it.

Acked-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>


24.07.2017 23:55, Andrei Vagin пишет:
> From: Andrei Vagin <avagin@virtuozzo.com>
> 
> current_user_ns_initial() is used to restrict operations,
> which are allowed in a ve initial userns, but aren't allowed
> in sub-user-namespaces. But now this function doesn't take
> into account init_user_ns. init user_ns is a root item in the
> hierarchy of user namespaces, so it is actually initiall
> for ve-s. The upstream kernel allow to do anything from
> init_user_ns, and we don't want to change this behaviour.
> 
> https://jira.sw.ru/browse/PSBM-58574
> 
> Cc: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
> Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
> ---
>  kernel/ve/ve.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
> index 0533d79..e95b3f3 100644
> --- a/kernel/ve/ve.c
> +++ b/kernel/ve/ve.c
> @@ -241,6 +241,9 @@ bool current_user_ns_initial(void)
>  	struct ve_struct *ve = get_exec_env();
>  	bool ret = false;
>  
> +	if (current_user_ns() == &init_user_ns)
> +		return true;
> +
>  	rcu_read_lock();
>  	if (ve->ve_ns && ve->init_cred->user_ns == current_user_ns())
>  		ret = true;
>
Konstantin Khorenko July 25, 2017, 11:46 a.m.
The only possible problem i can see here in the future:
imagine we implement n:m mapping for user namespaces of vz Containers,
after that each fs superblock will contain a link to user_ns in which it was created
(in order to get user_ns id mapping).

Thus in case someone enters from host to Container mount namespace and mount something,
most probably it won't be accessible inside a Container due to different uid/gid mappings for file owners.

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 07/24/2017 11:55 PM, Andrei Vagin wrote:
> From: Andrei Vagin <avagin@virtuozzo.com>
>
> current_user_ns_initial() is used to restrict operations,
> which are allowed in a ve initial userns, but aren't allowed
> in sub-user-namespaces. But now this function doesn't take
> into account init_user_ns. init user_ns is a root item in the
> hierarchy of user namespaces, so it is actually initiall
> for ve-s. The upstream kernel allow to do anything from
> init_user_ns, and we don't want to change this behaviour.
>
> https://jira.sw.ru/browse/PSBM-58574
>
> Cc: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
> Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
> ---
>  kernel/ve/ve.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
> index 0533d79..e95b3f3 100644
> --- a/kernel/ve/ve.c
> +++ b/kernel/ve/ve.c
> @@ -241,6 +241,9 @@ bool current_user_ns_initial(void)
>  	struct ve_struct *ve = get_exec_env();
>  	bool ret = false;
>
> +	if (current_user_ns() == &init_user_ns)
> +		return true;
> +
>  	rcu_read_lock();
>  	if (ve->ve_ns && ve->init_cred->user_ns == current_user_ns())
>  		ret = true;
>
Stanislav Kinsburskiy July 25, 2017, 12:20 p.m.
25.07.2017 14:46, Konstantin Khorenko пишет:
> The only possible problem i can see here in the future:
> imagine we implement n:m mapping for user namespaces of vz Containers,
> after that each fs superblock will contain a link to user_ns in which it was created
> (in order to get user_ns id mapping).
> 
> Thus in case someone enters from host to Container mount namespace and mount something,
> most probably it won't be accessible inside a Container due to different uid/gid mappings for file owners.
> 

Whom it might be?

> -- 
> Best regards,
> 
> Konstantin Khorenko,
> Virtuozzo Linux Kernel Team
> 
> On 07/24/2017 11:55 PM, Andrei Vagin wrote:
>> From: Andrei Vagin <avagin@virtuozzo.com>
>>
>> current_user_ns_initial() is used to restrict operations,
>> which are allowed in a ve initial userns, but aren't allowed
>> in sub-user-namespaces. But now this function doesn't take
>> into account init_user_ns. init user_ns is a root item in the
>> hierarchy of user namespaces, so it is actually initiall
>> for ve-s. The upstream kernel allow to do anything from
>> init_user_ns, and we don't want to change this behaviour.
>>
>> https://jira.sw.ru/browse/PSBM-58574
>>
>> Cc: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
>> Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
>> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
>> ---
>>  kernel/ve/ve.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
>> index 0533d79..e95b3f3 100644
>> --- a/kernel/ve/ve.c
>> +++ b/kernel/ve/ve.c
>> @@ -241,6 +241,9 @@ bool current_user_ns_initial(void)
>>      struct ve_struct *ve = get_exec_env();
>>      bool ret = false;
>>
>> +    if (current_user_ns() == &init_user_ns)
>> +        return true;
>> +
>>      rcu_read_lock();
>>      if (ve->ve_ns && ve->init_cred->user_ns == current_user_ns())
>>          ret = true;
>>
Konstantin Khorenko July 25, 2017, 12:23 p.m.
On 07/25/2017 03:20 PM, Stanislav Kinsburskiy wrote:
>
>
> 25.07.2017 14:46, Konstantin Khorenko пишет:
>> The only possible problem i can see here in the future:
>> imagine we implement n:m mapping for user namespaces of vz Containers,
>> after that each fs superblock will contain a link to user_ns in which it was created
>> (in order to get user_ns id mapping).
>>
>> Thus in case someone enters from host to Container mount namespace and mount something,
>> most probably it won't be accessible inside a Container due to different uid/gid mappings for file owners.
>>
>
> Whom it might be?

Administrator, support. A Human.

>
>> --
>> Best regards,
>>
>> Konstantin Khorenko,
>> Virtuozzo Linux Kernel Team
>>
>> On 07/24/2017 11:55 PM, Andrei Vagin wrote:
>>> From: Andrei Vagin <avagin@virtuozzo.com>
>>>
>>> current_user_ns_initial() is used to restrict operations,
>>> which are allowed in a ve initial userns, but aren't allowed
>>> in sub-user-namespaces. But now this function doesn't take
>>> into account init_user_ns. init user_ns is a root item in the
>>> hierarchy of user namespaces, so it is actually initiall
>>> for ve-s. The upstream kernel allow to do anything from
>>> init_user_ns, and we don't want to change this behaviour.
>>>
>>> https://jira.sw.ru/browse/PSBM-58574
>>>
>>> Cc: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
>>> Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
>>> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
>>> ---
>>>  kernel/ve/ve.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
>>> index 0533d79..e95b3f3 100644
>>> --- a/kernel/ve/ve.c
>>> +++ b/kernel/ve/ve.c
>>> @@ -241,6 +241,9 @@ bool current_user_ns_initial(void)
>>>      struct ve_struct *ve = get_exec_env();
>>>      bool ret = false;
>>>
>>> +    if (current_user_ns() == &init_user_ns)
>>> +        return true;
>>> +
>>>      rcu_read_lock();
>>>      if (ve->ve_ns && ve->init_cred->user_ns == current_user_ns())
>>>          ret = true;
>>>
> .
>
Stanislav Kinsburskiy July 25, 2017, 12:28 p.m.
25.07.2017 15:23, Konstantin Khorenko пишет:
> On 07/25/2017 03:20 PM, Stanislav Kinsburskiy wrote:
>>
>>
>> 25.07.2017 14:46, Konstantin Khorenko пишет:
>>> The only possible problem i can see here in the future:
>>> imagine we implement n:m mapping for user namespaces of vz Containers,
>>> after that each fs superblock will contain a link to user_ns in which it was created
>>> (in order to get user_ns id mapping).
>>>
>>> Thus in case someone enters from host to Container mount namespace and mount something,
>>> most probably it won't be accessible inside a Container due to different uid/gid mappings for file owners.
>>>
>>
>> Whom it might be?
> 
> Administrator, support. A Human.
> 

Ah, ok. Then who cares. Simply "Don't do it".

>>
>>> -- 
>>> Best regards,
>>>
>>> Konstantin Khorenko,
>>> Virtuozzo Linux Kernel Team
>>> 
>>> On 07/24/2017 11:55 PM, Andrei Vagin wrote:
>>>> From: Andrei Vagin <avagin@virtuozzo.com>
>>>>
>>>> current_user_ns_initial() is used to restrict operations,
>>>> which are allowed in a ve initial userns, but aren't allowed
>>>> in sub-user-namespaces. But now this function doesn't take
>>>> into account init_user_ns. init user_ns is a root item in the
>>>> hierarchy of user namespaces, so it is actually initiall
>>>> for ve-s. The upstream kernel allow to do anything from
>>>> init_user_ns, and we don't want to change this behaviour.
>>>>
>>>> https://jira.sw.ru/browse/PSBM-58574
>>>>
>>>> Cc: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
>>>> Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
>>>> Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
>>>> ---
>>>>  kernel/ve/ve.c | 3 +++
>>>>  1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
>>>> index 0533d79..e95b3f3 100644
>>>> --- a/kernel/ve/ve.c
>>>> +++ b/kernel/ve/ve.c
>>>> @@ -241,6 +241,9 @@ bool current_user_ns_initial(void)
>>>>      struct ve_struct *ve = get_exec_env();
>>>>      bool ret = false;
>>>>
>>>> +    if (current_user_ns() == &init_user_ns)
>>>> +        return true;
>>>> +
>>>>      rcu_read_lock();
>>>>      if (ve->ve_ns && ve->init_cred->user_ns == current_user_ns())
>>>>          ret = true;
>>>>
>> .
>>
Andrey Vagin July 25, 2017, 4:58 p.m.
On Tue, Jul 25, 2017 at 02:46:31PM +0300, Konstantin Khorenko wrote:
> The only possible problem i can see here in the future:
> imagine we implement n:m mapping for user namespaces of vz Containers,
> after that each fs superblock will contain a link to user_ns in which it was created
> (in order to get user_ns id mapping).
> 
> Thus in case someone enters from host to Container mount namespace and mount something,
> most probably it won't be accessible inside a Container due to different uid/gid mappings for file owners.

Maybe you have to get userns from a mount namespace. Each mount
namespace has a link to a userns where it was created.

> 
> --
> Best regards,
> 
> Konstantin Khorenko,
> Virtuozzo Linux Kernel Team
> 
> On 07/24/2017 11:55 PM, Andrei Vagin wrote:
> > From: Andrei Vagin <avagin@virtuozzo.com>
> > 
> > current_user_ns_initial() is used to restrict operations,
> > which are allowed in a ve initial userns, but aren't allowed
> > in sub-user-namespaces. But now this function doesn't take
> > into account init_user_ns. init user_ns is a root item in the
> > hierarchy of user namespaces, so it is actually initiall
> > for ve-s. The upstream kernel allow to do anything from
> > init_user_ns, and we don't want to change this behaviour.
> > 
> > https://jira.sw.ru/browse/PSBM-58574
> > 
> > Cc: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
> > Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
> > Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
> > ---
> >  kernel/ve/ve.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
> > index 0533d79..e95b3f3 100644
> > --- a/kernel/ve/ve.c
> > +++ b/kernel/ve/ve.c
> > @@ -241,6 +241,9 @@ bool current_user_ns_initial(void)
> >  	struct ve_struct *ve = get_exec_env();
> >  	bool ret = false;
> > 
> > +	if (current_user_ns() == &init_user_ns)
> > +		return true;
> > +
> >  	rcu_read_lock();
> >  	if (ve->ve_ns && ve->init_cred->user_ns == current_user_ns())
> >  		ret = true;
> >