ioctl_tty.2: add TIOCGPTPEER documentation

Submitted by Aleksa Sarai on June 9, 2017, 5:01 p.m.

Details

Message ID 20170609170147.32311-1-asarai@suse.de
State New
Series "ioctl_tty.2: add TIOCGPTPEER documentation"
Headers show

Commit Message

Aleksa Sarai June 9, 2017, 5:01 p.m.
The feature this patch references has currently only been accepted into
tty-testing, but Greg told me to kick this down to man-pages. As a
result, I can't reference upstream commit id's because the code isn't in
Linus' tree yet -- should I resend this once it lands in tty-next or
Linus' tree?

Also obviously the release version is a bit of a lie.

8<-----------------------------------------------------------------------

This is an ioctl(2) recently added by myself, to allow for container
runtimes and other programs that interact with (potentially hostile)
Linux namespaces to safely create {master,slave} pseudoterminal pairs
without needing to open potentially unsafe /dev/pts/... filenames that
may be malicious mountpoints or similar in an untrusted namespace
(avoiding the endless issues with ptsname(3) and similar approaches).

Cc: <containers@lists.linux-foundation.org>
Signed-off-by: Aleksa Sarai <asarai@suse.de>
---
 man2/ioctl_tty.2 | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Patch hide | download patch | download mbox

diff --git a/man2/ioctl_tty.2 b/man2/ioctl_tty.2
index d280beacf..61e147d99 100644
--- a/man2/ioctl_tty.2
+++ b/man2/ioctl_tty.2
@@ -380,6 +380,21 @@  Place the current lock state of the pseudoterminal slave device
 in the location pointed to by
 .IR argp
 (since Linux 3.8).
+.TP
+.BI "TIOCGPTPEER	int " flags
+Opens and returns a new file handle to the pseudoterminal slave
+device with the given
+.BR open (2)-style
+.IR flags ,
+regardless of whether the path is accessible through the calling process's
+mount namespaces.
+
+Security-conscious programs interacting with namespaces may wish to use this
+over
+.BR open (2)
+with the path provided by
+.BR ptsname (3),
+and similar library methods that have insecure APIs (since Linux 4.13).
 .PP
 The BSD ioctls
 .BR TIOCSTOP ,

Comments

Greg Kroah-Hartman June 9, 2017, 6:10 p.m.
On Sat, Jun 10, 2017 at 03:01:47AM +1000, Aleksa Sarai wrote:
> The feature this patch references has currently only been accepted into
> tty-testing, but Greg told me to kick this down to man-pages. As a
> result, I can't reference upstream commit id's because the code isn't in
> Linus' tree yet -- should I resend this once it lands in tty-next or
> Linus' tree?

It's now merged to tty-next, and will show up in the next linux-next
release, and is planned to be merged into 4.13-rc1 when that happens.

thanks,

greg k-h
Michael Kerrisk (man-pages) Aug. 15, 2017, 7:27 p.m.
On 06/09/2017 07:01 PM, Aleksa Sarai wrote:
> The feature this patch references has currently only been accepted into
> tty-testing, but Greg told me to kick this down to man-pages. As a
> result, I can't reference upstream commit id's because the code isn't in
> Linus' tree yet -- should I resend this once it lands in tty-next or
> Linus' tree?
> 
> Also obviously the release version is a bit of a lie.

Hello Aleksa,

I've applied this patch, and then tweaked the wording a little. Could
you please check the following text:

       TIOCGPTPEER    int flags
              (since Linux 4.13) Given  a  file  descriptor  in  fd  that
              refers  to  a  pseudoterminal  master, open (with the given
              open(2)-style flags) and return a new file descriptor  that
              refers to the peer pseudoterminal slave device.  This oper‐
              ation can be performed regardless of whether  the  pathname
              of  the  slave  device  is  accessible  through the calling
              process's mount namespaces.

              Security-conscious programs interacting with namespaces may
              wish  to  use  this  operation rather than open(2) with the
              pathname returned by ptsname(3), and similar library  func‐
              tions that have insecure APIs.

I also have a question on the last sentence: what are the "similar library
functions that have insecure APIs"? It's not clear to me what you are 
referring to here.

Cheers,

Michael

> 
> 8<-----------------------------------------------------------------------
> 
> This is an ioctl(2) recently added by myself, to allow for container
> runtimes and other programs that interact with (potentially hostile)
> Linux namespaces to safely create {master,slave} pseudoterminal pairs
> without needing to open potentially unsafe /dev/pts/... filenames that
> may be malicious mountpoints or similar in an untrusted namespace
> (avoiding the endless issues with ptsname(3) and similar approaches).
> 
> Cc: <containers@lists.linux-foundation.org>
> Signed-off-by: Aleksa Sarai <asarai@suse.de>
> ---
>  man2/ioctl_tty.2 | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/man2/ioctl_tty.2 b/man2/ioctl_tty.2
> index d280beacf..61e147d99 100644
> --- a/man2/ioctl_tty.2
> +++ b/man2/ioctl_tty.2
> @@ -380,6 +380,21 @@ Place the current lock state of the pseudoterminal slave device
>  in the location pointed to by
>  .IR argp
>  (since Linux 3.8).
> +.TP
> +.BI "TIOCGPTPEER	int " flags
> +Opens and returns a new file handle to the pseudoterminal slave
> +device with the given
> +.BR open (2)-style
> +.IR flags ,
> +regardless of whether the path is accessible through the calling process's
> +mount namespaces.
> +
> +Security-conscious programs interacting with namespaces may wish to use this
> +over
> +.BR open (2)
> +with the path provided by
> +.BR ptsname (3),
> +and similar library methods that have insecure APIs (since Linux 4.13).
>  .PP
>  The BSD ioctls
>  .BR TIOCSTOP ,
>
Aleksa Sarai Aug. 16, 2017, 4:43 a.m.
> I've applied this patch, and then tweaked the wording a little. Could
> you please check the following text:
> 
>         TIOCGPTPEER    int flags
>                (since Linux 4.13) Given  a  file  descriptor  in  fd  that
>                refers  to  a  pseudoterminal  master, open (with the given
>                open(2)-style flags) and return a new file descriptor  that
>                refers to the peer pseudoterminal slave device.  This oper‐
>                ation can be performed regardless of whether  the  pathname
>                of  the  slave  device  is  accessible  through the calling
>                process's mount namespaces.
> 
>                Security-conscious programs interacting with namespaces may
>                wish  to  use  this  operation rather than open(2) with the
>                pathname returned by ptsname(3), and similar library  func‐
>                tions that have insecure APIs.

Yup, that sounds good.

> I also have a question on the last sentence: what are the "similar library
> functions that have insecure APIs"? It's not clear to me what you are
> referring to here.

There are a few posix_-style functions provided by glibc that are just 
wrappers around the open+ptsname combo that I mention earlier in the 
sentence (and thus are vulnerable to the same issue). But if you feel 
it's confusing you can feel free to drop it.

Thanks.
Eric W. Biederman Aug. 16, 2017, 4:43 p.m.
"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> On 06/09/2017 07:01 PM, Aleksa Sarai wrote:
>> The feature this patch references has currently only been accepted into
>> tty-testing, but Greg told me to kick this down to man-pages. As a
>> result, I can't reference upstream commit id's because the code isn't in
>> Linus' tree yet -- should I resend this once it lands in tty-next or
>> Linus' tree?
>> 
>> Also obviously the release version is a bit of a lie.
>
> Hello Aleksa,
>
> I've applied this patch, and then tweaked the wording a little. Could
> you please check the following text:
>
>        TIOCGPTPEER    int flags
>               (since Linux 4.13) Given  a  file  descriptor  in  fd  that
>               refers  to  a  pseudoterminal  master, open (with the given
>               open(2)-style flags) and return a new file descriptor  that
>               refers to the peer pseudoterminal slave device.  This oper‐
>               ation can be performed regardless of whether  the  pathname
>               of  the  slave  device  is  accessible  through the calling
>               process's mount namespaces.
>
>               Security-conscious programs interacting with namespaces may
>               wish  to  use  this  operation rather than open(2) with the
>               pathname returned by ptsname(3), and similar library  func‐
>               tions that have insecure APIs.
>
> I also have a question on the last sentence: what are the "similar library
> functions that have insecure APIs"? It's not clear to me what you are 
> referring to here.

A couple of things to note on the bigger picture.

The glibc library on all distributions has been changed to not have a
setuid binary pt_chown, that uses ptsname.  This was the primary fix
for the security issue.

The behavior of opening /dev/ptmx has been changed to perform a path
lookup relative to the location of /dev/ptmx of  ./pts/ptmx and open
it it is a devpts filesystem and to fail otherwise.    This further
makes it hard to confuse userspace this way as /dev/ptmx always
corresponds to /dev/pts/ptmx.  Even in chroots and in other mount
namespaces.

Both of these changes largely makes glibc's use of these features
secure.  /dev/ptmx always corresponds to /dev/pts and there no readily
available suid root applications too fool.

That makes TIOCGPTPEER a very nice addition, but not something people
have to scramble to use to ensure their system is secure.  As a hostile
environment now has to work very hard to confuse the existing mechanisms.

>> This is an ioctl(2) recently added by myself, to allow for container
>> runtimes and other programs that interact with (potentially hostile)
>> Linux namespaces to safely create {master,slave} pseudoterminal pairs
>> without needing to open potentially unsafe /dev/pts/... filenames that
>> may be malicious mountpoints or similar in an untrusted namespace
>> (avoiding the endless issues with ptsname(3) and similar approaches).
>> 
>> Cc: <containers@lists.linux-foundation.org>
>> Signed-off-by: Aleksa Sarai <asarai@suse.de>

Eric
Aleksa Sarai Aug. 16, 2017, 4:54 p.m.
> A couple of things to note on the bigger picture.
> 
> The glibc library on all distributions has been changed to not have a
> setuid binary pt_chown, that uses ptsname.  This was the primary fix
> for the security issue.
> 
> The behavior of opening /dev/ptmx has been changed to perform a path
> lookup relative to the location of /dev/ptmx of  ./pts/ptmx and open
> it it is a devpts filesystem and to fail otherwise.    This further
> makes it hard to confuse userspace this way as /dev/ptmx always
> corresponds to /dev/pts/ptmx.  Even in chroots and in other mount
> namespaces.

I have a feeling that there might be a way to trick glibc if you use 
FUSE, but I haven't actually tried to create a PoC for it. Fair point 
though.

> That makes TIOCGPTPEER a very nice addition, but not something people
> have to scramble to use to ensure their system is secure.  As a hostile
> environment now has to work very hard to confuse the existing mechanisms.

There are usecases where you simply need TIOCGPTPEER, and no other 
userspace alternative will do, but maybe if we modified the paragraph to 
read (as suggested):

   Security-conscious programs interacting with namespaces may
   wish  to  use  this  operation rather than open(2) with the
   pathname returned by ptsname(3).

This would clarify that there are usecases where you need this 
particular feature, without saying causing people to panic over 
inaccurate claims of glibc being broken. Does that sound better?
Eric W. Biederman Aug. 16, 2017, 5:14 p.m.
Aleksa Sarai <asarai@suse.de> writes:

>> A couple of things to note on the bigger picture.
>>
>> The glibc library on all distributions has been changed to not have a
>> setuid binary pt_chown, that uses ptsname.  This was the primary fix
>> for the security issue.
>>
>> The behavior of opening /dev/ptmx has been changed to perform a path
>> lookup relative to the location of /dev/ptmx of  ./pts/ptmx and open
>> it it is a devpts filesystem and to fail otherwise.    This further
>> makes it hard to confuse userspace this way as /dev/ptmx always
>> corresponds to /dev/pts/ptmx.  Even in chroots and in other mount
>> namespaces.
>
> I have a feeling that there might be a way to trick glibc if you use
> FUSE, but I haven't actually tried to create a PoC for it. Fair point
> though.

To trick glibc fuse would have to be mounted somewhere on /dev.

>> That makes TIOCGPTPEER a very nice addition, but not something people
>> have to scramble to use to ensure their system is secure.  As a hostile
>> environment now has to work very hard to confuse the existing mechanisms.
>
> There are usecases where you simply need TIOCGPTPEER, and no other
> userspace alternative will do, but maybe if we modified the paragraph
> to read (as suggested):
>
>   Security-conscious programs interacting with namespaces may
>   wish  to  use  this  operation rather than open(2) with the
>   pathname returned by ptsname(3).
>
> This would clarify that there are usecases where you need this
> particular feature, without saying causing people to panic over
> inaccurate claims of glibc being broken. Does that sound better?

I think your original words sounded fine.  I would even go for new
programs may want to use the new ioctl as it fundamentally less racy
and more of what is actually trying to be implemented with the userspace
pieces.

I just wanted to point out that TIOCGPTPEER while being the interface
that it would have been nice had we had since the beginning (and would
have avoided all of the problems) is actually not something we need to
scramble and use it is just a very nice to have.  As the immediate
issues have been fixed in other ways.    It was not clear to me from the
other discussions if you and Michael Kerrisk were aware of the
mitigations that had been made to address the security issue.

The change to the behavior of /dev/ptmx may need to be documented
somewhere.   I am not certain if anything has been documented since
devpts has started allowing multiple mounts.

Eric
Michael Kerrisk (man-pages) Nov. 20, 2017, 10:20 a.m.
On 08/16/2017 07:14 PM, Eric W. Biederman wrote:
> Aleksa Sarai <asarai@suse.de> writes:
> 
>>> A couple of things to note on the bigger picture.
>>>
>>> The glibc library on all distributions has been changed to not have a
>>> setuid binary pt_chown, that uses ptsname.  This was the primary fix
>>> for the security issue.
>>>
>>> The behavior of opening /dev/ptmx has been changed to perform a path
>>> lookup relative to the location of /dev/ptmx of  ./pts/ptmx and open
>>> it it is a devpts filesystem and to fail otherwise.    This further
>>> makes it hard to confuse userspace this way as /dev/ptmx always
>>> corresponds to /dev/pts/ptmx.  Even in chroots and in other mount
>>> namespaces.
>>
>> I have a feeling that there might be a way to trick glibc if you use
>> FUSE, but I haven't actually tried to create a PoC for it. Fair point
>> though.
> 
> To trick glibc fuse would have to be mounted somewhere on /dev.
> 
>>> That makes TIOCGPTPEER a very nice addition, but not something people
>>> have to scramble to use to ensure their system is secure.  As a hostile
>>> environment now has to work very hard to confuse the existing mechanisms.
>>
>> There are usecases where you simply need TIOCGPTPEER, and no other
>> userspace alternative will do, but maybe if we modified the paragraph
>> to read (as suggested):
>>
>>   Security-conscious programs interacting with namespaces may
>>   wish  to  use  this  operation rather than open(2) with the
>>   pathname returned by ptsname(3).
>>
>> This would clarify that there are usecases where you need this
>> particular feature, without saying causing people to panic over
>> inaccurate claims of glibc being broken. Does that sound better?
> 
> I think your original words sounded fine.  I would even go for new
> programs may want to use the new ioctl as it fundamentally less racy
> and more of what is actually trying to be implemented with the userspace
> pieces.
> 
> I just wanted to point out that TIOCGPTPEER while being the interface
> that it would have been nice had we had since the beginning (and would
> have avoided all of the problems) is actually not something we need to
> scramble and use it is just a very nice to have.  As the immediate
> issues have been fixed in other ways.    It was not clear to me from the
> other discussions if you and Michael Kerrisk were aware of the
> mitigations that had been made to address the security issue.

So, my takeaway is that we leave this text:

              Security-conscious programs interacting with namespaces may
              wish  to  use  this  operation rather than open(2) with the
              pathname returned by ptsname(3), and similar library  func‐
              tions that have insecure APIs.  (For example, confusion can
              occur in some cases using ptsname(3) with a pathname  where
              a  devpts  filesystem has been mounted in a different mount
              namespace.)

as is?

> The change to the behavior of /dev/ptmx may need to be documented
> somewhere.   I am not certain if anything has been documented since
> devpts has started allowing multiple mounts.

Eric can you say more about this. When did this change occur?
What is the model: mount devpts once in each mount namespace?

Cheers,

Michael
Christian Brauner Nov. 20, 2017, 12:15 p.m.
On Mon, Nov 20, 2017 at 11:20:13AM +0100, Michael Kerrisk (man-pages) wrote:
> On 08/16/2017 07:14 PM, Eric W. Biederman wrote:
> > Aleksa Sarai <asarai@suse.de> writes:
> > 
> >>> A couple of things to note on the bigger picture.
> >>>
> >>> The glibc library on all distributions has been changed to not have a
> >>> setuid binary pt_chown, that uses ptsname.  This was the primary fix
> >>> for the security issue.
> >>>
> >>> The behavior of opening /dev/ptmx has been changed to perform a path
> >>> lookup relative to the location of /dev/ptmx of  ./pts/ptmx and open
> >>> it it is a devpts filesystem and to fail otherwise.    This further
> >>> makes it hard to confuse userspace this way as /dev/ptmx always
> >>> corresponds to /dev/pts/ptmx.  Even in chroots and in other mount
> >>> namespaces.
> >>
> >> I have a feeling that there might be a way to trick glibc if you use
> >> FUSE, but I haven't actually tried to create a PoC for it. Fair point
> >> though.
> > 
> > To trick glibc fuse would have to be mounted somewhere on /dev.
> > 
> >>> That makes TIOCGPTPEER a very nice addition, but not something people
> >>> have to scramble to use to ensure their system is secure.  As a hostile
> >>> environment now has to work very hard to confuse the existing mechanisms.
> >>
> >> There are usecases where you simply need TIOCGPTPEER, and no other
> >> userspace alternative will do, but maybe if we modified the paragraph
> >> to read (as suggested):
> >>
> >>   Security-conscious programs interacting with namespaces may
> >>   wish  to  use  this  operation rather than open(2) with the
> >>   pathname returned by ptsname(3).
> >>
> >> This would clarify that there are usecases where you need this
> >> particular feature, without saying causing people to panic over
> >> inaccurate claims of glibc being broken. Does that sound better?
> > 
> > I think your original words sounded fine.  I would even go for new
> > programs may want to use the new ioctl as it fundamentally less racy
> > and more of what is actually trying to be implemented with the userspace
> > pieces.
> > 
> > I just wanted to point out that TIOCGPTPEER while being the interface
> > that it would have been nice had we had since the beginning (and would
> > have avoided all of the problems) is actually not something we need to
> > scramble and use it is just a very nice to have.  As the immediate
> > issues have been fixed in other ways.    It was not clear to me from the
> > other discussions if you and Michael Kerrisk were aware of the
> > mitigations that had been made to address the security issue.
> 
> So, my takeaway is that we leave this text:
> 
>               Security-conscious programs interacting with namespaces may
>               wish  to  use  this  operation rather than open(2) with the
>               pathname returned by ptsname(3), and similar library  func‐
>               tions that have insecure APIs.  (For example, confusion can
>               occur in some cases using ptsname(3) with a pathname  where
>               a  devpts  filesystem has been mounted in a different mount
>               namespace.)

This sounds fine to me. I probably should point out that I wrote a patch for
glibc to use TIOCGPTPEER based slave fd allocation when supported and only use
the insecure way as a fallback. I've pushed this to glibc master on 8 October.
That means it is still in our current glibc development cycle and will thus be
available in glibc 2.27. The relevant commit hash is
645ac9aaf89e3311949828546df6334322f48933 and the link to the diff
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=645ac9aaf89e3311949828546df6334322f48933;hp=98e0742024d4c13c08a6076b3d119c250e7d0118

At least for us at glibc this should mean that *most* insecure
path-based-operations have been eliminated since other path-based bits in these
codepaths are basically no-ops by now. That being said glibc needs to be
backward compatible and so will (see comment above) execute path-based fallback
code if the new ioctl() fails. So if you really^2 care about using TIOCGPTPEER
you should do the ioctl() dance yourself.

(Fwiw, I'm not directly involved with Musl I've written a patch for them as well
but Rich had some reservations and I haven't been able to pick that patch back
up yet.)

Christian

> 
> as is?
> 
> > The change to the behavior of /dev/ptmx may need to be documented
> > somewhere.   I am not certain if anything has been documented since
> > devpts has started allowing multiple mounts.
> 
> Eric can you say more about this. When did this change occur?
> What is the model: mount devpts once in each mount namespace?
> 
> Cheers,
> 
> Michael
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/containers
Eric W. Biederman Nov. 20, 2017, 5:06 p.m.
"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:

> On 08/16/2017 07:14 PM, Eric W. Biederman wrote:
>> Aleksa Sarai <asarai@suse.de> writes:
>> 
>>>> A couple of things to note on the bigger picture.
>>>>
>>>> The glibc library on all distributions has been changed to not have a
>>>> setuid binary pt_chown, that uses ptsname.  This was the primary fix
>>>> for the security issue.
>>>>
>>>> The behavior of opening /dev/ptmx has been changed to perform a path
>>>> lookup relative to the location of /dev/ptmx of  ./pts/ptmx and open
>>>> it it is a devpts filesystem and to fail otherwise.    This further
>>>> makes it hard to confuse userspace this way as /dev/ptmx always
>>>> corresponds to /dev/pts/ptmx.  Even in chroots and in other mount
>>>> namespaces.
>>>
>>> I have a feeling that there might be a way to trick glibc if you use
>>> FUSE, but I haven't actually tried to create a PoC for it. Fair point
>>> though.
>> 
>> To trick glibc fuse would have to be mounted somewhere on /dev.
>> 
>>>> That makes TIOCGPTPEER a very nice addition, but not something people
>>>> have to scramble to use to ensure their system is secure.  As a hostile
>>>> environment now has to work very hard to confuse the existing mechanisms.
>>>
>>> There are usecases where you simply need TIOCGPTPEER, and no other
>>> userspace alternative will do, but maybe if we modified the paragraph
>>> to read (as suggested):
>>>
>>>   Security-conscious programs interacting with namespaces may
>>>   wish  to  use  this  operation rather than open(2) with the
>>>   pathname returned by ptsname(3).
>>>
>>> This would clarify that there are usecases where you need this
>>> particular feature, without saying causing people to panic over
>>> inaccurate claims of glibc being broken. Does that sound better?
>> 
>> I think your original words sounded fine.  I would even go for new
>> programs may want to use the new ioctl as it fundamentally less racy
>> and more of what is actually trying to be implemented with the userspace
>> pieces.
>> 
>> I just wanted to point out that TIOCGPTPEER while being the interface
>> that it would have been nice had we had since the beginning (and would
>> have avoided all of the problems) is actually not something we need to
>> scramble and use it is just a very nice to have.  As the immediate
>> issues have been fixed in other ways.    It was not clear to me from the
>> other discussions if you and Michael Kerrisk were aware of the
>> mitigations that had been made to address the security issue.
>
> So, my takeaway is that we leave this text:
>
>               Security-conscious programs interacting with namespaces may
>               wish  to  use  this  operation rather than open(2) with the
>               pathname returned by ptsname(3), and similar library  func‐
>               tions that have insecure APIs.  (For example, confusion can
>               occur in some cases using ptsname(3) with a pathname  where
>               a  devpts  filesystem has been mounted in a different mount
>               namespace.)
>
> as is?
>
>> The change to the behavior of /dev/ptmx may need to be documented
>> somewhere.   I am not certain if anything has been documented since
>> devpts has started allowing multiple mounts.
>
> Eric can you say more about this. When did this change occur?
> What is the model: mount devpts once in each mount namespace?


Let me replay things as best I can remember them.

Once upon a time pty's were ordinary dev entries, and posix_openpt
scoured around and found a pair of master/slave devices that were not
used and return those.  With a suid helper (pt_chown) grantpt changes the
permissions on the slave side.


This was in the days before udev and hotplug support in the kernel
and so had the disadvantage of having requiring someone to call mknod
in /dev for all possible ptys to be used before a pty could be created.

Following a posix draft /dev/ptmx and the devpts filesystem (expected to
be mounted at /dev/pts) were added.  The opening of /dev/ptmx creates a
slave entry in the devpts filesystem.  The devpts filesystem options
"uid", "gid", and "mode" existed since the beginning (2.1.93 ish) to
ensure that the newly created slave tty has the correct mode.

At which point the setuid granpt helper (pt_chown) should have begun the
process of retirement.

The weird quirk with devpts is that if it was mounted a secound
time you would get the same filesytem you mounted the first time, but
would have the opportunity to change the mount options.

As some uncareful scripts would mount devpts in a chroot without the
proper mount options in some distributions the pt_chown binary was kept
on not just for backwards compability but to keep the system working
after these weird scripts ran.  I believe one of them is
xen-image-builder.

Meanwhile in the land of containers and checkpoint/restart we wanted the
ability to migrate open ptys.  To do that we felt it was necessary to
preserve the major/minor numbers of the open ptys, so the migrated ptys
would have the same numbers.  Which resulted in devpts gaining the
"newinstance" mount option and a ptmx device node in 2.6.29.

To create a new slave in a mount of devpts mounted with "newinstance" it
was necessary to open /dev/pts/ptmx which in practice was bind mounted
or symlinked to /dev/ptmx to create a new pty.  This solved things
for containers.

Lurking in the background was setuid helper for grantpt (pt_chown) which
would not go away because of silly chrooted scripts that do things
wrong.

The addition of "newinstance" and user namespaces made it possible for
an mischievous person to mount devpts without privilege and open a pty
and pass that back to glibc and ask it to call granpt on it.  At which
point glibc would be confused and ask pt_chown to chown the same
numbered pty in the local instance of devpts.  As the attacker could
completely control the pty number this had the potential for a lot of
mischief.

At the end of the day this wound up with 3 fixes.
- Distributions stopped shipping pt_chown.
- The devpts "newinstance" mount option was made to always apply in 4.7,
  ensuring the mount options of devpts would not affect other mounts,
  and thus removing the need for pt_chown.
- The TIOCGPTPEER ioctl was added which if used carefully makes it
  impossible to confused glibc.

To make "newinstance" always apply to devpts without breaking any
existing distribution was interesting.  The problem was the /dev/ptmx
device node, which was fine when there was only one instance of devpts.

When the "newinstance" mount option was added a specific devpts instance
was mounted kernel internal and was used for /dev/ptmx for backwards
compatiblity.  Which worked at the time.   With "newinstance" always
creating a new instance of devpts that definition did not work.

We examined just having /dev/ptmx point to the first mount of devpts but
that fails on distributions like CentOS6 that mount devpts in the initrd
and then again from fstab.  We examined having devtmpfs create ptmx as a
symlink to pts/ptmx, but that fails on distributions like slackware that
don't mount devtmpfs.  What did work was to have /dev/ptmx perform a
relative path based lookup of "pts" in the same directory as the "ptmx"
device node.  That works for the weird distros like CentOS6 and the
weird chroot images that call mknod /dev/ptmx and mount devpts
themselves.  Plus it works for all of the normal cases.

Eric