forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
devpts: Make each mount of devpts an independent filesystem.
The /dev/ptmx device node is changed to lookup the directory entry "pts" in the same directory as the /dev/ptmx device node was opened in. If there is a "pts" entry and that entry is a devpts filesystem /dev/ptmx uses that filesystem. Otherwise the open of /dev/ptmx fails. The DEVPTS_MULTIPLE_INSTANCES configuration option is removed, so that userspace can now safely depend on each mount of devpts creating a new instance of the filesystem. Each mount of devpts is now a separate and equal filesystem. Reserved ttys are now available to all instances of devpts where the mounter is in the initial mount namespace. A new vfs helper path_pts is introduced that finds a directory entry named "pts" in the directory of the passed in path, and changes the passed in path to point to it. The helper path_pts uses a function path_parent_directory that was factored out of follow_dotdot. In the implementation of devpts: - devpts_mnt is killed as it is no longer meaningful if all mounts of devpts are equal. - pts_sb_from_inode is replaced by just inode->i_sb as all cached inodes in the tty layer are now from the devpts filesystem. - devpts_add_ref is rolled into the new function devpts_ptmx. And the unnecessary inode hold is removed. - devpts_del_ref is renamed devpts_release and reduced to just a deacrivate_super. - The newinstance mount option continues to be accepted but is now ignored. In devpts_fs.h definitions for when !CONFIG_UNIX98_PTYS are removed as they are never used. Documentation/filesystems/devices.txt is updated to describe the current situation. This has been verified to work properly on openwrt-15.05, centos5, centos6, centos7, debian-6.0.2, debian-7.9, debian-8.2, ubuntu-14.04.3, ubuntu-15.10, fedora23, magia-5, mint-17.3, opensuse-42.1, slackware-14.1, gentoo-20151225 (13.0?), archlinux-2015-12-01. With the caveat that on centos6 and on slackware-14.1 that there wind up being two instances of the devpts filesystem mounted on /dev/pts, the lower copy does not end up getting used. Signed-off-by: "Eric W. Biederman" <[email protected]> Cc: Greg KH <[email protected]> Cc: Peter Hurley <[email protected]> Cc: Peter Anvin <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Al Viro <[email protected]> Cc: Serge Hallyn <[email protected]> Cc: Willy Tarreau <[email protected]> Cc: Aurelien Jarno <[email protected]> Cc: One Thousand Gnomes <[email protected]> Cc: Jann Horn <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Florian Weimer <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
- Loading branch information
Showing
7 changed files
with
126 additions
and
296 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,141 +1,26 @@ | ||
Each mount of the devpts filesystem is now distinct such that ptys | ||
and their indicies allocated in one mount are independent from ptys | ||
and their indicies in all other mounts. | ||
|
||
To support containers, we now allow multiple instances of devpts filesystem, | ||
such that indices of ptys allocated in one instance are independent of indices | ||
allocated in other instances of devpts. | ||
All mounts of the devpts filesystem now create a /dev/pts/ptmx node | ||
with permissions 0000. | ||
|
||
To preserve backward compatibility, this support for multiple instances is | ||
enabled only if: | ||
To retain backwards compatibility the a ptmx device node (aka any node | ||
created with "mknod name c 5 2") when opened will look for an instance | ||
of devpts under the name "pts" in the same directory as the ptmx device | ||
node. | ||
|
||
- CONFIG_DEVPTS_MULTIPLE_INSTANCES=y, and | ||
- '-o newinstance' mount option is specified while mounting devpts | ||
|
||
IOW, devpts now supports both single-instance and multi-instance semantics. | ||
|
||
If CONFIG_DEVPTS_MULTIPLE_INSTANCES=n, there is no change in behavior and | ||
this referred to as the "legacy" mode. In this mode, the new mount options | ||
(-o newinstance and -o ptmxmode) will be ignored with a 'bogus option' message | ||
on console. | ||
|
||
If CONFIG_DEVPTS_MULTIPLE_INSTANCES=y and devpts is mounted without the | ||
'newinstance' option (as in current start-up scripts) the new mount binds | ||
to the initial kernel mount of devpts. This mode is referred to as the | ||
'single-instance' mode and the current, single-instance semantics are | ||
preserved, i.e PTYs are common across the system. | ||
|
||
The only difference between this single-instance mode and the legacy mode | ||
is the presence of new, '/dev/pts/ptmx' node with permissions 0000, which | ||
can safely be ignored. | ||
|
||
If CONFIG_DEVPTS_MULTIPLE_INSTANCES=y and 'newinstance' option is specified, | ||
the mount is considered to be in the multi-instance mode and a new instance | ||
of the devpts fs is created. Any ptys created in this instance are independent | ||
of ptys in other instances of devpts. Like in the single-instance mode, the | ||
/dev/pts/ptmx node is present. To effectively use the multi-instance mode, | ||
open of /dev/ptmx must be a redirected to '/dev/pts/ptmx' using a symlink or | ||
bind-mount. | ||
|
||
Eg: A container startup script could do the following: | ||
|
||
$ chmod 0666 /dev/pts/ptmx | ||
$ rm /dev/ptmx | ||
$ ln -s pts/ptmx /dev/ptmx | ||
$ ns_exec -cm /bin/bash | ||
|
||
# We are now in new container | ||
|
||
$ umount /dev/pts | ||
$ mount -t devpts -o newinstance lxcpts /dev/pts | ||
$ sshd -p 1234 | ||
|
||
where 'ns_exec -cm /bin/bash' calls clone() with CLONE_NEWNS flag and execs | ||
/bin/bash in the child process. A pty created by the sshd is not visible in | ||
the original mount of /dev/pts. | ||
As an option instead of placing a /dev/ptmx device node at /dev/ptmx | ||
it is possible to place a symlink to /dev/pts/ptmx at /dev/ptmx or | ||
to bind mount /dev/ptx/ptmx to /dev/ptmx. If you opt for using | ||
the devpts filesystem in this manner devpts should be mounted with | ||
the ptmxmode=0666, or chmod 0666 /dev/pts/ptmx should be called. | ||
|
||
Total count of pty pairs in all instances is limited by sysctls: | ||
kernel.pty.max = 4096 - global limit | ||
kernel.pty.reserve = 1024 - reserve for initial instance | ||
kernel.pty.reserve = 1024 - reserved for filesystems mounted from the initial mount namespace | ||
kernel.pty.nr - current count of ptys | ||
|
||
Per-instance limit could be set by adding mount option "max=<count>". | ||
This feature was added in kernel 3.4 together with sysctl kernel.pty.reserve. | ||
In kernels older than 3.4 sysctl kernel.pty.max works as per-instance limit. | ||
|
||
User-space changes | ||
------------------ | ||
|
||
In multi-instance mode (i.e '-o newinstance' mount option is specified at least | ||
once), following user-space issues should be noted. | ||
|
||
1. If -o newinstance mount option is never used, /dev/pts/ptmx can be ignored | ||
and no change is needed to system-startup scripts. | ||
|
||
2. To effectively use multi-instance mode (i.e -o newinstance is specified) | ||
administrators or startup scripts should "redirect" open of /dev/ptmx to | ||
/dev/pts/ptmx using either a bind mount or symlink. | ||
|
||
$ mount -t devpts -o newinstance devpts /dev/pts | ||
|
||
followed by either | ||
|
||
$ rm /dev/ptmx | ||
$ ln -s pts/ptmx /dev/ptmx | ||
$ chmod 666 /dev/pts/ptmx | ||
or | ||
$ mount -o bind /dev/pts/ptmx /dev/ptmx | ||
|
||
3. The '/dev/ptmx -> pts/ptmx' symlink is the preferred method since it | ||
enables better error-reporting and treats both single-instance and | ||
multi-instance mounts similarly. | ||
|
||
But this method requires that system-startup scripts set the mode of | ||
/dev/pts/ptmx correctly (default mode is 0000). The scripts can set the | ||
mode by, either | ||
|
||
- adding ptmxmode mount option to devpts entry in /etc/fstab, or | ||
- using 'chmod 0666 /dev/pts/ptmx' | ||
|
||
4. If multi-instance mode mount is needed for containers, but the system | ||
startup scripts have not yet been updated, container-startup scripts | ||
should bind mount /dev/ptmx to /dev/pts/ptmx to avoid breaking single- | ||
instance mounts. | ||
|
||
Or, in general, container-startup scripts should use: | ||
|
||
mount -t devpts -o newinstance -o ptmxmode=0666 devpts /dev/pts | ||
if [ ! -L /dev/ptmx ]; then | ||
mount -o bind /dev/pts/ptmx /dev/ptmx | ||
fi | ||
|
||
When all devpts mounts are multi-instance, /dev/ptmx can permanently be | ||
a symlink to pts/ptmx and the bind mount can be ignored. | ||
|
||
5. A multi-instance mount that is not accompanied by the /dev/ptmx to | ||
/dev/pts/ptmx redirection would result in an unusable/unreachable pty. | ||
|
||
mount -t devpts -o newinstance lxcpts /dev/pts | ||
|
||
immediately followed by: | ||
|
||
open("/dev/ptmx") | ||
|
||
would create a pty, say /dev/pts/7, in the initial kernel mount. | ||
But /dev/pts/7 would be invisible in the new mount. | ||
|
||
6. The permissions for /dev/pts/ptmx node should be specified when mounting | ||
/dev/pts, using the '-o ptmxmode=%o' mount option (default is 0000). | ||
|
||
mount -t devpts -o newinstance -o ptmxmode=0644 devpts /dev/pts | ||
|
||
The permissions can be later be changed as usual with 'chmod'. | ||
|
||
chmod 666 /dev/pts/ptmx | ||
|
||
7. A mount of devpts without the 'newinstance' option results in binding to | ||
initial kernel mount. This behavior while preserving legacy semantics, | ||
does not provide strict isolation in a container environment. i.e by | ||
mounting devpts without the 'newinstance' option, a container could | ||
get visibility into the 'host' or root container's devpts. | ||
|
||
To workaround this and have strict isolation, all mounts of devpts, | ||
including the mount in the root container, should use the newinstance | ||
option. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.