Skip to content

Commit

Permalink
Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virti…
Browse files Browse the repository at this point in the history
…ofs-20201026' into staging

virtiofsd pull 2020-10-26

Misono
   Set default log level to info
   Explicit build option for virtiofsd

Me
   xattr name mapping

Stefan
  Alternative chroot sandbox method

Max
  Submount mechanism

Signed-off-by: Dr. David Alan Gilbert <[email protected]>

# gpg: Signature made Mon 26 Oct 2020 18:41:36 GMT
# gpg:                using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <[email protected]>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert-gitlab/tags/pull-virtiofs-20201026:
  tests/acceptance: Add virtiofs_submounts.py
  tests/acceptance/boot_linux: Accept SSH pubkey
  virtiofsd: Announce sub-mount points
  virtiofsd: Store every lo_inode's parent_dev
  virtiofsd: Add fuse_reply_attr_with_flags()
  virtiofsd: Add attr_flags to fuse_entry_param
  virtiofsd: Announce FUSE_ATTR_FLAGS
  linux/fuse.h: Pull in from Linux
  tools/virtiofsd: xattr name mappings: Simple 'map'
  tools/virtiofsd: xattr name mapping examples
  tools/virtiofsd: xattr name mappings: Map server xattr names
  tools/virtiofsd: xattr name mappings: Map client xattr names
  tools/virtiofsd: xattr name mappings: Add option
  virtiofsd: add container-friendly -o sandbox=chroot option
  virtiofsd: passthrough_ll: set FUSE_LOG_INFO as default log_level
  configure: add option for virtiofsd

Signed-off-by: Peter Maydell <[email protected]>
  • Loading branch information
pm215 committed Oct 27, 2020
2 parents 4a74626 + c93a656 commit 725ca33
Show file tree
Hide file tree
Showing 17 changed files with 1,528 additions and 40 deletions.
8 changes: 7 additions & 1 deletion configure
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,7 @@ fdt="auto"
netmap="no"
sdl="auto"
sdl_image="auto"
virtiofsd="auto"
virtfs=""
libudev="auto"
mpath="auto"
Expand Down Expand Up @@ -999,6 +1000,10 @@ for opt do
;;
--enable-libudev) libudev="enabled"
;;
--disable-virtiofsd) virtiofsd="disabled"
;;
--enable-virtiofsd) virtiofsd="enabled"
;;
--disable-mpath) mpath="disabled"
;;
--enable-mpath) mpath="enabled"
Expand Down Expand Up @@ -1758,6 +1763,7 @@ disabled with --disable-FEATURE, default is enabled if available:
vnc-png PNG compression for VNC server
cocoa Cocoa UI (Mac OS X only)
virtfs VirtFS
virtiofsd build virtiofs daemon (virtiofsd)
libudev Use libudev to enumerate host devices
mpath Multipath persistent reservation passthrough
xen xen backend driver support
Expand Down Expand Up @@ -6972,7 +6978,7 @@ NINJA=$ninja $meson setup \
-Dxen=$xen -Dxen_pci_passthrough=$xen_pci_passthrough -Dtcg=$tcg \
-Dcocoa=$cocoa -Dmpath=$mpath -Dsdl=$sdl -Dsdl_image=$sdl_image \
-Dvnc=$vnc -Dvnc_sasl=$vnc_sasl -Dvnc_jpeg=$vnc_jpeg -Dvnc_png=$vnc_png \
-Dgettext=$gettext -Dxkbcommon=$xkbcommon -Du2f=$u2f \
-Dgettext=$gettext -Dxkbcommon=$xkbcommon -Du2f=$u2f -Dvirtiofsd=$virtiofsd \
-Dcapstone=$capstone -Dslirp=$slirp -Dfdt=$fdt \
-Diconv=$iconv -Dcurses=$curses -Dlibudev=$libudev\
-Ddocs=$docs -Dsphinx_build=$sphinx_build -Dinstall_blobs=$blobs \
Expand Down
193 changes: 186 additions & 7 deletions docs/tools/virtiofsd.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,24 @@ This program is designed to work with QEMU's ``--device vhost-user-fs-pci``
but should work with any virtual machine monitor (VMM) that supports
vhost-user. See the Examples section below.

This program must be run as the root user. Upon startup the program will
switch into a new file system namespace with the shared directory tree as its
root. This prevents "file system escapes" due to symlinks and other file
system objects that might lead to files outside the shared directory. The
program also sandboxes itself using seccomp(2) to prevent ptrace(2) and other
vectors that could allow an attacker to compromise the system after gaining
control of the virtiofsd process.
This program must be run as the root user. The program drops privileges where
possible during startup although it must be able to create and access files
with any uid/gid:

* The ability to invoke syscalls is limited using seccomp(2).
* Linux capabilities(7) are dropped.

In "namespace" sandbox mode the program switches into a new file system
namespace and invokes pivot_root(2) to make the shared directory tree its root.
A new pid and net namespace is also created to isolate the process.

In "chroot" sandbox mode the program invokes chroot(2) to make the shared
directory tree its root. This mode is intended for container environments where
the container runtime has already set up the namespaces and the program does
not have permission to create namespaces itself.

Both sandbox modes prevent "file system escapes" due to symlinks and other file
system objects that might lead to files outside the shared directory.

Options
-------
Expand Down Expand Up @@ -69,6 +80,13 @@ Options
* readdirplus|no_readdirplus -
Enable/disable readdirplus. The default is ``readdirplus``.

* sandbox=namespace|chroot -
Sandbox mode:
- namespace: Create mount, pid, and net namespaces and pivot_root(2) into
the shared directory.
- chroot: chroot(2) into shared directory (use in containers).
The default is "namespace".

* source=PATH -
Share host directory tree located at PATH. This option is required.

Expand Down Expand Up @@ -109,6 +127,167 @@ Options
timeout. ``always`` sets a long cache lifetime at the expense of coherency.
The default is ``auto``.

xattr-mapping
-------------

By default the name of xattr's used by the client are passed through to the server
file system. This can be a problem where either those xattr names are used
by something on the server (e.g. selinux client/server confusion) or if the
virtiofsd is running in a container with restricted privileges where it cannot
access some attributes.

A mapping of xattr names can be made using -o xattrmap=mapping where the ``mapping``
string consists of a series of rules.

The first matching rule terminates the mapping.
The set of rules must include a terminating rule to match any remaining attributes
at the end.

Each rule consists of a number of fields separated with a separator that is the
first non-white space character in the rule. This separator must then be used
for the whole rule.
White space may be added before and after each rule.

Using ':' as the separator a rule is of the form:

``:type:scope:key:prepend:``

**scope** is:

- 'client' - match 'key' against a xattr name from the client for
setxattr/getxattr/removexattr
- 'server' - match 'prepend' against a xattr name from the server
for listxattr
- 'all' - can be used to make a single rule where both the server
and client matches are triggered.

**type** is one of:

- 'prefix' - is designed to prepend and strip a prefix; the modified
attributes then being passed on to the client/server.

- 'ok' - Causes the rule set to be terminated when a match is found
while allowing matching xattr's through unchanged.
It is intended both as a way of explicitly terminating
the list of rules, and to allow some xattr's to skip following rules.

- 'bad' - If a client tries to use a name matching 'key' it's
denied using EPERM; when the server passes an attribute
name matching 'prepend' it's hidden. In many ways it's use is very like
'ok' as either an explict terminator or for special handling of certain
patterns.

**key** is a string tested as a prefix on an attribute name originating
on the client. It maybe empty in which case a 'client' rule
will always match on client names.

**prepend** is a string tested as a prefix on an attribute name originating
on the server, and used as a new prefix. It may be empty
in which case a 'server' rule will always match on all names from
the server.

e.g.:

``:prefix:client:trusted.:user.virtiofs.:``

will match 'trusted.' attributes in client calls and prefix them before
passing them to the server.

``:prefix:server::user.virtiofs.:``

will strip 'user.virtiofs.' from all server replies.

``:prefix:all:trusted.:user.virtiofs.:``

combines the previous two cases into a single rule.

``:ok:client:user.::``

will allow get/set xattr for 'user.' xattr's and ignore
following rules.

``:ok:server::security.:``

will pass 'securty.' xattr's in listxattr from the server
and ignore following rules.

``:ok:all:::``

will terminate the rule search passing any remaining attributes
in both directions.

``:bad:server::security.:``

would hide 'security.' xattr's in listxattr from the server.

A simpler 'map' type provides a shorter syntax for the common case:

``:map:key:prepend:``

The 'map' type adds a number of separate rules to add **prepend** as a prefix
to the matched **key** (or all attributes if **key** is empty).
There may be at most one 'map' rule and it must be the last rule in the set.

xattr-mapping Examples
----------------------

1) Prefix all attributes with 'user.virtiofs.'

::

-o xattrmap=":prefix:all::user.virtiofs.::bad:all:::"


This uses two rules, using : as the field separator;
the first rule prefixes and strips 'user.virtiofs.',
the second rule hides any non-prefixed attributes that
the host set.

This is equivalent to the 'map' rule:

::
-o xattrmap=":map::user.virtiofs.:"

2) Prefix 'trusted.' attributes, allow others through

::

"/prefix/all/trusted./user.virtiofs./
/bad/server//trusted./
/bad/client/user.virtiofs.//
/ok/all///"


Here there are four rules, using / as the field
separator, and also demonstrating that new lines can
be included between rules.
The first rule is the prefixing of 'trusted.' and
stripping of 'user.virtiofs.'.
The second rule hides unprefixed 'trusted.' attributes
on the host.
The third rule stops a guest from explicitly setting
the 'user.virtiofs.' path directly.
Finally, the fourth rule lets all remaining attributes
through.

This is equivalent to the 'map' rule:

::
-o xattrmap="/map/trusted./user.virtiofs./"

3) Hide 'security.' attributes, and allow everything else

::

"/bad/all/security./security./
/ok/all///'

The first rule combines what could be separate client and server
rules into a single 'all' rule, matching 'security.' in either
client arguments or lists returned from the host. This stops
the client seeing any 'security.' attributes on the server and
stops it setting any.

Examples
--------

Expand Down
11 changes: 10 additions & 1 deletion include/standard-headers/linux/fuse.h
Original file line number Diff line number Diff line change
Expand Up @@ -227,7 +227,7 @@ struct fuse_attr {
uint32_t gid;
uint32_t rdev;
uint32_t blksize;
uint32_t padding;
uint32_t flags;
};

struct fuse_kstatfs {
Expand Down Expand Up @@ -310,6 +310,7 @@ struct fuse_file_lock {
* FUSE_NO_OPENDIR_SUPPORT: kernel supports zero-message opendir
* FUSE_EXPLICIT_INVAL_DATA: only invalidate cached pages on explicit request
* FUSE_MAP_ALIGNMENT: map_alignment field is valid
* FUSE_ATTR_FLAGS: fuse_attr.flags is present and valid
*/
#define FUSE_ASYNC_READ (1 << 0)
#define FUSE_POSIX_LOCKS (1 << 1)
Expand Down Expand Up @@ -338,6 +339,7 @@ struct fuse_file_lock {
#define FUSE_NO_OPENDIR_SUPPORT (1 << 24)
#define FUSE_EXPLICIT_INVAL_DATA (1 << 25)
#define FUSE_MAP_ALIGNMENT (1 << 26)
#define FUSE_ATTR_FLAGS (1 << 27)

/**
* CUSE INIT request/reply flags
Expand Down Expand Up @@ -413,6 +415,13 @@ struct fuse_file_lock {
*/
#define FUSE_FSYNC_FDATASYNC (1 << 0)

/**
* fuse_attr flags
*
* FUSE_ATTR_SUBMOUNT: File/directory is a submount point
*/
#define FUSE_ATTR_SUBMOUNT (1 << 0)

enum fuse_opcode {
FUSE_LOOKUP = 1,
FUSE_FORGET = 2, /* no reply */
Expand Down
1 change: 1 addition & 0 deletions meson.build
Original file line number Diff line number Diff line change
Expand Up @@ -2045,6 +2045,7 @@ summary_info += {'Audio drivers': config_host['CONFIG_AUDIO_DRIVERS']}
summary_info += {'Block whitelist (rw)': config_host['CONFIG_BDRV_RW_WHITELIST']}
summary_info += {'Block whitelist (ro)': config_host['CONFIG_BDRV_RO_WHITELIST']}
summary_info += {'VirtFS support': config_host.has_key('CONFIG_VIRTFS')}
summary_info += {'build virtiofs daemon': have_virtiofsd}
summary_info += {'Multipath support': mpathpersist.found()}
summary_info += {'VNC support': vnc.found()}
if vnc.found()
Expand Down
2 changes: 2 additions & 0 deletions meson_options.txt
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ option('vnc_sasl', type : 'feature', value : 'auto',
description: 'SASL authentication for VNC server')
option('xkbcommon', type : 'feature', value : 'auto',
description: 'xkbcommon support')
option('virtiofsd', type: 'feature', value: 'auto',
description: 'build virtiofs daemon (virtiofsd)')

option('capstone', type: 'combo', value: 'auto',
choices: ['disabled', 'enabled', 'auto', 'system', 'internal'],
Expand Down
13 changes: 7 additions & 6 deletions tests/acceptance/boot_linux.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def download_boot(self):
self.cancel('Failed to download/prepare boot image')
return boot.path

def download_cloudinit(self):
def download_cloudinit(self, ssh_pubkey=None):
self.log.info('Preparing cloudinit image')
try:
cloudinit_iso = os.path.join(self.workdir, 'cloudinit.iso')
Expand All @@ -67,7 +67,8 @@ def download_cloudinit(self):
password='password',
# QEMU's hard coded usermode router address
phone_home_host='10.0.2.2',
phone_home_port=self.phone_home_port)
phone_home_port=self.phone_home_port,
authorized_key=ssh_pubkey)
except Exception:
self.cancel('Failed to prepared cloudinit image')
return cloudinit_iso
Expand All @@ -80,19 +81,19 @@ class BootLinux(BootLinuxBase):
timeout = 900
chksum = None

def setUp(self):
def setUp(self, ssh_pubkey=None):
super(BootLinux, self).setUp()
self.vm.add_args('-smp', '2')
self.vm.add_args('-m', '1024')
self.prepare_boot()
self.prepare_cloudinit()
self.prepare_cloudinit(ssh_pubkey)

def prepare_boot(self):
path = self.download_boot()
self.vm.add_args('-drive', 'file=%s' % path)

def prepare_cloudinit(self):
cloudinit_iso = self.download_cloudinit()
def prepare_cloudinit(self, ssh_pubkey=None):
cloudinit_iso = self.download_cloudinit(ssh_pubkey)
self.vm.add_args('-drive', 'file=%s,format=raw' % cloudinit_iso)

def launch_and_wait(self):
Expand Down
Loading

0 comments on commit 725ca33

Please sign in to comment.