Skip to content

Commit

Permalink
Merge branch 'syscalls-next' of git://git.kernel.org/pub/scm/linux/ke…
Browse files Browse the repository at this point in the history
…rnel/git/brodo/linux

Pull removal of in-kernel calls to syscalls from Dominik Brodowski:
 "System calls are interaction points between userspace and the kernel.
  Therefore, system call functions such as sys_xyzzy() or
  compat_sys_xyzzy() should only be called from userspace via the
  syscall table, but not from elsewhere in the kernel.

  At least on 64-bit x86, it will likely be a hard requirement from
  v4.17 onwards to not call system call functions in the kernel: It is
  better to use use a different calling convention for system calls
  there, where struct pt_regs is decoded on-the-fly in a syscall wrapper
  which then hands processing over to the actual syscall function. This
  means that only those parameters which are actually needed for a
  specific syscall are passed on during syscall entry, instead of
  filling in six CPU registers with random user space content all the
  time (which may cause serious trouble down the call chain). Those
  x86-specific patches will be pushed through the x86 tree in the near
  future.

  Moreover, rules on how data may be accessed may differ between kernel
  data and user data. This is another reason why calling sys_xyzzy() is
  generally a bad idea, and -- at most -- acceptable in arch-specific
  code.

  This patchset removes all in-kernel calls to syscall functions in the
  kernel with the exception of arch/. On top of this, it cleans up the
  three places where many syscalls are referenced or prototyped, namely
  kernel/sys_ni.c, include/linux/syscalls.h and include/linux/compat.h"

* 'syscalls-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux: (109 commits)
  bpf: whitelist all syscalls for error injection
  kernel/sys_ni: remove {sys_,sys_compat} from cond_syscall definitions
  kernel/sys_ni: sort cond_syscall() entries
  syscalls/x86: auto-create compat_sys_*() prototypes
  syscalls: sort syscall prototypes in include/linux/compat.h
  net: remove compat_sys_*() prototypes from net/compat.h
  syscalls: sort syscall prototypes in include/linux/syscalls.h
  kexec: move sys_kexec_load() prototype to syscalls.h
  x86/sigreturn: use SYSCALL_DEFINE0
  x86: fix sys_sigreturn() return type to be long, not unsigned long
  x86/ioport: add ksys_ioperm() helper; remove in-kernel calls to sys_ioperm()
  mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead()
  mm: add ksys_mmap_pgoff() helper; remove in-kernel calls to sys_mmap_pgoff()
  mm: add ksys_fadvise64_64() helper; remove in-kernel call to sys_fadvise64_64()
  fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate()
  fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls
  fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate()
  fs: add ksys_sync_file_range helper(); remove in-kernel calls to syscall
  kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid()
  kernel: add ksys_unshare() helper; remove in-kernel calls to sys_unshare()
  ...
  • Loading branch information
torvalds committed Apr 3, 2018
2 parents 2103596 + c9a2119 commit 642e7fd
Show file tree
Hide file tree
Showing 105 changed files with 3,129 additions and 1,868 deletions.
34 changes: 33 additions & 1 deletion Documentation/process/adding-syscalls.rst
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,7 @@ your new syscall number may get adjusted to resolve conflicts.
The file ``kernel/sys_ni.c`` provides a fallback stub implementation of each
system call, returning ``-ENOSYS``. Add your new system call here too::

cond_syscall(sys_xyzzy);
COND_SYSCALL(xyzzy);

Your new kernel functionality, and the system call that controls it, should
normally be optional, so add a ``CONFIG`` option (typically to
Expand Down Expand Up @@ -487,6 +487,38 @@ patchset, for the convenience of reviewers.
The man page should be cc'ed to [email protected]
For more details, see https://www.kernel.org/doc/man-pages/patches.html


Do not call System Calls in the Kernel
--------------------------------------

System calls are, as stated above, interaction points between userspace and
the kernel. Therefore, system call functions such as ``sys_xyzzy()`` or
``compat_sys_xyzzy()`` should only be called from userspace via the syscall
table, but not from elsewhere in the kernel. If the syscall functionality is
useful to be used within the kernel, needs to be shared between an old and a
new syscall, or needs to be shared between a syscall and its compatibility
variant, it should be implemented by means of a "helper" function (such as
``kern_xyzzy()``). This kernel function may then be called within the
syscall stub (``sys_xyzzy()``), the compatibility syscall stub
(``compat_sys_xyzzy()``), and/or other kernel code.

At least on 64-bit x86, it will be a hard requirement from v4.17 onwards to not
call system call functions in the kernel. It uses a different calling
convention for system calls where ``struct pt_regs`` is decoded on-the-fly in a
syscall wrapper which then hands processing over to the actual syscall function.
This means that only those parameters which are actually needed for a specific
syscall are passed on during syscall entry, instead of filling in six CPU
registers with random user space content all the time (which may cause serious
trouble down the call chain).

Moreover, rules on how data may be accessed may differ between kernel data and
user data. This is another reason why calling ``sys_xyzzy()`` is generally a
bad idea.

Exceptions to this rule are only allowed in architecture-specific overrides,
architecture-specific compatibility wrappers, or other code in arch/.


References and Sources
----------------------

Expand Down
2 changes: 1 addition & 1 deletion arch/alpha/kernel/osf_sys.c
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ SYSCALL_DEFINE6(osf_mmap, unsigned long, addr, unsigned long, len,
goto out;
if (off & ~PAGE_MASK)
goto out;
ret = sys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT);
ret = ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT);
out:
return ret;
}
Expand Down
2 changes: 1 addition & 1 deletion arch/arm/kernel/sys_arm.c
Original file line number Diff line number Diff line change
Expand Up @@ -35,5 +35,5 @@
asmlinkage long sys_arm_fadvise64_64(int fd, int advice,
loff_t offset, loff_t len)
{
return sys_fadvise64_64(fd, offset, len, advice);
return ksys_fadvise64_64(fd, offset, len, advice);
}
2 changes: 1 addition & 1 deletion arch/arm64/kernel/sys.c
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ asmlinkage long sys_mmap(unsigned long addr, unsigned long len,
if (offset_in_page(off) != 0)
return -EINVAL;

return sys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT);
return ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT);
}

SYSCALL_DEFINE1(arm64_personality, unsigned int, personality)
Expand Down
4 changes: 2 additions & 2 deletions arch/ia64/kernel/sys_ia64.c
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ int ia64_mmap_check(unsigned long addr, unsigned long len,
asmlinkage unsigned long
sys_mmap2 (unsigned long addr, unsigned long len, int prot, int flags, int fd, long pgoff)
{
addr = sys_mmap_pgoff(addr, len, prot, flags, fd, pgoff);
addr = ksys_mmap_pgoff(addr, len, prot, flags, fd, pgoff);
if (!IS_ERR((void *) addr))
force_successful_syscall_return();
return addr;
Expand All @@ -151,7 +151,7 @@ sys_mmap (unsigned long addr, unsigned long len, int prot, int flags, int fd, lo
if (offset_in_page(off) != 0)
return -EINVAL;

addr = sys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT);
addr = ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT);
if (!IS_ERR((void *) addr))
force_successful_syscall_return();
return addr;
Expand Down
2 changes: 1 addition & 1 deletion arch/m68k/kernel/sys_m68k.c
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ asmlinkage long sys_mmap2(unsigned long addr, unsigned long len,
* so we need to shift the argument down by 1; m68k mmap64(3)
* (in libc) expects the last argument of mmap2 in 4Kb units.
*/
return sys_mmap_pgoff(addr, len, prot, flags, fd, pgoff);
return ksys_mmap_pgoff(addr, len, prot, flags, fd, pgoff);
}

/* Convert virtual (user) address VADDR to physical address PADDR */
Expand Down
6 changes: 3 additions & 3 deletions arch/microblaze/kernel/sys_microblaze.c
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
if (pgoff & ~PAGE_MASK)
return -EINVAL;

return sys_mmap_pgoff(addr, len, prot, flags, fd, pgoff >> PAGE_SHIFT);
return ksys_mmap_pgoff(addr, len, prot, flags, fd, pgoff >> PAGE_SHIFT);
}

SYSCALL_DEFINE6(mmap2, unsigned long, addr, unsigned long, len,
Expand All @@ -50,6 +50,6 @@ SYSCALL_DEFINE6(mmap2, unsigned long, addr, unsigned long, len,
if (pgoff & (~PAGE_MASK >> 12))
return -EINVAL;

return sys_mmap_pgoff(addr, len, prot, flags, fd,
pgoff >> (PAGE_SHIFT - 12));
return ksys_mmap_pgoff(addr, len, prot, flags, fd,
pgoff >> (PAGE_SHIFT - 12));
}
22 changes: 11 additions & 11 deletions arch/mips/kernel/linux32.c
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,8 @@ SYSCALL_DEFINE6(32_mmap2, unsigned long, addr, unsigned long, len,
{
if (pgoff & (~PAGE_MASK >> 12))
return -EINVAL;
return sys_mmap_pgoff(addr, len, prot, flags, fd,
pgoff >> (PAGE_SHIFT-12));
return ksys_mmap_pgoff(addr, len, prot, flags, fd,
pgoff >> (PAGE_SHIFT-12));
}

#define RLIM_INFINITY32 0x7fffffff
Expand All @@ -82,13 +82,13 @@ struct rlimit32 {
SYSCALL_DEFINE4(32_truncate64, const char __user *, path,
unsigned long, __dummy, unsigned long, a2, unsigned long, a3)
{
return sys_truncate(path, merge_64(a2, a3));
return ksys_truncate(path, merge_64(a2, a3));
}

SYSCALL_DEFINE4(32_ftruncate64, unsigned long, fd, unsigned long, __dummy,
unsigned long, a2, unsigned long, a3)
{
return sys_ftruncate(fd, merge_64(a2, a3));
return ksys_ftruncate(fd, merge_64(a2, a3));
}

SYSCALL_DEFINE5(32_llseek, unsigned int, fd, unsigned int, offset_high,
Expand All @@ -105,13 +105,13 @@ SYSCALL_DEFINE5(32_llseek, unsigned int, fd, unsigned int, offset_high,
SYSCALL_DEFINE6(32_pread, unsigned long, fd, char __user *, buf, size_t, count,
unsigned long, unused, unsigned long, a4, unsigned long, a5)
{
return sys_pread64(fd, buf, count, merge_64(a4, a5));
return ksys_pread64(fd, buf, count, merge_64(a4, a5));
}

SYSCALL_DEFINE6(32_pwrite, unsigned int, fd, const char __user *, buf,
size_t, count, u32, unused, u64, a4, u64, a5)
{
return sys_pwrite64(fd, buf, count, merge_64(a4, a5));
return ksys_pwrite64(fd, buf, count, merge_64(a4, a5));
}

SYSCALL_DEFINE1(32_personality, unsigned long, personality)
Expand All @@ -131,15 +131,15 @@ SYSCALL_DEFINE1(32_personality, unsigned long, personality)
asmlinkage ssize_t sys32_readahead(int fd, u32 pad0, u64 a2, u64 a3,
size_t count)
{
return sys_readahead(fd, merge_64(a2, a3), count);
return ksys_readahead(fd, merge_64(a2, a3), count);
}

asmlinkage long sys32_sync_file_range(int fd, int __pad,
unsigned long a2, unsigned long a3,
unsigned long a4, unsigned long a5,
int flags)
{
return sys_sync_file_range(fd,
return ksys_sync_file_range(fd,
merge_64(a2, a3), merge_64(a4, a5),
flags);
}
Expand All @@ -149,14 +149,14 @@ asmlinkage long sys32_fadvise64_64(int fd, int __pad,
unsigned long a4, unsigned long a5,
int flags)
{
return sys_fadvise64_64(fd,
return ksys_fadvise64_64(fd,
merge_64(a2, a3), merge_64(a4, a5),
flags);
}

asmlinkage long sys32_fallocate(int fd, int mode, unsigned offset_a2,
unsigned offset_a3, unsigned len_a4, unsigned len_a5)
{
return sys_fallocate(fd, mode, merge_64(offset_a2, offset_a3),
merge_64(len_a4, len_a5));
return ksys_fallocate(fd, mode, merge_64(offset_a2, offset_a3),
merge_64(len_a4, len_a5));
}
6 changes: 4 additions & 2 deletions arch/mips/kernel/syscall.c
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ SYSCALL_DEFINE6(mips_mmap, unsigned long, addr, unsigned long, len,
{
if (offset & ~PAGE_MASK)
return -EINVAL;
return sys_mmap_pgoff(addr, len, prot, flags, fd, offset >> PAGE_SHIFT);
return ksys_mmap_pgoff(addr, len, prot, flags, fd,
offset >> PAGE_SHIFT);
}

SYSCALL_DEFINE6(mips_mmap2, unsigned long, addr, unsigned long, len,
Expand All @@ -73,7 +74,8 @@ SYSCALL_DEFINE6(mips_mmap2, unsigned long, addr, unsigned long, len,
if (pgoff & (~PAGE_MASK >> 12))
return -EINVAL;

return sys_mmap_pgoff(addr, len, prot, flags, fd, pgoff >> (PAGE_SHIFT-12));
return ksys_mmap_pgoff(addr, len, prot, flags, fd,
pgoff >> (PAGE_SHIFT - 12));
}

save_static_function(sys_fork);
Expand Down
30 changes: 15 additions & 15 deletions arch/parisc/kernel/sys_parisc.c
Original file line number Diff line number Diff line change
Expand Up @@ -270,16 +270,16 @@ asmlinkage unsigned long sys_mmap2(unsigned long addr, unsigned long len,
{
/* Make sure the shift for mmap2 is constant (12), no matter what PAGE_SIZE
we have. */
return sys_mmap_pgoff(addr, len, prot, flags, fd,
pgoff >> (PAGE_SHIFT - 12));
return ksys_mmap_pgoff(addr, len, prot, flags, fd,
pgoff >> (PAGE_SHIFT - 12));
}

asmlinkage unsigned long sys_mmap(unsigned long addr, unsigned long len,
unsigned long prot, unsigned long flags, unsigned long fd,
unsigned long offset)
{
if (!(offset & ~PAGE_MASK)) {
return sys_mmap_pgoff(addr, len, prot, flags, fd,
return ksys_mmap_pgoff(addr, len, prot, flags, fd,
offset >> PAGE_SHIFT);
} else {
return -EINVAL;
Expand All @@ -292,24 +292,24 @@ asmlinkage unsigned long sys_mmap(unsigned long addr, unsigned long len,
asmlinkage long parisc_truncate64(const char __user * path,
unsigned int high, unsigned int low)
{
return sys_truncate(path, (long)high << 32 | low);
return ksys_truncate(path, (long)high << 32 | low);
}

asmlinkage long parisc_ftruncate64(unsigned int fd,
unsigned int high, unsigned int low)
{
return sys_ftruncate(fd, (long)high << 32 | low);
return ksys_ftruncate(fd, (long)high << 32 | low);
}

/* stubs for the benefit of the syscall_table since truncate64 and truncate
* are identical on LP64 */
asmlinkage long sys_truncate64(const char __user * path, unsigned long length)
{
return sys_truncate(path, length);
return ksys_truncate(path, length);
}
asmlinkage long sys_ftruncate64(unsigned int fd, unsigned long length)
{
return sys_ftruncate(fd, length);
return ksys_ftruncate(fd, length);
}
asmlinkage long sys_fcntl64(unsigned int fd, unsigned int cmd, unsigned long arg)
{
Expand All @@ -320,7 +320,7 @@ asmlinkage long sys_fcntl64(unsigned int fd, unsigned int cmd, unsigned long arg
asmlinkage long parisc_truncate64(const char __user * path,
unsigned int high, unsigned int low)
{
return sys_truncate64(path, (loff_t)high << 32 | low);
return ksys_truncate(path, (loff_t)high << 32 | low);
}

asmlinkage long parisc_ftruncate64(unsigned int fd,
Expand All @@ -333,42 +333,42 @@ asmlinkage long parisc_ftruncate64(unsigned int fd,
asmlinkage ssize_t parisc_pread64(unsigned int fd, char __user *buf, size_t count,
unsigned int high, unsigned int low)
{
return sys_pread64(fd, buf, count, (loff_t)high << 32 | low);
return ksys_pread64(fd, buf, count, (loff_t)high << 32 | low);
}

asmlinkage ssize_t parisc_pwrite64(unsigned int fd, const char __user *buf,
size_t count, unsigned int high, unsigned int low)
{
return sys_pwrite64(fd, buf, count, (loff_t)high << 32 | low);
return ksys_pwrite64(fd, buf, count, (loff_t)high << 32 | low);
}

asmlinkage ssize_t parisc_readahead(int fd, unsigned int high, unsigned int low,
size_t count)
{
return sys_readahead(fd, (loff_t)high << 32 | low, count);
return ksys_readahead(fd, (loff_t)high << 32 | low, count);
}

asmlinkage long parisc_fadvise64_64(int fd,
unsigned int high_off, unsigned int low_off,
unsigned int high_len, unsigned int low_len, int advice)
{
return sys_fadvise64_64(fd, (loff_t)high_off << 32 | low_off,
return ksys_fadvise64_64(fd, (loff_t)high_off << 32 | low_off,
(loff_t)high_len << 32 | low_len, advice);
}

asmlinkage long parisc_sync_file_range(int fd,
u32 hi_off, u32 lo_off, u32 hi_nbytes, u32 lo_nbytes,
unsigned int flags)
{
return sys_sync_file_range(fd, (loff_t)hi_off << 32 | lo_off,
return ksys_sync_file_range(fd, (loff_t)hi_off << 32 | lo_off,
(loff_t)hi_nbytes << 32 | lo_nbytes, flags);
}

asmlinkage long parisc_fallocate(int fd, int mode, u32 offhi, u32 offlo,
u32 lenhi, u32 lenlo)
{
return sys_fallocate(fd, mode, ((u64)offhi << 32) | offlo,
((u64)lenhi << 32) | lenlo);
return ksys_fallocate(fd, mode, ((u64)offhi << 32) | offlo,
((u64)lenhi << 32) | lenlo);
}

long parisc_personality(unsigned long personality)
Expand Down
18 changes: 9 additions & 9 deletions arch/powerpc/kernel/sys_ppc32.c
Original file line number Diff line number Diff line change
Expand Up @@ -77,44 +77,44 @@ unsigned long compat_sys_mmap2(unsigned long addr, size_t len,
compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count,
u32 reg6, u32 poshi, u32 poslo)
{
return sys_pread64(fd, ubuf, count, ((loff_t)poshi << 32) | poslo);
return ksys_pread64(fd, ubuf, count, ((loff_t)poshi << 32) | poslo);
}

compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, compat_size_t count,
u32 reg6, u32 poshi, u32 poslo)
{
return sys_pwrite64(fd, ubuf, count, ((loff_t)poshi << 32) | poslo);
return ksys_pwrite64(fd, ubuf, count, ((loff_t)poshi << 32) | poslo);
}

compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offhi, u32 offlo, u32 count)
{
return sys_readahead(fd, ((loff_t)offhi << 32) | offlo, count);
return ksys_readahead(fd, ((loff_t)offhi << 32) | offlo, count);
}

asmlinkage int compat_sys_truncate64(const char __user * path, u32 reg4,
unsigned long high, unsigned long low)
{
return sys_truncate(path, (high << 32) | low);
return ksys_truncate(path, (high << 32) | low);
}

asmlinkage long compat_sys_fallocate(int fd, int mode, u32 offhi, u32 offlo,
u32 lenhi, u32 lenlo)
{
return sys_fallocate(fd, mode, ((loff_t)offhi << 32) | offlo,
return ksys_fallocate(fd, mode, ((loff_t)offhi << 32) | offlo,
((loff_t)lenhi << 32) | lenlo);
}

asmlinkage int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long high,
unsigned long low)
{
return sys_ftruncate(fd, (high << 32) | low);
return ksys_ftruncate(fd, (high << 32) | low);
}

long ppc32_fadvise64(int fd, u32 unused, u32 offset_high, u32 offset_low,
size_t len, int advice)
{
return sys_fadvise64(fd, (u64)offset_high << 32 | offset_low, len,
advice);
return ksys_fadvise64_64(fd, (u64)offset_high << 32 | offset_low, len,
advice);
}

asmlinkage long compat_sys_sync_file_range2(int fd, unsigned int flags,
Expand All @@ -124,5 +124,5 @@ asmlinkage long compat_sys_sync_file_range2(int fd, unsigned int flags,
loff_t offset = ((loff_t)offset_hi << 32) | offset_lo;
loff_t nbytes = ((loff_t)nbytes_hi << 32) | nbytes_lo;

return sys_sync_file_range(fd, offset, nbytes, flags);
return ksys_sync_file_range(fd, offset, nbytes, flags);
}
Loading

0 comments on commit 642e7fd

Please sign in to comment.