Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container detection fails on cgroup v2 devices #1592

Closed
Silvanoc opened this issue Feb 24, 2021 · 12 comments · Fixed by #1686 or #1996
Closed

Container detection fails on cgroup v2 devices #1592

Silvanoc opened this issue Feb 24, 2021 · 12 comments · Fixed by #1686 or #1996

Comments

@Silvanoc
Copy link
Contributor

Silvanoc commented Feb 24, 2021

Actual behavior
Kaniko fails to detect that it's running in a Docker container when not run on a system using cgroup v2.

Error message:

kaniko should only be run inside of a container, run with the --force flag if you are sure you want to continue

This issue can be reproduced on a Debian Testing/Bullseye (default using cgroup v2) installation and on an ArchLinux installation using cgroup v2.

This issue originates from the way genuinetools/bpfd/proc detects if running in a Docker container or not. On Debian Stable/Buster (default using cgroup v1) and Archlinux using cgroup v1

Expected behavior
Kaniko detects the container runtime properly and doesn't report any issue.

To Reproduce
A system with cgroup v2 is needed to reproduce the issue.

Steps to reproduce the behavior (lines starting with $ are to be run on the host, lines starting with # are to be run in the container):

  1. $ docker run --rm -ti --entrypoint "" gcr.io/kaniko-project/executor:v1.5.1-debug sh
  2. # echo "FROM alpine:latest" > Dockerfile
  3. # /kaniko/executor --dockerfile Dockerfile --destination /dev/null

Additional Information

  • Dockerfile
    Independent of Dockerfile, tested with different ones.
  • Build Context
    Independent of context (tested on contexts without copying any file)
  • Kaniko Image (fully qualified with digest)
    gcr.io/kaniko-project/executor:v1.5.1-debug, sha256:e00dfdd4a44097867c8ef671e5a7f3e31d94bd09406dbdfba8a13a63fc6b8060

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing No new feature
Please check if the build works in docker but not in kaniko Cannot be tested in Docker
Please check if this error is seen when you use --cache flag No difference
Please check if your dockerfile is a multistage dockerfile Independent from Dockerfile
@Silvanoc Silvanoc changed the title Container detection fails on Docker 20.10.3 Container detection fails on cgroup v2 devices Feb 26, 2021
@thediveo
Copy link

thediveo commented Mar 2, 2021

This detection issue seems to come from an upstream dependency, namely https://github.com/genuinetools/bpfd/blob/master/proc/proc.go. GetContainerID(...) tries to glance a container identifier from cgroup path information and the way cgroup paths are detected differs between the v1 multiple hierarchies and the unified v2 hierarchy.

@Silvanoc
Copy link
Contributor Author

Silvanoc commented Mar 2, 2021

Exactly, the question is how this detection can get fixed. Either in the dependency upstream or in Kaniko itself. Does a process running in a Docker container on a system with cgroup v2 have a way to detect that it's running in a container?

Podman, as an example that works, explicitly declares it by assigning the environment variable container=podman. Perhaps some similar declaration in Docker is needed.

@thediveo
Copy link

thediveo commented Mar 2, 2021

A closer look at a container's /proc/self/cgroup show only 0::/, indicating that Docker when on pure (not hybrid) cgroups v2 also activates cgroup namespaces. This of course does what it is intended for: avoiding leaking host cgroup information into containers, so there is no chance to glance any container engine (runtime) information from the cgroup hierarchy.

In consequence, there doesn't seem to be any chance to fix the detection upstream, at least not for stock default Docker configurations.

So there's probably only --force left, I'm afraid.

@Silvanoc
Copy link
Contributor Author

Silvanoc commented Mar 2, 2021

Currently I don't see any other option, but I might be missing something...

@thediveo
Copy link

thediveo commented Mar 7, 2021

There actually is one sign that might be useful to container detection, albeit without specifics as to which engine is used: when using cgroups v2 with the namespace boundaries then the cgroup2 "root" has a cgroup.freezer interface file, whereas the real cgroup2 filesystem-originating root (in the initial cgroup namespace) won't ever have a cgroup.freezer interface file. So this allows to differentiate between the initial cgroup namespace cgroup v2 unified hierarchy root and a cgroup-namespace delegated cgroup appearing as the cgroup root inside "something boxed" -- this doesn't need to be a container, just something cgroup-namespace'd.

Would that be sufficient for Kaniko?

Note bene: looking at the cgroup namespace itself doesn't help much, unless the container had been started with "host:pid", so there is no chance of reckoning whether this is the initial cgroup namespace or not.

See also:

archlinux-github pushed a commit to archlinux/archlinux-docker that referenced this issue Apr 17, 2021
@klausenbusk
Copy link

klausenbusk commented Apr 17, 2021

I had a look at how systemd does it, the main code is here. Among other things, it checks if specific files exist on the filesystem.

wULLSnpAXbWZGYDYyhWTKKspEQoaYxXyhoisqHf added a commit to wULLSnpAXbWZGYDYyhWTKKspEQoaYxXyhoisqHf/docker-fedora-hugo that referenced this issue Apr 21, 2021
This is a temporary workaround that enables kaniko to run on cgroup v2
enabled hosts. Due to an upstream issue, kaniko fails to detect that it
is indeed being run from a container.
Solution introduced here is to force kaniko to run regardless.

ref: GoogleContainerTools/kaniko#1592

[skip ci]
@ejose19
Copy link
Contributor

ejose19 commented May 7, 2021

I had a look at how systemd does it, the main code is here. Among other things, it checks if specific files exist on the filesystem.

I confirm the systemd container detection worked with cgroupv2 on archlinux host. So having geniunetools use this method (proposed on genuinetools/bpfd#16) should solve the issue.

@ejose19
Copy link
Contributor

ejose19 commented May 7, 2021

I've pushed a PR on upstream genuinetools/bpfd#19, however I noticed that package last activity was on 2019, so there's a chance it's not merged soon, @tejal29 should I backport the fix to https://github.com/GoogleContainerTools/kaniko/blob/master/vendor/github.com/genuinetools/bpfd/proc/proc.go as well to fix this on our side?

@Silvanoc
Copy link
Contributor Author

@klausenbusk @ejose19 the systemd approach is a "good weather" approach that surely covers over 90% of all use-cases, but that percentage will be slowly reducing...

IMO As long as Kaniko provides the flag --force to cover the use-cases not being covered by this "good weather" approach, the systemd approach should suffice.

Additional information

The file /.dockerenv is in fact a poor hint of a container environment, since it's a file that's part of the container image, not of the container environment! It's a buildtime hint for runtime, bad idea... You could extract the rootfs of a container image and use it with chroot (therefore not being containerized) and the file /.dockerenv would be part of the extracted rootfs. Or you could run with Docker a container image not built with Docker and therefore missing the file /.dockerenv.

Podman approach is much better, since it's the container runtime providing runtime information (therefore the file appears in the temporary directory /run).

Container images being built with buildah don't provide the file /.dockerenv and with a throttling Docker Hub you can expect in the future more people pulling images from less Docker-bound registries (like Quay).

@pschichtel
Copy link

Is a release with this already planned?

@wULLSnpAXbWZGYDYyhWTKKspEQoaYxXyhoisqHf

Is a release with this already planned?

I'd also be interested, altough in the meantime I believe you could get away with using the latest stable-ish build from executor's GCP index.
Personally, I've found the sha256:6ecc43ae139ad8cfa11604b592aaedddcabff8cef469eda303f1fb5afe5e3034 ref reliable and have been using if for just over a month, although it'd be nice to see a release with it.
Perhaps the team plans to merge a couple of more features first..

@Silvanoc
Copy link
Contributor Author

@pschichtel @wULLSnpAXbWZGYDYyhWTKKspEQoaYxXyhoisqHf there's meanwhile an issue requesting a new release, you can react to it to raise its visibility: #1740

hkube-ci pushed a commit to kube-HPC/hkube that referenced this issue Mar 27, 2022
hkube-ci pushed a commit to kube-HPC/hkube that referenced this issue Mar 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants