I think they linuxcontainers.org people would disagree. Like the table is trying to communicate, in contrast to eg Docker, this is not about application containerization.
The table is comparing all three types: VM, system containers, application containers. Incus supports application containers. Its a relatively recent addition.
That's literally the opposite of what this documentation is explaining. System containers exist. You can run the entire userspace of an OS (including systemd) in a container.
VMs also don't always require hardware virtualization - Alibaba's PVM https://lkml.org/lkml/2024/2/26/1263 didn't get upstreamed, but, theoretically the MMU is all you need for complete isolation. This kind of idea is also how VM software worked before VT-x was introduced. And of course QEMU has the TCG which works with no kernel support at all.
Yes, Xen was big on paravirtualisation but started supporting the other kinds pretty soon, too. (At least they were supported around 2009-2012, when I was working on XenServer.)
In my experience TCG (or any method that doesn't require root / admin power) is pretty slow. But I'd be happy to be wrong about that, for an odd project I have
It also depends on the architectures. x86 on ARM is tough to do efficiently because of the memory model differences. One of the keys to Rosetta 2 being so good was being able to make the underlying ARM processor obey the x86 memory model (even though it was still executing ARM instructions).
Incus is really nice. It manages to provide a rather container-like experience for VMs. Having the ability to grab a shell on or copy files to/from a VM with the ease of using Docker is a great quality of life improvement. This requires an agent running in the VM but it's already included in the images from the project repo.
It mostly isn't. Almost all Linux container escapes only require the ability to make system calls to the shared kernel from processes inside the container. The system container doesn't really restrict this ability. It also increases surface area to compromise the container before attacking the host system, since there's now a bunch of extra software running inside the container.
If privilege isolation is a priority but you want to use containers, gVisor and Firecracker are way ahead of anything else. The Linux kernel API has proved to be very hard to secure, and not for lack of trying.
Any shared resource between containers or the kernel itself is an attack surface.
Both options have a very wide attack surface - the kernel api.
Nothing really beats virtualization in security, the surface shrinks to pretty much just the virtualization bits in the kernel and some user space bits in the VMM.
Complexity is generally the enemy of security, because securing a system requires understanding it. If you can build a more understandable, less moving parts, more observable, more easily manageable etc system with system containers, it's a security argument.
The article is pretty useless at explaining the difference, I agree. It makes claims about Docker that aren't true (e.g. single container) while making inadequate reference to the OS features likely involved in making "system containers" what they are (SECCOMP, capabilities, network namespaces, nftables).
As an engineer this page has a real "trust me bro" feel to it. Maybe fine as a marketing and product positioning thing, but not interesting for HN.
This has been one personal pet peeve with the documentation surrounding Incus.
As a stack, Incus has been exceptional, it has largely replaced Proxmox and Podman Quadlets for me. For context, I homelab so I cannot generalize my claim to SMB or enterprise.
But the documentation has been very end user oriented, information regarding specifics like seccomp as you mentioned are only discoverable with the search bar and that leads to various disparate locations; and that also isn't taking into account that some of the more nitty gritty information isn't on the Incus portion of linuxcontainers.org, see the LXC Security page for example: https://linuxcontainers.org/lxc/security/
Seems mostly off topic to the article. I think system containers should be implemented in user space. They are not about security theatre but about getting a sandboxed environment which feels like a real/virtual machine but is lighter weight. Very useful e.g. when I want to emulate a whole cluster of Linux machines. And for those needs security is nice but not key.
It is application containers which maybe should be replaced by better kernel security, not system containers.
What is this? Docker containers can host more than one process/service/app. And why is some product called “Incus” using “linuxcontainers.org” as a domain name?
According to their Github page, they _are_ linuxcontainers (in a way), and Incus is Apache licensed:
Incus, which is named after the Cumulonimbus incus or anvil cloud started as a community fork of Canonical's LXD following Canonical's takeover of the LXD project from the Linux Containers community.
The project was then adopted by the Linux Containers community, taking back the spot left empty by LXD's departure.
Incus is a true open source community project, free of any CLA and remains released under the Apache 2.0 license. It's maintained by the same team of developers that first created LXD.
LXD users wishing to migrate to Incus can easily do so through a migration tool called lxd-to-incus.
Are (self hosting) people putting multiple services like Django app, Postgres, Redis etc into a single container/lightweight VM instead of using Docker Compose with single-purpose containers?
You don’t have too, as you can create a single posgres instance for your services.
I prefer Incus, because you can’t do adhoc patching with docker. Instead you have to rebuild the images and that becomes a hassle quicky in a homelab settings. Incus have a VM feel while having docker management UX.
incus is the truly open source version of lxc/lxd. It is stable and incredible. I manage dozens of machines and want for nothing, and most importantly, pay nothing for that luxury.
Which just validates my point that a generic-sounding domain is the wrong place to host content that even within the Linux ecosystem is a relatively minor player.
Not only is this project website older than Docker, early versions of Docker literally used LXC as the backend, which was supported in Docker for the first two years of its life.
The Docker folks could have done their work under this umbrella and (maybe for good reasons) chose not to. For later container runtimes, idk the story.
But this project/community definitely laid the groundwork for all of those later Linux container runtimes.
lxc is used really frequently in the home space (jellyfin/plex for instance). A lot of Proxmox use cases as well which is growing in popularity extremely rapidly.
LXC far predates docker regardless of size or impact. It's not disingenuous if you were literally the foundation docker was able to package into a shiny accessible tool.
Like that matters to consumers? Regardless of who invented sodas, the market has changed and people connect more brands to the kind of drink now, so equating Pepsi to Soda is factually incorrect.
linux containers, be it a lxd container, or a containerd/dockerd one, only run on linux hosts.
windows containers, only run on windows hosts.
when you run a linux container on a windows host, you're actually running a linux container inside of a linux vm on top of a windows host.
containers share the host operating system's kernel. it is impossible for a linux container (which is just a linux process) to execute and share the windows kernel. the reverse is true, a windows container (which is just a process) cannot execute and share the linux kernel
the article is correct, linux containers can only execute on a linux host
As I always say: a VM makes an OS believe that it has the machine to itself; a container makes a process believe that it has the OS to itself.
I think they linuxcontainers.org people would disagree. Like the table is trying to communicate, in contrast to eg Docker, this is not about application containerization.
The table is comparing all three types: VM, system containers, application containers. Incus supports application containers. Its a relatively recent addition.
I can't find great docs for it, but its in the release notes last year: https://linuxcontainers.org/incus/news/2024_07_12_05_07.html
For those unfamiliar, Incus is 'replacement' for LXD. https://linuxcontainers.org/incus
That's literally the opposite of what this documentation is explaining. System containers exist. You can run the entire userspace of an OS (including systemd) in a container.
And a system container makes an OS (or OS userland) believe that that it has the kernel to itself.
I'll have to remember that one!
VMs also don't always require hardware virtualization - Alibaba's PVM https://lkml.org/lkml/2024/2/26/1263 didn't get upstreamed, but, theoretically the MMU is all you need for complete isolation. This kind of idea is also how VM software worked before VT-x was introduced. And of course QEMU has the TCG which works with no kernel support at all.
I think you could also add Xen to that list. IIRC, the old Xen PV mode was purely paravirtualized without using any hardware extensions.
Yes, Xen was big on paravirtualisation but started supporting the other kinds pretty soon, too. (At least they were supported around 2009-2012, when I was working on XenServer.)
I think things are swinging back the other way if I have understood the more recent PVHv2 stuff correctly.
In my experience TCG (or any method that doesn't require root / admin power) is pretty slow. But I'd be happy to be wrong about that, for an odd project I have
It depends a bit on your workload. If you have a pure computation workload, without much IO, TCG etc doesn't need to be slow.
It also depends on the architectures. x86 on ARM is tough to do efficiently because of the memory model differences. One of the keys to Rosetta 2 being so good was being able to make the underlying ARM processor obey the x86 memory model (even though it was still executing ARM instructions).
Incus is really nice. It manages to provide a rather container-like experience for VMs. Having the ability to grab a shell on or copy files to/from a VM with the ease of using Docker is a great quality of life improvement. This requires an agent running in the VM but it's already included in the images from the project repo.
Can someone explain how a system container is more secure than an application container, if that is indeed the case?
It mostly isn't. Almost all Linux container escapes only require the ability to make system calls to the shared kernel from processes inside the container. The system container doesn't really restrict this ability. It also increases surface area to compromise the container before attacking the host system, since there's now a bunch of extra software running inside the container.
If privilege isolation is a priority but you want to use containers, gVisor and Firecracker are way ahead of anything else. The Linux kernel API has proved to be very hard to secure, and not for lack of trying.
In the context of Incus, they are the same.
Incus and LXC internally use umoci to manipulate the OCI tarball to conform to how LXC runs containers.
See: - https://umo.ci/ - https://github.com/lxc/lxc/blob/lxc-4.0.2/templates/lxc-oci....
It's not really.
Any shared resource between containers or the kernel itself is an attack surface.
Both options have a very wide attack surface - the kernel api.
Nothing really beats virtualization in security, the surface shrinks to pretty much just the virtualization bits in the kernel and some user space bits in the VMM.
Complexity is generally the enemy of security, because securing a system requires understanding it. If you can build a more understandable, less moving parts, more observable, more easily manageable etc system with system containers, it's a security argument.
It generally is more secure just because the system container virtualization system is "more complete", so it's harder to get out from under it.
My understanding with Incus(the OP link) it's the same virtualization system, so there is no real difference, security wise between the two.
The question then becomes can they get out from under the virtualization and can they get access to other machines, containers, etc.
Docker's virtualization system has been very weak security wise. So a system container would be more secure than docker's virtualization system.
The article is pretty useless at explaining the difference, I agree. It makes claims about Docker that aren't true (e.g. single container) while making inadequate reference to the OS features likely involved in making "system containers" what they are (SECCOMP, capabilities, network namespaces, nftables).
As an engineer this page has a real "trust me bro" feel to it. Maybe fine as a marketing and product positioning thing, but not interesting for HN.
This has been one personal pet peeve with the documentation surrounding Incus.
As a stack, Incus has been exceptional, it has largely replaced Proxmox and Podman Quadlets for me. For context, I homelab so I cannot generalize my claim to SMB or enterprise.
But the documentation has been very end user oriented, information regarding specifics like seccomp as you mentioned are only discoverable with the search bar and that leads to various disparate locations; and that also isn't taking into account that some of the more nitty gritty information isn't on the Incus portion of linuxcontainers.org, see the LXC Security page for example: https://linuxcontainers.org/lxc/security/
IMO it's not good that the kernel interferences keep on spawning endless userland "middleware" projects.
I still want capsicum to give me sane defaults, so the incentive for sandbox security theater goes away.
Seems mostly off topic to the article. I think system containers should be implemented in user space. They are not about security theatre but about getting a sandboxed environment which feels like a real/virtual machine but is lighter weight. Very useful e.g. when I want to emulate a whole cluster of Linux machines. And for those needs security is nice but not key.
It is application containers which maybe should be replaced by better kernel security, not system containers.
This sounds very similar to BootC except that BootC is immutable
What is this? Docker containers can host more than one process/service/app. And why is some product called “Incus” using “linuxcontainers.org” as a domain name?
According to their Github page, they _are_ linuxcontainers (in a way), and Incus is Apache licensed:
Incus, which is named after the Cumulonimbus incus or anvil cloud started as a community fork of Canonical's LXD following Canonical's takeover of the LXD project from the Linux Containers community.
The project was then adopted by the Linux Containers community, taking back the spot left empty by LXD's departure.
Incus is a true open source community project, free of any CLA and remains released under the Apache 2.0 license. It's maintained by the same team of developers that first created LXD.
LXD users wishing to migrate to Incus can easily do so through a migration tool called lxd-to-incus.
https://github.com/lxc/incus
Supported by Colima, too: https://github.com/abiosoft/colima/blob/main/README.md#incus
Linux Containers, or LXC, came before Docker and OCI standardization.
As the others have mentioned, Incus is the community fork led by former members of the LXD team.
Very early versions of Docker even used LXC before they replaced it with libcontainer.
Are (self hosting) people putting multiple services like Django app, Postgres, Redis etc into a single container/lightweight VM instead of using Docker Compose with single-purpose containers?
You don’t have too, as you can create a single posgres instance for your services.
I prefer Incus, because you can’t do adhoc patching with docker. Instead you have to rebuild the images and that becomes a hassle quicky in a homelab settings. Incus have a VM feel while having docker management UX.
incus is the truly open source version of lxc/lxd. It is stable and incredible. I manage dozens of machines and want for nothing, and most importantly, pay nothing for that luxury.
It's a bad sign that the first table on the page is full of errors.
"Can only host Linux" -- Windows Containers are a thing too: https://learn.microsoft.com/en-us/virtualization/windowscont...
"Can host a single app" -- not true either. It's just bad practice to host multiple apps in a single container, but it's definitely possible.
IMHO it's not very nice to use the generic-sounding "linuxcontainers.org" domain exclusively for LXC-related content there.
On incus/lxd is true there containers can only be Linux..
Not sure about the one app thing but that’s the general design of those ad well I suppose.
Which just validates my point that a generic-sounding domain is the wrong place to host content that even within the Linux ecosystem is a relatively minor player.
Not only is this project website older than Docker, early versions of Docker literally used LXC as the backend, which was supported in Docker for the first two years of its life.
The Docker folks could have done their work under this umbrella and (maybe for good reasons) chose not to. For later container runtimes, idk the story.
But this project/community definitely laid the groundwork for all of those later Linux container runtimes.
lxc is used really frequently in the home space (jellyfin/plex for instance). A lot of Proxmox use cases as well which is growing in popularity extremely rapidly.
I really wish I could just run regular docker or oci containers in Proxmox.
Which is small in the scope of things when Docker Desktop and containerd are both used at far larger scales.
LXC far predates docker regardless of size or impact. It's not disingenuous if you were literally the foundation docker was able to package into a shiny accessible tool.
I’m not sure I follow. Are you suggesting OP has an incorrect apex domain name?
It’s like selling Pepsi exclusively on soda.org.
Only if Pepsi had always been called Soda Co and was older than Coca Cola.
For that analogy to hold, Pepsi would have also invented sodas.
Like that matters to consumers? Regardless of who invented sodas, the market has changed and people connect more brands to the kind of drink now, so equating Pepsi to Soda is factually incorrect.
Don’t give them any ideas!!!
Not only that, containers predate Linux implementations, I was using HP-UX Vaults in 1999.
linux containers, be it a lxd container, or a containerd/dockerd one, only run on linux hosts.
windows containers, only run on windows hosts.
when you run a linux container on a windows host, you're actually running a linux container inside of a linux vm on top of a windows host.
containers share the host operating system's kernel. it is impossible for a linux container (which is just a linux process) to execute and share the windows kernel. the reverse is true, a windows container (which is just a process) cannot execute and share the linux kernel
the article is correct, linux containers can only execute on a linux host
Very cool...
In my experience it has gotta be Docker. For these reasons:
1. I said so
2. I'm the boss
3. Goto 1.