About Containers and VMs

(linuxcontainers.org)

78 points | by Bogdanp 3 days ago ago

51 comments

  • hliyan 8 hours ago ago

    As I always say: a VM makes an OS believe that it has the machine to itself; a container makes a process believe that it has the OS to itself.

    • fulafel 7 hours ago ago

      I think they linuxcontainers.org people would disagree. Like the table is trying to communicate, in contrast to eg Docker, this is not about application containerization.

    • falcojr 6 hours ago ago

      That's literally the opposite of what this documentation is explaining. System containers exist. You can run the entire userspace of an OS (including systemd) in a container.

    • abhinavk 5 hours ago ago

      And a system container makes an OS (or OS userland) believe that that it has the kernel to itself.

    • weikju 7 hours ago ago

      I'll have to remember that one!

  • mappu 7 hours ago ago

    VMs also don't always require hardware virtualization - Alibaba's PVM https://lkml.org/lkml/2024/2/26/1263 didn't get upstreamed, but, theoretically the MMU is all you need for complete isolation. This kind of idea is also how VM software worked before VT-x was introduced. And of course QEMU has the TCG which works with no kernel support at all.

    • SirGiggles 6 hours ago ago

      I think you could also add Xen to that list. IIRC, the old Xen PV mode was purely paravirtualized without using any hardware extensions.

      • eru 6 hours ago ago

        Yes, Xen was big on paravirtualisation but started supporting the other kinds pretty soon, too. (At least they were supported around 2009-2012, when I was working on XenServer.)

        • SirGiggles 6 hours ago ago

          I think things are swinging back the other way if I have understood the more recent PVHv2 stuff correctly.

    • 01HNNWZ0MV43FF 6 hours ago ago

      In my experience TCG (or any method that doesn't require root / admin power) is pretty slow. But I'd be happy to be wrong about that, for an odd project I have

      • eru 6 hours ago ago

        It depends a bit on your workload. If you have a pure computation workload, without much IO, TCG etc doesn't need to be slow.

        • johncolanduoni 5 hours ago ago

          It also depends on the architectures. x86 on ARM is tough to do efficiently because of the memory model differences. One of the keys to Rosetta 2 being so good was being able to make the underlying ARM processor obey the x86 memory model (even though it was still executing ARM instructions).

  • scottyeager 3 hours ago ago

    Incus is really nice. It manages to provide a rather container-like experience for VMs. Having the ability to grab a shell on or copy files to/from a VM with the ease of using Docker is a great quality of life improvement. This requires an agent running in the VM but it's already included in the images from the project repo.

  • reilly3000 7 hours ago ago

    Can someone explain how a system container is more secure than an application container, if that is indeed the case?

    • johncolanduoni 5 hours ago ago

      It mostly isn't. Almost all Linux container escapes only require the ability to make system calls to the shared kernel from processes inside the container. The system container doesn't really restrict this ability. It also increases surface area to compromise the container before attacking the host system, since there's now a bunch of extra software running inside the container.

      If privilege isolation is a priority but you want to use containers, gVisor and Firecracker are way ahead of anything else. The Linux kernel API has proved to be very hard to secure, and not for lack of trying.

    • SirGiggles 6 hours ago ago

      In the context of Incus, they are the same.

      Incus and LXC internally use umoci to manipulate the OCI tarball to conform to how LXC runs containers.

      See: - https://umo.ci/ - https://github.com/lxc/lxc/blob/lxc-4.0.2/templates/lxc-oci....

    • cakealert 6 hours ago ago

      It's not really.

      Any shared resource between containers or the kernel itself is an attack surface.

      Both options have a very wide attack surface - the kernel api.

      Nothing really beats virtualization in security, the surface shrinks to pretty much just the virtualization bits in the kernel and some user space bits in the VMM.

    • fulafel 5 hours ago ago

      Complexity is generally the enemy of security, because securing a system requires understanding it. If you can build a more understandable, less moving parts, more observable, more easily manageable etc system with system containers, it's a security argument.

    • zie 6 hours ago ago

      It generally is more secure just because the system container virtualization system is "more complete", so it's harder to get out from under it.

      My understanding with Incus(the OP link) it's the same virtualization system, so there is no real difference, security wise between the two.

      The question then becomes can they get out from under the virtualization and can they get access to other machines, containers, etc.

      Docker's virtualization system has been very weak security wise. So a system container would be more secure than docker's virtualization system.

    • thundergolfer 6 hours ago ago

      The article is pretty useless at explaining the difference, I agree. It makes claims about Docker that aren't true (e.g. single container) while making inadequate reference to the OS features likely involved in making "system containers" what they are (SECCOMP, capabilities, network namespaces, nftables).

      As an engineer this page has a real "trust me bro" feel to it. Maybe fine as a marketing and product positioning thing, but not interesting for HN.

      • SirGiggles 6 hours ago ago

        This has been one personal pet peeve with the documentation surrounding Incus.

        As a stack, Incus has been exceptional, it has largely replaced Proxmox and Podman Quadlets for me. For context, I homelab so I cannot generalize my claim to SMB or enterprise.

        But the documentation has been very end user oriented, information regarding specifics like seccomp as you mentioned are only discoverable with the search bar and that leads to various disparate locations; and that also isn't taking into account that some of the more nitty gritty information isn't on the Incus portion of linuxcontainers.org, see the LXC Security page for example: https://linuxcontainers.org/lxc/security/

  • Ericson2314 6 hours ago ago

    IMO it's not good that the kernel interferences keep on spawning endless userland "middleware" projects.

    I still want capsicum to give me sane defaults, so the incentive for sandbox security theater goes away.

    • jeltz 6 hours ago ago

      Seems mostly off topic to the article. I think system containers should be implemented in user space. They are not about security theatre but about getting a sandboxed environment which feels like a real/virtual machine but is lighter weight. Very useful e.g. when I want to emulate a whole cluster of Linux machines. And for those needs security is nice but not key.

      It is application containers which maybe should be replaced by better kernel security, not system containers.

  • kottapar 5 hours ago ago

    This sounds very similar to BootC except that BootC is immutable

  • skywhopper 8 hours ago ago

    What is this? Docker containers can host more than one process/service/app. And why is some product called “Incus” using “linuxcontainers.org” as a domain name?

    • paulhart 8 hours ago ago

      According to their Github page, they _are_ linuxcontainers (in a way), and Incus is Apache licensed:

      Incus, which is named after the Cumulonimbus incus or anvil cloud started as a community fork of Canonical's LXD following Canonical's takeover of the LXD project from the Linux Containers community.

      The project was then adopted by the Linux Containers community, taking back the spot left empty by LXD's departure.

      Incus is a true open source community project, free of any CLA and remains released under the Apache 2.0 license. It's maintained by the same team of developers that first created LXD.

      LXD users wishing to migrate to Incus can easily do so through a migration tool called lxd-to-incus.

      https://github.com/lxc/incus

    • SirGiggles 6 hours ago ago

      Linux Containers, or LXC, came before Docker and OCI standardization.

      As the others have mentioned, Incus is the community fork led by former members of the LXD team.

      • antod 19 minutes ago ago

        Very early versions of Docker even used LXC before they replaced it with libcontainer.

    • aitchnyu 3 hours ago ago

      Are (self hosting) people putting multiple services like Django app, Postgres, Redis etc into a single container/lightweight VM instead of using Docker Compose with single-purpose containers?

      • skydhash 25 minutes ago ago

        You don’t have too, as you can create a single posgres instance for your services.

        I prefer Incus, because you can’t do adhoc patching with docker. Instead you have to rebuild the images and that becomes a hassle quicky in a homelab settings. Incus have a VM feel while having docker management UX.

    • xrd 7 hours ago ago

      incus is the truly open source version of lxc/lxd. It is stable and incredible. I manage dozens of machines and want for nothing, and most importantly, pay nothing for that luxury.

  • jiggawatts 9 hours ago ago

    It's a bad sign that the first table on the page is full of errors.

    "Can only host Linux" -- Windows Containers are a thing too: https://learn.microsoft.com/en-us/virtualization/windowscont...

    "Can host a single app" -- not true either. It's just bad practice to host multiple apps in a single container, but it's definitely possible.

    IMHO it's not very nice to use the generic-sounding "linuxcontainers.org" domain exclusively for LXC-related content there.

    • weikju 8 hours ago ago

      On incus/lxd is true there containers can only be Linux..

      Not sure about the one app thing but that’s the general design of those ad well I suppose.

      • jiggawatts 8 hours ago ago

        Which just validates my point that a generic-sounding domain is the wrong place to host content that even within the Linux ecosystem is a relatively minor player.

        • pxc 5 hours ago ago

          Not only is this project website older than Docker, early versions of Docker literally used LXC as the backend, which was supported in Docker for the first two years of its life.

          The Docker folks could have done their work under this umbrella and (maybe for good reasons) chose not to. For later container runtimes, idk the story.

          But this project/community definitely laid the groundwork for all of those later Linux container runtimes.

        • chucky_z 8 hours ago ago

          lxc is used really frequently in the home space (jellyfin/plex for instance). A lot of Proxmox use cases as well which is growing in popularity extremely rapidly.

          • esseph 6 hours ago ago

            I really wish I could just run regular docker or oci containers in Proxmox.

          • jiggawatts 8 hours ago ago

            Which is small in the scope of things when Docker Desktop and containerd are both used at far larger scales.

        • TrueDuality 8 hours ago ago

          LXC far predates docker regardless of size or impact. It's not disingenuous if you were literally the foundation docker was able to package into a shiny accessible tool.

        • cyberge99 8 hours ago ago

          I’m not sure I follow. Are you suggesting OP has an incorrect apex domain name?

          • 9dev 8 hours ago ago

            It’s like selling Pepsi exclusively on soda.org.

            • jeltz 5 hours ago ago

              Only if Pepsi had always been called Soda Co and was older than Coca Cola.

            • Kudos 5 hours ago ago

              For that analogy to hold, Pepsi would have also invented sodas.

              • 9dev 3 hours ago ago

                Like that matters to consumers? Regardless of who invented sodas, the market has changed and people connect more brands to the kind of drink now, so equating Pepsi to Soda is factually incorrect.

            • weikju 8 hours ago ago

              Don’t give them any ideas!!!

    • pjmlp 3 hours ago ago

      Not only that, containers predate Linux implementations, I was using HP-UX Vaults in 1999.

    • wutwutwat 8 hours ago ago

      linux containers, be it a lxd container, or a containerd/dockerd one, only run on linux hosts.

      windows containers, only run on windows hosts.

      when you run a linux container on a windows host, you're actually running a linux container inside of a linux vm on top of a windows host.

      containers share the host operating system's kernel. it is impossible for a linux container (which is just a linux process) to execute and share the windows kernel. the reverse is true, a windows container (which is just a process) cannot execute and share the linux kernel

      the article is correct, linux containers can only execute on a linux host

  • worik 8 hours ago ago

    Very cool...

    In my experience it has gotta be Docker. For these reasons:

    1. I said so

    2. I'm the boss

    3. Goto 1.