Could lockfiles just be SBOMs?

(nesbitt.io)

63 points | by zdw 18 hours ago ago

54 comments

Lvl999Noob 16 hours ago ago
Personally, I would prefer that the package managers keep their own lockfiles with all their metadata. A CI process (using the package managers itself) can create the SBOM for every commit in a standardized environment. We get all the same benefits without losing anything (the package managers can keep their own formats and metadata and remove anything unneeded for the SBOM from it).
[-]
- ozim 13 hours ago ago
  Second that. It is trivial to add SBOM generator to your pipeline - it is not trivial to make all kind of package managers to switch and each format is used for different audiences.
  [-]
  - Jnr 26 minutes ago ago
    I do exactly that in my container build pipelines and it is great. And then CI uploads those SBOMs to Dependency Track.
    Depending on the language, scanning just the container is not enough, you for sure want to scan the lockfiles for full dependency list before it is compiled/packed/minified and becomes invisible to trivy/syft.
    [-]
    - Lvl999Noob 18 minutes ago ago
      You are building everything in CI from scratch so theoretically, it should be completely possible to not need to scan lockfiles and get all the data from their respective sources (OS, runtime, dynamic libs, static deps, codegen tools, build time deps, etc)
  - zvr 12 hours ago ago
    Exactly.
    To understand what an impossible task this is, there is no need to think about different ecosystems (PyPI vs NPM vs Cargo vs ...). Even in the case of different Linux distributions, the package managers are so different that expecting them to support the same formats is a lost cause.
endorphine 16 hours ago ago
From https://en.wikipedia.org/wiki/Software_supply_chain:
> A software bill of materials (SBOM) declares the inventory of components used to build a software artifact, including any open source and proprietary software components. It is the software analogue to the traditional manufacturing BOM, which is used as part of supply chain management.
[-]
- stuaxo 9 hours ago ago
  Still not fully helpful. The article could have included some links or a box out.
matharmin 11 hours ago ago
SBOM may contain similar info to lockfiles, but the purposes are entirely different.
Lockfiles tells the package manager what to install. SBOM tells the user what your _built_ project contains. In some cases it could be the same, but in most cases it's not.
It's more complicated than just annotating which dependencies are development versus production dependencies. You may be installing dependencies, but not actually use them in the build (for example optional transitive dependencies). Some build tools can detect this and omit them from the SBOM, but you can't omit these from your lockfile.
Fundamentally, lockfiles are an input to your developement setup process, while SBOM is an output of the build process.
Now, there is still an argument that you can use the same _format_ for both. But there are no significant advantages to that: The SBOM is more verbose, does not diff will, will result in worse performance.
[-]
- sunnyday_002 9 hours ago ago
  So the lockfile is a superset, but never a subset?
  So it basically is an SBOM then but just sometimes has extra dependencies?
  [-]
  - matharmin 9 hours ago ago
    Superset of dependencies, but often a subset of info per depedency.
    [-]
    - sunnyday_002 9 hours ago ago
      Ah okay! I know Rust has the transitive dependencies did not think/realise all languages might not, good point!
Ferret7446 14 hours ago ago
No because SBOMs are a hot mess and not standardized at all. They're "standardized" in the same sense as HL7 (ask someone in the healthcare industry, make sure to have some sedatives on hand first). A comprehensive SBOM for something like Chromium is many dozens of MBs compressed (I forget exactly, but it's patently ridiculous). Also SBOMs should be build artifacts, so them (also) being build inputs is problematic.
[-]
- zvr 12 hours ago ago
  The format is standardized, to the highest level possible: ISO/IEC 5962:2021 defines SPDX v2.2.1. The actual standard text is available for free at the ISO website (and other places, like spdx.org).
  The newer version, SPDX v3.0, will become ISO/IEC 5962:2026, and work is already underway for further versions.
  What is not standardized at all are the integration of processes for producing/consuming/maintaining SBOMs in the software development world.
  [-]
  - Ferret7446 10 hours ago ago
    Oh sure, the format is standardized. The semantics aren't however, in any practical sense. What happens when you vendor/patch/fork a dependency? What happens to vulnerabilities that are not in code paths not used by your software, or only under certain flags?
    HTML is standardized too, how many documents do you think use the p or i tags properly? Heck, how many documents do you think are HTML5 compliant, even ignoring the semantics?
    (And even if it were, it is still much too bulky of a tool to replace lockfiles. Having to add a kilobyte to your file every time a bunch of new vulnerabilities get reported in your deps recursively sounds like a great addition to your commit history.)
    [-]
    - zvr 10 hours ago ago
      > What happens when you vendor/patch/fork a dependency?
      You change the supplier property (and most probably the version). This is how you distinguish between OpenSSL 3.1.4 from OpenSSL project and OpenSSL 3.5.4-1~deb13u1 from Debian project.
      > What happens to vulnerabilities that are not in code paths not used by your software, or only under certain flags?
      You record this information in the SBOM, using structures like "this software has this vulnerability reported, but it's not affected by it in this case" (see, for example, VexNotAffectedVulnAssessmentRelationship in SPDXv3).
      I completely agree that its purpose is not to replace lockfiles.
- larusso 13 hours ago ago
  This year I had to create SBOM files for our Unity projects. Of course there is nothing. For all that don’t know: UPM (Unity Package Manager) is a way to easily install packages in Unity. And as a side note, for whatever reason they decided to built on top of npm not nuget for the package infrastructure and metadata format. Anyways: Most packages we use are simply wrapper packages for other packages. Like a wrapper for a .NET library. There is no clear dependency try but based on the package ID I’m able to see them. So I wrote the SBOM files based manually with an SBOM library and added pedigree statements to the original nuget package being wrapped. Idea was if the nuget package has a security issue the UPM package also gets flagged. Showed that one of the security engineers of the software we use. As wer was cool but that is not a standard. There is also no official package specification for UPM (I also made that up as part of the purl) So yes SBOM is a standard with a huge array of ways to declare said information. And it seems most companies consuming the files don’t built general parsers but expect specific formats for X.
- mrweasel 11 hours ago ago
  This might not be part of HL7, but I recall working on software for a healthcare product, and simply having a list of components want not enough. Each component had to be accompanied by a risk assessment. It's a really clever way of keeping your dependency count low.
  [-]
  - Yossarrian22 6 hours ago ago
    How does that work for high complexity dependencies like compression or cryptography? If HL7 wouldn’t catch xzutils is it really adding anything?
- isodev 12 hours ago ago
  Oh dear, HL7, I may be suffering from a form of PTSD… my therapist has heard about this “standard” at length.
  But I think SBOMs are better structured. I also feel that if package managers refocus their efforts on that, the standard and its implementations can be evolved. It’s the whole perk of using standards. I think it would be a good thing
brookman64k 11 hours ago ago
In some ecosystems like Rust/Cargo the lock file can list a superset of the dependencies that actually make it into the final executable. Crates may conditionally include or exclude dependencies based on enabled features selected by the parent crate, or on the compilation target itself. As a result, the SBOM is effectively a build artifact, and its contents can legitimately vary across platforms.
woodruffw 17 hours ago ago
This is a great summary, although I think I'm more bearish on SBOMs than Andrew is: my experience integrating them so far (in both pip-audit and uv) has been that there's much more malleability at the representation level than the presence of a standard might imply, and that consumers have adapted (a la Postel) to this reality by being very permissive with the kinds of broken stuff they permit when ingesting third-party SBOMs.
(Case in point: pip-audit's CycloneDX emission was subtly incorrect for years, and nobody noticed[1].)
[1]: https://github.com/pypa/pip-audit/pull/981
notepad0x90 14 hours ago ago
Wouldn't lock files require running the thing? People need to be able to verify SBOM without doing that. It's the kind of thing you check against a large fleet of devices. If someone has software installed on their laptop but hasn't run it in a year, you need to be able to measure SBOM for that.
SBOM is too similar to things like authenticode and package signing for it to be some unique solution. We're too used to how things have always been done. Too stuck in the "monkey see, monkey do" mindset. How about any piece of software, under any execution environment should not only have an SBOM declaration, but cyptographic authentication of all of its components, including any static data files.
This should be a standardized mechanism. Everyone is doing their own thing and it's creating lots of insecurity and chaos. Why can't I answer all security-related questions about the software I'm running on any device or OS using the same protocol?
Everyone would consider it absurd if we used a different TLS when talking to an Apache server or a Windows server than alternatives.
SBOM, code signing (originator of the code), capability declarations, access requirements (camera, mic, etc...) are not things that are unique to an OS or platform. And for the details that are, those are data values that should be different, not the entire method of verification.
I wonder what it would take to enact this, I'd imagine some sort of regulatory push? But we don't even have a good cross-platform and standardized way of doing this for anyone to enforce it to begin with.
[-]
- perbu 14 hours ago ago
  Want to verify the installed package, the package should provide checksums you can verify. AFAIK, the SBOM is to documents the build, not the install.
  [-]
  - notepad0x90 13 hours ago ago
    The checksum just tells you what the hash is, nothing more. Supply chain attacks aren't always against the main executable either. With authenticode, the "catalog" can be signed. You're even more opposite of OP than I (OP proposes lockfiles which are at runtime).
    It shouldn't be for "just" any state of the software. We should be able to verify SBOM and take actions at any point. At build time, it is only useful for the developer, I don't get why SBOM is relevant at all. I think you mean at deployment time (when someone installs it - they check SBOM). What I'm saying is, when you fetch the software (download, package manager, appstore,curl|sh), when you "install" it, when you run it, and when it is dormant and unused. At all of those times, SBOM should be checkable. Hashes are useless unless you want people to collect hashes for every executable constantly, including things like software updates.
    The problem is, people are looking at it only from their own perspective. People interested in audits and compliance don't care about runtime policy enforcement. People worried about software supplychain compromises, care more about immediate auditability of their environment and ability to take actions.
    The recent Shai-Hulud node worm is a good example. Even the best sources were telling people to check specific files at specific locations. There was just one post I found on github issues where someone was suggesting checking the node package cache. Ideally, we would be able to allow-list even js files based on real-time SBOM driven policies. We should be able to easily say "if the software version is published by $developer between dates $start and $end it is disallowed".
    [-]
    - baobun 12 hours ago ago
      I still don't see how lockfiles can't be SBOM.
      They contain for each dependency name, version, (derivable) URL and integrity checksum, plus of course the intra-dependency relationships.
      This can all be verified at any point in the lifecycle without running any of the code, provided a network connection and/or the module cache. What's missing?
      > With authenticode, the "catalog" can be signed
      You could trivially sign any lockfile, though I've never seen it. I think it could be neat and it might have a chance to catch on if there was more support in tooling for it. The NPM registry does support ECDSA package sigs but I guess signatures for this use should be distributed on other channels given how much of an antipattern uploading lockfiles to registry is considered in the npm community and that's an uphill. In the context of SBOMs I guess there's already a slot for it?
      [-]
      - notepad0x90 5 hours ago ago
        I don't think you've addressed the requirement of having to execute the software, that was my main objection.
        Another matter is that most software I know of doesn't even use lock files. Furthermore, there are lots and lots of software that would need to be updated to support your scheme, but updating them just isn't practical. It would have to be relegated to the type of software that gets regularly updated and its authors care about this stuff. I mean, we can't even get proper software authors to host a security.txt on their website reliably. It needs to work for "old" software, and "new" software would need to spend time and effort implementing this scheme. How can we get people that won't even sign their executable to sign a lock file and participate in the verification process?
  - zvr 12 hours ago ago
    Ah, but there are actually different types of SBOMs, that describe the software in different parts of its lifecycle. It's a completely different outcome to record the software when looking at its source, at what is being distributed, or at what is being installed, for example.
    At some point we realized that we were talking across each other, since everyone was using "SBOM" to describe different contents and use cases.
    The consensus was expressed around 3 years ago, and published in https://www.cisa.gov/sites/default/files/2023-04/sbom-types-...
    [-]
    - notepad0x90 5 hours ago ago
      I haven't had a chance to read that, but do you think it would be impractical to have the different types of SBOMs declared in a standardized format? My impression is that no matter what, authenticity needs to be established, so it will always fall under "cryptographic verification of information about software", it is the standardization of that which I have an issue with.
onion2k 13 hours ago ago
Isn't one fairly major problem with using lockfiles that there could be packages in the lockfile that aren't used in the application? If I run "npm i package" that doesn't tell you whether or not 'package' is actually used in the app.
For most things that unused dependency is just annoying but if your government has mandated that you use a specific package for something (e.g. cryptography) the lockfile isn't enough to give you confidence that the app is actually doing that. You'll still need to audit the application code.
[-]
- pacificpendant 11 hours ago ago
  You’re right that SBOMs cannot be used to attest that a library is correctly used. I’m not sure if that’s a common use-case of SBOMs though. I normally see people wanting SBOMs for security transparency (customer can see if you’re maintaining your dependencies), vulnerability management (customer can know what vulnerabilities lurk in the dependencies) and license compliance (they can know you didn’t use any dependencies with licenses that cause commercial issues).
  Related to your point though is that just because a dependency is vulnerable doesn’t mean the software using it is affected too. It might not use the functionality that’s vulnerable. Which means a supplier needs to share their assessment of each dependency vulnerability.
pjmlp 10 hours ago ago
> Every package manager has its own lockfile format. Gemfile.lock, package-lock.json, yarn.lock, Cargo.lock, poetry.lock, composer.lock, go.sum. They all record roughly the same information: which packages were installed, at what versions, with what checksums, from where.
Nope, Java and .NET ecosystem don't use them.
[-]
- homebrewer 10 hours ago ago
  One can easily opt-in with modern dotnet.
  https://devblogs.microsoft.com/dotnet/enable-repeatable-pack...
  [-]
  - pjmlp 10 hours ago ago
    I know, however as you point out, it isn't used by default.
perbu 13 hours ago ago
Software I built will have the following ingredients.
source from git ~30 go packages ~150 npm packages ~A three layered docker image
ozim 13 hours ago ago
Typical software developer fallacy - well it looks the same so we can abstract and merge concept.
Well NO lock file and SBOM formats are used for different purposes and are to be consumed by different audiences. They will evolve in different speeds and ways. Ideally SBOM should not evolve and package lock should be able to change on a whim by package manager developers.
SBOMs are meant to be shared by 3’rd parties while lock files not - just because some tooling accidentally started using lock files for ingestion is just because people didn’t knew better or couldn’t explain to their customers why they should do SBOM so they did first easiest thing.
stuaxo 10 hours ago ago
There's a great rule for UK Gov websites that an acronyms must be defined on first use.
What on earth is an SBOM?
[-]
- zvr 9 hours ago ago
  Software Bill Of Materials (moving to System Bill Of Materials), as lots of comments here explain.
  What is a "UK" ? ;-)
zingar 17 hours ago ago
In hearing the SBOM term for the first time from that article and the linked Wikipedia page. For the ignorant like me: what is it that SBOM is used for that lockfiles aren’t? Everything in the article is something that I’m used to seeing automated scanners using lockfiles for.
Is it just that the two are used by different communities? What is the SBOM community?
[-]
- zvr 12 hours ago ago
  Think of the SBOM as a "table of contents" for the software you are receiving. Another metaphors that has been used is the "nutrition label" that you get in all packaged food.
  So, it's a list of the "software components" that are inside a piece of software. And then you add metadata about each of these components: what's its name? its version? its hash? Up to now we're in lockfile territory.
  But you want more information: what is the license? who supplied it? what is the security status? does it have known CVEs? are they relevant?
  And then you go to special cases, like "AI" software: oh, it's a model? how was it trained? on which data? Or like software that has to be certified, to be used when safety is important.
  An SBOM is capable of providing all this information. Take a look at the different parts that SPDX provides, and it's an ever expanding area.
- edoceo 17 hours ago ago
  In many cases the lock files are for one part of the stack. Like npm and composer and $other_lang thing. sBOM is when all are together and version-pinned. (I've over simplified).
  Edit: for my domain we have Alpine, Debian, PHP, JS, Go in the stack. So our BOM has all that (and dependencies). It's a big list. Some is just necessary base (Alpine, Debian) but some are core stack and other are edge (dependency on python lib when we're mostly Rust (or something)).
  Mirror/Vendor all these things for supply-chain integrity (it's what they tell me)
- Khaine 15 hours ago ago
  SBOMs are a solution intended to help solve a couple of problems:
  1) help identify and remediate software that has been built with vulnerable packages (think log4j).
  2) help protect against supply chain compromise as the SBOM contains hashes that allow packages to be verified
  [-]
  - pacificpendant 11 hours ago ago
    https://www.ntia.gov/sites/default/files/publications/sbom_m...
    Depending on who you ask an SBOM might not need a hash. NTIA only recommend a hash.
  - ozim 13 hours ago ago
    You forgot about the important one SBOMs are created with thought about sharing them with third parties like your customers - lock files not.
    [-]
    - Khaine 10 hours ago ago
      Thats an important point. You can't tell if the software you use is vulnerable to something like log4j without the vendor telling you, or doing lots of manual investigation.
      SBOMs are supposed to help with software composition analysis. Basically, you as an enterprise have an inventory of what software you use, and their SBOMs (i.e. dependencies). I can then use this to automatically check which software is impacted by severe vulnerabilities when they are announced.
- Tomte 13 hours ago ago
  Software licensing information is the big use case where SPDX originated from.
  In CycloneDX you can also express things like attestations/certifications, possibly down to the code review level (although I think nobody does that).
- LoganDark 17 hours ago ago
  > what is it that SBOM is used for that lockfiles aren’t?
  Compliance. The article mentions "the EU’s Cyber Resilience Act will push vendors toward providing SBOMs", and having package managers generate SBOMs directly would certainly be convenient for that.
  [-]
  - jlubawy 15 hours ago ago
    The FDA also requires SBOMs as of a few years ago for medical device software.
- 17 hours ago ago
  [deleted]
phendrenad2 16 hours ago ago
> the security world has been pushing CycloneDX and SPDX
> CycloneDX supports JSON, XML, and YAML
And SPDX is JSON.
Are there any other examples of government-mandated non-human-readable file formats? I feel like bureaucracies have a natural tendency to water down requirements such as this and instead focuses on getting wet signatures on pen-and-paper.
[-]
- Tomte 12 hours ago ago
  Or tag-value, which is actually preferred by many practitioners. Nesting is implicit in that format, but SBOMs should be mostly flat, anyway.
  Unfortunately, T-V hs been dropped in SPDX 3.0.
  [-]
  - zvr 12 hours ago ago
    It was dropped exactly because it was flat and it was becoming completely unmanageable.
    SPDX v3 is based on a graph model that can represent hierarchies natively. It can then be serialized in a file, for example, in JSON format.
    [-]
    - Tomte 11 hours ago ago
      But it was the best format for manually creating an SBOM.
      Most SBOM use cases don‘t need the ability to put your detailed software architecture in the SBOM.
firloop 18 hours ago ago
Another drawback could be that package manager lockfile schemas are optimized for performance[0]. I wouldn't appreciate seeing slower install times by default - especially if the lockfile could be converted with other tooling.
[0]: https://bun.com/blog/behind-the-scenes-of-bun-install#optimi...
voidUpdate 12 hours ago ago
https://xkcd.com/927/