Making PyPI's test suite 81% faster – The Trail of Bits Blog

(blog.trailofbits.com)

78 points | by rbanffy 4 days ago ago

22 comments

  • cocoflunchy 6 hours ago ago

    I don't understand why pytest's collection is so slow.

    On our test suite (big django app) it takes about 15s to collect tests. So much that we added a util using ripgrep to find the file and pass it as an argument to pytest when using `pytest -k <testname>`.

    • Galanwe 37 minutes ago ago

      From my experience speeding up pytests with Django:

      - Creating and migrating the test DB is slow. There is no shame in storing and committing a premigrated sqlite test DB generated upon release, it's often small in size and will save time for everyone.

      - Stash your old migrations that nobody use anymore.

      - Use python -X importtime and paste the result in an online viewer. Sometimes moving heavy imports to functions instead of the global scope will make individual tests slower, but collection will be faster.

      - Use pytest-xdist

      - Disable transactions / rollback on readonly tests. Ideally you want most of your non-inserting tests to work on the migrated/preloaded features in your sqlite DB.

      We can enter into more details if you want, but the pre migrated DB + xdist alone allowed me to speedup tests on a huge project from 30m to 1m.

      • imp0cat 2 minutes ago ago

        Is there a way to use pytest-xdist and still keep the regular output?

    • boxed 3 hours ago ago

      I've done some work on making pytest faster, and it's mostly a case of death by a thousand paper cuts. I wrote hammett as an experimental benchmark to compare to.

    • kinow 5 hours ago ago

      In their case I think they were no specifying any test path. Which would cause pytest to search or tests in multiple directories.

      Another thing that can slow down pytest collection and bootstrap is how fixture are loaded. So reducing number or scope of fixtures may help too.

    • piokoch 5 hours ago ago

      Ehhh, those pesky Python people, complaining and complaining, average Spring Boot application takes 15s to start even looking if the code compiled ;)

      • thom 5 hours ago ago

        Lest we start to malign the JVM as a whole, my Clojure test suite, which includes functional tests running headless browsers against a full app hitting real Postgres databases, runs end to end in 20s.

        • ffsm8 2 hours ago ago

          The spring tests are generally quicker then the equivalent python test, so ime - the jvm is mostly to blame.

          How much time actually goes by after you click "run test" (or run the equivalent cli command) until the test finished running?

          Any projects using the jvm I've ever worked on (none of which were clojure, admittedly) have always taken at least 10-15s until the pre-phases were finished and the actual test setup began

          • thom 2 hours ago ago

            If I completely clear all cached packages maybe, but I never do that locally or in CI/CD, and that's true of Python too (but no doubting UV is faster than Maven). Clojure/JVM startup time is less than half a second, obviously that's still infinitely more than Python or a systems language but tolerable to me. First test runs after about 2s? And obviously day to day these things run instantly because they're already loaded in a REPL/IPython. Maybe unfair to compare an interpreted language to a compiled one: building an uberjar would add 10 seconds but I'd never do that during development, which is part of the selling point I guess. Either way, I don't think the JVM startup time is really a massive issue in 2025, and I feel like whatever ecosystem you're in, you can always attack these slow test suites and improve your quality of life.

  • NeutralForest 3 hours ago ago

    Pretty good article, it's really a challenge to properly isolate DB operations during testing so having a difference instance per worker is nice. I remember trying to use different schemas (not instances) but I had a hard time to isolate roles as well.

  • bgwalter 4 hours ago ago

    I get that pytest has features that unittest does not, but how is scanning for test files in a directory considered appropriate for what is called a high security application in the article?

    For high security applications the test suite should be boring and straightforward. pytest is full of magic, which makes it so slow.

    Python in general has become so complex, informally specified and bug ridden that it only survives because of AI while silencing critics in their bubble.

    The complexity includes PSF development processes, which lead to:

    https://www.schneier.com/blog/archives/2024/08/leaked-github...

    • williamdclt 3 hours ago ago

      > it only survives because of AI

      I don't disagree that it's "complex, informally specified" (idk about bug ridden or silencing critics), but it's just silly to say it only survives because of AI. It was a top-used language before AI got big for web development, data science and all sorts of scientific analysis, and these haven't gone away: I don't expect Python lost much ground in these fields, if any.

      • bgwalter 3 hours ago ago

        Dropbox moved parts from Python to Golang already in 2014. Google fired the Python team last year and I hear that it does not use Python for new code. Instagram is kept afloat by gigantic hacks.

        The scientific ecosystem was always there, but relied on heavy marketing to academics, who (sadly) in turn indoctrinate new students to use Python as a first language.

        I did forget about sysadmin use cases in Linux distributions, but they could be easily replaced by even Perl, as leaner BSD distributions already do.

        • guappa 31 minutes ago ago

          You'd be right if go wasn't an awful language designed by someone who clearly failed their compiler class at university.

  • throwme_123 4 hours ago ago

    Is Trail of Bits transitioning out of "crypto"?

    Imho, they are one of the best auditors out there for smart contracts. Wouldn't be surprising to see some of these talented teams find bigger markets.

  • ustad 6 hours ago ago

    The article uses pytest - does anyone have similar tips when using pythons builtin unittest?

    • masklinn 6 hours ago ago

      The sys.monitoring and import optimisation suggestions apply as-is.

      If you use standard unittest discovery the third item might apply as well, though probably not to the same degree.

      I don’t think unittest has any support for distribution so the xdist stuff is a no.

      On the other hand you could use unit test as the API with Pytest as your test runner. Then you can also use xdist. And eventually migrate to the Pytest test api because it’s so much better.

      • kinow 5 hours ago ago

        I wwsn't familiar with this sys.monitoring option for coverage. Going to give it a try in my test suite. At the moment with docker testcontainers, gh actions test matrix for multiple python versions, and unit + regression + integration tests it is taking about 3-5 minutes.

    • anticodon 3 hours ago ago

      I profiled a huge legacy tests collection using cProfile, and found lots of low hanging fruits. Like some tests were creating 4000x3000 Pillow image in memory just to test how image saving code works (checkign that filename and extension are correct). And hundreds of tests had created this huge image for every test (in the setUp method) because of unittest reliance on inheritance. Reducing size image to 10x5 made the test suite faster for like 5-7% (it was long time ago, so I don't remember exact statistics).

      So, I'd run the tests under cProfile first.

      • dmurray an hour ago ago

        But the changes in TFA were of the other of 75% improvement for "dumb" changes that were agnostic to the details of the tests being run.

        Saying you got a 5-7% improvement from a single change, discovered using the profiler, that took understanding of the test suite and the domain to establish it was OK, and that actually changed the functionality under test - that's all an argument for doing exactly the opposite of what you recommend.