19 comments

  • sarchertech 2 hours ago ago

    This is just the natural solution that falls out if you want to change a schema with no downtime. I always just called it “dual writing”.

  • leetrout 3 hours ago ago

    I use this example when I speak about and teach devops trainings.

    I call it the migration sandwich. (Nothing to do with the cube rule).

    A piece of bread isn't a sandwich and a single migration in a tool like alembic isn't a "sandwich" either. You have a couple layers of bread with one or several layers of toppings and it's not a sandwich until it's all done.

    People get a laugh out of the "idiot sandwich meme" and we always have a good conversation about what gnarly migrations people have seen or done (72+ hours of runtime, splitting to dozens or more tables and then reconstructing things, splitting things out to safely be worked on in the expanded state for weeks, etc).

    I had never heard it called "expand and contract" before reading this article a few years ago.

    What does everyone else call these?

    • tczMUFlmoNk 2 minutes ago ago

      I have usually heard it called "A–AB–B migrations". As in, you support version A, then you support both version A and version B, then you support just version B.

      The rest of the sequencing details follow from this idea.

    • mcdonje 41 minutes ago ago

      This is the same pattern as versioning, but with an extremely short sunset for the old version.

    • hobs 2 hours ago ago

      To me is just a blue-green type deployment for schemas. You have an old and a new thing, you split and merge as traffic replays to the new thing and shows that its viable and not breaking, you swap over as you can.

  • davedx 37 minutes ago ago

    I use Prisma on almost all my node.js projects these days, and I wish that part of schema migrations was also automated by Prisma. But last I checked, it doesn't even rename columns properly.

    I feel like maybe they should invest more R&D in their migrations technology? The ORM is pretty great.

  • isuckatcoding 40 minutes ago ago

    Ok hear me out. What if this whole process was statefully managed for you as an add on to your database?

    Like you essentially defined the steps in a temporal like workflow and then it does all the work of expanding, verifying and contracting.

    • Fripplebubby 20 minutes ago ago

      I'm hearing you out, but how is this going to affect the part of this that is client behavior rather than database behavior? If there is some kind of sdk that actually captures the interface here (that is, that the client needs to be compatible with both versions of the schema at once for a while) and pushes that back to the client, that could be interesting, like a way to define that column "name" and columns "first name", "last name" are conceptually part of the same thing and that the client code paths must provide handling for both at once.

    • nightpool 23 minutes ago ago

      On the Rails side, Gitlab has an extensive set of helpers for this that a lot of Rails projects have adopted—I would love to see them pulled out into a Gem or adopted into Rails core proper: https://gitlab.com/gitlab-org/gitlab-foss/blob/master/lib/gi...

  • krystofee an hour ago ago

    Is there any easy way to implement this pattern in AWS RDS deployments where we need to deploy multiple times a day and need it to be done in few minutes?

    • numbsafari an hour ago ago

      In my experience, this process typically spans multiple deploys. I would say the key insight that I have taken away from decades of applying this approach, is that data migrations need to be done in an __eventually consistent__ approach, rather than as an all-or-nothing, stop-the-world, global transaction or transformation.

      Indeed, this pattern, in particular, is extremely useful in environments where you are trying to making changes to one part of a system while multiple deploys are happening across the entire system, or where you are dealing with a change that requires a large number of clients to be updated where you don't have direct control of those clients or they operate in a loosely-connected fashion.

      So, regardless of AWS RDS as your underlying database technology, plan to break these steps up into individual deployment steps. I have, in fact, done this with systems deployed over AWS RDS, but also with systems deployed to on-prem SQL Server and Oracle, to nosql systems (this is especially helpful in those environments), to IoT and mobile systems, to data warehouse and analysis pipelines, and on and on.

  • maffyoo 2 hours ago ago

    Expand Contract from Fowler's bliki

    https://martinfowler.com/bliki/ParallelChange.html

    • layer8 an hour ago ago

      Expand the interface contract and then contract the interface contract? ;)

  • fuzzy2 3 hours ago ago

    I’m confused. I thought Expand and Contract was about mutating an existing schema, adding columns and tables, not creating a full replacement schema. But maybe I misunderstood?

    What’s in the article, I know as the Strangler Fig Pattern.

    • jerriep 2 hours ago ago

      What you describe is what they describe as well:

      > For column-level changes, this often means adding new columns to a table that have the characteristics you want while leaving the current columns as-is.

      I think what makes it confusing is that their diagrams depict a completely separate schema, but what they describe is really just altering the existing schema.

    • DeathArrow 2 hours ago ago

      > What’s in the article, I know as the Strangler Fig Pattern.

      Strangler fig pattern is mostly concerned with migrating from an old software to a new software, from example from a monolith to microservices. But I guess you can also apply it to database schemas.

  • AtlasBarfed 19 minutes ago ago

    1) double write essentially

    2) migration involves the problem of mixing a migration write with an actual live in flight mutation. Cassandra would solve this with additional per cell write time tracking or a migrated vs new mutation flag

    3) and then you have deletes. So you'll need a tombstone mechanism, because if a live delete of a cell value is overwritten by a migrated value, than data that is deleted comes back to life

  • skywhopper 2 hours ago ago

    This is the model we used at the SaaS I worked for a decade ago. It worked great to allow for smooth, zero-downtime upgrades across a fleet of thousands of DB servers serving tens of thousands of app servers and millions of active users.