Maybe I can't understand what TFA is describing, but from what I know a patch is usually tied to a specific commit, so a very specific point of time in the upstream lifetime. It does not make sense to have it lingering longer than that. Even in the case when you want to maintain a set of patches (package building,...) you usually revise it every new version of the software. In this case, the intent is much more important than the how (which quickly become history).
The point is to maintain your set (perhaps stack) of patches as a set of patches on top of upstream for the long term. Yes, you will probably have to revise them as upstream changes, but this will let you maintain their identity as you do so. Is that something you will find useful? Maybe, maybe not.
You’re thinking a patch is text, but should think of it as a logical change. Unless the logic becomes part of upstream the patch is not tied to a specific point in “time”. There’s a cost to it, as you have to constantly rebase. This is the case with any non-vanilla distribution (e.g. Linux), although it’s also at a package level so you do this both for each package as well across every package. For well written code there’s reasonably low coupling so it’s less work to maintain.
Yes, I don't quite get it. When I need to maintain a fork, I just add an extra remote to git. Then I fetch upstream (what I call my remote) and rebase my changes against whatever branch I'm following. At any point in time I can generate a patch file that works for whatever version I have rebased against.
Seems easy enough, I read the article multiple times and I don't get why what they are describing is needed.
The difference is that git rebasing is a destructive operation, you lose track of the old version when you do it. (Yes, there's technically the reflog.. but it's much less friendly to browse, and there's no way to share it across a team.)
Maybe that's an okay tradeoff for something you use by yourself, but it gets completely untenable when you're multiple people maintaining it together, because constantly rebasing branches completely breaks Git's collaboration model.
I worked at a place that was allergic to contributing patches upstream. We maintained a lot of internal forks for things and had no problem collaborating.
You don't need to push the rebased branch to the same branch on your remote, if that's an issue (although I don't see how it is).
Maybe this is a case of "Dropbox is just rsync", but I feel like just learning git and using it is easier than learning a new tool.
We do this for some of the components that are shared between Servo and Firefox. Firefox is upstream, and on the Servo side we have automated and manual syncing. The automated syncing mirrors the upstream `main` branch to our `upstream` without changes daily. The manual syncing rebases our changes on top a new upstream version through a manual rebase process. This happens monthly and each sync is pushed to a new branch to maintain history.
Between monthly syncs we push our own changes to our latest monthly branch (which also get manually sent upstream when we get a chance).
I see — you’re doing more than “here’s a few patches to keep working across revisions”, you’re doing separate-path feature work on a different, actively-developed project.
To me that sounds like not a great idea, but if you must do it, I could see some usefulness to this.
Yeah. For reference, this is a typical patchset for the project that motivated it.[0] Some of the patches are "routine" dependency upgrades, some of them are bugfix backports, some of them are original work that we were planning to upstream but hadn't got around to yet. Some are worth keeping when upgrading to a new upstream version, some aren't.
I agree that it's not ideal, but... there are always tradeoffs to manage.
Agreed. If you want your change and don’t want to bother the maintainers with a patch they are unlikely to accept, or can’t because it’s proprietary: fork the repo (at whatever tag makes sense), then periodically sync with the latest code for that version.
The likelihood of conflicts is minimal, and often if you see conflicts it’s a good indication your issue may have been resolved. Or if not, you can see if it’s still needed, or how to adjust it.
> fork the repo (at whatever tag makes sense), then periodically sync with the latest code for that version.
Yeah, this is the workflow that Lappverk is trying to enable.
The problem is that neither of Git's collaboration models works well for this problem. Rebasing breaks collaboration (and history for the patchset itself), and merging quickly loses track of individual patches. Lappverk is an attempt to provide a safer way to collaborate over the rebase workflow.
For example wine-staging (ran by Wine developers themselves) hosts patches for Wine project and they revise / rebase them with each Wine version, which is often not a trivial task. I don't see how you can avoid that really. But Wine staging itself is a git repository that holds patches (and their history) if that helps, which indeed can stay there for years.
Same happens with patches that Debian applies on top of fixed versions of packages. They are stored in Debian's Salsa git.
You may have a look at Quilt. I doesn't solve the problem the author described but may help you once you accept there is no easy solution in sight.
Quilt is automation for the "bag of patches" model. I used it once when I needed to upgrade the internal bag of patches at $big_corp so as to apply them to a newer version of $public_app. It was predictably complex but somehow still manageable.
If you squint a bit then the [bag of patches] + [automated application in order] is a poor man's Git. If you keep this in a git repo then you're basically versioning repos (poor man's ones) in a repo. It almost sounds like the solution to author's problem :)
Many times I've just patched the binary even if source is available, because trying to reproduce the binary you currently have, with only the changes you want and everything else the same, can be an even more difficult exercise than simply changing a string or constant.
The process described reminded me of "pristine source" and RPM spec files that take the upstream pristine source and patch it during the build process. Maintaining that is always a little bit of a headache if you don't do it regularly, especially having to maintain (generate and apply) a separate set of patch files for the changes and express/apply the patches in the spec file. This looks to make light work of that.
Honestly I found a better strategy to name branches after the fork point and the date you started the fork. So you’d have main-2025-03-07 for a fork of main started 03-07 another main-2025-05-08 for a rebase. The patch set above that is just what you carry. I’m not sure maintaining them as literal patches is that helpful vs just keeping it as explicit patches to apply in git. But maybe this is the right strategy once your fork gets complicated but at that point you should be hard forking rather than soft forking IMO.
This is supercool. One my constant problem with self-hosting is that I often need to modify just a couple of files here and there, but then I'm stuck with a forked repo or a dirty work copy.
Are you talking about personal or professional self-host? Why are you constantly patching software you self-host? Not enough configurability? Using software not made for self-host? Holding it wrong?
I ask because it seems...strange that you have these issues so often.
Maybe I can't understand what TFA is describing, but from what I know a patch is usually tied to a specific commit, so a very specific point of time in the upstream lifetime. It does not make sense to have it lingering longer than that. Even in the case when you want to maintain a set of patches (package building,...) you usually revise it every new version of the software. In this case, the intent is much more important than the how (which quickly become history).
The point is to maintain your set (perhaps stack) of patches as a set of patches on top of upstream for the long term. Yes, you will probably have to revise them as upstream changes, but this will let you maintain their identity as you do so. Is that something you will find useful? Maybe, maybe not.
You’re thinking a patch is text, but should think of it as a logical change. Unless the logic becomes part of upstream the patch is not tied to a specific point in “time”. There’s a cost to it, as you have to constantly rebase. This is the case with any non-vanilla distribution (e.g. Linux), although it’s also at a package level so you do this both for each package as well across every package. For well written code there’s reasonably low coupling so it’s less work to maintain.
Yes, I don't quite get it. When I need to maintain a fork, I just add an extra remote to git. Then I fetch upstream (what I call my remote) and rebase my changes against whatever branch I'm following. At any point in time I can generate a patch file that works for whatever version I have rebased against.
Seems easy enough, I read the article multiple times and I don't get why what they are describing is needed.
(Author here.)
The difference is that git rebasing is a destructive operation, you lose track of the old version when you do it. (Yes, there's technically the reflog.. but it's much less friendly to browse, and there's no way to share it across a team.)
Maybe that's an okay tradeoff for something you use by yourself, but it gets completely untenable when you're multiple people maintaining it together, because constantly rebasing branches completely breaks Git's collaboration model.
I worked at a place that was allergic to contributing patches upstream. We maintained a lot of internal forks for things and had no problem collaborating.
You don't need to push the rebased branch to the same branch on your remote, if that's an issue (although I don't see how it is).
Maybe this is a case of "Dropbox is just rsync", but I feel like just learning git and using it is easier than learning a new tool.
We do this for some of the components that are shared between Servo and Firefox. Firefox is upstream, and on the Servo side we have automated and manual syncing. The automated syncing mirrors the upstream `main` branch to our `upstream` without changes daily. The manual syncing rebases our changes on top a new upstream version through a manual rebase process. This happens monthly and each sync is pushed to a new branch to maintain history.
Between monthly syncs we push our own changes to our latest monthly branch (which also get manually sent upstream when we get a chance).
I see — you’re doing more than “here’s a few patches to keep working across revisions”, you’re doing separate-path feature work on a different, actively-developed project.
To me that sounds like not a great idea, but if you must do it, I could see some usefulness to this.
Yeah. For reference, this is a typical patchset for the project that motivated it.[0] Some of the patches are "routine" dependency upgrades, some of them are bugfix backports, some of them are original work that we were planning to upstream but hadn't got around to yet. Some are worth keeping when upgrading to a new upstream version, some aren't.
I agree that it's not ideal, but... there are always tradeoffs to manage.
[0]: https://github.com/stackabletech/docker-images/tree/e30798ac...
Agreed. If you want your change and don’t want to bother the maintainers with a patch they are unlikely to accept, or can’t because it’s proprietary: fork the repo (at whatever tag makes sense), then periodically sync with the latest code for that version.
The likelihood of conflicts is minimal, and often if you see conflicts it’s a good indication your issue may have been resolved. Or if not, you can see if it’s still needed, or how to adjust it.
(Author here.)
> fork the repo (at whatever tag makes sense), then periodically sync with the latest code for that version.
Yeah, this is the workflow that Lappverk is trying to enable.
The problem is that neither of Git's collaboration models works well for this problem. Rebasing breaks collaboration (and history for the patchset itself), and merging quickly loses track of individual patches. Lappverk is an attempt to provide a safer way to collaborate over the rebase workflow.
A patch just encapsulates what was added and removed in a particular change, it doesn’t care about any commits.
For example wine-staging (ran by Wine developers themselves) hosts patches for Wine project and they revise / rebase them with each Wine version, which is often not a trivial task. I don't see how you can avoid that really. But Wine staging itself is a git repository that holds patches (and their history) if that helps, which indeed can stay there for years.
Same happens with patches that Debian applies on top of fixed versions of packages. They are stored in Debian's Salsa git.
You may have a look at Quilt. I doesn't solve the problem the author described but may help you once you accept there is no easy solution in sight.
Quilt is automation for the "bag of patches" model. I used it once when I needed to upgrade the internal bag of patches at $big_corp so as to apply them to a newer version of $public_app. It was predictably complex but somehow still manageable.
If you squint a bit then the [bag of patches] + [automated application in order] is a poor man's Git. If you keep this in a git repo then you're basically versioning repos (poor man's ones) in a repo. It almost sounds like the solution to author's problem :)
Many times I've just patched the binary even if source is available, because trying to reproduce the binary you currently have, with only the changes you want and everything else the same, can be an even more difficult exercise than simply changing a string or constant.
Lol I remember doing this when I was younger with the `man` command to remove a 5 second exit delay for the browser output.
Especially if you make a habit of patching the binary instead of rebuilding from source! ;)
I once wrote a small C++ wrapper for POSIX dlfcn.h. Someone sent a pull request that would have turned it into a Windows-only library.
Like... Intentionally, or because they unthinkingly did something non-portable?
whenever i rebase longstanding commits in my fork, i keep the previous branch by appending the date to its name.
reading the readme didn't make it clear to me how this app would make my life any easier (also considering the added complexity of a new tool).
don't get me wrong, it's a PITA... but how would it hurt less using this tool?
i rarely, if ever, need to look at the history of this.
The process described reminded me of "pristine source" and RPM spec files that take the upstream pristine source and patch it during the build process. Maintaining that is always a little bit of a headache if you don't do it regularly, especially having to maintain (generate and apply) a separate set of patch files for the changes and express/apply the patches in the spec file. This looks to make light work of that.
Honestly I found a better strategy to name branches after the fork point and the date you started the fork. So you’d have main-2025-03-07 for a fork of main started 03-07 another main-2025-05-08 for a rebase. The patch set above that is just what you carry. I’m not sure maintaining them as literal patches is that helpful vs just keeping it as explicit patches to apply in git. But maybe this is the right strategy once your fork gets complicated but at that point you should be hard forking rather than soft forking IMO.
Modifying source code like this is one method. For web software, bookmarklets are another great way to do that.
I’m a big fan of Greasemonkey scripts for this, although these days I prefer Violentmonkey because it has several capabilities that the OG doesn’t.
This is supercool. One my constant problem with self-hosting is that I often need to modify just a couple of files here and there, but then I'm stuck with a forked repo or a dirty work copy.
I'm going to try to make a frontend UI for it.
Are you talking about personal or professional self-host? Why are you constantly patching software you self-host? Not enough configurability? Using software not made for self-host? Holding it wrong? I ask because it seems...strange that you have these issues so often.