JJ's conflict handling seems nice. Can't say that it's a big enough problem for me to qualify as a "major feature", but a seemingly nice improvement.
I'm not sure how I feel about JJ's "working copy commit". One of the great things about Meta VCS is that all commits are automatically backed up into the cloud. Which seems incompatible with the JJ model? Not sure. I think the D in DVCS is *wildly* overrated. 99.999% of projects are defacto centralized.
I'm #TeamMonoRepo 100% of the way. My background is gamedev and perforce. An industry which still uses Perforce because Git is poopy poop poop. The Git integration I want to see is the ability to easily sync a monorepo subfolder with an external GitHub repo. Syncing commits for internal projects that are open sourced requires a big ugly custom set of tooling. And I'd kind of like a way to do an "inner fork" within a monorepo if that makes sense.
If you're interested here's a pair of blog posts I wrote that have at least some of my thoughts on source control.
> JJ's conflict handling seems nice. Can't say that it's a big enough problem for me to qualify as a "major feature", but a seemingly nice improvement.
The advantages people first think of when they hear about jj's conflict handling are usually that you can collaborate on conflicts and that you can leave conflicts for later. What's less obvious [1] is that being able to store conflicts in commits means that we can always rebase descendants, so there are states like what Mercurial (and Sapling, I think) call "obsolete" and "orphan". There is also no "interrupted rebase" state when you're resolving conflicts.
These things simplify for the user. They also simplify a lot for developers. An example is how I spent about 2 weeks on a `hg amend --into` command for amending the changes in the working copy into an ancestor. I then implemented that in under an hour in jj. Much of the complexity in hg stemmed from dealing with the interrupted states while dealing with conflicts. (Other complexity in hg that jj doesn't have is dealing with a dirty working copy, dealing with concurrent operations, and simply complexity in the APIs for creating new commits in memory.)
[1] IIRC, it took me about a year after I added support for "first-class conflicts" until I figured out that it meant that we should simply always rebase descendants. Jujutsu had a orphans and a `jj evolve` command before then.
Conflict handling is much more useful whenever you actually have conflicts arising regularly. :) For working on JJ itself, this is super useful because it's still actively having tons of code written by 2-3 people, continuously. In other words, the velocity for newly written code is still very high, internal APIs change breaking PRs, etc. I think how often you handle conflicts in a project has a big number of factors. But JJ really does make conflict handling not 1 but like 3-5 times easier, IMO. So you only really need to do it once to be wow-ed, in my opinion. (One of my previous coworkers said something like, and I quote, "This is fucking awesome.")
JJ has an abstract notion of a working copy and even an abstract notion of a backend that stores commits. It is a set of Rust crates, and these are interfaces, so you can just implement the interfaces with whatever backend you want. For example, the working copy can be described in terms of the filesystem (POSIX copy/rename/move files around) or in terms of a virtual filesystem (RPC to a server that tells you to make path x/y/z look like it did at version 123.)
You can absolutely have "cloud commits" like Mononoke/Sapling does, and that is a central feature desired at Google too, where Martin works. Skip to the end of this talk from Git Merge 2024 a few weeks ago, and you can see Martin talk about their version of `jj` used at Google, that interacts with Piper/CitC: https://www.youtube.com/watch?v=LV0JzI8IcCY
My understanding is that the design works something like this (no promises but we may approximate something like this in JJ itself): A central server (Piper) stores all the data. A local pair of daemons runs on your machine: a proxy daemon that talks with the server on your behalf (is used to reduce RPC latency for e.g. commits and writes, and does asynchronous upload) and a virtual filesystem daemon (CitC). The virtual filesystem daemon manages your working copy. It "mounts" the repository at version 123 (the "baseline version") onto a filesystem directory, by talking to the server. Then it tracks changes to the directory as you update files; sort of like OverlayFS does for Docker.
When you make a commit, tell the VFS layer to snapshot the changes between your working copy and the baseline, uploading them to the server. Then tell the backend that you've created version 14 out of those files/blobs. The new version 124 is now your new baseline. Rinse and repeat to create version 125, 126, etc...
The TL;DR on that is the VFS layer manages the working copy. The server/proxy manage interaction with a backend.
OK, monorepo export of subprojects. For the purposes of exporting monorepo subfolders, I'm actively exploring and thinking about "filtering" tools that can be applied to a commit graph to produce a new commit graph. Mononoke at Meta can do something like this. Check out this tool Josh that does it for native Git repositories: https://josh-project.github.io/josh/reference/filters.html
The idea is let's say you have commits A -> B -> C and you want to export a repository that only contains a project you want to export, located under paths path/to/oss/thing. You basically sweep the commit graph and only retain commits that touch path/to/oss/thing. If a commit does not touch this path, remove it from the graph and "stitch" the graph back together. If a commit does touch this path, then keep it.
If a commit touches that path and other paths that don't match, then create a new commit that only touches files under that path. In other words, you might derive a new commit graph A' -> C' where B was thrown away because it touched private code, and A' and C' are "filtered" versions of the originals that retain a subset of their changes. The secret is that this algorithm can be fully deterministic, assuming the input commit graph is "immutable" and cannot be changed.
This idea actually is kind of like applying a MATERIALIZED VIEW to a database. In fact it could actually be a kind of continuous materialized view where a new derived graph is computed incrementally and automatically. But how do you express the filters? The underlying filter is really a way of expressing a functional map between graphs. So should filters only be bijective? What if they're injective? Etc. I haven't fully figured this out.
So anyway, I've never really worked in gamedev or huge monorepo orgs before, but these are all problems I've acutely felt, and Git never really solved. So, I can say that yes, we are at least thinking about these things and more.
> whenever you actually have conflicts arising regularly.
I mean I work at Meta. We have lots of people working on stuff at the same time. :) AraxisMerge auto-merges quite nicely. Text conflicts have just never been a meaningful issue for me in my 15+ year career.
Now binary conflicts are a source of pain! Unfortunately require Perforce style file-locking once you hit a team size.
> I've never really worked in gamedev or huge monorepo orgs before, but these are all problems I've acutely felt, and Git never really solved. So, I can say that yes, we are at least thinking about these things and more.
Good to hear! Yeah the workflow for subproject syncing it's super obvious. It definitely can be done but will require a lot of thinking and a few iterations.
I'm not sure how I feel about JJ's "working copy commit". One of the great things about Meta VCS is that all commits are automatically backed up into the cloud. Which seems incompatible with the JJ model? Not sure. I think the D in DVCS is *wildly* overrated. 99.999% of projects are defacto centralized.
I'm #TeamMonoRepo 100% of the way. My background is gamedev and perforce. An industry which still uses Perforce because Git is poopy poop poop. The Git integration I want to see is the ability to easily sync a monorepo subfolder with an external GitHub repo. Syncing commits for internal projects that are open sourced requires a big ugly custom set of tooling. And I'd kind of like a way to do an "inner fork" within a monorepo if that makes sense.
If you're interested here's a pair of blog posts I wrote that have at least some of my thoughts on source control.
https://www.forrestthewoods.com/blog/dependencies-belong-in-... https://www.forrestthewoods.com/blog/using-zig-to-commit-too...