The most difficult part about code deletion is practicing the Chesterton's Fence principle:
> In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.
While this tool certainly does the job of proposing code deletions, that's the easier part. The harder part is knowing why the code exists in the first place, which is necessary to know whether it's truly a good idea to remove it. Google, smartly, is leaving that part up to a human (for now).
"This code has been dead for six months" is a very good heuristic that the code is not relevant. I do occasionally reject the sensenmann CLs, but only very very rarely. This isn't weird code that nobody knows why it exists but it is currently doing something. This is code that cannot execute.
Code that only triggers from a yearly holiday, disaster alerts, leap years, or the like, would have longer periods of going unused and likely be very problematic if removed. Unless by dead code you mean unreachable code in which case it shouldn't exist in the first place and I agree should be removed.
Dead code is shit that has a build rule and no other build rules in the entire repo depend on it. Or private functions that have no callers in an application that isn't built with any kind of reflection capabilities.
Or code that is behind if(false), System.exit or null pointer dereference. Finding those kinds of things is one of the best part of modern compilers, it's nice that they've extended that to the whole ecosystem.
Yes, the nice thing about blaze/bazel + sensenmann is that you can very accurately say "this code was not built into a binary that has run in the past 6 months".
Sometimes you still want it (e.g. python scripts that are used every once in a while for ad-hoc things and might go months between uses), but usually the right thing to do is productionize stuff like that slightly more (and also test it semi-regularly to make sure it hasn't broken).
Nah, there's stuff that scans the entire repo regularly for all kinds of interesting purposes, and of that's ignoring the fact that `atime` isn't available or a source of truth in piper.
Like conceptually I believe this could be wrong in both directions, since there's heavy caching of build artifacts, you can totally build a transitive dependency of some file without actually reading the file (and potentially do this for a relatively long period of time, though I don't think that will happen in practice), and stuff will regularly look through large swaths of files that aren't necessarily run.
Because piper[1] isn't a normal filesystem. It's often accessible through a FUSE-based api, so it appears like a filesystem to some users sometimes, but it can also be accessed over an RPC api. So the concept of "atime" isn't really a fit, because, well, you access the filesystem from a view that is based on your workspace, (think, akin to the git commit hash being in the filepath), and the "real" underlying file isn't necessarily on your machine.
So under a reasonable definition of atime, the atime of most files is never, because on any given arbitrary commit, you don't access/build everything.
> you cache an entire file instead of just reading it
You don't cache the file, you cache the file's outputs, keyed by a hash of the file (or all the files which are deps of a particular output). With bazel, you have a shared build artifact cache that can securely and reliably be shared across every user at a company with tens of thousands of engineers. If I build some target `:foo`, which corresponds to `foo.o`, generated from `foo.c`, but someone built `:foo` five minutes ago, as long as `:foo` hasn't changed (and you can check that the file hasn't changed without reading it because the fancy filesystem stores a hash of the file alongside the actual file), I won't actually read `foo.c` or go though the motions of building `:foo`, I can just pull `foo.o` directly from the object cache and use that in any dependencies, which means that I can build my output without ever invoking a compiler, which is really cheap and fast.
You could argue that upon reading the hash you should update the atime, but that's extremely expensive since now you have to do writes instead of idempotent reads a bunch of times a second.
I mean that's functionally what they do, just more accurately since it isn't read time, but time it was built into a binary since that information is accessible, and because the object cache isn't filesystem like and doesn't have an "atime".
And also you still need to reverse from a cached output foo.o, to every transitive dependency of that, of which there could be thousands. Which can be done but requires invoking the build system to do. So it's not anything like just checking a timestamp on the file.
Nah, like conceptually they are both writing serialized bytes and like indexing them so they can conceptually be found easily and like read back. This is probably why you didn't like explain why they are different.
Different industries I guess. The new financial year comes around every 12mo. Good luck explaining to the accountants that you deleted their end-of-year reconciliation reports because they didn’t run them every 6mo.
I would expect that those binaries would be built and deployed constantly, even if they just woke up once a day to check whether it was mid-april or whenever the financial year ended. I would also expect that they would be integrated with other reporting that happened on a daily, weekly, monthly, and quarterly basis.
Do you really have an entire application binary that is only deployed and run once a year?
Do you really have an entire application binary that is only deployed and run once a year?
I have ones which are run when the need arises. It could be a year, it could be ten. But I know they work and I don't need to touch them, nor would I want to delete them.
At Google there’s not really such a thing as code that never changes. Your dependencies will shift beneath you, even if it’s just the compiler itself. More likely though the libraries you use have been updated, a robot or two have come through and applied automated fixes etc etc.
So if we’re using it once a year, (we do have tools like that, maybe not strictly annually but certainly rarely) then you do indeed want it to be automatically built on some kind of recurring basis so it’s ready to go when you need it. And the tooling makes it (relatively) easy to have that happen.
If your once-a-year binary has any dependencies in common with others, then you need to at the very least be rebuilding and re-running the tests when those shared dependencies change, or else you'll find at the end of the year that it doesn't build, let alone work. And the compiler and stdlib count as shared dependencies.
The fact that you can so very straightforwardly declare that code which hasn't been used in 6 months is irrelevant is extremely revealing about what leads to the deplorable state of "modern" software development.
Google's response to Chesterton's Fence is: "if you liked it, then put a test on it".
I used to update the internal version of numpy for Google and if people asked me to rollback after I made my update (having fixed all the test failures I could detect), and they didn't have a test, well, that's their problem. The one situation where that rule wouldn't apply is if I somehow managed to break production and we needed to do an emergency rollback.
I shed a tear when some of my old, unused code was autodeleted at Google, but nowadays my attitude is: HEAD of your version control should only contain things which are absolutely necessary from a functional selection perspective.
At Google, because everything is in the mono repo and nearly everything around it is in some level of flux, if you don’t have a test for your code, it gets broken pretty regularly.
Most teams are pretty understanding when they break you on some untested code, and will work with you to fix it, but the very first step in working together is for you to write a test that shows the issue.
The resultant culture is heavily biased toward writing tests before something bad happens to you.
I don’t think you understand Senssenmann fully based on this post. At Google basically everything in use has a Bazel-like build target. This means the code base is effectively a directed “forest”/tree-like data structure with recognizable sources and sink. If you can trace through the tree and find included-but-not-used code by analyzing build targets, you can safely delete it. There are even systems (though not covering everything) that sample binaries’ function usage you could double check against.
> why the code exists in the first place
If the code is unreachable it’s at best a “possibly will be used in the future” and most likely simply something that was used but not deleted when it’s last use was removed (or a YAGNI liability).
If you can find a piece of code included in build targets but unreachable in all of them, it’s typically safe to delete. And it’s not done without permission generally, automation will send the change to a team member to double check it’s ok to delete/nobody is going to start using it soon.
Yep. You would almost always know before the deletion is even committed or the change proposed whether it would cause a build or test breakage. And if the automated safety checks and manual review didn’t catch such a breakage, you’d easily rollback. The deleted code is always “still there”, you have to jump through a ton of justified hoops to do a true deletion, which Sensenmann won’t be doing.
It should also be clear that this article is about "deleting" code from an active project, not about "deleting" it entirely from the version control system. Thus, any code "deleted" through the described process could still easily be restored if necessary.
From reading the comments, I get the impression that there is a much deeper misunderstanding going on. This seems to be not so much (or not at all?) about identifying rarely run code paths, but about build units that have been abandoned on the dependant side.
So if there's a fancy onLocusSwarmAttack() that has never been run outside a test in a module to help applications deal with various kinds of datacenter outages called TenPlagues, it won't be in the crosshairs of Sensenmann. But if one day that module gets a successor (perhaps BeiWeltuntergangScheibeEinschlagen if it's made by the Zurich team?) and all code is supposed to eventually migrate to the new one, Sensenmann will tell you that TenPlagues has finally left the building when (if) it has happened.
Again, just my impression from the article and the discussions here (many seem to think it's about onLocusSwarmAttack()), probably resulting from a linguistic barrier between the language used at Google to talk about their monorepo and more conventional English. I guess they have done some quite deliberate shifts to help people "think monorepo"?
As a counter to Chesterton's Fence: sometimes the fastest way to understand what something does is to remove it and see what complains. You might get only 1 complainer for every 10 fences you take down. Putting that one fence back up takes much longer than taking it down, but the time saved from removing the other 9 unnecessary ones makes it a net win. And this time you can add Documentation to the rebuilt fence.
A further counterpoint: if you follow the Fence proponents' logic to its conclusion you can never remove any code which is clearly an absurd situation.
I think the real logical flaw is that Fencers (as I will now call them) put the blame on the person who removes an apparently useless fence. But they're wrong. The real blame lies with the person who built the apparently useless fence and didn't put a sign on it explaining why it shouldn't be removed.
> A further counterpoint: if you follow the Fence proponents' logic to its conclusion you can never remove any code which is clearly an absurd situation.
No, that would only be the case if one would never understand any code. Chesterton's Fence consists of two parts ("understanding some code" as a precondition to "removing some code"), and leaving one or the other part out makes it some other thing than what Chesterton's Fence means.
> The real blame lies with the person who built the apparently useless fence and didn't put a sign on it explaining why it shouldn't be removed.
Chesterton's Fence is not about blame, or the past in general - it is about how to deal with things that are in the present. (Although I agree that the original fence-builder should have left a note or two!)
The principle says you can’t remove it until you understand why it was there. It’s more about doing due diligence.
I follow the principle when I remove code, and it’s a reason why good code comments are important. “Oh yeah, this was written for [x] which is no longer a thing, we can remove it now”
And not putting up a sign explaining why it's a necessary fence is a bad decision. Avoiding removing all unlabeled fences because they might, maybe, potentially be useful, is also likely a bad decision if taken to it's conclusion.
Chesterton's Fence is a reminder to actually get up, walk over to the fence and skim the multiple labels and notes the builders left there - because the big problem is usually someone looking at a thing that came before them, and assuming they understand what it was for, without bothering to actually check it.
The article is about the removal of _dead_ code. So, not a "fence across the road" - it's a fence that was moved to the side of the road, already cleared. The question is just whether to dismantle the fence or keep it there just in case.
+1. And, it's in version control forever. It's not as if it entirely disappears. Like one of the sibling comments mentioned, I only rarely reject Sensenmann CLs.
That's worth explaining: it's automated code deletion, but the owner of the code (a committer to that directory hierarchy) must approve it, so it's rare there's ever a false deletion.
You raise a good point, and I would answer it with agree and disagree:
Agree: Yes, you are correct, merely observing that a code path was never executed in the last 6 months is not the same as understanding why the code path was created in the first place. There might be the quite real possibility of an infrequent event that appears just once in every two years or so (of course, this should also be documented somewhere!).
Disagree: Pragmatically, we have an answer if the code path was not executed after 6 months use in production and test: We know that, with a very high probability, the code path was created either by mistake (human factor) or intentionally for some behavior that is no longer expected from our software. To continue the Fence metaphor, regarding Sensenmann: After 6 months, we know about the Fence that 1) it has no role to play in keeping the stuff out that we want out (that was all done by other fences that were had contact with an animal at least once) and 2) that it might have been used to keep out flying elephants or whatever, but no such being was observed in the last 6 months (at least the fence made no contact with it, which it then should have!) and probably went away.
That said, having a human in the loop is probably a good idea.
> In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.
https://wiki.lesswrong.com/wiki/Chesterton%27s_Fence
While this tool certainly does the job of proposing code deletions, that's the easier part. The harder part is knowing why the code exists in the first place, which is necessary to know whether it's truly a good idea to remove it. Google, smartly, is leaving that part up to a human (for now).