Things You Should Never Do, Part I

zdw · on Feb 23, 2012

If you really want to do a rewrite, first:

- Write comprehensive test cases and run your current product agains them. Have these also benchmark the code if they're performance critical.

- During the rewrite, slavishly adhere to the test cases. If there's an undesireable difference discovered between old/new, write another test case.

- If possible, break up the original product into smaller modular sub-projects you can rewrite independently.

Most problems with rewrites come from starting from scratch - slowly replacing a code base from the inside is a much better way to do it.

outworlder · on Feb 23, 2012

> If possible, break up the original product into smaller modular sub-projects you can rewrite independently.

This.

Many projects can be done this way. In particular, the process of figuring out the set that can be replaced at any given time can aid in the refactoring.

glogla · on Feb 23, 2012

You should however be careful with the test cases -- if you want to change the internal structure, unit tests can make you use the same interface, and unless your codebase if already well modularized, it will make architectural changes difficult.

quattrofan · on Feb 23, 2012

Or come up with a plan to gradually replace key areas with new system. I have a real world example of what Joel describes, came up with a plan to do what I am describing it was rejected, a total rewrite done and it was a disaster, the company has never really recovered.

dhh · on Feb 23, 2012

If you rewrite your code base to do the same things as the old code base was doing, yes, you're doing it wrong. If you rewrite your code base to do different things, you might very well be very right. (Depending on whether the market is more interested in those new things than the old things).

kenrikm · on Feb 23, 2012

I agree, I actually just threw out a bunch of code and started with a fresh base with a much cleaner OOP structure and could not be happier with my results so far. Things that I was struggling with before are now super simple because everything is properly segmented. The whole process only took two days as many of my classes could be reused and I just needed to fix how they were interacting. Then again I was just correcting my quick and dirty code, I don't know if I would have done the same with a production application as it would be safer to just work on both at the same time until it was in good shape to cleanly replace the first.

An example of when a rewrite goes bad is Final Cut Pro X.. Apple really leaned a lesson with that fiasco.

ternaryoperator · on Feb 24, 2012

I don't think Joel is arguing against throwing out old code here and there. Especially, if it only takes two days to replace it. I think most devs would view that as an inevitable step on large projects. Especially on projects that head into uncharted waters, where the chances of getting things right the first time are almost nil.

SonicSoul · on Feb 23, 2012

this is a risky territory for a whole set of reasons. those different things you're adding might change the way customers use your product, and more often than not, their opinion will differ on how useful the changes are. often times products lose features in such a rewrite. these features may seem insignificant to you, but may mean the world to existing users.

dhh · on Feb 23, 2012

Which is why in our case we're keeping both the old and the new product around.

jgarmon · on Feb 23, 2012

We did precisely the same thing at our company about 18 months ago. Our original codebase just couldn't scale. It was a great MVP platform, and we learned a great deal from it, but it simply couldn't handle load, sophisticated billing options, or any reasonable expansion of the product functionality.

That said, we left the new and old systems running in parallel (all new users were directed to the new platform) for over a year before we even informed users we were shutting it off, then gave them another six months to migrate. And we never purged the old data; it's in cold storage in case someone ever comes looking for it.

Rewriting the codebase isn't something you should take lightly, but "never" is pretty damn strong word.

hythloday · on Feb 23, 2012

Then it seems like you're not really rewriting as Joel defines it, you're launching a new product (it does new things, and it doesn't replace the old one). I guess it then becomes a question of effort spent in cannibalizing your own product vs. the income the new version brings, which is a very different equation for Basecamp and the products Joel talks about.

romaniv · on Feb 23, 2012

If everyone followed this advise, most of my professional life today would consist of tweaking hundreds of bandaids for PostNuke modules and Classic ASP pages done in Visual Basic.

Technologies change. More importantly, requirements and visions change as well. Blindly throwing money at an existing system without considering alternatives is a really, really bad idea. And while you can argue that it's possible to refactor anything into anything else, you can't argue that it will always be cost-effective.

mirsadm · on Feb 23, 2012

Any suggestions for what to do in this situation?

I was unfortunate enough to work at a place with a 10+ year old code base where there have been 3 to 4 software architects that have all had their own vision over the years. The code has been refactored many times already. It consisted of a buttload of singletons that were all initialized at seemingly random places. It was impossible to start or shut down the app deterministically. It was massively multithreaded (in most cases for no good reason). They had this horrible implementation of a shared cache (because many instances of this app ran across many computers that needed to share state).

It had no unit tests. Whenever you change anything it would break something entirely different. To me it was an example of everything going wrong...many times.

The best part was that it had to run for weeks because it was a critical application (for fire, police and military services). It didn't. They were instructed to reboot all the machines every day. But it wouldn't start up properly every time so they may have to do it multiple times. I did my best for the year to clean up the critical parts of it. I was contracting there but they had offered me a full time position by the end.

I finally cracked it when the original "software architect" came back because he had been fired from his old position. He was annoyed that over the years his code had been changed so he started to put back in place everything that he had done before he left. Not only that we had an angry complaint from one of the customers about a possible hostage situation that could have turned very badly because of this crappy system.

We tried to reason with the software manager by suggesting that we assign one or two people to start from scratch and take across code that can reused. It would take a bit of time to reach the same feature set as the old application but it had become so difficult to add new features to the old one that it needed to be done.

ejames · on Feb 23, 2012

If you're dealing with a complex system that needs a lot of work, I would recommend trying to break it down into subsystems and handle the subsystems one at a time.

The original article does assume that you're dealing with a system that more-or-less works. If your application flatly doesn't do the right thing, than a big rewrite from scratch may really be the best choice - there's nothing to save.

But spaghetti code doesn't just fall from the sky, it occurs because of politics, stubbornness, and bad processes. Over the course of the 10+ years it took to make the code base, is it really the case that nobody but you noticed the problem? It's more likely that there's a lot of pressure going in the opposite direction, and other maintainers didn't know what to do either.

Dealing with subsystems helps you handle both problems. Management might not be willing to let you rewrite the whole thing, but if you said, "Let's just fix the boot-up process for the Cyclotron 4000 resource. Nothing else changes, just the Cyclotron boot." you might be able to get permission. In a badly-maintained project, it's hard to replace all the instances of one service - that's what makes it 'badly-maintained' - but it's still easier than dealing with the whole system in one go. And, of course, instead of 'fixing' the Cyclotron you're actually rewriting it with a new, non-wacko Cyclotron service.

Then you go back to your manager and say, "It was rough, but the Cyclotron 4000 no longer blocks the start-up. Let's get the next thing on the list." Not only do you have a slightly better project, you also have better credibility with management, which makes it more likely you'll be listened to when you say a certain technical measure is necessary. Next, fix the subsystem that talks to the Cyclotron - and so on. Pick a right time to introduce tests, code review, and all the rest.

Remember that just as you had the experience working with the terrible code base, your managers had the experience of working with the previous 3 or 4 software architects who "had their own vision" and delivered a product that doesn't start up reliably - I don't think it's surprising that there was no longer the political will to assign people to refactoring or rewriting tasks. Bad architecture uses up the political will needed to approve good architecture, because it makes all "architecture" tasks look bad. You need to regard you reputation as a finite, under-supplied resource just as much as your time and budget and plan to get more.

From your use of the past tense, it looks like you're no longer in that situation (good for you!)... but that would be my advice if you see a similar situation in the future. I've used this plan in my own career to rewrite a (much smaller, only moderately troubled) project piece by piece over the course of a year.

mirsadm · on Feb 23, 2012

Good response! Another part of the problem was that the company preferred contractors over full time staff. It had very high turn over because of this. Many have already replaced subsystems with their own versions over time. There had already been many implementations of the Cyclotron subsystem :). To be honest I probably ended up being one of them. Working there was too stressful and the rewards for trying to achieve more were not recognised.

The place had a reputation for hiring highly motivated engineers and burning them out. Just to be replaced by another. When I left, they hired a very talented guy that I worked with for a couple of months. He left recently and the cycle begins again!

guard-of-terra · on Feb 23, 2012

You could probably replace all those singletons with a sane DI system in a week, why couldn't you?

mirsadm · on Feb 23, 2012

It would have been possible but probably not in a week. The code base was written using QT and half the singletons would be lazily created at seemingly random places through hundreds of signal/slot calls (sometimes through the event queue if it came from another thread).

The singletons were just one of the many problems. I remember there was a "database.cpp" file which handled all access to the SQL database. It was over 10k lines of code and had hundreds of structs to represent all the tables in the system. The person responsible for that ensured he had a job by only working with that source code.

glogla · on Feb 23, 2012

Wow.

This makes me thing that Java is better for this kind of big "enterprise" application, not because it's faster or more enterprise or somesuch, but because it's more limited, and therefore less things can go wrong.

I worked in two banks, developing web banking in one and middleware service in the other, and while there were some strange things (what's with banks and XML, really?) there was nothing that terrible here.

But then I'm pretty sure that someone will share their Java horror story.

mirsadm · on Feb 23, 2012

I could write a book about how bad everything was set up :). This place loved to abuse XML. It had > 2000 XML files to configure the system. Objects in code were "generic" and instantiated based on XML configuration. You could inherit configuration from base XML files. It was basically impossible to determine where a piece of the system was set up from.

As bad as everything was it was a fantastic learning experience for me though.

og1 · on Feb 23, 2012

While I mostly agree with this, I think over time you can see some reasons for re-write. Two of the errors used as examples as bug fixes may no longer be relevant with current hardware progress. "Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle."

Were the low memory issues due to the limitations of current hardware at the time and no longer relevant? Do many computers still have floppy disks, and would you still want to have to support that?

I think re-writes should be avoided unless you have a full understanding of the reasons for all the one-off patches. You should also have an idea of what you will lose/gain by getting rid of them.

bengoodger · on Feb 23, 2012

Given the trend towards bundling and a changing internet population (away from savvier early adopters and towards people more inclined to stick with the defaults), Netscape was always going to have a really tough time competing. Microsoft changed the game in browser distribution and competing effectively meant that your product had to succeed in several areas, none of which Netscape did. A string of inept management and product decisions didn't help either.

I would interpret Joel's point as: Not shipping something for years and then shipping unstable crap is a bad idea. In this case, it ruined people's already shaky confidence in Netscape. Rewriting to improve code isn't always a bad thing - in this case it did end up paying off, just too late to benefit Netscape. Firefox probably wouldn't have happened if the switch of codebases hadn't been made. Netscape being out of the picture by this time was a good thing - since Netscape was institutionally incapable of shipping quality software it would just have screwed it up.

jasonkolb · on Feb 23, 2012

LOL @ "One project I worked on actually had a data type called a FuckedString."

On another note, things like Eclipse's refactoring tools have exponentially increased my ability to keep my codebase clean and easy to read. Renaming variables to stay consistent etc is extremely easy using these tools and I have to wonder why people still struggle with issues like this.

arctangent · on Feb 23, 2012

I don't think this advice fits well with agile approaches to building software.

I've spent about three years (on and off) building a Django application for a team where I work and, over time, new functionality has been bolted on as and when the needs of the team changed. We've also gone through one fairly major model change that seriously wrecked the sense of symmetry in the v1.0 I delivered all those years ago.

A bit later this year I will be making a whole lot more changes to the application and I've decided that rather than going from a v1.6 to a v1.7 this is a great time to throw away the majority of the code and re-build from the ground up so that the v2.0 is properly architected to meet the needs that exist today.

It's going to be a lot of work but I think it's worth it in terms of making new features easier to add and reducing the maintenance burden associated with a structure that has become overburdened with technical debt.

mnutt · on Feb 23, 2012

Agile means different things to different people, but to me "the big rewrite" sounds pretty un-agile. You end up making a lot of architecture decisions all at once and go a long time between working releases.

If your business needs have completely changed then it might be worth it, but at that point it's no longer "the big rewrite" but instead just writing a new application. (that isn't required to fulfill the requirements of the old app)

agentultra · on Feb 23, 2012

Reading poorly written English is hard. When the author can't keep their focus and weave together the structure of several disparate arguments into a cohesive essay, it can be very difficult to understand their intentions. It can be near impossible to follow them to their conclusion and decide what it is they're getting at if they muddy the waters with poor writing.

The same is true of software. Well written software is a joy to read and expresses an idea clearly. I love reading code like this. I love writing it even more.

If we want to write code that is easy to understand we should spend more time reading code than simply writing it all the time. Practice makes perfect but introspection reveals the path to self improvement. Absorbing the good ideas of others and filtering out the bad is part of being a good writer and is especially important to the programmer. Read more code!

rcfox · on Feb 23, 2012

The problem is that everyone has a different idea of what's nice to read. Some people are more comfortable with an algorithm in a for loop, while others would prefer to write it recursively. Who's to say which is objectively nicer to read?

roc · on Feb 23, 2012

> "Who's to say which is objectively nicer to read?"

The one that's written clearly. An ugly, obfuscated loop is worse than clearly-written and commented recursion and vice-versa.

If the most pressing readability concern in your code is whether an existing loop would be more easily-read as recursion, you're already done. The code is clean enough. Move on.

gbog · on Feb 23, 2012

Relativism has to be relativised. Some logic is better expressed in a loop, some other fits better in a recursion. A guideline is to use the simplest correct path, which means for example that if complexity of the result is similar, the loop is preferable.

numeromancer · on Feb 23, 2012

This is an instance of the Fence Fallacy:

http://minx.cc/?post=320257

abruzzi · on Feb 23, 2012

> then they made it again in rewriting Quattro Pro from scratch and astonishing people with how few features it had.

This reminds me of Final Cut Pro X. From an abstract code-base standpoint, it was much improved. Many people on the nets (that probably never used FCP) tried to cover for it with excuses like "it's a 1.0 product" or "just don't upgrade", but from a user standpoint, the rewrite left behind enough features that it became unusable to many people (myself included) that owned, used, and loved the previous version.

As developers, we need to do projects for ourselves (like code cleanups, and rearcitecting) but we need to understand the user impact first.

giulivo · on Feb 23, 2012

I may not understand very deeply the topic discussed, but IMHO the FSF and Torvalds started a (very) successful movement which is all about contributing and sharing the code. It is, many years later, providing us with code of very high quality used in a number of complex and critical environments. I don't think FLOSS developers use to think that other's code is a mess... but there is no mention of open source in the article.

kfcm · on Feb 24, 2012

Remember, this article was written about 12 years ago.

That said, there are times when tossing out the whole thing and starting over is absolutely the right thing to do. You can identify these apps by when the 8x5 sticky index cards describing defects fill a 9'x15' wall, application instability is a feature, and the "Big Ball of Mud" anti-pattern definition uses the app as its case study.

tobiasSoftware · on Feb 23, 2012

I understand his point with gigantic messy software, but a lot of software you can actually learn the problems of how it was coded the first time and do it better the second time.

It is very situation dependent on whether the messiness is a result of bug fixes that will have to be reapplied, or because the system was built when no one had a clue how to build it.

a_a_r_o_n · on Feb 23, 2012

"Don't rewrite" seems like pretty good advice.

Can you make a case for "write a new product that will (eventually) replace the old one?" After all, you know a lot more now from having written the first one. Write one to throw away, and all that.

Or if you write an X, are you forever discouraged from ever producing another, better, more relevant X?

narag · on Feb 23, 2012

This article was brilliant, the kind that seems obvious once you read it. Made me Joel's fan since.

My unfortunately long experience with maintaining code has taught me how true it is.

I would add that most often the maligned code is the only existing documentation of a company's workings.

127001brewer · on Feb 23, 2012

Isn't 37signals rewriting Basecamp (currently referred to as "Basecamp Next")?

Perhaps in some cases, rewriting your software can be a competitive advantage.

rurounijones · on Feb 23, 2012

To be fair in this case it is the same people doing the rewrite as the original version so I guess the theory is that the hard-won knowledge has not been lost and can be integrated into the new code.

sneak · on Feb 23, 2012

http://en.wikipedia.org/wiki/Second-system_effect

pbhjpbhj · on Feb 23, 2012

The second-system effect refers to the tendency of small, elegant, and successful systems to have elephantine, feature-laden monstrosities as their successors. The term was first used by Fred Brooks in his classic The Mythical Man-Month.[1] It described the jump from a set of simple operating systems on the IBM 700/7000 series to OS/360 on the 360 series.

---

FWIW I hate the posting of a link without describing the content in some way.

gcercy · on Feb 23, 2012

talk about reuse, the article is from 2000