Why must it be either-or? The web is another medium for the journalism. There's no reason to assume performance was of greater importance than our core mission. The point of that slide is to explain why this redesign had a performance goal at all. Perhaps the slides don't explain this point well, but I think you're reading too much into it.
Exactly. To compound the issue, older content may not fit into our current schema, so there are other data issues that need to be solved. Adding hand-coded HTML to pages was quite common 10 years ago, so parsing that into the structures we work with today isn't always straightforward and difficult to automate.
As a disclaimer, this falls under the "Back then" section of the talk which is an overview of how things used to work. We no longer do things this way.
Publishing a page is running content from a CMS through a templating system. However, the time spent executing templates isn't the only factor in the duration of the "publish". The slides refer to a compilation step (which actually also included a preprocessor step), and includes delegating to a service to copy the resulting page to disk and ensure the write succeeds for all data centers. For data consistency and system monitoring, we essentially treat that entire process as an atomic action and wait for all parts to finish. Additionally, since "publishing" is a core process for us, we avoid doing massive publishes that might risk the systems involved in the successful publishing of current articles. So increasing the number of these for the sake of pushing code is considered too risky. Yes, there are ways of mitigating that risk, but dealing with this legacy problem once and for all is a better path forward than scaling up this solution.
Thank you for this detailed explanation. I assumed there was some extenuating complexity to the process, and was simply wondering what it was that prevented the original publishing system from being adapted to be faster.
I'd first like to say that this is the deck from a presentation at Velocity NY last week. Like most other talks, separating the slides from the presenter can make interpreting the context difficult. I did try to make an effort to have my slides provide useful information without me presenting them, but I acknowledge that I may not have done enough in that regard. I also received feedback from people present that there were too many bullet points and my font was too small. Can't please everyone I guess. But if you have a link to what you consider the "perfect" slide deck where unambiguous context is maintained without video of the talk, I'd love to study it in order to improve.
Other replies will be directed at the specific comment thread.