Even when I worked at Canonical I never installed the latest LTS until at least its first point release in the summer. Maybe I'm part of the problem: if more people followed my lead, the initial release would get buggier and buggier.
Oh don't worry, no offense taken. It's an interesting problem: dogfooding is definitely a good way to catch problems, but if your release is unstable enough to break machines, you can end up with a wide swath of the company being unproductive. The stability of the overall release wasn't one of my core responsibilities, so naturally I gave more priority to the things that were; I needed a release that worked so I could get my job done.
There might be some cultural solutions to that. For example, if the company expects that employees are dogfooding, perhaps testing out a beta in a specific timeframe, it could be culturally expected that devs might have broken machines during that timeframe. If was that taken into account in both the schedule as well as support paths, that would make things a bit easier. Honestly, though, even if that were the case, it would still be a hard sell for me. Quite simply: I don't like failing at my duties. I would need to be convinced that this was one of my duties, and I'm not sure how the company would pull that off. And that's ignoring other very practical concerns, such as the fact that a fair number of Canonical engineers only have one work machine. Having that machine down, depending on the definition of "down," can make it difficult to get support in the first place.
Why wouldn't you just run your stable work on a vm? Drop it onto an unstable host to play around and of it breaks your work environment is backed up, stable, and unimpeded. There are many creative ways to work and dogfood.
Perhaps I'm misunderstanding your question, but is it really dogfooding if every meaningful thing I'm doing is within a stable VM?
In general, a VM for my work doesn't change anything. It doesn't magically keep anything backed up: I still need to commit/push regularly, and so on. If the host explodes, I'm still down, even if what I'm doing is in a VM. I could immediately reinstall an older release and get set back up, but that isn't really dogfooding for an OS. I would need to report a bug and actually get the issue resolved, which generally means running in a broken state for a while to get there.
Beyond that, I've used a VM for my day-to-day work in the past, and honestly I found it maddening. I couldn't fully utilize my computer without starving the host of resources, and there were a thousand other papercuts relating to hardware access and general instability. I try to avoid that development story these days.
As you can see, my concerns here have nothing to do with losing code or data. They relate to lost productivity and not satisfying my primary duties.
For complex software, real-life testing all the corners is impossible.
You'd get some value out of people running it on VMs, some people running it on the same old hardware, some people running it on cutting-edge hardware, some people running it for extended periods and doing dist-upgrades, some people constantly reinstalling it (even within a VM).
There are so many places software could fail. Especially things that aren't repeated often seem to fail: For example, websites where logging in works, but creating a user fails and nobody notices, because all employees already have accounts.
I think you're misinterpreting my response. Of course there's value in testing operating systems within VMs. In fact, a solid chunk (if not most) of the Ubuntu install-base is VMs, believe it or not. I wasn't talking about that at all, but rather responding to the idea that, if I used a VM as my development platform, my problems would go away. In reality, in that context, it really changes nothing.
Surely it would be no problem to give developers a second machine for the testing period, so they could work with that one every day and if it breaks, the stable one is just one git checkout away...
But no, I've actually never seen this. Probably better to lose people who leave over their employer messing up the work hardware than spending 3k per 2-3 years for a second machine.
Extending the metaphor with this behavior, it'd be like mainly feeding your dog store-bought food, and occasionally lining up a bowl of home-made. It might sometimes eat it, but you wouldn't know.
Yes that might actually help, but I can imagine the cost/benefit analysis being difficult to pull off: the costs are very real, but the benefits are less tangible. To be clear, I don't know of anyone who left because of a bad upgrade. I think a lot of folks just ran like I did, hopping from LTS to LTS after the point release (or only once their LTS was nearing EOL).
There certainly used to be a strong push to have internal people use the product a lot more during the development cycle. There was also a real desire to make the devel version actually usable. That fell by the wayside, sadly.
Having your developer workstation break while you have a backlog full of stuff to do, would absolutely make you less motivated to run the developer release. Especially if you're not on the desktop team.
I suspect it was a lot easier to feel like that was one's duty when the company was smaller and you were closer to every piece of it. It was also probably easier to get issues resolved when you knew exactly who to talk to. Maintaining that culture as you grow is probably quite the challenge!