They do say that they adjust for "patient characteristics", although they don't go into much detail. Also it's important to note that physicians that have a large volume of patients don't have a higher mortality rate on their patients. It looks like a real effect, maybe from "staying behind" or maybe simply not being on the "top of their game".
It's difficult to really say for certain what's going on. The adjustments I'm sure are very crude compared to what case assignment decisions are actually based on, for example, so I'm sure they're adjustments wouldn't really account very well for patient illness severity or other subtleties along those lines.
Another possibility is that younger staff are more likely to be questioned about things. "Have you thought about X?" causes them to rethink something and make a revision. If older staff are just assumed to know what they're doing, they might be questioned less.
The fact that differences weren't present among physicians with a large volume also makes me wonder if this is just a fishing expedition that wouldn't replicate. Not to cast aspersions on the authors; just to say that if you slice up any dataset enough you can find something.