Ironically there is a straightforward answer: arc distance. What angle does the font (or pixel, or whatever you're trying to scale) subtend from the position of your eye? This works always. It's sometimes hard to define for some devices (e.g. TV's, which might be used a widely varying distances), but even then the world has come up with standard conventions (e.g. assume you're 10' from your 40" TV, assume your phone is about 12" away, assume the controls in your car are at a 30" arm length...).
And, of course, it's not implemented anywhere. So we all suffer with "dpi" and "px".
That's fun. Obviously that's not how it works, 1px is 1 physical pixel everywhere I've ever seen. Think of how much would break if something decided to do this "right" ...
And, of course, it's not implemented anywhere. So we all suffer with "dpi" and "px".