>Scientific logic is proving things by losslessly compressing statements to their axioms. Commonsense logic uses lossy compression, which makes it less accurate in edge cases, but also less brittle, more efficient, further reaching and more stable in most real-world situations.
Knowledge includes insight into the why part of the mechanism - why does the protein behave in this way? This can lead to generalizations which go beyond answering different questions of the same sort (such as "what about this protein then") to questions of a different form that have answers underpinned by the mechanism. For example, "how does that structure evolve over time?" this is closely related to the ability to make analogies using the knowledge - "if proteins react in that way within their own molecule then when they meet another molecule they should react this way". Also the knowledge only becomes knowledge when it's in the framework that "can know" which is to say that the thing using it can handle different questions and can decide to create an analogy using other knowledge. For Alphafold2 that framework is Deepmind, but of course I don't know enough to know if they and it can know things about proteins in the way I described or if they "just" have a compressed form of the solution space. I suspect the latter.
Being able to extrapolate beyond mere variations of the training data.
EDIT: A simpler example might be helpful. We could, for example, train a network to recognize and predict orbital trajectories. Feed it either raw images or processed position-and-magnitude readings, and it outputs predicted future observations. One could ask, "does it really understand orbital mechanics, or is it merely finding an efficient compression of the solution space?"
But this question can be reduced in such a way as to made empirical by presenting the network with a challenge that requires real understanding to solve. For example, show it observations of an interstellar visitor on a hyperbolic trajectory. ALL of its training data consisted of observations of objects in elliptical orbits exhibiting periodic motion. If it is simply matching observation to its training data, it will be unable to conceive that the interstellar visitor is not also on a periodic trajectory. But on the other hand if it really understood what it was seeing then it would understand (like Kepler and Newton did) that elliptical motion requires velocities bounded by an upper limit, and if that speed is exceeded then the object will follow a hyperbolic path away from the system, never to return. It might not conceive these notions analytically the way a human would, but an equivalent generalized model of planetary motion must be encoded in the network if it is to give accurate answers to questions posed so far outside of its training data.
How you translate this into AlphaFold I'm not so certain, as I lack the domain knowledge. But a practical ramification would be the application of AlphaFold to novel protein engineering. If AlphaFold lacks "real understanding", then its quality will deteriorate when it is presented with protein sequences further and further removed from its training data, which presumably consists only of naturally evolved biological proteins. Artificial design is not as constrained as Darwinian evolution, so de novo engineered proteins are more likely to diverge from AlphaFold's training data. But if AlphaFold has an actual, generalized understanding of the problem domain, then it should remain accurate for these use cases.
I mean perhaps I am not entirely sure myself. I imagine that the solution space to this problem is some very complicated, lets say algebraic variety/manifold/space/configuration space, but obviously it is still low enough dimension it can be sort of picked out nicely from some huge ambient space.
For example specific points on this object are a folded proteins. I suppose then it is how well does this get encoded, does it know about "properties" of this surface, or is it more like a rough kind of point cloud because you have sampled enough and then it does some crude interpolation. But maybe that does not respect the sort of properties in this object. Maybe there are conservation laws, symmetry properties, etc which are actually important, and then not respecting that you have just produced garbage.
So I think it is important to know what kind of problem you are dealing with. Imagine a long time scale n-body problem with lots of sensitivity. Maybe in a video game it doesn't matter if there is something non physical about what it produces, as long as it looks good enough.
Maybe this interpolation is practical for its purpose.
But I think we should still be careful and question what kind of problem it is applied to perhaps. Maybe it's more like a complexity vs complicated question.
> does it know about "properties" of this surface, or is it more like a rough kind of point cloud because you have sampled enough and then it does some crude interpolation
Say that there existed some high-level property such as "conservation of energy". A "knowledge system" which learns about that property would be able to answer any questions related to it after reducing to a "conservation of energy" problem. Is the same true for NNs? The way folks talk about them, they sound like they can compress dynamically, and would therefore be able to learn and apply new high-level properties.
Also, do NNs have "rounding errors"? We have confidently learned that energy is conserved, but would NNs which never had that rule directly encoded understand conservation as "exactly zero", or "zero with probability almost 1", or "almost zero"?
I think it is fine if it is "effective". Really most of our physics is effective. So valid at a certain length scale. Fluid mechanics is very good, but it does not describe it all in terms of quark interactions. Quantum field theories are also mostly effective. So as long as it is describing protein dynamics at some effective length scale that is fine. Obviously it does not know anything about quarks/electrons/etc etc.