Not really. The only real example given in the article is when you hook someone up to an fMRI machine, collect data about how the brain looks when it sees a certain image, and then have a computational statistics program (NOT an artificial brain, in any sense) do some number crunching and output the most likely thing it's looking at based on things you specifically trained it to recognize beforehand. We learn precisely nothing from this, no medical or computer science advances are made from it, and it doesn't remotely support the title of the article.
I think the Economist article is exactly right - that despite the massive differences between ANNs and the brain, ANNs are indeed highly suggestive of how some aspects of the brain appear to work.
People can criticize the shortcomings of GPT-4, but it's hard to argue that it's at least capable of some level of reasoning (or functionally equivalent if you object to that word!). It's not yet clear exactly how a Transformer works other than at mechanical level of the model architecture (vs the LLM "running on" the architecture), but we are at least starting to glean some knowledge of how the trained model is operating...
It seems that pairs of attention heads in consecutive layers are acting in coordination as "induction heads" that in one case are performing a kind of analogical(?) A'B' => AB match-and-copy type of operation. The induction head causes a context token A to be matched (via "attention" key query) with an earlier token A' whose following token B' then causes related token B to be copied to the residual stream at position following A.
This seems a very basic type of operation, and no doubt there's a lot more interpretability research to be done, but given the resulting reasoning/cognitive power (even in absense of any working memory or looping!), it seems we don't need to go looking for overly complex exotic mechanisms to begin to understand how the cortex may be operating. It's easy to imagine how this same type of embedded key matching might work in the cortex, perhaps with cortical columns acting as complex pattern matchers. Perhaps the brain's well known ~7 item working memory corresponds to a "context" of sorts that is updated in same way as induction heads update the residual stream.
Anything I've written here about correspondence between transformer and cortex is of course massive speculation, but the point is that the ANN's operation does indeed start to suggest how the brain, operating on similar sparse/embedded representations, may be working.
It probably doesn't use backpropagation of gradients. Instead, the cortex appears to be a prediction engine that uses error feedback (perceptual reality vs prediction) to minimize prediction errors in a conceptually similar type of way. If every "layer" (cortical patch) is doing it's own prediction and receiving feedback, then you don't need any error propagation from one layer to the next.
Sure, we don't know the exact details (Geoff Hinton spent much of his career trying to answer this question), but at the big picture level it does seem clear that the cortex is a prediction engine that minimizes prediction errors by feedback, and most likely does so in a localized way. Exactly how these prediction updates work is unknown.
Could you expand a bit on how you think simulated annealing could work?
Hum... I have no model for what actually happens. It's just that simulated annealing appears quite often on the biologic world, we never found any coordination communication channel that could create something more "intelligent" (AFAIK, there are just a few "neighborhood to global" analog signals available), and there isn't a multiplicity of attempts interacting with each other (again, there is no channel for that) that would lead to a parallel evolutionary strategy.
Also, when we learn a mechanical task, our errors tend to cool down around our successes, instead of following a direction.
But again, I have no idea of any mechanism that is there. It's more because all the others seem to have been eliminated.
This is just false. Outside of the visual cortex there isn't any evidence that brains work anything like GPT-4 or neural nets. Producing the same outputs isn't evidence of anything.
At best you have just stated a hypothesis of how the brain might work, not any actual evidence supporting it.
The response is a bit knee-jerky and emotional. A lot of people find it upsetting that human thinking can be approximated by AI.
The demands for proof stem from this emotional response and a misunderstanding of how science actually works. There is plenty of evidence and theories. Most of that evidence is of an empirical nature. It suggests that GPT 4 can do some interesting things and that artificial neurons cluster and fire in ways similar to those in real brains. That's a theory that is backed by some of this evidence. And it is of course not completely accidental because that sort of was the intention by those that constructed the neural networks. You could say that neural networks apparently work as intended.
The scientific way to dismiss that theory would be proving it wrong. That's what science does: gather evidence and facts, come up with theories that explain those, and then try to find proof that counters those theories and explanations. And then you replace them with better ones. Falsifying theories is how science move forward. You don't prove them right but you fail to prove them wrong. Insisting something is false without doing that is very unscientific.
Human brains are more than just neurons of course. The article actually calls that out. There's a lot of chemistry in our brains that directly controls what it does. That's why people enjoy taking certain drugs; those literally change the way our brain operates. Coffee is a drug that many people find useful. Learning especially is associated with endorphins. You get a little endorphin rush when you figure something out. Some people like this so much that they become scientists.
Artificial neural networks don't really model any of that. But nobody is saying that they are the same; just that they do similar things in similar ways; as can be observed via experiments and the use of MRI scanners. We don't really understand why that is but the similarity is easily observed and a valid theory is that an ANN captures enough of the complexity of a brain to be able to do interesting things. Which is of course backed up by plenty of empirical evidence in the form of people having used GOT-4. People seem to struggle to articulate what it is that is missing exactly that would prove that theory wrong. Lots of people that want that to be wrong, not a lot coming up with better theories. But of course some scientists are working on that and the prospect of them figuring this out is what truly scares some people.
> The scientific way to dismiss that theory would be proving it wrong.
That's completely the opposite of what is supposed to happen: You are supposed to provide evidence that it is right, not assume it is right until proven wrong.
> just that they do similar things in similar ways
Completely baseless. Neural nets can be trained on all kinds of phenomenon, the weather for example. No one says that the weather works like a neural net. For the same reason there is no reason to say (without specific evidence) that the brain works like a neural net despite producing the same output.
> People seem to struggle to articulate what it is that is missing exactly that would prove that theory wrong.
Again there is no evidence that it is right. When there is no evidence that it is right there is no burden to show that it is wrong.
Sure - you can't just look at how an ANN works and assume that's how the brain does that too, but the ANN's operation can act as inspiration to suggest or confirm the way the brain might be doing something.
It seems neuroscientists are good at discovering low level detail and perhaps not so good in general (visual cortex being somewhat of an exception) at putting the pieces together to suggest high level operations. ANNs seem complementary in that while their low level details are little like the brain, the connectionist architectures can be comparable, and we do know the top down operation (even though interpretation is an issue, more for some ANNs than others). If we assume that the cortex is doing some type of prediction error minimization then it's likely to have found similar solutions to an ANN in cases where problem and connectivity are similar.
Again this is just a hypothesis and nothing has come from it despite the fact that it is very old. There isn't any good reason for it to be true and no one has done anything to show that it is.
So why do you believe that the visual cortex works similarly to an ANN, but the rest of the cortex doesn't ?!
Of course we know that CNNs and the visual cortex both learn similar low level orientated-line feature detectors, etc, so there is evidence of them operating in similar fashion.
The thing is, the entire cortex is a very regular structure with the same 6-layer architecture across all areas and same thalamo-cortical loop connectivity. So, it's very unlikely that one area of cortex is operating in a fashion much different to any other area. If you're willing to accept that the visual cortex has close parallels to a CNN in terms of how it is operating, then that is highly suggestive that the abstraction of a CNN/ANN does indeed capture the essence of at least some aspects of cortical operation.
Nobody is saying that the brain and ANNs are exactly the same, but there seems to be close enough correspondence to at least some types of ANN architecture that lessons learned there can inspire understanding of the latter.
Another example - before seq-2-seq neural net language translation, would you have believed that an a variable length sentence could have it's meaning captured by a fixed size sentence embedding ? Intuition might have suggested that some complex structured variable length representation would be needed, but now we know that's not the case, and it helps understand how similar representations are likely also used in the brain.
> So why do you believe that the visual cortex works similarly to an ANN, but the rest of the cortex doesn't ?!
Because there are actual (unethical on humans) experiments on cats (which have similar visual systems to humans) so we know some things about how the visual system works. There is nothing equivalent for the rest of the mind.
> Nobody is saying that the brain and ANNs are exactly the same, but there seems to be close enough correspondence to at least some types of ANN architecture that lessons learned there can inspire understanding of the latter.
Again this is a hypothesis, the burden is on the person making that claim to provide evidence.
> Another example - [...]
This is a much worse example because we specifically know that LLMs don't work the way the human language faculty do because they can "learn" languages that are impossible for humans.
You completely ignored the point about the uniform nature of the cortex. It doesn't make sense to say say "well, ok, maybe this bit works like an ANN, but the rest works differently".
> we specifically know that LLMs don't work the way the human language faculty do
No we don't, because we still don't know how human language works. People are still debating Chompsky innate-ism. Again, LLM processing of language offers interesting clues - not just embeddings, but also the fact that language can be learnt with a very simple architecture (transformer) with zero support for "universal grammar" or any such priors.
You seem to be missing the spirit of this whole conversation and the original linked article - it's not about claiming/hypothesizing that the brain works the same as any and all ANN architectures or principles, but rather that one can draw inspiration from ANNs to develop theories and intuitions (which still need to be tested) for how the brain may be working.
> You completely ignored the point about the uniform nature of the cortex. It doesn't make sense to say say "well, ok, maybe this bit works like an ANN, but the rest works differently".
The "uniform nature of the cortex" implies far too much. We know that dramatically different processes happen in different parts of the brain. That needs to be explained and it's where all the interesting questions are. You might as well say every part of the body is made of cells so they all work the same. That might be true at some level of abstraction but it's completely useless.
> No we don't, because we still don't know how human language works.
We do know some ways they don't work: namely neural nets because it can be shown that neural nets can learn languages that human cannot.
> People are still debating Chompsky innate-ism. Again, LLM processing of language offers interesting clues - not just embeddings, but also the fact that language can be learnt with a very simple architecture (transformer) with zero support for "universal grammar" or any such priors.
This tells you nothing about the language faculty just as building a neural net to predict the weather tells you nothing about how the weather operates.
> You seem to be missing the spirit of this whole conversation and the original linked article - it's not about claiming/hypothesizing that the brain works the same as any and all ANN architectures or principles, but rather that one can draw inspiration from ANNs to develop theories and intuitions (which still need to be tested) for how the brain may be working.
People have been making claims that connectionist models will lead to some insight for decades and there isn't much to show for it. No one cares where someone derives their inspiration for their hypotheses, come back when you have some result to show.
> The "uniform nature of the cortex" implies far too much. We know that dramatically different processes happen in different parts of the brain. That needs to be explained and it's where all the interesting questions are.
Different areas of the cortex are far less specialized than you seem to think. We process language via vision (sign language) just as readily as via hearing. We process vision just as readily via touch (experimental transducer array) as via sight.
> No one cares where someone derives their inspiration
> Different areas of the cortex are far less specialized than you seem to think. We process language via vision (sign language) just as readily as via hearing. We process vision just as readily via touch (experimental transducer array) as via sight.
This is just false, you don't know what you are talking about. It's been known for some time that sign language is processed in the same parts of the brain as spoken language.
I'm telling you you are wrong and don't know anything.
You don't even understand the link you posted which says the exact opposite of what you are claiming: That sign language uses Broca's area (the part of the brain believed to be connected to language) just as spoken and written language do.
Isn’t the hypothesis the point though? If you have existential evidence of a logical phenomenon occurring in a machine, then it’s worth directing your study to look for a similar (though not necessarily the exact) mechanism in organic matter, because logic is independent of the anatomy and physiology of the machine/organism performing it? Besides, the propagation of 1’s and 0’s in computers has a very strong parallel with, say, how the brain transmits action potential to skeletal muscle, for example.
> Outside of the visual cortex there isn't any evidence that brains work anything like GPT-4 or neural nets.
Are you sure? If you had to pick a number what would you pick?
I have never built a useful neural network myself so I can't speak confidently, but I've read enough McCullough & Pitts, Cybernetics, Minsky, et cetera, to have the knowledge that there's a direct connection between a multitude of lab experiments done in the early 1900's to quantify the behavior of neurons in various animals, and then develop the principle ideas of ANNs from those findings.
Of course, there are hundreds, if not thousands or more of "components" in the brain, and I'm not sure how many of those components we have digital analogs for. But I feel like GPT-4 makes me think we've crossed the 10% threshold.
Human speech processes seem similar to how LLMs work. There are brain injuries (Korsakoff's syndrome) that cause confabulation, which is like ChatGPT hallucinating the answer to questions.
OK you got me there I guess, but even learning that we've learned nothing is learning in itself. Otherwise the act of compressing wouldn't give any more information.
Knowing that, say, some file contains a sequence of zeros you can run through RLE is information. That part is usually not part of the information content calculation, but without it you can't do compression because the file could just be 100% entropy. Ergo it's useful information.
That was sort of my original point anyway, knowing you don't know something makes it a known unknown instead of an unknown unknown. Or in this case a known known.
Oversimplifying for brevity (and there is definitely more nuance to this). This is basically the modeling approach:
1. Have a biological brain do a task, record neuronal data + task performance
2. Copy some of those biological features and implement in an ANN
3. Tune the many free parameters in the ANN on task performance
4. Show that the bio-inspired ANN performs better than SOTA and/or shows "signatures" that are more brain-like.
The major criticisms of Yamins' (and similar) groups are either that correlation != causation, or correlation != understanding, or that it is tautological (bio-inspired ANNs will be more biological). I'm not sure how seriously this work is taken vs. true first principles theory.
Yes indeed. Attempts to simulate human neurons have shown that a single neuron can be simulated with a realtively large ANN consisting of several layers. This tells us that human neurons are more computationally complex and capable compared to other animals and orders of magnitude more complex than neurons in ANNs.
For me the most interesting parallel is from (I think) GANs, and other generative AIs. This is similar to the idea in psychology that we are really doing a lot of projection with some correction based on sensory input - as opposed to actually perceiving everything around us.
Also, real synapses are one of the most abundant features of real brains and are the direct inspiration for NN weights. I'm not sure the artificial brains help understand real ones, but they do seem to validate some ideas we have about real ones.
Called the "bare infinitive". Use your favorite search engine, many sites will explain it better than I can.
Short: "Help is a verb that can be used with or without to and with or without an object before the infinitive. When we use it without an infinitive it sometimes sounds more informal." (from Bare infinitive - Learning English | BBC World Service)
scientific metaphors have been useful for science on all fields, it doesn't mean they are accurate or anything, they just help you think about a thing in a better way
There's this very fundamental problem in a lot of sciences, given some phenomena, find the patterns to compress the phenomena without knowing all the patterns a priori.
It's like wavelet decomposition where your wavelets are updates as new data is coming in.
The fundamental problem is that you keep coming up with random bullshit to post on HN and you keep inviting people to your discord basement to discuss it.