I recently started reading Jeff Hawkins’s On Intelligence, and it reminded me of how computer folk like hierarchies. Hawkins’s premise is that the neocortex of the human brain learns with hierarchies. Hierarchies are used for memory and for prediction. I really like this book. However, I think the description of the neocortex in terms of hierarchies is misleading. He writes:
What do I mean by a nested or hierarchical structure? Think about music. Notes are combined to form intervals. Intervals are combined to form melodic phrases. Phrases are combined to form melodies or songs. Songs are combined into albums. Think about written language. Letters are combined to form syllables. Syllables are combined to form words. Words are combined to form clauses and sentences.
I agree that there is hierarchical structure, but it is in the individual occurrences of those notes, words, and phrases—not in the general classes of them. I think it is unlikely that the brain separately learns the letter “
h” in the word
hotel and the word
home. Instead, a single part of the brain learns the concept for
h, and it feeds into both the concept for
hotel and the concept for
home. This structure would have multiple levels, just as a hierarchy has, and it would represent small features at the bottom and large objects at the top, just like a hierarchy, but each node would not feed only (or primarily) into one parent node:
Hawkins in fact writes that lower levels are used by multiple high level objects:
In a complementary bit of efficiency, representations of simple objects at the bottom of the hierarchy can be reused over and over for different high-level sequences. For instance, we don’t have to learn one set of words for the Gettysburg Address and a completely different set for Martin Luther King’s “I Have a Dream” speech, even though the two orations contain some of the same words.
and he writes that there are a lot of connections from outside the main hiearchical pathway:
On close inspection, we see that at least 90 percent of the synapses on cells within each column come from places outside the column itself. Some connections arrive from neighboring columns. Others come from halfway across the brain.
So Hawkins does see something other than a hierarchy. I thought perhaps that he was using the term “hierarchy” differently than the usual meaning, but the diagrams in the book reinforce the tree structure: every node has a parent; every parent has multiple children.
How this “sharing” occurs, I think, is a key element of generalizing to learn abstractions. It’s not just an efficiency trick. I think it is likely that the brain learns specific instances (which form a hierarchy) first, but then generalizes that to classes of objects. The specific instances might be forgotten later (perhaps this is the role of sleep). Once generalized into classes, the relationships no longer form a hierarchy. Instead it’s probably a partially acyclic graph structure, where there are cycles within each layer, but no cycles between layers. I’m really just guessing here though.
I’m really enjoying this book. Viewing the brain as a prediction machine explains many things I’ve wondered about. For example, take a look at the blurry word on this page, and then take a look at it in context. In context, you can tell what the word is. That makes sense if the brain is predicting things.
My only complaint about this book is that the diagrams in the book and the use of the word “hiearchy” suggest a tree structure, but I think it would be more realistic to look at it as a graph. Yes, graphs are messy, and trees are beautiful, but just because we computer scientists like playing with them doesn’t mean they’re there.