What economic growth and
statistical semantics tell us
about the structure of the world
A Working Paper • August 24, 2020
William L. Benzon
What economic growth and statistical
semantics tell us about the structure of the
world
William Benzon • August 24, 2020
Abstract: The metaphysical structure of the world, as opposed to its physical
structure, resides in the relationship between our cognitive capacities and the
world itself. Because the world itself is “lumpy”, rather than “smooth” (as
developed herein, but akin to “simple” vs. complex”), it is learnable and hence
livable. Machine learning AI engines, such as GPT-3, are able to approximate the
semantic structure of language, to the extent that that structure can be modeled
in a high-dimensional space. That structure ultimately depends on the fact that
the world is lumpy. It is the lumpiness that is captured in the statistics. Similarly, I
argue, the American economy has entered a period of stagnation because the
world is lumpy. In such a world good “ideas” become more and more difficult to
find. Stagnation then reflects the increasing costs the learning required to develop
economically useful ideas.
CONTENTS
Wending our way in a complex world.............................................................................................. 2
World, mind, and learnability: On the metaphysical structure of the cosmos ................................. 5
Stagnation, Redux: Like diamonds, good ideas are not evenly distributed ................................... 10
The complex universe: Further reading ......................................................................................... 18
1301 Washington St., Apt. 311
Hoboken, NJ 07030
646.599.3232
bbenzon@mindspring.com
This work is licensed under a Creative Commons Attribution-Share Alike 4.0 Unported License.
1
Wending our way in a complex world
This paper is based on two very different posts that I’ve written in the last month. One of them
takes the statistical semantics of AI engines like GPT-3 as its starting point: “World, mind, and
learnability: On the metaphysical structure of the cosmos” (revised considerably for this paper).
The other is about economic growth and stagnation: “Stagnation, Redux: Like diamonds, good
ideas are not evenly distributed”.
These two very different papers nonetheless share both substance and method. Methodologically,
both argue that the situation we observe is intelligible if we assume that the world is structured in
a certain way. Their core substance is about that structure: the world must be “lumpy” – a notion
I discuss on pages 5 ff. Because the world is lumpy we can learn about it, live in it, talk about it,
and write about it. By contrast, a “smooth” world would be unintelligible and hence unlivable.
The exhaustive statistical analysis of a large body of text is, in effect, able to recover that structural
lumpiness as reflected in language and use it to produce new texts.
However, because the world is lumpy, we begin by learning and benefiting from things close to
hand. When those resources have been exhausted we and must travel deeper into the world,
expending more and more effort to extract economic benefit. Our current economic stagnation
reflects the increasing cost of learning more about the world. A smooth world would no doubt be
more convenient, for there would be economic benefit at every turn, either that or economic
disaster. That is, if the world were smooth, we wouldn’t be here.
That, I know, this talk of smoothness and lumpiness is very abstract and “featureless”. But then
could a resonance between such disparate phenomena as statistical semantics and economic
stagnation but be abstract? I’ll provide more substance later in this paper, some diagrams, and
some arguments. But first I want to suggest that what I’ve been calling lumpiness is what Ilya
Prigogine and many others have called complexity.
Our complex world
Some years ago David Hays and I wondered why natural selection leads to complexity.1 We
argued that, over the long run, natural selection favors organisms with increased ability to process
information, and that ability yields benefits in a complex universe. But what did we mean by that,
a complex universe? Here is what we said:
It is easy enough to assert that the universe is essentially complex, but what does that
assertion mean? Biology is certainly accustomed to complexity. Biomolecules consist of
many atoms arranged in complex configurations; organisms consist of complex
arrangements of cells and tissues; ecosystems have complex pathways of dependency
between organisms. These things, and more, are the complexity with which biology
must deal. And yet such general examples have the wrong “feel;” they don't focus one's
attention on what is essential. To use a metaphor, the complexity we have in mind is a
complexity in the very fabric of the universe. That garments of complex design can be
made of that fabric is interesting, but one can also make complex garments from simple
fabrics. It is complexity in the fabric which we find essential.
We take as our touchstone the work of Ilya Prigogine, who won the Nobel prize for
demonstrating that order can arise by accident (Prigogine and Stengers 1984; Prigogine
1
The abstract and full reference for our article is on p. 18.
2
1980; Nicolis and Prigogine 1977). He showed that when certain kinds of
thermodynamic systems get far from equilibrium order can arise spontaneously. These
systems include, but are not limited to, living systems. In general, so-called dissipative
systems are such that small fluctuations can be amplified to the point where they change
the behavior of the system. These systems have very large numbers of parts and the
spontaneous order they exhibit arises on the macroscopic temporal and spatial scales of
the whole system rather than on the microscopic temporal and spatial scales of its very
many component parts. Further, since these processes are irreversible, it follows that
time is not simply an empty vessel in which things just happen. The passage of time,
rather, is intrinsic to physical process.
We live in a world in which “evolutionary processes leading to diversification and
increasing complexity” are intrinsic to the inanimate as well as the animate world
(Nicolis and Prigogine 1977: 1; see also Prigogine and Stengers 1984: 297-298). That
this complexity is a complexity inherent in the fabric of the universe is indicated in a
passage where Prigogine (1980: xv) asserts “that living systems are far-from-equilibrium
objects separated by instabilities from the world of equilibrium and that living organisms
are necessarily ‘large,’ macroscopic objects requiring a coherent state of matter in order
to produce the complex biomolecules that make the perpetuation of life possible.” Here
Prigogine asserts that organisms are macroscopic objects, implicitly contrasting them
with microscopic objects.
That is to say, the world has two regimes, the macroscopic and the microscopic, and they differ in
more than size:
Prigogine has noted that the twentieth century introduction of physical constants such
as the speed of light and Planck's constant has given an absolute magnitude to physical
events (Prigogine and Stengers 1984: 217-218). If the world were entirely Newtonian,
then a velocity of 400,000 meters per second would be essentially the same as a velocity
of 200,000 meters per second. That is not the universe in which we live. Similarly, a
Newtonian atom would be a miniature solar system; but a real atom is quite different
from a miniature solar system.
Physical scale makes a difference. The physical laws which apply at the atomic scale,
and smaller, are not the same as those which apply to relatively large objects. That the
pattern of physical law should change with scale, that is a complexity inherent in the
fabric of the universe, that is a complexity which does not exist in a Newtonian universe.
At the molecular level life is subject to the quantum mechanical laws of the microuniverse. But multi-celled organisms are large enough that, considered as homogeneous
physical bodies, they exist in the macroscopic world of Newtonian mechanics. Life thus
straddles a complexity which inheres in the very structure of the universe.
That, then, is at the heart of what I mean when I talk of the world as being lumpy rather than
smooth. The world is complex, not in the sense that it has many parts intricately arranged, but
that, over the long run, the interplay of physical laws tends toward proliferation and diversity of
objects. Long term evolutionary dynamics are inherent in the structure of physical law.
With that as background, we are just about ready to plunge into the argument. But I want to
address one last issue.
3
What does this have to do with metaphysics?
Metaphysics as a branch of philosophy, and this paper does not look like philosophy as it is
currently taught in colleges and universities, whether in the Anglo-American analytic tradition or
the more discursive and literary Continental tradition. Those are specialized academic disciplines.
This paper is something else. Speculative, yes, but speculation is search of broad understanding.
Here and there a variety of thinkers have suggested that there is a lot of philosophy around and
about that’s philosophical in an older more synthetic sense of the word. There was a time – not all
that long ago – when philosophers tried to synthesize human knowledge over a wide range of
topics and make sense of the world as a whole. That enterprise was quite different from academic
philosophy of whatever kind, which is much narrower in scope.
Eric Schliesser, for example, has recently argued for synthetic philosophy:
By ‘synthetic philosophy’ I mean a style of philosophy that brings together insights,
knowledge, and arguments from the special sciences with the aim to offer a coherent
account of complex systems and connect these to a wider culture or other philosophical
projects (or both). Synthetic philosophy may, in turn, generate new research in the
special sciences, a new science connected to the framework adopted in the synthetic
philosophy, or new projects in philosophy. So, one useful way to conceive of synthetic
philosophy is to discern in it the construction of a scientific image that may influence the
development of the special sciences, philosophy, public policy, or the manifest image.2
He is reviewing two books, From Bacteria to Bach and Back: The Evolution of Minds, by Daniel Dennett,
and Other Minds: The Octopus and The Evolution of Intelligent Life, by Godfrey Smith. Both men are
analytic philosophers who have published a variety of technical papers. But neither of those books
is a work of technical analytic philosophy.
Rather, they are works of synthetic philosophy that bring together work from a wide variety of
intellectually specialized sources, many of then in the various sciences, into a single intellectual
framework. I think that many works written by academic specialists for a general audience are
thus works of synthetic philosophy. Some other examples that come to mind: Jared Diamond,
Guns, Germs, and Steel: The Fates of Human Societies; Steven Pinker, How the Mind Works; Steven
Mithen, The Singing Neanderthals: The Origins of Music, Language, Mind and Body, and my own,
Beethoven’s Anvil: Music in Mind and Culture.
My purpose in this working paper isn’t that grand. I am not proposing a synthesis that embraces
both the economics of growth and computational semantics. Rather, I am arguing that,
considered at the proper level of abstraction, both of those disciplines encounter similar issues. It’s
not that I think that these disciplines can be put to work solving one another’s problems but rather
that both give us parallel and congruent insights into the nature of the world. Ultimately they are
coping with the same thing, how we live in a complex world.
2
Eric Schliesser, Synthetic Philosophy, Biology & Philosophy, Vol. 34: 19, 2019,
https://doi.org/10.1007/s10539-019-9673-3.
4
World, mind, and learnability: On the metaphysical
structure of the cosmos
Statistical methods have been very successful in constructing AI engines in the last two decades or
so, with GPT-33 as a current example that is receiving a great deal of attention. The engine learns
the statistical structure of a large corpus of texts and then, on the basis of that model, is able to
perform natural language tasks, such as translation from one language to another, or, in the case
of GPT-3, composing texts of some length that convincingly mimic human performance (among
many other tasks). The question I’m trying to answer is this: What is it about language such that a
sophisticated statistical model is able to generate convincing texts?
I don’t have an answer to that question, but I wrote a working paper in which I sketched out a
framework in which someone (with more mathematical skills than I have) could formulate an
answer to that question.4 Let us start obliquely, with metaphysics and learnability, and only then
move to language and texts.
Metaphysics and learnability
When I talk of the metaphysical structure of the world I talk of something that inheres in the
relationship between us, homo sapiens sapiens, and the world around us. It is because of this
metaphysical structure that the world intelligible to us. What is the world that it is perceptible,
that we can move around in it in a coherent fashion? Whatever it is, since humans did not arise de
novo, that metaphysical structure must necessarily extend through the animal kingdom5 and, who
knows, plants as well.
Imagine, in contrast, that the world consisted entirely of elliptically shaped objects. Some are
perfectly circular, others only nearly circular. Still others seem almost flattened into lines. And we
have everything in between. In this world things beneficial to us are a random selection from the
full population of possible elliptical beings, and the same with things dangerous to us. Thus there
are no simple and obvious perceptual cues that separate good things from bad things. A very good
elliptical being may differ from a very bad being in a very minor way, difficult to detect.
Call that a smooth world (see Figure 1A below), where smoothness is understood to be
metaphysical in character rather than physical. A smooth world would be all but impossible to
live in because the things in it are poorly distinguishable. Ours is a lumpy world (see Figure 1B
below), but in a metaphysical sense rather than a physical sense. Things have a wide variety of
distinctly different (metaphysical) shapes and come in many different sizes.6
Imagine a high dimensional space in which objects are arranged according to their metaphysical
“shape” or “conformation”. In a smooth world all the objects are concentrated in a single
compact region in the space because they are pretty much alike. In a lumpy world objects are
distributed in clumps throughout the space. Each clump consists of objects having similar
metaphysical conformations but the clumps are relatively far apart because objects belonging to
3
Tom B. Brown, Benjamin Mann, Nick Ryder, et al. Language Models are Few-Shot Learners,
arXiv:2005.14165v4 [cs.CL] 5 June 2020, p. 8. (https://arxiv.org/abs/2005.14165v4)
4 GPT-3: Waterloo or Rubicon? Here be Dragons, full reference and abstract on p. 19.
5 For a discussion of the ‘fit’ between the mental capacities of animals and the world, see “Principles and
Development of Natural Intelligence”, abstract and complete reference on p. 18.
6 “Smooth” and “lumpy” correspond to “simple” and “complex” as I have used those terms earlier (pp. 2
ff.). I prefer the more informal terms because they are more suggestive.
5
different clumps resemble one another more than any of them resembles an object in another
clump.7
Figure 1A: Smooth World
Figure 1B: Lumpy World
The objects in the smooth world are differentiated along two (physical) dimensions, call them
height and width. We can easily imagine a third dimension, orientation, giving us a cube of
ellipsoids. The lumpy world has five clusters of objects. The objects within each cluster can be
differentiated along two or three, perhaps, four dimensions for each cluster. But how do we
differentiate among the clusters? What dimensions do we need for that?
Here I am using the notion of dimension in the sense that Peter Gärdenfors uses it in his
geometric semantics. 8 He talks of semantic spaces in which objects are differentiated from one
another along a small number of dimensions and applies this idea, not just to physical properties,
such as color, but to objects, actions, propositions, and grammatical categories. It is a very rich
and general conception.
So, we live in a lumpy world. Yes, there are cases where small differences are critical. But they
don’t dominate. Our world is intelligible. Plants are distinctly different from animals, tigers from
mice, oaks from petunias, rocks and water are not at all alike, and so on. It is thus possible to
construct a conceptual system capable of navigating in the external world so as to preserve and
even enhance the integrity of the internal milieu. Gärdenfors’ conceptual spaces capture the
variety in a lumpy world in a way that nervous systems can compute over it.
The world’s learnability is not something that can be characterized in a reductionist fashion.
Whatever it is that makes, for example, a trail in the forest learnable, cannot be characterized
simply in terms of the chemical structure of things in the forest, or their mechanical properties.
Learnability can only be characterized in relation to something that is learning. In particular, for
us, a central aspect of the world’s metaphysical structure lies in the correspondence between
language and the world.
Figures 2 through 7 (pp. 13 to 1616) may also be taken as illustrating the difference between a
smooth and a lumpy world. Figures 2, 3, and 4 show a smooth world while Figures 5 and 6 show
a lumpy world. Note that Figure 5 shows a definite point of origin (for the evolution of
humankind). The spaces in those figures are, of course, to be construed metaphysically rather
than physically.
8 Peter Gärdenfors, Conceptual Spaces: The Geometry of Thought, MIT Press, 2000; The Geometry of
Meaning: Semantics Based on Conceptual Spaces, MIT Press, 2014.
7
6
“This is most peculiar,” you might remark. “Tell me again, how does this metaphysical structure
of the world differ from the world’s physical structure?” I will say, provisionally, that it is a matter
of intension rather than extension. Extensionally the physical and the metaphysical are one and the
same. But intensionally, they are different. We think about them in different terms. We ask
different things of them. They have different conceptual affordances. The physical world is
meaningless; it is simply there. It is in the metaphysical world that we seek meaning.
A brief dialogic interlude
Does this make sense, philosophically?
How would I know?
I get it, you’re just making this up.
Right.
Hmmmm… How does this relate to that object-oriented ontology stuff you
were so interested in a couple of years ago? 9
Interesting question. Why don’t you think about it and get back to me.
I mean, that metaphysical structure you’re talking about, it seems
almost like a complex multidimensional tissue binding the world
together. It has a whiff of a Latourian actor-network about it.
Hmmm… Set that aside for awhile. I want to go somewhere else.
Still on GPT-3, eh?
You got it.
Text as the interplay of the world and the mind
Text reflects this learnable, this metaphysical, structure, albeit at some remove:
See, for example, William Benzon, Living with Abundance in a Pluralist Cosmos: Some Metaphysical
Sketches, Working Paper, January 2013, 87 pp.,
https://www.academia.edu/4066568/Living_with_Abundance_in_a_Pluralist_Cosmos_Some_
Metaphysical_Sketches.
9
7
AI learning engines are modeling structure inherent in the text. That structure exists because it
was written by people engaged with and making sense of the world as they wrote. If they had
been spinning out language in a random fashion there would be no structure in the text for the
engine to learn, no matter how much such text it traversed. The structure thus learned is not,
however, explicit in the statistical model created by the engine, as it would be in the operations of
a “classical” propositional/symbolic model. 10
There are two things in play: 1) the fact that the text is learnable, and 2) that it is learnable by a
statistical process. How are these two related?
If we already had a rich and robust explicit symbolic model in computable form, then we wouldn’t
need statistical learning at all. We could just run the symbolic model over the corpus and encode
the result. But why do even that? If we can read the corpus with the propositional model in a
simulation of human reading, then there’s no need to encode it at all. Just have the computer read
whatever aspect of the corpus is needed at the time.
We can thus statistical learning is a substitute for the lack of a usable propositional model. The
statistical model does work, but at the expense of explicitness.
10
Both artificial intelligence (AI) and computational linguistics began with the construction of painstakingly
hand-crafted symbolic models of mental processes. This continued well into the 1980s, but then became
supplanted by statistical techniques, techniques which have achieved greater success in the last two or three
decades. But research has continued on symbolic systems. For some remarks comparing the achievements
of the two approaches see, Martin Kay, A Life of Language, Computational Linguistics, Volume 31 Issue 4,
December 2005, pp. 425-438,
http://web.stanford.edu/~mjkay/LifeOfLanguage.pdf. I discuss this in GPT-3: Waterloo or Rubicon? Here be
Dragons, full citation and abstract on p. 19.
8
But why does the statistical model work at all? That’s the question. It’s not enough to say, because
the world itself is learnable. That’s true for the propositional model as well. Both work because
the world is learnable.
Description, narration, computation, and time
But… Humans don’t learn the world with a statistical model. We learn it through a propositional
engine floating over an analogue or quasi-analogue engine with statistical properties.11 And it is
the propositional engine that allows us to produce language. A corpus is a product of the linguistic
activity of a propositional engine, not a statistical model, acting on the world.
Description is one such basic action; narration is another. Analysis and explanation are perhaps
more sophisticated and depend on (logically) prior description and narration. Note that this
process of rendering thought into language is inherently and necessarily temporal. The order in which
signifiers are placed into the speech stream depends in some way, not necessarily obvious, on the
relations among the correlative signifieds in an abstract semantic or cognitive space (think of Figure
1B above). Those relations are a first order approximation to the lumpiness of the world. Relative
positions and distances between signifiers in the speech stream reflect distances between and
relative orientations among correlative signifieds in that semantic space. We thus have systematic
relationships between positions and distances of signifiers in the speech stream, on the one hand, and positions and
distances of signifieds in semantic space. It is those systematic relationships that allow statistical analysis of the
speech stream to reconstruct semantic space. The statistical analysis is thus a second order approximation
to the lumpiness of the world.
For investigation: How is the multidimensional structure of semantic space be encoded in strings
and how much text is needed to encode a given granularity of multidimensional structure?
Let me repeat, time is not extrinsic to this process. Time is intrinsic and constitutive of computation.
Speaking involves computation, as does the statistical analysis of the speech stream.
The propositional engine learns the world via Gärdenfors’ dimensions (p. 6), and whatever else,
Powers’ stack for example. Those dimensions are implicit in the resulting propositional model and
so become projected onto the speech stream via syntax, pragmatics, and discourse structure. The
language engine is then able to extract (a simulacrum of) those dimensions through statistical
learning. Those dimensions are expressed in the parameter weights of the model. THAT’s what
makes the knowledge so ‘frozen’. One has to cue it with actual speech.
The whole language model thus functions as associative memory.12 You present it with an input
cue; it then associates from that cue with each emitted string ‘feeding back’ into the memory bank
via associative memory. The implicit structure of that memory bank, encoded in a meshwork of
billions of interconnected neurons, thus follows the (metaphysical) structure of the world, and is
under constant adjustment and revision through interaction with the world. Statistical language
models then give us a simulacrum of the mind’s memory model, which is, in turn, a simulacrum
of the world.
11
William Powers, Behavior: The Control of Perception (Aldine) 1973. A decade later David Hays integrated
Powers’ model into his cognitive network model, David G. Hays, Cognitive Structures, HRAF Press, 1981.
12 The idea that the brain implements associative memory in a holographic fashion was championed by
Karl Pribram in the 1970s and 1980s. David Hays and I drew on that work in an article on metaphor,
William Benzon and David Hays, Metaphor, Recognition, and Neural Process, The American Journal of
Semiotics , Vol. 5, No. 1 (1987), 59-80,
https://www.academia.edu/238608/Metaphor_Recognition_and_Neural_Process.
9
Stagnation, Redux: Like diamonds, good ideas are
not evenly distributed
Tyler Cowen recently posted a conversation with Nicholas Bloom, a Stanford economist
interested in economic growth.13 That prompted me to revise a working paper I’d written last
year, Stagnation and Beyond: Economic growth and the cost of knowledge in a complex world,14 which is based
on a paper Bloom had published along with three colleagues, Are Ideas Getting Harder to Find?15
I took two of their case studies, Moore’s law and drug discovery, examined them in an informal
manner and concluded that (much of) the increasing difficulty of discovering new ideas can be
attributed to the cost of finding out more about the world.
At the time I wrote that paper I was unaware of Paul Romer’s 1992 article, Two Strategies for
Economic Development.16 He develops his argument with a toy model that is similar in spirit to
the idea I’d advanced in my working paper. The objective of this post is to update that idea in
view of Romer’s model by deploying some toys of my own.
First I set the stage with some passages from Cowen’s conversation with Bloom. Then I introduce
Romer, after which I offer my elaborations. First I introduce some simple diagrams through
which we can visualize the relationship between our conceptual systems (maps) and the world
itself (territory). I conclude by anchoring that story in time and space making it one that is, in
principle, about the evolution of human society over time.
Bloom’s conversation with Cowen
Bloom sets the stage:
The big picture — just to make sure everyone’s on the same page — is, if you look in
the US, productivity growth . . . In fact, I could go back a lot further. It’s interesting —
you go much further, and you think of European and North American history. In the
UK that has better data, there was very, very little productivity growth until the
Industrial Revolution. Literally, from the time the Romans left in whatever, roughly 100
AD, until 1750, technological progress was very slow.
Sure, the British were more advanced at that point, but not dramatically. The estimates
were like 0.1 percent a year, so very low. Then the Industrial Revolution starts, and it
starts to speed up and speed up and speed up. And technological progress, in terms of
productivity growth, peaks in the 1950s at something like 3 to 4 percent a year, and
then it’s been falling ever since.
Then you ask that rate of fall — it’s 5 percent, roughly. It would have fallen if we held
inputs constant. The one thing that’s been offsetting that fall in the rate of progress is
13
Tyler Cowen, Nicholas Bloom on Management, Productivity, and Scientific Progress,
https://medium.com/conversations-with-tyler/nicholas-bloom-tyler-cowen-productivity-economicsb5714b05fc2b.
14 For the abstract and full citation, go to page 19.
15 Nicholas Bloom, Charles I. Jones, John Van Reenen, and Michael Webb, Are Ideas Getting Harder to
Find? American Economic Review 2020, 110(4), https://doi.org/10.1257/aer.20180338.
16 Paul M. Romer, Two Strategies for Economic Development: Using Ideas and Producing Ideas, The World
Bank Economic Review, Volume 6, Issue suppl_1, 1 December 1992, pp. 63–91,
https://doi.org/10.1093/wber/6.suppl_1.63, or here: sci-hub.tw/10.1093/wber/6.suppl_1.63.
10
we’ve put more and more resources into it. Again, if you think of the US, the number of
research universities has exploded, the number of firms having research labs.
Thomas Edison, for example, was the first lab about 100 years ago, but post–World
War II, most large American companies have been pushing huge amounts of cash into
R&D. But despite all of that increase in inputs, actually, productivity growth has been
slowing over the last 50 years. That’s the sense in which it’s harder and harder to find
new ideas. We’re putting more inputs into labs, but actually productivity growth is
falling.
Cowen responds by saying, “Let’s say paperwork for researchers is increasing, bureaucratization
is increasing. How do we get that to be negative 5 percent a year as an effect?” A bit later he’ll
offer:
Doesn’t the explanation have to be that scientific efforts used to be devoted to public
goods much more, and now they’re being devoted to private goods? That’s the only
explanation that’s consistent with rising wages for science but a declining social output
from her research, her scientific productivity.
In his response to Tyler, Bloom makes two suggestions that are consistent with my hypothesis:
Why is it happening at the aggregate level? I think there are three reasons going on.
One is actually come back to Ben Jones, who had an important paper, which is called, I
believe, “[Death of the] Renaissance Man.”17 This came out 15 years ago or something.
The idea was, it takes longer and longer for us to train.
Just in economics — when I first started in economics, it was standard to do a four-year
PhD. It’s now a six-year PhD, plus many of the PhD students have done a pre-doc, so
they’ve done an extra two years. We’re taking three or four years longer just to get to the
research frontier. There’s so much more knowledge before us, it just takes longer to
train up. That’s one story.
A second story I’ve heard is, research is getting more complicated. I remember I sat
down with a former CEO of SRI, Stanford Research Institute, which is a big research
lab out here that’s done many things. For example, Siri came out of SRI. He said,
“Increasingly it’s interdisciplinary teams now.”
His third suggestion repeats one of Cowen’s:
Then finally, as you say, I suspect regulation costs, various other factors are making it
harder to undertake research. A lot of that’s probably good. I’d have to look at
individual regulations. Health and safety, for example, is probably a good idea, but in
the same way, that is almost certainly making it more expensive to run labs…
I am most interested in pursing the notion that ideas are getting harder to find because that’s just
how the world is. Even assuming that I am correct, one must still show that that factor is stronger
17
Benjamin F. Jones, The Burden of Knowledge and the “Death of the Renaissance Man”: Is Innovation
Getting Harder? The Review of Economic Studies, Volume 76, Issue 1, January 2009, Pages 283–317,
https://doi.org/10.1111/j.1467-937X.2008.00531.x
11
than the one’s Cowen favors and, for that matter, other factors one might suggest. That’s a
different kind of argument and one I won’t pursue here.
Romer’s model
Let’s start Paul Romer’s toy model from 1992. Romer sets it up with one that is standard in
economics, that of a factory (p. 67):
One of the great successes of neoclassical economics has been the elaboration and
extension of the metaphor of the factory that is invoked by a production function. To be
explicit about this image, recall the child's toy called the Play-Doh Fun Factory. To
operate the Fun Factory, a child puts Play-Doh (a form of modeling compound) into the
back of the toy and pushes on a plunger that applies pressure. The Play-Doh is extruded
through an opening in the front of the toy. Depending on the particular die used to
form the opening, out come solid Play-Doh rods, Play-Doh1-beams,or lengths of hollow
Play-Doh pipe.
We use the Fun Factory model or something just like it to describe how capital (the Fun
Factory)and labor (the child's strength) change the characteristics of goods, converting
them from less valuable forms (lumps of modeling compound) into more valuable forms
(lengths of pipe).
But Romer isn’t interest in the production of physical goods. He’s interested in the production of
ideas. The child’s chemistry set proves useful (p. 68):
Another child's toy is a chemistry set. For this discussion, the set can be represented as a
collection of N jars, each containing a different chemical element. From the child's point
of view, the excitement of this toy comes from trying to find some combination of the
underlying chemicals that, when mixed together and heated, does something more
impressive than change colors (explode, for example). In a set with N jars, there are 2N–1
different mixtures of K elements, where K varies between 1 and N. (There are many
more mixtures if we take account of the proportions in which ingredients can be mixed
and the different pressures and temperatures that can be used during mixing.)
As N grows, what computer scientists refer to as the curse of dimensionality
sets in. The number of possible mixtures grows exponentially with N, the dimension of
this system. For a modestly large chemistry set, the number of possible mixtures is far
too large for the toy manufacturer to have directly verified that no mixture is explosive.
If N is equal to 100, there are about 1030 different mixtures that an adventurous child
could conceivably put in a test tube and hold over a flame. If every living person on
earth (about 5 billion) had tried a different mixture each second since the universe
began (no more than 20 billion years ago), we would still have tested less than 1 percent
of all the possible combinations. […]
Continuing on, Romer observes (pp. 68-69):
The potential for continued economic growth comes from the vast search space that we
can explore. The curse of dimensionality is, for economic purposes, a remarkable
blessing. To appreciate the potential for discovery, one need only consider the
possibility that an extremely small fraction of the large number of possible mixtures may
be valuable.
12
What interests me is that this vast search space of possibilities is mostly empty of useful ideas,
combinations of chemical elements in this case. Moreover I want to distinguish between the space
itself and the strategy we have for searching that space. That strategy is based on our theory or
theories about that space, or domain as I sometimes like to call it. I want to examine the
relationship between the world, on the one hand, and our ideas of it on the other. To do so I must
enter the den of the metaphysician, to borrow a phrase from Warren McCulloch.
On the relationship between the world and our ideas of it
The background space in the following simple diagram represents the space of potential chemical
combinations (in this case) while the black dots indicate the useful ones. Think of this as a smooth
world as earlier defined (p. 5).
Figure 2: Useful ideas in the
field of all possible ideas.
Notice that some areas are less sparsely populated than others. Now let us superimpose a mental
map on it – the orange grid is that map:
Figure 3: Our theory of the
domain (in orange).
A smooth world is easily mapped. We can now see that while, yes, some regions of the territory
are less populated than others, on the whole the mental map – our theory of the domain – is
13
highly congruent with the territory. Most “bins” in our map contain a useful idea, though it isn’t
in the same place in each bin. If we search each bin, then, we will very likely find a useful idea.
Some searches will take longer than others, but most searches will be rewarded.
Now consider the next diagram. The shaded area represents the territory that we have explored:
Figure 4: Territory we
have explored (gray).
There are still a good many useful ideas that we have not yet discovered. Many of them are in
bins immediately adjacent to ones where we have come up with a useful idea. In fact, I count
more empty bins (nine) in territory we’ve searched than in territory we haven’t (five).
However, our economy appears to be stagnating. If the relationship between our ideas and the
world is as I have depicted it in Figures 2 and 3, then we should be coming up with new ideas
fairly regularly. The effort of discovery should be roughly the same for each idea.18 It isn’t. Rather,
the effort is increasing. Perhaps the fit between our conceptual grid and the world itself isn’t so
regular as those diagrams suggest.
Consider this rather different diagram, which depicts an underlying world that is lumpy (p. 5):
18
Since we don’t search for ideas one at a time, but rather in research programs aimed at discovering as
many ideas as possible I suppose we should be talking about the marginal effort of discovery for the next idea, or
some such thing. We can set that nicety aside for the informal purposes of this argument.
14
Figure 5: Well-explored
territory (gray).
Lumpy worlds are not so easily mapped as smooth ones. Most of the useful ideas have been found
– that is, they are in territory we have explored – and most of the territory we haven’t explored is
empty of useful ideas. There are still good ideas out there; but they aren’t close to us, they’re
distant. We’re going to have to expend more effort to find them. That expenditure shows up as a
decline in productivity. Our economy has entered a period of stagnation.
That, I believe, is the situation we are in.
History and evolution: What does this mean, ideas are “close to us”?
Those diagrams are all well and good. But they imply the existence of a transcendental point of view
that simply isn’t available to us. That is, it is easy enough for us to examine the diagram, and see,
1) the distribution of ideas in territory, as in Figure 2,
2) the conceptual grid we lay over the territory, as in Figures 3, 4, and 5, and
3) to compare the two in each case.
In the real world, however, we don’t have access to that point of view – a problem that bothered
Immanuel Kant, among others.
However, we, the human race, have been living in this real world for a long time. And over that
long time our ideas about the world have changed rather considerably, often in ways that improve
our well-being. I suggest that it is the very fact of change that gives credence to those simple diagrams. We
change our ideas and, in consequence, can construct better garments, more secure dwellings,
have access to a greater variety of foodstuffs, can travel further and with greater safety, and so
forth. This process isn’t so simple as that sentence implies, often enough we blunder into the
unexpected, and that forces us to change our ideas, and so forth. But it will do for my present
purposes.
So, let us go back in time some three million years or so. Our ancestors are very clever apes living
somewhere in, I believe, East Africa, but my argument doesn’t depend on just where or even on
the idea that our ancestors come from one single region. Wherever they are, they have certain
capacities of perception and movement, capacities that have evolved over millions of years and so
are reasonably well suited to their physical environment.
15
Let us stipulate that those things, of whatever kind, to which we have easy access (as a function of
our perceptual, cognitive, and motor capacities) are close to us – perhaps physically close, but
mostly close in a more abstract, more metaphysical sense, if you will. By extending ourselves,
though, we can reach things that are somewhat distant from us, perhaps even quite distant. At
some point, for example, our pre-human ancestors discovered how to fabricate stone tools of
various kinds, arrow and spear heads, knives, and hand axes. This involves knowledge of
fabrication techniques and knowledge of how to use the resulting tools, in hunting, food
preparation, and so forth. It also involves knowledge of where the best raw stone is located, which
may not be where people lived. These various kinds of knowledge increased our ancestors control
over their world and their lives, but did have a cost associated with them. It took time to figure
these things our, to reduce them to routine, and to pass those routines on to others.
Consider another diagram:
Figure 6: Point of origin.
This is the same as Figure 4, but with the addition of a green spot. That green spot is where those
ancestors of ours were located in the world. It is, if you will, an anchor tying us to a particular
historical trajectory. That’s how evolutionary processes are, tied to specific sequences of events in
time and space.
Note, however, that I don’t mean only physical location (East Africa) but…What should we call it,
this relationship between our capacities (at any given moment) and the world-in-itself? For the
moment I’m content to think of it as our metaphysical position in the world – as I’ve already
indicated, where our physical location in the world is but one aspect of this metaphysical position.
The point I am making is that, in a world constituted as ours is, the effort expended to
acquire an idea is an effective measure of that idea’s (metaphysical) distance from
us. Correlatively, the effort required to acquire a second idea, Fred, after having acquired a first
idea, Amanda, is a measure of the distance between Fred and Amanda. I am, of course, only
interested in the effort directly related to uncovering the idea. The effort required, e.g. to fill out
forms and jump through bureaucratic hoops is extraneous and should not be taken into account
in thinking about this issue, even if it is annoyingly intrusive in point of actual fact.
One final remark. Cultural growth does not seem uniform over the long term.19 In my working
paper, Stagnation and Beyond, I lay out some ideas that David Hays and I have worked on,20 a
19
See Stagnation and Beyond, pp. 11–13.
16
theory of cultural evolution over the longue durée – see pp. 3-4, 15-28 in Stagnation and Beyond for
their relevance to stagnation. The following diagram illustrates that long-term trajectory:
Figure 7: Four cultural ranks, so far.
The story we tell is one of cultural paradigms existing at four levels of sophistication, which we
call ranks. In the terminology of current evolutionary biology, these ranks represent major
transitions in cultural life. Rank 1 cognitive architectures emerged when the first humans appeared
on the savannas of Africa speaking language as we currently know it. Those architectures
structured the lives of primitive societies that emerged perhaps 50,000 to 100,000 years ago.
Around 5,000 to 10,000 years ago Rank 2 architectures emerged in relatively large stable human
societies with people subsisting on systematic agriculture, living in walled cities and reading
written texts. Rank 3 architectures first emerged in Europe during the Renaissance and gave
European cultures the capacity to dominate, in a sense, to create, world history over the last 500
years. The late 19th century saw the emergence of Rank 4 architecture.
My suspicion, then, about the current stagnation is that we’re at a plateau, marking time before
ascending another growth curve. To make that ascent we need new conceptual frameworks,
frameworks which will, in effect, bring us closer to a realm of new ideas currently invisible and
inaccessible to us.
Our story is not over. We are preparing a new beginning.
20
“The Evolution of Cognition” is our basic paper; abstract and full citation, p. 18) Hays has written a
history of technology from this point of view, David Hays, The Evolution of Technology Through Four Cognitive
Ranks (1993), http://asweknowit.ca/evcult/Tech/FRONT.shtml. See also posts at New Savanna under the
labels, “tech evol”, https://new-savanna.blogspot.com/search/label/tech%20evol, and “cultural ranks”,
https://new-savanna.blogspot.com/search/label/cultural%20ranks.
17
The complex universe: Further reading
I have listed five papers that give substance to the arguments in this working paper. The first three
are papers that David Hays and I published in the formal literature. The last two are informal
working papers of my own.
Complex structure in a complex universe
William L. Benzon and David G. Hays, A Note on Why Natural Selection Leads to Complexity,
Journal of Social and Biological Structures 13: 33-40, 1990,
https://www.academia.edu/8488872/A_Note_on_Why_Natural_Selection_Leads_to_Complexi
ty.
Abstract: While science has accepted biological evolution through natural selection, there is no
generally agreed explanation for why evolution leads to ever more complex organisms. Evolution
yields organismic complexity because the universe is, in its very fabric, inherently complex, as
suggested by Ilya Prigogine's work on dissipative structures. Because the universe is complex,
increments in organismic complexity yield survival benefits: (1) more efficient extraction of energy
and matter, (2) more flexible response to vicissitudes, (3) more effective search. J.J. Gibson's
ecological psychology provides a clue to the advantages of sophisticated information processing
while the lore of computational theory suggests that a complex computer is needed efficiently to
perform complex computations (i.e. sophisticated information processing).
Natural intelligence
William L. Benzon and David G. Hays, Principles and Development of Natural Intelligence,
Journal of Social and Biological Structures 11, 293 - 322, 1988,
https://www.academia.edu/235116/Principles_and_Development_of_Natural_Intelligence.
Abstract: The phenomena of natural intelligence can be grouped into five classes, and a specific
principle of information processing, implemented in neural tissue, produces each class of
phenomena. (1) The modal principle subserves feeling and is implemented in the reticular
formation. (2) The diagonalization principle subserves coherence and is the basic principle,
implemented in neocortex. (3) Action is subserved by the decision principle, which involves
interlinked positive and negative feedback loops, and resides in modally differentiated cortex. (4)
The problem of finitization resolves into a figural principle, implemented in secondary cortical
areas; figurality resolves the conflict between pro-positional and Gestalt accounts of mental
representations. (5) Finally, the phenomena of analysis reflect the action of the indexing principle,
which is implemented through the neural mechanisms of language.
These principles have an intrinsic ordering (as given above) such that implementation of each
principle presupposes the prior implementation of its predecessor. This ordering is preserved in
phylogeny: (1) mode, vertebrates; (2) diagonalization, reptiles; (3) decision, mammals; (4) figural,
primates; (5) indexing, Homo sapiens sapiens. The same ordering appears in human ontogeny and
corresponds to Piaget's stages of intellectual development, and to stages of language acquisition.
Cognitive architecture in cultural evolution
William L. Benzon and David G. Hays, The Evolution of Cognition, Journal of Social and Biological
Structures 13(4): 297-320, 1990, https://www.academia.edu/243486/The_Evolution_of_Cognition.
Abstract: With cultural evolution new processes of thought appear. Abstraction is universal, but
rationalization first appeared in ancient Greece, theorization in Renaissance Italy, and model
building in twentieth-century Europe. These processes employ the methods of metaphor,
18
metalingual definition, algorithm, and control, respectively. The intellectual and practical
achievements of populations guided by the several processes and exploiting the different
mechanisms differ so greatly as to warrant separation into cultural ranks. The fourth rank is not
completely formed, while regions of the world and parts of every population continue to operate
by the processes of earlier ranks.
Economic stagnation is to be expected
William Benzon, Stagnation and Beyond: Economic growth and the cost of knowledge in a complex world,
Working Paper, Version 2, August 2, 2019,
https://www.academia.edu/39927897/Stagnation_and_Beyond_Economic_growth_and_the_co
st_of_knowledge_in_a_complex_world.
Abstract: What economists have identified as stagnation over the last few decades can also be
interpreted as the cost of continuing successful engagement with a complex world that is not set
up to serve human interests. Two arguments: 1) The core argument holds that elasticity (ß) in the
production function for economic growth is best interpreted as a function of the interaction
between the economic entity (firm, industry, the economy as a whole) and particular aspects the
larger world: physical scale in the case of semi-conductor development, biological organization in
the case of drug discovery. 2) A larger argument interprets current stagnation as the shoulder of a
growth curve in the evolution of culture through a succession of fundamental stages in underlying
cognitive architecture. New stages develop over old through a process of reflective abstraction
(Piaget) in which the mechanisms of earlier stages become objects for manipulation and
deployment for the emerging stage.
Statistical semantics
William Benzon, GPT-3: Waterloo or Rubicon? Here be Dragons, Working Paper, Version 2, August
20, 2020, https://www.academia.edu/43787279/GPT_3_Waterloo_or_Rubicon_Here_be_Dragons;
https://ssrn.com/abstract=3667608; ResearchGate:
https://www.researchgate.net/publication/343444766_GPT3_Waterloo_or_Rubicon_Here_be_Dragons.
Abstract: GPT-3 is an AI engine that generates text in response to a prompt given to it by a
human user. It does not understand the language that it produces, at least not as philosophers
understand such things. And yet its output is in many cases astonishingly like human language.
How is this possible? Think of the mind as a high-dimensional space of signifieds, that is, meaningbearing elements. Correlatively, text consists of one-dimensional strings of signifiers, that is,
linguistic forms. GPT-3 creates a language model by examining the distances and ordering of
signifiers in a collection of text strings and computes over them so as to reverse engineer the
trajectories texts take through that space. Peter Gärdenfors’ semantic geometry provides a way of
thinking about the dimensionality of mental space and the multiplicity of phenomena in the
world, about how mind mirrors the world. Yet artificial systems are limited by the fact that they
do not have a sensorimotor system that has evolved over millions of years. They do have inherent
limits.
19