Oh gosh, it has been a long time, hasn’t it? My deepest apologies, sports fans. You know how life is, always getting in the way. Perhaps this will spur a production in verbal output but it’s just as likely that it’ll be a once-per-year salvo. Don’t get used to anything nice, my mother always told me.
Anyway, this too-prolix production has been made possible by a friend soliciting my input on the following article. That shit is long, so take a good thirty minutes out of your day if you plan to read it all, and then take another thirty to read this response, which I know you’re going to do because you love me that much.
I’ll save you some of the trouble by putting my thesis front and center so you can decide whether or not you want to continue reading or leave an angry comment: I think the linked piece is premised on some really flimsy assumptions and glosses over some serious problems, both empirical and logical, in its desire to attain its final destination. This is, sadly, par for the course in popular writing about AI; even very clever people often write very stupid things on this topic. There’s a lot of magical thinking going on in this particular corner of the Internet; much of it, I think, can be accounted for by a desire to believe in a bright new future about to dawn, coupled with a complete lack of consequences for being wrong about your predictions. That said, let’s get to the meat.
There are three basic problems with Tim Urban’s piece, and I’ll try and tackle all three of them. The first is that it relies throughout on entirely speculative and unjustified projections generated by noted “futurist” (here I would say, rather, charlatan, or perhaps huckster) Ray Kurzweil; these projections are the purest fantasy premised on selective interpretations of sparsely available data and once their validity is undermined, the rest of the thesis collapses pretty quickly. The second problem is that Urban repeatedly makes wild leaps of logic and inference to arrive at his favored result. Third, Urban repeatedly mischaracterizes or misunderstands the state of the science, and at one point even proposes a known mathematical and physical impossibility. There’s a sequel to Urban’s piece too, but I’ve only got it in me to tackle this one.
Conjectures and Refutations
Let me start with what’s easily the most objectionable part of Urban’s piece: the charts. Now, I realize that the charts are meant to be illustrative rather than precise scientific depictions of reality, but for all that they are still misleading. Let’s set aside for the moment the inherent difficulty of defining what exactly constitutes “human progress” and note that we don’t really have a good way of determining where we stand on that little plot even granting that such a plot could be made. Urban hints at this problem with his second “chart” (I guess I should really refer to them as “graphics” since they are not really charts in any meaningful sense), but then the problem basically disappears in favor of a fairly absurd thought experiment involving a time traveler from the 1750s. My general stance is that in all but the most circumscribed of cases, thought experiments are thoroughly useless, and I’d say that holds here. We just don’t know how a hypothetical time traveler retrieved from 250 years ago would react to modern society, and any extrapolation based on that idea should be suspect from the get-go. Yes, the technological changes from 1750 to today are quite extreme, perhaps more extreme than the changes from 1500 to 1750, to use Urban’s timeline. But they’re actually not so extreme that they’d be incomprehensible to an educated person from that time. For example, to boil our communication technology down to the basics, the Internet, cell phones, etc. are just uses of electrical signals to communicate information. Once you explain the encoding process at a high level to someone familiar with the basics of electricity (say, Ben Franklin), you’re not that far off from explicating the principles on which the whole thing is based, the rest being details. Consider further that in 1750 we are a scant 75 years away from Michael Faraday, and 100 years away from James Clerk Maxwell, the latter of whom would understand immediately what you’re talking about.
We can play this game with other advances of modern science, all of which had some precursors in the 1750s (combustion engines, the germ theory of disease, etc.). Our hypothetical educated time traveler might not be terribly shocked to learn that we’ve done a good job of reducing mortality through immunizations, or that we’ve achieved heavier-than-air flight. I doubt that however surprised they are it would be to the extent that they would die. The whole “Die Progress Unit” is, again, a tongue-in-cheek construct from Urban, meant to be illustrative, but rhetorically it functions to cloak all kinds of assumptions about how people would or would not react. It disguises a serious conceptual and empirical problem (just how do we define and measure things like “rates of progress”) behind a very glib imaginary scenario that is both not meant to be taken seriously and function as justification for the line of thinking that Urban pursues later in the piece.
The idea that ties this first part together is Kurzweil’s “Law of Accelerating Returns.” Those who know me won’t be surprised to learn that I don’t think much of Kurzweil or his laws. I think Kurzweil is one part competent engineer and nine parts charlatan, and that most of his ideas are garbage amplified by money. The “Law” of accelerating returns isn’t any such thing, certainly not in the simplistic way presented in Urban’s piece, and relying on it as if it were some sort of proven theorem is a terrible mistake. A full explanation of the problems with the Kurzweilian thesis will have to wait for another time, but I’ll sketch one of the biggest objections below. Arguendo I will grant an assumption that in my view is mostly unjustified, which is that the y-axis on those graphics can even be constructed in a meaningful way.
A very basic problem with accelerating returns is that it very much depends on what angle you look at it from. To give a concrete example, if you were a particle physicist in the 1950s, you could pretty much fall ass-backwards into a Nobel Prize if you managed to scrape together enough equipment to build yourself a modest accelerator capable of finding another meson. But then a funny thing happened, which is that ever incremental advance over the gathering of low-hanging fruit consumed disproportionately more energy. Unsurprisingly, the marginal returns on increased energy diminished greatly; the current most powerful accelerator in the world (the LHC at CERN) has beam energies that I believe will max out at somewhere around 7 TeV, give or take a few GeV. That’s one order of magnitude more powerful than the second-most powerful accelerator (the RHIC at Brookhaven), and it’s not unrealistic to believe that the discovery of any substantial new physics will require an accelerator another order of magnitude more powerful. In other words, the easy stuff is relatively easy and the hard stuff is disproportionately hard. Of course this doesn’t mean that all technologies necessarily follow this pattern, but note that what we’re running up against here is not a technological limit per se, but rather a fundamental physical limit: the increased energy scale just is where the good stuff lies. Likewise, there exist other real physical limits on the kind of stuff we can do. You can only make transistors so small until quantum effects kick in; you can only consume so much energy before thermodynamics dictates that you must cook yourself.
The astute reader will note that this pattern matches quite well (at least, phenomenologically speaking) the logistic S-curve that Urban draw in one of his graphics. But what’s really happening there? What Urban has done is to simply connect a bunch of S-curves and overlaid them on an exponential, declaring (via Kurzweil) that this is how technology advances. But does technology really advance this way? I can’t find any concrete argument that it does, just a lot of hand-waving about plateaus and explosions. What’s more, the implicit assumption lurking in the construction of this plot is that when one technology plays itself out, we will somehow be able to jump ship to another method. There is historical precedent for this assumption, especially in the energy sector: we started off by burning wood, and now we’re generating energy (at least potentially) from nuclear reactions and sunlight. All very nice, until you realize that the methods of energy generation that are practical to achieve on Earth are likely completely played out. We have fission, fusion, and solar, and that’s about it for the new stuff. Not because we aren’t sufficiently “clever” but because underlying energy generation is a series of real physical processes that we don’t get to choose. There may not be another accessible S-curve that you we can jump to.
Maybe other areas of science behave in this way and maybe they don’t; it’s hard to know for sure. But admitting ignorance in the face of incomplete data is a virtue, not a sin; we can’t be justified in assuming that we’ll be able to go on indefinitely appending S-curves to each other. At best, even if the S-curve is “real,” what we’re actually dealing with is an entire landscape of such curves, arranged in ways we don’t really understand. As such, predictions about the rate of technological increase are based on very little beyond extrapolating various conveniently-arranged plots; it’s just that instead of extrapolating linearly, Kurzweil (and Urban following after him) does so exponentially. Well, you can draw lines through any set of data that you like, but it doesn’t mean you actually understand anything about that data unless you understand the nature of the processes that give rise to it.
You can look at the just-so story of the S-curve and the exponential (also the title of a children’s book I’m working on) as a story about strategy and metastrategy. In other words, each S-curve technology is a strategy, and the metastrategy is that when one strategy fails we develop another to take its place. But of course this itself assumes that the metastrategy will remain valid indefinitely; what if it doesn’t? Hitting an upper or lower physical limit is an example of a real barrier that is likely not circumventable through “paradigm shifts” because there’s a real universe that dictates what is and isn’t possible. Kurzweil prefers to ignore things like this because they throw his very confident pronouncements into doubt, but if we’re actually trying to formulate at least a toy scientific theory of progress, we can’t discount these scenarios.
1. p → q;
3. therefore, q
Since Kurzweil’s conjectures (I won’t dignify them with the word “theory”) don’t actually generate any useful predictions, it’s impossible to test them in any real sense of the word. I hope I’ve done enough work above to persuade the reader that these projections are nothing more than fantasy predicated on the fallacious notion that the metastrategy of moving to new technologies is going to work forever. As though it weren’t already bad enough to rely on these projections as if they were proven facts, Urban repeatedly mangles logic in his desire to get where he’s going. For example, at one point, he writes:
So while nahhhhh might feel right as you read this post, it’s probably actually [sic] wrong. The fact is, if we’re being truly logical and expecting historical patterns to continue, we should conclude that much, much, much more should change in the coming decades than we intuitively expect.
It’s hard to see why the skeptics are the ones who are “probably actually wrong” and not Urban and Kurzweil. If we’re being “truly logical” then, I’d argue, we aren’t making unjustified assumptions about what the future will look like based on extrapolating current non-linear trends, especially when we know that some of those extrapolations run up against basic thermodynamics.
That self-assured gem comes just after Urban commits an even grosser offense against reason. This:
And yes, no one in the past has not died. But no one flew airplanes before airplanes were invented either.
is not an argument. In the words of Wolfgang Pauli, it isn’t even wrong. This is a sequence of words that means literally nothing and no sensible conclusion can be drawn from it. To write this and to reason from such premises is to do violence to the very notion of logic that you’re trying to defend.
The entire series contains these kinds of logical gaps that are basically filled in by wishful thinking. Scales, trends, and entities are repeatedly postulated, then without any particular justification or reasoning various attributes are assigned to them. We don’t have the faintest idea of what an artificial general intelligence or super-intelligence might look like, but Urban (via Kurzweil) repeatedly gives it whatever form will make his article most sensational. If for some reason the argument requires an entity capable of things incomprehensible to human thought, that capability is magicked in wherever necessary.
The State of the Art
Urban’s taxonomy of “AI” is likewise flawed. There are not, actually, three kinds of AI; depending on how you define it, there may not even be one kind of AI. What we really have at the moment are a number of specialized algorithms that operate on relatively narrowly specified domains. Whether or not that represents any kind of “intelligence” is a debatable question; pace John McCarthy, it’s not clear that any system thus far realized in computational algorithms has any intelligence whatsoever. AGI is, of course, the ostensible goal of AI research generally speaking, but beyond general characteristics such as those outlined by Allen Newell, it’s hard to say what an AGI would actually look like. Personally, I suspect that it’s the sort of thing we’d recognize when we saw it, Turing-test-like, but pinning down any formal criteria for what AGI might be has so far been effectively impossible. Whether something like the ASI that Urban describes can even plausibly exist is of course the very thing in doubt; it will not surprise you, if you have not read all the way through part 2, that having postulated ASI in part 1, Urban immediately goes on to attribute various characteristics to it as though he, or anyone else, could possibly know what those characteristics might be.
I want to jump ahead for a moment and highlight one spectacularly dumb thing that Urban says at the end of his piece that I think really puts the whole thing in perspective:
If our meager brains were able to invent wifi, then something 100 or 1,000 or 1 billion times smarter than we are should have no problem controlling the positioning of each and every atom in the world in any way it likes, at any time—everything we consider magic, every power we imagine a supreme God to have will be as mundane an activity for the ASI as flipping on a light switch is for us.
This scenario impossible. Not only does it violate everything we know about uncertainty principles, but it also effectively implies a being with infinite computational power; this is because even if atoms were classical particles, controlling the position of every atom logically entails running forward in time a simulation of the trajectories of those atoms to infinite precision, a feat that is impossible in a finite universe. Not only that, but the slightest error in initial conditions will accumulate exponentially (here, the exponential stuff is actually mathematically valid), so that e.g. improving your forecast horizon by a factor of 10 requires a factor of 100 increase in computational power and so on.
This might seem like an awfully serious takedown of an exaggerated rhetorical point, but it’s important because it demonstrates how little Urban knows, or worries, about the actual science at stake. For example, he routinely conflates raw computational power with the capabilities of actual mammalian brains:
So the world’s $1,000 computers are now beating the mouse brain and they’re at about a thousandth of human level.
But of course this is nonsense. We are not “beating” the mouse brain in any substantive sense, we merely have machines that do a number of calculations per second that is comparable to a number that we imagine the mouse brain is also doing. About the best we’ve been able to do is to mock up a network of virtual point neurons that kind of resembles a slice of the mouse brain, maybe, if you squint from far away, and run it for a few seconds. Which is a pretty impressive technical achievement, but saying that we’ve “beaten the mouse brain” is wildly misleading. “Affordable, widespread AGI-caliber hardware in ten years,” is positively fantastical even under the most favorable Moore’s Law assumptions.
Of course, even with that kind of hardware, AGI is not guaranteed; it takes architecture as much as computational power to get to intelligence. Urban recognizes this, but his proposed “solutions” to this problem again betray as misunderstanding of both the state of science and our capabilities. For example, his “emulate the brain” solution is basically bog-standard connectionism. Not that connectionism is bad or hasn’t produced some pretty interesting results, but neuroscientists have known for a long time now that the integrate-and-fire point neuron of connectionist models is a very, very, very loose abstraction that doesn’t come close to capturing all the complexities of what happens in the brain. As this paper on “the neuron doctrine” (PDF) makes clear, the actual biology of neural interaction is fiendishly complicated, and the simple “fire together-wire together” formalism is a grossly inadequate (if also usefully tractable) simplification. Likewise, the “whole brain simulation” story fails to take into account real biological complexities of faithfully simulating neuronal interactions. Urban links to an article which claims that whole-brain emulation of C. elegans has been achieved, but while the work done by the OpenWorm folks is certainly impressive, it’s still a deeply simplified model. It’s hard from the video to gauge how closely the robot-worm’s behavior matches the real worm’s behavior; it’s likely that, at least, it exhibits some types of behaviors that the worm also exhibits, but I doubt that even its creators would claim ecological validity for their mode. At the very best, it’s a proof of principle regarding how one might go about doing something like this in the future, and, keep in mind, that this is a 300-neuron creature whose connectome is entirely known.
Nor are genetic algorithms likely to do the trick. Overall, the track record of genetic algorithms in actually producing useful results is decidedly mixed. In a recent talk I went to, Christos Papadimitriou, a pretty smart guy, flat out claimed that “genetic algorithms don’t work.” (PDF, page 18). I do not possess sufficient expertise to judge the truth of this statement, but I think the probability that genetic algorithms will provide the solution is low. It does not help that we “know” what we’re aiming for; in truth we have no idea what we’re optimizing for, and our end-goal is something of the “we know it when we see it” variety, which isn’t something that lends itself terribly well to a targeted search. Evolution, unlike humans, optimized for certain sorts of local fitness maxima (to put it very, very simply), and wound up producing something that couldn’t possibly have been targeted for in such an explicit way.
All of this is to say that knowing the connectome and having some computing power at your disposal is a necessary but not sufficient condition for replicating even simple organismic functionality. Understanding how to go from even a complete map of the human brain to a model of how that brain produces intelligence is not a simple mapping, nor is it just a matter of how many gigaflops you can execute. You have to have the right theory or your computational power isn’t worth that much. A huge problem that one hits on when speaking with actual neuroscientists is that there’s really a dearth of theoretical machinery out there that even begins to accurately represent intelligence as a whole, and it isn’t for lack of trying.
The concluding discussion of what an AI might look like in relation to humans is hopelessly muddled. We barely have any coherent notion of how to quantify existing human intelligence, much less a possible artificial one. There’s no particular reason to think that intelligence follows some kind of linear scale, or that “170,000 more intelligent than a human,” is any sort of meaningful statement, rather than a number thrown out into the conversation without any context.
The problem with the entire conversation surrounding AI is that it’s almost entirely divorced from the realities of both neuroscience and computer science. The boosterism that emanates from places like the Singularity Institute and from people like Kurzweil and his epigones is hardly anything more than science fiction. Their projections are mostly obtained by drawing straight lines through some judiciously-selected data, and their conjectures about what may or may not be possible are mostly based on wishful thinking. It’s disappointing that Urban’s three weeks of research have produced a piece that reads like an SI press release, rather than any sort of sophisticated discussion of either the current state of the AI field or the tendencious and faulty logic driving the hype.
None of this is to say that we should be pessimists about the possibility of artificial intelligence. As a materialist, I don’t believe that humans are somehow imbued with any special metaphysical status that is barred to machines. I hold out hope that some day we will, through diligent research into the structure of existing brains, human and otherwise, unravel the mystery of intelligence. But holding out hope is one thing; selling it as a foregone conclusion is quite another. Concocting bizarre stories about superintelligent machines capable of manipulating individual atoms through, apparently, the sheer power of will, is just fabulism. Perhaps no more succinct and accurate summary of this attitude has ever been formulated than that written by John Campbell of Pictures for Sad Children fame:
it’s flying car bullshit: surely the world will conform to our speculative fiction, surely we’re the ones who will get to live in the future. it gives spiritual significance to technology developed primarily for entertainment or warfare, and gives nerds something to obsess over that isn’t the crushing vacuousness of their lives
Maybe that’s a bit ungenerous, but I find that it’s largely true. Obsession about AI futures is not even a first world problem as much as a problem for a world that has never existed and might never exist. It’s like worrying about how you’ll interact with the aliens that you’re going to find on the other side of the wormhole before you even know how to get out of the solar system without it taking decades.