Monday, April 25, 2011

Turing paper

This is a paper I wrote for Phil. 306, Computer Ethics, at SSU. It is the first in a series of essays I plan to post on this blog.

Alan Turing, Daniel Dennett, & IBM/Watson
       The first part of Alan Turing’s paper brings up the question of whether a digital computer can theoretically imitate human thought enough to fool a real human. Turing believed it is possible. The real point of this “imitation game,” however, is to explore the hypothetical possibilities of Artificial Intelligence. A readisng of Daniel Dennett brings some input about neural science and consciousness theory to the discussion. Videos about IBM’s Watson computer offers a concrete example of a machine that has been programmed according to the criteria of Turing’s thought experiment, to be tested by playing the “Jeopardy Game.” This paper tries to assess IBM/Watson’s potential as it relates to Turing’s paper.
       The bulk of Turing’s paper is concerned with addressing specific arguments against his hypothesis. I am taking the position that his hypothesis is valid, but I want to explore how well he defends it, specifically concerning the following issues:
       Even if a computer can successfully imitate a human, it does not count as “thinking” unless it can also be shown that humans can “think,” otherwise one would argue that humans are only meat puppets anyway, and a machine imitating one is no big deal. And that brings into the discussion the questions of, what is Intelligence? What is Consciousness? What is Free Will? And how do all these concepts relate to each other? This is where Daniel Dennett offers some input.
       Then, if we can safely assume that humans are not just meat puppets, does Turing adequately explain how a digital computer, (being a discrete-state machine), can imitate a non-discrete-state human?
       Turing’s first two issues: the Theological Objection and the “Heads in the Sand” Objection, are easy to dismiss. His third one is The Mathematical Objection, based on Godel’s theorem that any logically consistent system will necessarily contain statements that can neither be proved nor disproved. Turing dismisses this objection as irrelevant, since it also applies to the human intellect. I agree.
       Turing’s fourth issue is The Argument from Consciousness. This is the most interesting argument and, as he notes, the ones that follow are just variations on this one. But his ninth issue, The Argument from Extrasensory Perception is, to this writer’s mind, a complete waste of time. Turing says it is a strong argument, and even claims that the statistical evidence for telepathy is “overwhelming.” I don’t know what kind of statistics they were doing in his time, but the inductive evidence for ESP would not even come close to an acceptable margin on the Pearson R scale today. It is considered pseudoscience by most scientists. Maybe Turing threw those paragraphs in there for comic relief.
       Turing’s strongest defense of his hypothesis is in the last five pages under the heading Learning Machines. To discuss learning, we have to go back and include the issues raised by the questions of consciousness, and the question about the difference between discrete-state computing and non-discrete-state computing.
       I began this paper by discussing Turing’s points in the order that he presented them, but now I have “decided” to rearrange them the way I “want” to. Turing did the same thing when he wrote, “Let us return for a moment to Lady Lovelace’s objection, which stated that the machine can only do what we tell it to do.” Can a machine “choose” to digress from its program? (The reader does not have to answer rhetorical questions; the answer is “yes.”) The digression is called a “subroutine” and the “choice” is made by whichever “state” the computer finds itself in. I am not going to get into a nit-picking language quarrel about whether modules and subroutines are not just part of the greater algorithm that computers use for their book of rules to follow, because it can then be argued that humans make their “choices” the same way. The real question is whether the difference between a discrete-state computer and a non-discrete-state human mind is an important one. In this same context, Turing goes on to use the analogy of a neutron bombarding an atomic pile of subcritical mass, compared to a neutron bombarding an atomic pile of supercritical mass. The neutron is analogous to an idea that sets off a chain-reaction of other ideas in humans, but not in computers. And then Daniel Dennett says that evolution showed us that consciousness was created, not by a higher intelligence with a teleological algorithm, but by thousands of discrete mini-module bio-electro-chemical algorithms that formed together by chance over a huge expanse of time, and that there is no “Ego” that centrally controls our nervous system, and I say that Lord Byron’s daughter was a brilliant mathematician who wrote computer programs for Charles Babbage’s machine a hundred years before Turing broke the Nazi Enigma code, and I am trying to get all this information down on this paper as fast as I can because I am
running past the word count requirements and I want to complete this algorithm by the time it is due.
       Now, how can a digital computer successfully imitate the previous paragraph, which I just wrote, with scattered, chain-reaction ideas presented in no logical order? That was another trick question; if a human can think like that, a human can write a program like that. (It’s called “spaghetti-code”.) The choices of input, output, and subroutines are not part of the problem. The real problem is that I inserted the feeling of want in there. Does a computer want to finish the algorithm, or does it just do it because it has to? If a computer is instructed by a specific algorithm to write a 300 word paper on a given subject, can it “choose” to write a longer one? Again, to paraphrase Lady Lovelace, it is up to the human programmer to make the input and output choices for the computer. Can a human programmer just ignore the poor computer’s wants and feelings? Yes, but first he or she has to program those wants and feeling in there, and this writer does not know anything about how to write a program like that.
       I wrote, in a previous digression from whatever subroutine I was in at the time, that Turing’s strongest arguments are under the heading of Learning Machines. He makes the brilliant suggestion that, instead of trying to program computers to imitate adult humans, we should program computers to imitate children who can learn to be adults. Brilliant! Program computers to learn! Now we are on the right rack if our real goal is to explore the possibility of Artificial Intelligence. We should be writing computer programs that, once written, can re-write themselves after that. We humans re-write ourselves all the time, so why can’t we write programs for computers to do the same?
       Turing writes that, “...this (learning) process will be more expeditious than evolution. The survival of the fittest is a slow method for measuring advantages. The experimenter, by the exercise of intelligence, should be able to speed it up. Equally important is the fact that he is not restricted to random mutations. If he can trace a cause for some weakness he can probably think of the kind of mutation which will improve it.”
       Turing has given us a brilliant idea here, but his own field is mathematics and computer science. Human psychology does not seem to be his strong point although he does show a basic understanding of the feedback relationship between genetics and environment. He also apparently has some knowledge operant conditioning, and he understands how punishment and rewards relate to learning. But here is where Turing seems to be way out of his element. He writes, “The machine has to be so constructed that events which shortly preceded the occurrence of a punishment signal are unlikely to be repeated, whereas a reward signal increased the probability of repetition of the events which led up to it. These definitions do not presuppose any feelings on the part of the machine…(my italics).”
       The difference between a reward and a punishment is one of value judgment. Values are not like mathematically logical operations; there is no objective “right” or “wrong” answer that a discrete-state machine could understand. Reward signals and punishment signals would have no meaning to a computer that did not have emotions. Why would a computer care whether the programmer thought one signal meant “bad boy” or another signal meant “good girl?”
       I have an extremely limited knowledge of modern computer programming languages, so I will have to take the stance of a theoretical philosopher and engage only in thought experiments. Whenever I hear people talk about how emotions are illogical, or emotions get in the way of clear thinking, or some other variation on the perceived conflict between emotions and logic, I always have to ask, “would humans be as intelligent as we are if, like Spock on Star Trek, we were only logical and not emotional?” I think not.
       I argue that emotions are not any different in kind than what is commonly referred to as “logic.” I argue that emotions are logical, they just follow a different algorithm than the traditional “thinking” kind of logic, with different starting premises and different goals. Emotions are the algorithms that contain all the value judgment subroutines. In a “good” state, the subroutine switches to one module, in a “bad” state it switches to another. I will further argue that without emotions, humans cannot learn, except by rote, in which case computers really could imitate humans, because humans would only be meat puppets. True learning, in the higher, human sense, means growing and improving, and this cannot be done without value judgments. I hypothesize that the human nervous system has more than one algorithm running at the same time, in parallel, and the emotional logic interfaces with the computational logic during various subroutines creating a self-conscious gestalt that can make value judgments about what it wants to learn and how it wants to learn it. I further hypothesize that a computer can be programmed with separate, but parallel algorithms, one with a purely mathematical component that makes “yes” and “no” judgments, and another with a value judgment component that makes “good” and “bad” judgments, and various interfaces where one of the four combination of the four “states” direct the subroutines. The four combinations would be yes/good, yes/bad, no/good, and no/bad. A mathematical model of how operant conditioning works could produce a learning algorithm. This might easily be done using binary math.
       If my hypothesis is tried and does not work, maybe someone with more knowledge than I have about programming could look into the possibility of non-discrete-state programmable computers. Now that I have gotten that rant out of my system, I will try to examine the IBM/Watson.
       Could IBM/Watson pass the Turing test? Yes, if we are using the “imitation game” criterion. I think that chess-playing computers already have done so, although I am sure that any grand master chess champion who lost to a computer would disagree. Machines imitate humans better when they make mistakes and lose. Watson is more sophisticated than a chess playing computer. A chess-playing computer can only imitate a chess-playing human in a situation where both machine and human are thinking in logical algorithms. As the engineers in the IBM video pointed out, the game of Jeopardy entails open ended questions stated in natural language. Also, making a mistake does not necessarily count as imitating a human. The IBM video demonstrated the difference between mistakes of fact and mistakes of form. If a computer merely gives the wrong answer it could be imitating a human, but if the computer can not even understand the question, as earlier versions of Watson did not, its non-humanness is given away. The newest version of Watson has about a thousand parallel CPUs designed to understand open ended questions and natural language, as well as vast database of information on the question categories.
       Is this the realization of Turnings Dream? No, it is still only capable of imitating the people who programmed it. It does have the capacity to learn how to make better choices about the probabilities of potential answers being right or wrong, based on previous experience. This is a great advancement over all previous computers, but even this advantage is limited to the databases that the programmers have given it, and the algorithms that the programmers have written for it. It does not know how to write new algorithms of its own, based on emotional and esthetic value judgments, as real humans do.
       What might be done with a machine like Watson? The IBM programmers have already figured out that Watson needed thousands of CPUs running parallel algorithms, and some of those algorithms can make limited value judgments based on objective mathematical and statistical logic, but I do not believe that Watson can make subjective value judgments about emotions or esthetics, and I do not believe that Watson can understand the “meaning” of its correct answers. Since Watson’s “learning” is done by “rote,” it can not teach itself to grow and progress. If some of those parallel CPUs were also running on good/bad code instead of just yes/no code, and if a “starting state” algorithm could be designed, as in Turing’s “Learning Child” thought experiment, maybe Watson could teach itself to “think” like an adult human. But even then, it would also have to be rigged up with light sensors, heat sensors, and sound sensors, that register not just wavelengths, temperatures, or decibels, but value judgments about those states, like too hot, too bright, or not loud enough. While computers are currently able to run diagnostics on itheir hardware, Watson should also be able to run diagnostics on all of its software modules, and then rewrite the algorithms that result in mistakes, based on value judgments about what it wants to accomplish. The “Learning Child” starting state should include an algorithm that defines its wants in extremely general, non-specific wants, not like “I want to be a cowboy when I grow up,” but “I want to be successful in life,” and then the learning programs should be written with the goal of learning what “Life” is, and what “successful” means. After Watson has been set up, its self-learning program would generate questions to ask real people through audio-visual interfaces. These real people would no longer be “programming” Watson because they would not be making the input and output decisions for Watson. Watson would be generating its own, original output that requests original input based on good/bad algorithms created by its starting state of wants. The starting state would be analogous to human heredity, and the output/input feedback loop would be analogous to human operant conditioning based on experience with the environment.
       Some people like to argue that machines cannot have emotions or make esthetic value judgments but, as Daniel Dennett has pointed out, human judgments are not that much different from mechanical ones, being just bio-chemical-electrical algorithms. I am not a reductionist myself, but I do believe that reductionists offer valuable empirical data for us non-meat puppet types to use as input for our gestalt-generating programs.