It makes a certain inevitable sense that two of our topics: AI, and IQ tests, would collide. Do we have anything left but an epistemological trainwreck?
If an “Artificial Intelligence” is attempting to simulate an intelligence, and an IQ test is supposed to measure intelligence, then why not give an AI an IQ test?
A battle appears to be shaping up over whose AI is “smarter”, which is a fascinating problem because typically an AI is going to be good at a specific task (identifying thumbprints) and absolutely incapable at others. The long sought-for “General Artificial Intelligence” would be something that was as good at a variety of tasks as a human. Like natural language query processing, recall, and enough creativity to be able to structure a response in passable English.
IBM’s Watson AI beat Brad Rutter at Jeopardy, and Brad Rutter has taken IQ tests: [lanc]
”We had him tested before he went into first grade and his IQ was so high that they could not even chart it,” said Joann Jupin, Rutters kindergarten teacher at Bucher Elementary School in Manheim Township.
Being the suspicious sort, I am starting to wonder if the AI researchers and IQ testers are trying to dodge this question: if A is better at a task than B, and that task is part of an IQ test, can we say that A is “smarter” than B? I suspect the answer is, “it’s more complicated than that” which just invites the question I was asking before: “then what good is your test?”
Of course we have another problem: Henry Goddard [wik] came up with the idea of factoring “mental age” into “intelligence test” results by dividing – hence Intelligence Quotient. Should we count Watson as being about 10 years old? That doesn’t make any sense. Can we de-normalize the IQ by multiplying by age to get the original intelligence test score back, then compare with Watson’s test score? My initial reaction is that Watson’d be unfairly disadvantaged because it would have to parse the questions on the IQ test, but so does a human.
Has the team at IBM tried training Watson on a mountain of IQ tests?
Turing’s famous test for machine intelligence ought to break down once AI gets to the point where it can fool people – then what? It’s a binary yes/no result – what if an AI is able to fool people tremendously well because it’s better at many of the things we consider “intelligence” than the testers? It wouldn’t matter – it’d only have to fool them enough.
Let’s set aside that I am extremely skeptical that IQ tests measure intelligence, but if they do, why wouldn’t they work for AI? We’re willing to measure humans against AI on specific other tests, like chess, or Go, or Dota2 – if playing any of those games well is something that involves intelligence, then why not? In 2015 that’s exactly what happened: [techx]
Results: It scored a WPPSI-III VIQ that is average for a four-year-old child, but below average for 5 to 7 year-olds
“We found that the WPPSI-III VIQ psychometric test gives a WPPSI-III VIQ to ConceptNet 4 that is equivalent to that of an average four-year old. The performance of the system fell when compared to older children, and it compared poorly to seven year olds.”
The article then goes on to say what you’re probably expecting, if you’ve been following this issue: the AI didn’t perform as well as it probably could have due to natural language parsing errors.
I think that the Turing Test should be ignored, frankly – let’s measure AIs against IQ tests. It’d be a win/win: either it would show how bad IQ tests are, or it would (eventually) shut all the MENSA types up. Or maybe both.
In a recent article on Slate, [slate] the author explains it neatly:
Measuring human intelligence is already a pretty controversial and complicated process, not the least because there’s no stringent definition for what intelligence even is. So it goes without saying that trying to measure machine intelligence is another iteration of an already flawed process. But whereas applying human intelligence to a scale is perhaps unnecessarily reductionist, measuring machine intelligence is a fraught necessity. A.I. are designed with specific tasks and services in mind, so in order to say, “This iteration is more effective than another,” you need a framework that makes that comparison quantifiable.
To that end, researchers from China have just developed what is ostensibly a new kind of IQ test for A.I. systems and human beings alike. It’s not the first time scientists have attempted to peg an IQ number to A.I. (historically those programs barely test better than an average toddler). But the Chinese researchers, in a new preprint paper, say they’ve developed a unique standard for assessing IQ in different A.I. agents. They used it on a variety of different A.I. assistant services last year and found that Google Assistant was among the most intelligent programs currently available, while Apple’s Siri ranked last.
I don’t get that – since most AIs are being tested against some kind of objective success/fail criteria, doesn’t success percentages comprise an adequate metric? For general-purpose AI, why not use IQ tests?
IBM’s marketing team, if you’re reading this: jump on this opportunity and train Watson to score 150 on IQ tests then publicly challenge Donald Trump to compare IQ scores. It’d be Yuge.
Fooling people tremendously well: I once imagined that one’s ability to fool another person is a problem of intellect and creativity – that someone smarter would find it easier to fool someone who was less smart than them. Of course, I don’t know what “smart” is. Deception requires creativity, good memory, an ability to think fast. If the Turing test is a problem defined as: “fool some people into thinking you are a human” then anything able to pass a Turing test might have to be smarter than the humans testing it. Whatever “smart” is.
Definition of Intelligence: “Intelligence is the aggregate or global capacity of the individual to act purposefully, to think rationally and to deal effectively with his environment (Wechsler, 1944).” [wikipedia] So part of intelligence if dealing effectively with one’s environment? OK – does that mean if I suffer an accident that blinds me, I have just become less intelligent? Worse, by Wechsler’s definition, I don’t see how we can call Watson anything but “intelligent.”