There is yet another discussion of intelligence raging across the internet just now, sparked by Sam Harris’ interview of Charles Murray and a Vox article critical of that interview. (h/t to PZ) I have been critical of the uses of IQ testing for quite some time now, dating back to 8th grade or so. There is nothing per se wrong with intelligence testing. Nor is it inherently bad to make use of intelligence testing. As part of a job application where one is being asked to perform particular tasks in a particular environment, it’s entirely conceivable that a particular intelligence test or set of such tests might well predict success in that job. However, for many if not the vast majority of public policy purposes, IQ and other intelligence testing will function badly, misleadingly, or both. This is even more true if we make assumptions about how much of a particular test result is due to intraracial genetic factors (factors shared within one race, but not between people of different races).
Rather than making this even more lengthy, I’d like to focus on just one substantive criticism and save other discussions for other posts. One metaphor used by Harris in his published e-mail exchange with Ezra Klein of Vox is that of height:
Height is highly heritable, but you can surely stunt a person’s (or a whole population’s) growth through malnutrition. So, merely seeing a group of short people, one can’t be sure to what degree environment determined their height. And yet it remains a fact that if a person doesn’t have the genes to be 7 feet tall, he won’t be. It is also utterly uncontroversial to say that while there are many ways to prevent a person from reaching his full intellectual height, if he doesn’t have the genes to be the next Alan Turing, he won’t be that either.
People respond sympathetically to this argument by Harris, but it is unfortunately misleading. The argument that Murray’s conclusions are both wrong-headed and harmful comes not from the fact that Murray assumes that some portion of intelligence is genetic, but rather that some portion of the observed difference in mean IQ from group to group is indicative of different racial population genetics.
Whether some person’s IQ test partly reflects a genetic component is a very different question than whether that person’s results can be combined with others of a particular racial grouping to demonstrate difference in racial population genetics related to intelligence.
Imagine, if you will, a hypothetical IQ test administrator who administers tests only in Ojibwe.
- Suppose this test administrator is asked to test the IQ of a monolingual English speaker.
- The administrator returns a result stating that the subject was unable to successfully follow instructions or answer a single question.
- Would it be reasonable to conclude that the subject deserves to be rated as low as the test’s accuracy allows (probably 20-55 for most IQ tests)?
- If the identical twin of the monolingual English speaker also failed to answer any questions, would a genetic assay of the twins allow us to say anything productive about the genetic contributions to intelligence?
- Since mean IQ is purposely defined as 100, and if (as Murray and Harris assert) 50% to 80% of intelligence is genetically heritable, then does that mean that these twins’ scores demonstrate a minimum 23 point genetic disadvantage on IQ tests for their race?
In truth, genetically comparing the most successful subjects of our hypothetical administrator to the twins might genuinely appear to show that indigenous/ First Nations peoples have the most intelligence-promoting genetic factors. In any group you can find average differences in the presence of some allele or other when compared to the genetics of some other group. Given the vast differences in average IQ we would expect to be found by this administrator, and how the ability to speak Ojibwe is passed down through families in a manner that highly correlates with how genes are passed through families, it is almost inevitable that some genes more common in certain indigenous/ First Nations families would correlate with being scored more highly by this administrator.
But the lesson we should take from this is not that the hypothetical administrator has the power to reveal how much of intelligence (or even how much of an IQ score) is determined by racial population genetics. The lesson we should take is that it is possible to create a test that appears culture neutral, but that actually generates scores highly dependent on certain shared cultural traits. Language is only the most obvious of these (and because it is so obvious, testing of someone is not generally considered valid unless performed in the first and/or native language of a subject).
The worst part of this lesson for Murray’s position, however, is that my reductio ad absurdum doesn’t actually communicate the real problem here. With monolingual speakers and total non-responsivity to the administrator’s instructions, it appears that I’m talking about a binary trait of intelligence tests: the test is either invalidated entirely by cultural effects or it is entirely valid despite cultural effects. In truth, most cultural traits with the capacity to impact an IQ test’s results will affect the final score only marginally. Further, while bilingual adults above a certain age may test better on certain IQ tests, it’s not at all certain that we can expect the same effect for, say, US citizen children that speak Spanish, Tagalog or Cantonese at home but English in school. The lesser amount of practice in a single language may put such a child at a disadvantage in either of the child’s native languages until after reaching a certain age where language proficiency is no longer developing as rapidly.
To return to the Harris metaphor, while it is easy to say that a 7′ tall person must have a distinct phenotype that enables growing to 7′, when comparing someone 5’6″ to someone 5’7″, how do we determine the genetic contribution is to height difference if one person had the genetic potential to grow to 5’9″ and the other 6’1″, but both suffered a degree of malnourishment during growth? Does it matter wether the 5’6″ tall individual had the genetic potential to grow to 5’9″ or if that shorter individual was the one with the genetic potential to grow to 6’1″? Say, for a moment, that the taller individual was actually more malnourished than the shorter, but the taller individual had the shorter maximum genetic potential for growth (i.e. 5’9″). This could easily be the case if the taller individual had a mutation that did not increase maximum height, but made height growth less sensitive to malnutrition. Now how do we characterize the genetic contribution to height in each of these persons?
To complete the metaphor: If these are racial averages instead of individuals, how do we characterize the racial genetic contribution to the racial average? The answer is that we may not be able to do so, and if we try then we may run up against methodological choices that create conflict between our own choices and the measurement choices of others. That hypothetical gene that makes one group more likely to grow to a higher percentage of maximum height for a given individuals genetic potential? Is that actually a “height” gene? Does it become a gene for height if malnutrition is ubiquitous? What if malnutrition is only expected for 70% of children? 50%? 20%? 1%?
We can think about cultural differences again in a moment, but for right now let’s compare the height-but-for-malnutrition metaphor used by Harris. We know that Black citizens of the US are exposed to higher environmental lead levels in childhood. We know that such exposure is negatively correlated with IQ. Hypothetically, if Black US populations have much higher rates of a genetic variation that leads to higher sensitivity to lead poisoning, you would see Black identical twins with a higher correlation on IQ tests than fraternal twins even when all other genetic factors for IQ happened to be the same because some fraternal twins would share this vulnerability but all identical twins would share it. Does this IQ-but-for-lead poisoning model tell us more about Black genetic factors for intelligence, or more about Black environmental factors for intelligence?
Back to culture. We know that for an IQ test to accurately measure what we hope we’re measuring when we seek to discover intelligence, we have to eliminate certain cultural factors. However, we can’t eliminate all of them. We don’t want to do so. If we’re trying to rate one’s ability to solve problems, it may be that a cultural emphasis on study and education positively impacts the ability to solve problems. For most purposes (say, hiring for a particular job) we won’t want to actually downgrade one of two people who score the same on the test because one comes from a cultural group that values education. However the person got good at solving the problem we don’t care, right? So then we’re back to arguing about what cultural factors need to be eliminated as a source of bias and what cultural factors are accepted as simply a few of many valid paths to success on the exam.
Poverty, too, is a confound. We know that skipping a meal before an IQ test can lead to a lower result. We know that people living in poverty a much more likely to skip meals. Since poverty is racialized, effects like missing a meal on a test day are also racialized. Interestingly, racial differences in IQ increase from pre-school age to high school age. For young students some schools have subsidized or free breakfasts and lunches, but for older students free breakfasts are generally not available. In addition to increased test-day effects, there is also an impeded ability to learn over the course of an educational career when many breakfasts are missed, even if a student manages to have breakfast on a test day. How much, precisely, do these racialized effects contribute to the gap in racial mean IQs?
The hard work to answer questions about poverty effects and cultural effects (including defining which cultural effects are acceptable to measure and for what purposes) is still ongoing, and as such IQ tests cannot currently credibly assert that they’ve eliminated culturally distinct sources of unacceptable bias or the effects of poverty. Since cultural groups are very frequently intraracial groups, these cultural differences will appear in analysis to be sub-racial differences that can add together with the differences inherent to other racial subgroups to become an apparent racial difference. With poverty racialized, you add another source of environmental effects to the measurements.
When we combine known confounds such as racial differences in lead exposure with unknown but potentially very real confounds such as hypothetical genetic sensitivity to lead (and there are so many of these “unknown but potentially real” genetic factors that some will exist even while most, under scrutiny, will turn out not to exist) and add in a huge number of other cultural confounds and then top all that variability off with methodological and definitional disagreements about bias, we arrive at the frustrating answer that these sources of error sum to larger than the observed racial differences in IQ. Since most of the factors of which we’re aware appear to show undermeasurement of disadvantaged people more than overmeasurement of the disadvantaged or undermeasurement of the advantaged, any apparent racial gap is likely to be in large part eliminated by measures that properly assess underlying intelligence. Combined with the fact that some genetic factors may be of the type we discussed with our hypothetical gene for extra sensitivity to lead poisoning, any remaining difference in observed racial mean IQ cannot be positively asserted to be a difference in genes for intelligence (unless resistance to lead poisoning is truly to be categorized as a “gene for intelligence”).
Finally, imagine that we finally get truly “environment neutral” IQ results as a combination of better tests, better test administration, uniformly excellent education, and the elimination of the effects of environmental racism that lead neighborhoods of color to be more toxic and racism and sexism generally, so that we don’t see racism- or sexism-determined differences in children’s heroes and intended adult careers (which can have the effect of focusing a child on things other than the types of education that positively affects IQ tests) or racism- or sexism-determined stereotype threat effects. Everything we know tells us that those effects are large enough to fully close the gap between racial and gender averages, but that they will be unlikely to precisely close the gap. There will likely always be some small difference between racial and gender groups.
At that point, we might be able to say that there is a genetically controlled difference in IQ test results, but if the difference is a single point on a scale where that equals 1/15th of a standard deviation, what is the significance of that result? As a mean, the difference could easily be 1/30th or 1/150th of a standard deviation. Yes, Harris and Murray are correct to say that there will almost certainly be some difference, but will it be meaningful? What will that meaning be?
Murray especially, but apparently also Harris want to say that since we know that we must eventually find a difference in mean racial IQ, we can go ahead and begin to ask questions about the meaning even before the exact quantity of difference is known. Worse, Murray and, apparently, Harris want to talk about the meaning of the genetic contribution to the differences in mean racial IQ. But at this point, and let me stress this,
given the fact that possible sources of error, methodological differences and deficiencies, and unacceptable cultural bias sum to more than the difference between mean racial IQ for whites and Blacks in the US, we don’t even know which racial group will prove to have the best genes for intelligence.
Going back to the hypothetical twins example above, Murray is asserting that since our current testing shows a multi-point deficit between Black mean IQ and white mean IQ in the US, then there is a multi-point genetic disadvantage for Black folk on IQ tests = 50% to 80% of the total disadvantage. But we cannot actually know this. We do not actually know this.
Right now we desperately need to address poverty and environmental racism, including but not limited to eliminating malnutrition and exposures to toxic levels of environmental lead. We desperately need to improve our schools generally and to create a uniform minimum standard that disproportionately raises outcomes in our least performing schools. In 20 years I’d like to see our worst public schools deliver results that would place them a standard deviation above the mean today. During those 20 years, intelligence researchers should be encouraged to investigate many cultural effects on IQ tests and do the hard work of determining whether those effects constitute confounds in measuring intelligence or whether they constitute positive contributions to the core of what an intelligence test actually should measure.
At that point, we can revisit what we know to see if we’ve actually reached a point where we have a measurement of genetic contribution to differences in racial mean IQs. Until then, we’re ill served by giving attention to Murray and those who support him in prematurely discussing the meaning of a genetic contribution we’re not yet able to measure.