main = print(“Hello World”)

I’ve been wanting to blog here for years, but I always wound up being crushed by schoolwork or distracted by personal life. Eventually I got sick of perpetually putting it off, and forced myself to apply. I’d figure out a way to make it work.

And, as you can see, I’m now blogging here!

And up to my eyeballs in schoolwork.

And with more demands on my free time than ever before.

But! I have a plan.

See, the nice thing about being a slightly-paranoid Computer Scientist is that you tend to keep a low profile. My previous blogging isn’t well known, and the rest of my back catalog ranges from “seen by five people” to “never been shared publicly.” I can easily pad this space with old material until I can come up for air. This is especially perfect, because while my contemporary writing is all about the replication crisis and angrily shouting at fools, my older work was more about atheist apologetics. I have a decently-sized book that I gave up on writing, all about the subject, and it led me to a set of arguments that I haven’t seen anyone else develop. That is book-worthy, but there’s no harm in workshopping it until I can properly put fingers to keyboard.

In the meantime, I should also get cracking at a comment policy. Years of lurking in comment threads have left me with… opinions on the matter. That’s for a future post, though.

I suppose some of you are wondering about the name. Funny, despite the whole “wanting to blog” thing I’ve never been able to decide on a proper blog name. I’ve held on to a catchy subtitle for years (“/dev/random, unless I make a hash of it”), but a title? No clue, no idea, nothing ever came to mind. Forced to come up with one at long last, I did what came naturally.

> while :; do echo `egrep 'te$' /usr/share/dict/words | perl -e 'rand($.)<1 and ($line=$_)while<>;print$line'` \
     `perl -e 'rand($.)<1 and ($line=$_)while<>;print$line' /usr/share/dict/words` ; done | less

xanthosiderite koa
Brooklynite lull
adeste reclamatory
bipunctate abevacuation
disrelate seewee
Epirote Cobden
hemisaprophyte parcel-guilty
camote danda
catastate Westphalian
ingurgitate ephelis
sommite soilures
inseminate rabies
pianoforte stabbed
preconstitute tanistry
Bonaparte intermodification
decapitate philohellenian
Marette Sharona
swinecote prefictional
miaskite Egbert
subprofessorate eosphorite
protectorate soogan
portmanmote morosities
indicolite saiyids
Marguerite hoidening
repromulgate pandemoniacal
barytocelestite alloxy
umbraculate Post-devonian
desecate white-rumped
landgate twice-canvassed
killinite pyrogallate
cycadophyte Englishable
lautarite buffoons
bipunctate tar
merocerite pencels
echelette Borak
odorate overcultivated
Parbate Perrins
amphodelite lethalize
hesperidate Lemosi
zonociliate implosively
Jacquette reimbushment
tricussate Reisinger
alunite high-hatty
archeocyte unimpatiently
montroydite roband
orcanette panstereorama
julienite unorchestrated
fulminurate pro-Sweden
Bathinette Piraeus
cassate unfeigning
lowigite dolos
lyddite intersomnial
delate hepatised
alienigenate perscribe
emporte zoroastra
hemimorphite off-put
hypoantimonate ambrosia
nonconfederate hotfoot
exonerate nonfuturition
reprobate spreadsheet

The algorithm hath spoken!

Steven Pinker, Crank

At least he doesn’t start out that way.

The Second Law of Thermodynamics states that in an isolated system (one that is not taking in energy), entropy never decreases. … Closed systems inexorably become less structured, less organized, less able to accomplish interesting and useful outcomes, until they slide into an equilibrium of gray, tepid, homogeneous monotony and stay there.

For a non-physicist, it’s a decent formulation. It needs more of a description of entropy, though. In computer science, we think of it as how much information is or could be packed into an space. If I have a typical six-sided die, I can send you a message by giving it to you in a specific configuration. If I just ask you to look at a specific side, there are only six unique states to send a message with; if I also ask you to look at the orientation of the other sides, I can bump that up to twenty-four. I can’t send any more information unless I increase the number of states, or get to send multiple die or the same die multiple times. Compression is just transforming a low-entropy encoding into a high-entropy one, saving some time or space.

The physics version is closely related: how many ways can I shuffle the microscopic details of a system while preserving the macroscopic ones? If you’re looking at something small like a computer circuit, the answer is “not many.” The finely-ordered detail can’t be tweaked very much, and still result in a functional circuit. In contrast, the air above the circuit can be mixed up quite a bit and yet still look and act the same. Should a microscopic fluctuation happen, it’ll be far more harmful to the circuit than the air, so when they do inevitably happen the result is a gradual breaking up of the circuit. Its molecules will be slowly stripped off and brought into equilibrium with the air surrounding it, which also changes but less so.

Still with me? Good, because Pinker starts to drift off..

The Second Law of Thermodynamics is acknowledged in everyday life, in sayings such as “Ashes to ashes,” “Things fall apart,” “Rust never sleeps,” “Shit happens,” You can’t unscramble an egg,” “What can go wrong will go wrong,” and (from the Texas lawmaker Sam Rayburn), “Any jackass can kick down a barn, but it takes a carpenter to build one.”

That’s not really the Second Law, though. Pinker himself acknowledges that it only applies to closed systems, but anyone who’s looked up can attest that it isn’t. This comes up all the time in Creationist circles:

There is a mathematical correlation between entropy increase and an increase in disorder. The overall entropy of an isolated system can never decrease. However, the entropy of some parts of the system can spontaneously decrease at the expense of an even greater increase of other parts of the system. When heat flows spontaneously from a hot part of a system to a colder part of the system, the entropy of the hot area spontaneously decreases!

It’s bad enough that Pinker invokes a creationist-level understanding of physics, but he actually manages to make them look intelligent with:

To start with, the Second Law implies that misfortune may be no one’s fault. … Not only does the universe not care about our desires, but in the natural course of events it will appear to thwart them, because there are so many more ways for things to go wrong than to go right. Houses burn down, ships sink, battles are lost for the want of a horseshoe nail.

There is no “wrong” ordering of molecules in the air or a computer chip, only orderings that aren’t what human beings want. “Misfortune” is a human construct superimposed on the universe, to model the goal we strive for. It has no place in a physics classroom, and is completely unrelated to thermodynamics.

Poverty, too, needs no explanation. In a world governed by entropy and evolution, it is the default state of humankind. Matter does not just arrange itself into shelter or clothing, and living things do everything they can not to become our food. What needs to be explained is wealth. Yet most discussions of poverty consist of arguments about whom to blame for it.

Poverty is the inability to fulfill our basic needs. Is Pinker saying that, by default, human beings are incapable of meeting their basic needs, like food and shelter? Then he is effectively arguing we should have gone extinct and been replaced by a species which has no problems meeting its basic needs, like spiders or bacteria or ants. This of course ignores that economies are not closed systems, as the Sun helpfully dumps energy on us. Innovation increases efficiency and therefore entropy, which means that people who can’t gather their needs efficiently given what they have are living in a low entropic state.

But I thought entropy only increased over time, according to the Second Law? By Pinker’s own logic, poverty should not be the default but the past, a state that we evolved out of!

More generally, an underappreciation of the Second Law lures people into seeing every unsolved social problem as a sign that their country is being driven off a cliff.

Ooooh, I get it. This essay is just an excuse for Pinker to whine about progressives who want to improve other people’s lives. He thought he could hide his complaints behind science, to make them look more digestible to himself and others, but in reality just demonstrated he understands physics worse than most creationists.

What a crank. And sadly, that seems to be the norm in Evolutionary Psychology.

No, that is not a Sokal hoax; that is a legitimate paper published by two leading Evolutionary Psychologists! There must be something about the field that breeds smug ignorance…

Replication Isn’t Enough

I bang on about statistical power because it indirectly raises the odds of a false positive. In brief, it forces you to do more tests to reach a statistical conclusion, stuffing the file drawer and thus making published results appear more certain than they are. In detail, see John Borghi or Ioannidis (2005). In comic, see Maki Naro.

The concept of statistical power has been known since 1928, the wasteful consequences of low power since 1962, and yet there’s no sign that scientists are upping their power levels. This is a representative result:

Our results indicate that the average statistical power of studies in the field of neuroscience is probably no more than between ~8% and ~31%, on the basis of evidence from diverse subfields within neuro-science. If the low average power we observed across these studies is typical of the neuroscience literature as a whole, this has profound implications for the field. A major implication is that the likelihood that any nominally significant finding actually reflects a true effect is small.

Button, Katherine S., et al. “Power failure: why small sample size undermines the reliability of neuroscience.” Nature Reviews Neuroscience 14.5 (2013): 365-376.

The most obvious consequence of low power is a failure to replicate. If you rarely try to replicate studies, you’ll be blissfully unaware of the problem; once you take replications seriously, though, you’ll suddenly find yourself in a “replication crisis.”

You’d think this would result in calls for increased statistical power, with the occasional call for a switch in methodology to a system that automatically incorporates power. But it’s also led to calls for more replications.

As a condition of receiving their PhD from any accredited institution, graduate students in psychology should be required to conduct, write up, and submit for publication a high-quality replication attempt of at least one key finding from the literature, focusing on the area of their doctoral research.
Everett, Jim AC, and Brian D. Earp. “A tragedy of the (academic) commons: interpreting the replication crisis in psychology as a social dilemma for early-career researchers.” Frontiers in psychology 6 (2015).

Much has been made of preregistration, publication of null results, and Bayesian statistics as important changes to how we do business. But my view is that there is relatively little value in appending these modifications to a scientific practice that is still about one-off findings; and applying them mechanistically to a more careful, cumulative practice is likely to be more of a hindrance than a help. So what do we do? …

Cumulative study sets with internal replication.

If I had to advocate for a single change to practice, this would be it.

There’s an intuitive logic to this: currently less than one in a hundred papers are replications of prior work, so there’s plenty of room for expansion; many key figures like Ronald Fisher and Jerzy Neyman have emphasized the necessity of replications; and it doesn’t require any modification of technique; and the “replication crisis” is primarily about replications. It sounds like an easy, feel-good solution to the problem.

But then I read this paper:

Smaldino, Paul E., and Richard McElreath. “The Natural Selection of Bad Science.” arXiv preprint arXiv:1605.09511 (2016).

It starts off with a meta-analysis of meta-analyses of power, and comes to the same conclusion as above.

We collected all papers that contained reviews of statistical power from published papers in the social, behavioural and biological sciences, and found 19 studies from 16 papers published between 1992 and 2014. … We focus on the statistical power to detect small effects of the order d=0.2, the kind most commonly found in social science research. …. Statistical power is quite low, with a mean of only 0.24, meaning that tests will fail to detect small effects when present three times out of four. More importantly, statistical power shows no sign of increase over six decades …. The data are far from a complete picture of any given field or of the social and behavioural sciences more generally, but they help explain why false discoveries appear to be common. Indeed, our methods may overestimate statistical power because we draw only on published results, which were by necessity sufficiently powered to pass through peer review, usually by detecting a non-null effect.

Rather than leave it at that, though, the researchers decided to simulate the pursuit of science. They set up various “labs” that exerted different levels of effort to maintain methodological rigor, killed off labs that didn’t publish much and replaced them with mutations of labs that published more, and set the simulation spinning.

We ran simulations in which power was held constant but in which effort could evolve (μw=0, μe=0.01). Here selection favoured labs who put in less effort towards ensuring quality work, which increased publication rates at the cost of more false discoveries … . When the focus is on the production of novel results and negative findings are difficult to publish, institutional incentives for publication quantity select for the continued degradation of scientific practices.

That’s not surprising. But then they started tinkering with replication rates. To begin with, replications were done 1% of the time, were guaranteed to be published, and having one of your results fail to replicate would exact a terrible toll.

We found that the mean rate of replication evolved slowly but steadily to around 0.08. Replication was weakly selected for, because although publication of a replication was worth only half as much as publication of a novel result, it was also guaranteed to be published. On the other hand, allowing replication to evolve could not stave off the evolution of low effort, because low effort increased the false-positive rate to such high levels that novel hypotheses became more likely than not to yield positive results … . As such, increasing one’s replication rate became less lucrative than reducing effort and pursuing novel hypotheses.

So it was time for extreme measures: force the replication rate to high levels, to the point that 50% of all studies were replications. All that happened was that it took longer for the overall methodological effort to drop and false positives to bloom.

Replication is not sufficient to curb the natural selection of bad science because the top performing labs will always be those who are able to cut corners. Replication allows those labs with poor methods to be penalized, but unless all published studies are replicated several times (an ideal but implausible scenario), some labs will avoid being caught. In a system such as modern science, with finite career opportunities and high network connectivity, the marginal return for being in the top tier of publications may be orders of magnitude higher than an otherwise respectable publication record.

Replication isn’t enough. The field of science needs to incorporate more radical reforms that encourage high methodological rigor and greater power.

Steven Pinker and his Portable Goalposts

PZ Myers seems to have pissed off quite a few people, this time for taking Steven Pinker to task. His take is worth reading in full, but I’d like to add another angle. In the original interview, there’s a very telling passage:

Belluz: But as you mentioned, there’s been an uptick in war deaths driven by the staggeringly violent ongoing conflict in Syria. Does that not affect your thesis?

Pinker: No, it doesn’t affect the thesis because the rate of death in war is about 1.4 per 100,000 per year. That’s higher than it was at the low point in 2010. But it’s still a fraction of what it was in earlier years.

See the problem here? Pinker’s hypothesis is that over the span of centuries, violence will decrease. The recent spike in deaths may be the start of a reversal that proves Pinker wrong. But because his hypothesis covers such a wide timespan, we’re going to need fifty or more years worth of data to challenge it. [Read more…]

Veritasium on the Reproducibility Crisis

It’s a great summary, going into much more depth than most. I really like how Muller brought out a concrete example of publication bias, and found an example of p-hacking in a branch of science that’s usually resistant to it, physics.

But I’m not completely happy with it. Some of this comes from being a Bayesian fanboi that didn’t hear the topic mentioned, but Muller also makes a weird turn of phrase at the end. Muller argues that, as bad as the flaws in science may be, think of how much worse they are in all our other systems of learning about the world.

Slight problem: there are no other systems. Even “I feel it’s true” is based on an evidential claim, evaluated for plausibility against other competing hypotheses. The weighting procedure may be hopelessly skewed, but so too are p-values and the publication process.

Muller could have strengthened his point by bringing up an example, yet did not. We’re left taking his word that science isn’t the sole methodology we have for exploring the world, and that those alternate methodologies aren’t as rigorous. Meanwhile, he explicitly points out that a small fraction of “landmark cancer trials” could be replicated; this implies that cancer treatments, and by extension the well-being of millions of cancer patients, are being harmed by poor methodology in science. Even if you disagree with my assertion that all epistemologies are scientific in some fashion, it’s tough to find a counter-example that effects 40% of us and will kill a quarter.

My hope doesn’t come from a blind assurance that other methodologies are worse than science, it comes from the news that scientists have recognized the flaws in their trade, and are working to correct them. To be fair to Muller, he’d probably agree.

Ignorance and Social Justice

“Why are you a feminist?”

Because it lets me sleep at night. Think about it: let’s say it’s true that over half the human population is burdened with a systematic disadvantage compared to the rest. Having learned of that, can you honestly shrug your shoulders and ignore the problem? I certainly can’t, so I’ll do what little I can to correct this injustice.

You may not agree, which is fine. But the corrolary of this view is that you cannot be opposed to feminism without also misunderstanding it. This sets up a prediction we can test: people who oppose feminism and other forms of social justice must be ignorant of it, must invoke straw-people, and must be resistant to learning or understanding it, if my stance has some truth to it.

The evidence suggests it has more than a little.

For instance, after offering to debate Martin Hughes, TJ Kirk cowardly backed out. Stephanie Zvan has an excellent blog post up pointing out that this is a common theme: people opposed to social justice aren’t keen on actually debating the subject.

That’s the real function of “You don’t want to debate” in this context. It isn’t to get you to debate. It’s there to say there’s something wrong with you. That’s why the offer disappears once you drag the argument into the reality of terms and conditions and making sure no one profits from the debate. It wasn’t real to begin with.

To do well in a debate, you really have to know the other side in depth. If you do that homework, though, you might learn the other side’s arguments are correct. So if you are hoping to sleep well at night, you don’t debates. I popped into the comment section to point out an exception to this:

Some of the hardcore haters would disagree, and say they’re perfectly fine with a debate. They have a very peculiar definition of “debate” in mind, though, where both sides shout slogans into the night without critically appraising their merits. It’s an extension of what I’ve called the “treadmill of lies:” By endlessly cycling from myth to lie, they avoid having to consider any one in detail and thus can convince themselves they’re just a bunch of skeptical satirists.

When this actually happens during a debate, we call it a “Gish Gallop.” This technique is a big problem with traditional, in-person debates, and I don’t think it’s a coincidence that TJ Kirk was pushing for this format instead of a more leisurely exchange of blog posts. He knew he had nothing but slogans against Hughes’ arguments, and he knew those wouldn’t convince anyone but those already convinced. Unless there was some sort of reward involved, like cash or a raised profile, there was no point in “debating” Hughes.

I go into a little more detail on the treadmill here. But as luck would have it, this data point was followed by yet another. Possibly in response to the controversy kicked off by Kirk, a number of atheist YouTubers joined with him to fire back a challenge: “QUESTIONS WHITE MEN HAVE FOR SJWs!

Others in the atheo/skeptic community have been responding back, in between bouts of muffled laughter and obvious eyerolls. I’ll add my two cents at some point, but for now I’d like to point out a common theme in the questions.

3. Do you want women to be equal or do you want women to be a protected class? You can’t have both.

Protected class: “A group of people with a common characteristic who are legally protected from employment discrimination on the basis of that characteristic. Protected classes are created by both federal and state law. … Federal protected classes include: Race. Color. Religion or creed. National origin or ancestry. Sex.

4. What are you afraid will happen when you leave your “safe space”?

A Safe Space is a place where anyone can relax and be able to fully express, without fear of being made to feel uncomfortable, unwelcome, or unsafe on account of biological sex, race/ethnicity, sexual orientation, gender identity or expression, cultural background, religious affiliation, age, or physical or mental ability.

5. How can you possibly justify the idea that it’s somehow racist to disagree with black lives matter?

When we say Black Lives Matter, we are broadening the conversation around state violence to include all of the ways in which Black people are intentionally left powerless at the hands of the state.  We are talking about the ways in which Black lives are deprived of our basic human rights and dignity.

6. Are you aware the present is not the past? Are you familiar with the concept of linear time? Because you seem incredibly comfortable traveling back through time by talking about how bad things were for women, or black people, or whomever. And then by using some form of SJW magic, you then claim or imply that those problems in the past exist today. Are you aware that this trick that you’re doing is not working? Why do you think that would work?

Results: In the United States, an estimated 19.3% of women and 1.7% of men have been raped during their lifetimes; an estimated 1.6% of women reported that they were raped in the 12 months preceding the survey. The case count for men reporting rape in the preceding 12 months was too small to produce a statistically reliable prevalence estimate. An estimated 43.9% of women and 23.4% of men experienced other forms of sexual violence during their lifetimes, including being made to penetrate, sexual coercion, unwanted sexual contact, and noncontact unwanted sexual experiences. The percentages of women and men who experienced these other forms of sexual violence victimization in the 12 months preceding the survey were an estimated 5.5% and 5.1%, respectively.

8. Did you know there are 13% more women in college right now than men? So if the whole goal of feminism is “equality,” shouldn’t we have some men-only scholarships in order to equal everything out?

The strength of this unconscious bias is quite astonishing – even for a relatively objective measure such as promptness, students rated a “female” professor 3.55 out of 5 and a “male” professor 4.35, despite the fact that they handed work back at the same time.

The implications are serious. In the competitive world of academia, student evaluations are often used as a tool in the process of hiring and promotion. That the evaluations may be biased against female professors is particularly problematic in light of existing gender imbalance, particularly at the highest echelons of academia. According to the American Association of University Professors, in 2012, 62% of men in academia in the US were tenured compared to only 44% of women, while women were far more likely to be in non-tenure track positions than men (32% of women in academia compared to just 19% of men).

When there are answers to the questions those YouTubers fired off, it only takes a few minutes of Googling to get an answer. Want scientific studies? They’ve been done by the hundreds, on nearly all the topics pushed by “social justice warriors.” Decades of research have been done, untold thousands of words have been spilled, and yet these people opposed to social justice are completely ignorant of it all. Had they put in the time to educate themselves, like some others have, they’d become social justice warriors too.

But as Zvan would have predicted, some of those questions aren’t actually questions.

7. Why do you think that you can spend your entire life in a state of perpetual emotional immaturity? Do you actually imagine that you’ll be able to stretch out your adolescence for your entire existence?

10. What do you hope to gain by bringing back racial segregation?

12. Why do you think every cis white male is born racist?

14. Would you rather be right, or popular? It seems like your primary objective is to score social points and get public validation.

These questions were never meant to be answered, they’re just empty talking points that form the treadmill’s belt. They’re meant to protect you from educating yourself, from breaking the wall of ignorance.

Because you might not sleep well, once you find out what’s on the other side.

A Computer Scientist Reads EvoPsych, Part 4

[Part 3]

The programs comprising the human mind were designed by natural selection to solve the adaptive problems regularly faced by our hunter-gatherer ancestors—problems such as finding a mate, cooperating with others, hunting, gathering, protecting children, navigating, avoiding predators, avoiding exploitation, and so on. Knowing this allows evolutionary psychologists to approach the study of the mind like an engineer. You start by carefully specifying an adaptive information processing problem; then you do a task analysis of that problem. A task analysis consists of identifying what properties a program would have to have to solve that problem well. This approach allows you to generate hypotheses about the structure of the programs that comprise the mind, which can then be tested.[1]

Let’s try this approach. My task will be to calculate the inverse square root of a number, a common one in computer graphics. The “inverse” part implies I’ll have to do a division at some point, and the “square root” implies either raising something to a power, finding the logarithm of the input, or invoking some sort of function that’ll return the square root. So I should expect a program which contains an inverse square root function to have something like:

float InverseSquareRoot( float x ) 

     return 1.0 / sqrt(x);


So you could imagine my shock if I peered into a program and found this instead:

float FastInvSqrt( float x )
    long i;
    float x2, y;
    x2 = x * 0.5;

    i = * ( long * ) &x;
    i = 0x5f3759df - ( i >> 1 );
    y = * ( float * ) &i;

    y = y * ( 1.5 - ( x2 * y * y ) );

    return y;

Something like that snippet was in Quake III’s software renderer. It uses one step of Newton’s Method to find the zero of an equation derived from the input value, seeded by a guess that takes advantage of the structure of floating point numbers. It also breaks every one of the predictions my analysis made, not even including a division.

The task analysis failed for a simple reason: nearly every problem has more than one approach to it. If we’re not aware of every alternative, our analysis can’t take all of them into account and we’ll probably be led astray. We’d expect convolutions to be slow for large kernels unless we were aware of the Fourier transform, we’d think it was impossible to keep concurrent operations from mucking up memory unless we knew we had hardware-level atomic operations, and if we thought of sorting purely in terms of comparing one value to another we’d miss out on the fastest sorting algorithm out there, Radix sort.

Radix sort doesn’t get implemented very often because it either requires a tonne of memory, or the overhead of doing a census makes it useless on small lists. To put that more generally, the context of execution matters more than the requirements of the task during implementation. The simplistic approach of Tooby and Cosmides does not take that into account.

We can throw them a lifeline, mind you. I formed a hypothesis about computing inverse square roots, refuted it, and now I’m wiser for it. Isn’t that still a net win for the process? Notice a key difference, though: we only became wiser because we could look at the source code. If FastInvSqrt() was instead a black box, the only way I could refute my analysis would be to propose the exact way the algorithm worked and then demonstrated it consistently predicted the outputs much better. If I didn’t know the techniques used in FastInvSqrt() were possible, I’d never be able to refute it.

On the contrary, I might falsely conclude I was right. After all, the outputs of my analysis and FastInvSqrt() are very similar, so I could easily wave away the differences as due to a buggy square root function or a flaw in the division routine. This is especially dangerous with evolutionary algorithms, as Dr. Adrian Thompson figured out in an earlier installment, because the odds of us knowing every possible trick are slim.

In sum, this analysis method is primed to generate smug over-confidence in your theories.

Each organ in the body evolved to serve a function: The intestines digest, the heart pumps blood, and the liver detoxifies poisons. The brain’s evolved function is to extract information from the environment and use that information to generate behavior and regulate physiology. Hence, the brain is not just like a computer. It is a computer—that is, a physical system that was designed to process information. Its programs were designed not by an engineer, but by natural selection, a causal process that retains and discards design features based on how well they solved adaptive problems in past environments.[1]

And is my appendix’s function to randomly attempt to kill me? The only people I’ve seen push this biological teleology are creationists who propose an intelligent designer. Few people well studied in biology would buy this line.

But getting back to my field, notice the odd dichotomy at play here: our brains are super-sophisticated computational devices, but not sophisticated enough to re-program themselves on-the-fly. Yet even the most primitive computers we’ve developed can modify the code they’re running, as they’re running it. Why isn’t that an option? Why can’t we be as much of a blank slate as forty-year old computer chips?

It’s tempting to declare that we’re more primitive than they are, computationally, but there’s a fundamental problem here: algorithms are algorithms are algorithms. If you can compute, you’re a Turing machine of some sort. There is no such thing as a “primitive” computer, at best you could argue some computers have more limitations imposed on them than others.

Human beings can compute, as anyone who’s taken a math course can attest. Ergo, we must be something like a Turing machine. Is it possible that our computation is split up into programs, which themselves change only slowly? Sure, but that’s an extra limitation imposed on our computability. It should not be assumed a-priori.

[Part 5]

[1] Tooby, John, and Leda Cosmides. “Conceptual Foundations of Evolutionary Psychology.The Handbook of Evolutionary Psychology (2005): 5-67.

A Computer Scientist Reads EvoPsych, Part 3

[Part 2]

As a result of selection acting on information-behavior relationships, the human brain is predicted to be densely packed with programs that cause intricate relationships between information and behavior, including functionally specialized learning systems, domain-specialized rules of inference, default preferences that are adjusted by experience, complex decision rules, concepts that organize our experiences and databases of knowledge, and vast databases of acquired information stored in specialized memory systems—remembered episodes from our lives, encyclopedias of plant life and animal behavior, banks of information about other people’s proclivities and preferences, and so on. All of these programs and the databases they create can be called on in different combinations to elicit a dazzling variety of behavioral responses.[1]

“Program?” “Database?” What exactly do those mean? That might seem like a strange question to hear from a computer scientist, but my training makes me acutely aware of how flexible those terms can be. [Read more…]

What is False?

John Oliver weighed in on the replication crisis, and I think he did a great job. I’d have liked a bit more on university press departments, who can write misleading press releases that journalists jump on, but he did have to simplify things for a lay audience.

It got me thinking about what “false” means, though. “True” is usually defined as “in line with reality,” so “false” should mean “not in line with reality,” the precise compliment.

But don’t think about it in terms of a single thing, but in multiple data points applied to a specific theory. Suppose we analyze that data, and find that all but a few datapoints are predicted by the hypothesis we’re testing. Does this mean the hypothesis is false, since it isn’t in line with reality in all cases, or true, because it’s more in line with reality than not? Falsification argues that it is false, and exploits that to come up with this epistemology:

  1. Gather data.
  2. Is that data predicted by the hypothesis? If so, repeat step 1.
  3. If not, replace this hypothesis with another that predicts all the data we’ve seen so far, and repeat step 1.

That’s what I had in mind when I said that frequentism works on streams of hypotheses, hopping from one “best” hypothesis to the next. The addition of time changes the original definitions slightly, so that “true” really means “in line with reality in all instances” while “false” means “in at least one instance, it is not in line with reality.”

Notice the asymmetry, though. A hypothesis has to reach a pretty high bar to be considered “true,” and “false” hypotheses range from “in line with reality, with one exception” to “never in line with reality.” Some of those “false” hypotheses are actually quite valuable to us, as John Oliver’s segment demonstrates. He never explains what “statistical significance” means, for instance, but later on uses “significance” in the “effect size” sense. This will mislead most of the audience away from the reality of the situation, and in the absolute it makes his segment “false.” Nonetheless, that segment was a net positive at getting people to understand and care for the replication crisis, so labeling it “false” is a disservice.

We need something fuzzier than the strict binary of falsification. What if we didn’t compliment “true” in the set-theory sense, but in the definitional sense? Let “true” remain “in line with reality in all instances,” but change “false” from “in at least one instance, it is not in reality” to “never in line with reality.” This creates a gap, though: that hypothesis from earlier is neither “true” nor “false,” as it isn’t true in all cases nor false in all. It must be in a third category, as part of some sort of paraconsistent logic.

This is where the Bayesian interpretation of statistics comes from, it deliberately disclaims an absolute “true” or “false” label for descriptions of the world, instead holding them up as two ends of a continuum. Every hypothesis in the third category inbetween, hoping that future data will reveal that its closer to one end of the continuum or the other.

I think it’s a neat way to view the Bayesian/Frequentism debate, as a mere disagreement over what “false” means.

A Computer Scientist Reads EvoPsych, Part 2

[Part 1]

the concept of “learning” within the Standard Social Science Model itself tacitly invokes unbounded rationality, in that learning is the tendency of the general-purpose, equipotential mind to grow—by an unspecified and undiscovered computational means—whatever functional information-processing abilities it needs to serve its purposes, given time and experience in the task environment.

Evolutionary psychologists depart from fitness teleologists, traditional economists (but not neuroeconomists), and blank-slate learning theorists by arguing that neither human engineers nor evolution can build a computational device that exhibits these forms of unbounded rationality, because such architectures are impossible, even in principle (for arguments, see Cosmides & Tooby, 1987; Symons 1989, 1992; Tooby & Cosmides, 1990a, 1992).[1]

Yeah, these people don’t know much about computer science.

You can divide the field of “artificial” intelligence into two basic approaches. The top-down approach outlined modular code routines like “recognize faces,” then broke those down into sub-tasks like “look for eyes” and “find mouths.” By starting at a high level and dividing these things down into neat, tidy sub-programs, we can chain them together and create a greater whole.

It’s never worked all that well, at least for real-life problems. Take Cyc, the best example I can think of. It takes basic facts about the world, like “water is wet” or “rain is water,” and uses a simple set of rules to query these facts (“is rain wet?”). What it can’t do is make guesses (“are clouds wet?”), nor discover new facts on its own, nor handle anything but simple text. Thirty years and millions of dollars haven’t made a dent in those problems.

Meanwhile, the graphics card manufacturer NVidia is betting the farm on something called “deep learning,” one of several “bottom-up” approaches. You present the algorithm with an image (or sound file or object, the number of dimensions can be easily changed), and it maps it to a grid of cells. You toss a slightly smaller grid of cells on top of it, and for each new cell you calculate a weighted sum of the nearby values in the previous grid, weights that are random to start off with. Repeat this several times, and you’ll wind up with a single cell at the end. Assign this cell to an output, say “person,” then rewind all the way back to the start. Wash, rinse, and repeat until you get another single cell, then at least enough single cells to handle every possible solution. All of these single cells have a value associated with them, so that “person” cell might give the image 0.7 “person”s. Having cataloged what’s in the image already, you know there’s actually 1.0 “person” there, and so you propagate that information back down the chain. Prior cell weights which were pro-person are increased, while the anti-person ones are decreased. Do this right to the bottom, and for every input cell, then repeat the process for a new image.

It’s loosely patterned after how our own neurons are laid out. Biology is a bit more liberal with how it connects, but this structure has the virtue of being easy to calculate and massively parallel, quite convenient for a company which manufactures processors that specialize in massively parallel computations. NVidia’s farm-betting comes from the fact that it’s wildly successful; all of the best image recognition algorithms follow the deep-learning pattern, and their success rates are not only impressive but also resemble our own.[2]

Heard of the AI that could play Atari games? Emphasis mine:

Our [Deep action-value Network or DQN] method outperforms the best existing reinforcement learning methods on 43 games without incorporating any of the additional prior knowledge about Atari 2600 games used by other approaches … . Furthermore, our DQN agent performed at a level that was comparable to that of a professional human games tester across the set of 49 games, achieving more than 75% of the human score on more than half of the games […]

Indeed, in certain games DQN is able to discover a relatively long-term strategy (for example, Breakout: the agent learns the optimal strategy, which is to first dig a tunnel around the side of the wall allowing the ball to be sent around the back to destroy a large number of blocks; …). […]

In this work, we demonstrate that a single architecture can successfully learn control policies in a range different environments with only very minimal prior knowledge, receiving only the pixels and the game score as inputs, and using the same algorithm, network architecture and hyperparameters each game, privy only to the inputs a human player would have.[3]

This deep learning network has no idea what a video game is, nor is it permitted to peek at the innards of the game itself, yet can not only learn to play these games at the same level as human beings, it can develop non-trivial solutions to them. You can’t get more “blank slate” than that.

This basic pattern has repeated multiple times over the decades. Neural nets aren’t as zippy as the new kid on the “bottom-up” block, yet they too have had great success where the modular top-down approach has failed miserably. I haven’t worked with either technology, but I’ve worked with something that’s related: genetic algorithms. Represent your solutions in a sort of genome, come up with a fitness metric for them, then mutate or randomly construct those genomes and keep the fittest ones in the pool until you’ve tried every possibility, or you get bored. Two separate runs might converge to the same solution, or they might not. A lot depends on the “fitness landscape” they occupy, which you can visualize as a 3D terrain map with height representing how “fit” something is.

A visualization of three "evolutionary fitness landscapes," ranging from simple to complex to SUPER complex.That landscape has probably got more than three dimensions, but those aren’t as easy to visualize and they behave very similarily to the 3D case. The terrain might be a Mount Fiji with a single solution at the top of a fitness peak, or a Himalayas with many peak solutions scattered about but a single tallest standing above them, or a foothills where solutions are aplenty but the best solution is tough to find.

All of these take the “bottom-up” approach, the opposite of the “top-down” one, and work up from very small components towards a high-level goal. The path to there is rarely known in advance, so the system “feels” its way there via evolutionary algorithms.

That path may not go the way you expect, however. Take the case of a researcher, Dr. Adrian Thompson, who used an evolutionary algorithm to find the smallest computer processor that could sense the difference between two tones.

Finally, after just over 4,000 generations, the test system settled upon the best program. When Dr. Thompson played the 1kHz tone, the microchip unfailingly reacted by decreasing its power output to zero volts. When he played the 10kHz tone, the output jumped up to five volts. He pushed the chip even farther by requiring it to react to vocal “stop” and “go” commands, a task it met with a few hundred more generations of evolution. As predicted, the principle of natural selection could successfully produce specialized circuits using a fraction of the resources a human would have required. And no one had the foggiest notion how it worked.

Dr. Thompson peered inside his perfect offspring to gain insight into its methods, but what he found inside was baffling. The plucky chip was utilizing only thirty-seven of its one hundred logic gates, and most of them were arranged in a curious collection of feedback loops. Five individual logic cells were functionally disconnected from the rest— with no pathways that would allow them to influence the output— yet when the researcher disabled any one of them the chip lost its ability to discriminate the tones. Furthermore, the final program did not work reliably when it was loaded onto other FPGAs of the same type.

It seems that evolution had not merely selected the best code for the task, it had also advocated those programs which took advantage of the electromagnetic quirks of that specific microchip environment. The five separate logic cells were clearly crucial to the chip’s operation, but they were interacting with the main circuitry through some unorthodox method— most likely via the subtle magnetic fields that are created when electrons flow through circuitry, an effect known as magnetic flux. There was also evidence that the circuit was not relying solely on the transistors’ absolute ON and OFF positions like a typical chip; it was capitalizing upon analogue shades of gray along with the digital black and white.[4]

Evolutionary approaches are very simple and require no understanding or insight into the problem you’re solving, but they usually requires ridiculous amounts of computation or training merely to keep pace with the top-down “modular” approach. The fitness function may lead to a solution much too complicated for you to understand or much too fragile to operate anywhere but where it was generated. But the bottom-up approach may be your only choice for certain problems.

The moral of the story: the ability to do complex calculation can be built up from a blank slate, in principle and practice. When we follow the bottom-up approach we tend to get results that more closely mirror biology than when we work from the top-down and modularize, though this is less insightful than it first appears. Nearly all bottom-up approaches take direct inspiration from biology, whereas top-down approaches owe more to Plato then Aristotle.

Biology prefers the blank slate.

[Part 3]

[1] Tooby, John, and Leda Cosmides. “Conceptual Foundations of Evolutionary Psychology.The Handbook of Evolutionary Psychology (2005): 5-67.

[2] Kheradpisheh, Saeed Reza, et al. “Deep Networks Resemble Human Feed-forward Vision in Invariant Object Recognition.” arXiv preprint arXiv:1508.03929 (2015).

[3] Mnih, Volodymyr, et al. “Human-level control through deep reinforcement learning.” Nature 518.7540 (2015): 529-533.

[4] Bellows, Alan. “On the Origin of Circuits • Damn Interesting.” Accessed May 4, 2016.