“You might think that’s OK”

If you’ve read my blog for a while, you’ve probably noticed that I treat US Republican politicians as if they were a hive mind. That’s obviously false, but when they act as a unit to continue family separation policies or put partisan hacks on the Supreme Court then their differences are small enough to safely ignore.

Today, we got another example of that. The Intelligence Committee within the US House of Representatives held a hearing on Russian interference. Rather than contribute towards that, however, every Republican on the committee used their time to demand the head of the committee step down. Why? According to a letter they released,

Despite these findings [of the Special Council report], you continue to proclaim in the media that there is “significant evidence of collusion.” You further have stated you “will continue to investigate the counterintelligence issues. That is, is the president or people around him compromised in any way to a hostile foreign power?” Your willingness to continue to promote a demonstrably false narrative is alarming.

Either Adam Schiff knew this was coming, or he’s damn quick on his feet, because he shot back with this. Forgive the length of this quote, but it’s worth absorbing in full. [Read more…]

Ugh, Not Again

P-values are back in the news. Nature published an article, signed by 800 scientists, calling for an end to the concept of “statistical significance.” It ruffled my feathers, even though I agreed with its central thesis.

The trouble is human and cognitive more than it is statistical: bucketing results into ‘statistically significant’ and ‘statistically non-significant’ makes people think that the items assigned in that way are categorically different. The same problems are likely to arise under any proposed statistical alternative that involves dichotomization, whether frequentist, Bayesian or otherwise.

Unfortunately, the false belief that crossing the threshold of statistical significance is enough to show that a result is ‘real’ has led scientists and journal editors to privilege such results, thereby distorting the literature. Statistically significant estimates are biased upwards in magnitude and potentially to a large degree, whereas statistically non-significant estimates are biased downwards in magnitude. Consequently, any discussion that focuses on estimates chosen for their significance will be biased. On top of this, the rigid focus on statistical significance encourages researchers to choose data and methods that yield statistical significance for some desired (or simply publishable) result, or that yield statistical non-significance for an undesired result, such as potential side effects of drugs — thereby invalidating conclusions.

Nothing wrong there. While I’ve mentioned some Bayesian buckets, I tucked away a one-sentence counter-argument in an aside over here. Any artificial significant/non-significant boundary is going to promote the distortions they mention here. What got me writing this post was their recommendations.

What will retiring statistical significance look like? We hope that methods sections and data tabulation will be more detailed and nuanced. Authors will emphasize their estimates and the uncertainty in them — for example, by explicitly discussing the lower and upper limits of their intervals. They will not rely on significance tests. When P values are reported, they will be given with sensible precision (for example, P = 0.021 or P = 0.13) — without adornments such as stars or letters to denote statistical significance and not as binary inequalities (P  < 0.05 or P > 0.05). Decisions to interpret or to publish results will not be based on statistical thresholds. People will spend less time with statistical software, and more time thinking.

This basically amounts to nothing. Journal editors still have to decide what to print, and if there is no strong alternative they’ll switch from an arbitrary cutoff of p < 0.05 to an ad-hoc arbitrary cutoff. In the meantime, they’re leaving flawed statistical procedures in place. P-values exaggerate the strength of the evidence, as I and others have argued. Confidence intervals are not an improvement, either. As I put it:

For one thing, if you’re a frequentist it’s a category error to state the odds of a hypothesis being true, or that some data makes a hypothesis more likely, or even that you’re testing the truth-hood of a hypothesis. […]

How does this intersect with confidence intervals? If it’s an invalid move to hypothesise[sic] “the population mean is Y,” it must also be invalid to say “there’s a 95% chance the population mean is between X and Z.” That’s attaching a probability to a hypothesis, and therefore a no-no! Instead, what a frequentist confidence interval is really telling you is “assuming this data is a representative sample, if I repeat my experimental procedure an infinite number of times then I’ll calculate a sample mean between X and Z 95% of the time.” A confidence interval says nothing about the test statistic, at least not directly.

In frequentism, the parameter is fixed and the data varies. It doesn’t make sense to consider other parameters, that’s a Bayesian move. And yet the authors propose exactly that!

We must learn to embrace uncertainty. One practical way to do so is to rename confidence intervals as ‘compatibility intervals’ and interpret them in a way that avoids overconfidence. Specifically, we recommend that authors describe the practical implications of all values inside the interval, especially the observed effect (or point estimate) and the limits. In doing so, they should remember that all the values between the interval’s limits are reasonably compatible with the data, given the statistical assumptions used to compute the interval. Therefore, singling out one particular value (such as the null value) in the interval as ‘shown’ makes no sense.

Much of what the authors proposed would be fixed by switching to Bayesian statistics. Their own suggestions invoke Bayesian ideas without realizing it. Yet they go out of their way to say nothing’s wrong with p-values or confidence intervals, despite evidence to the contrary. Their proposal is destined to fail, yet it got more support than the arguably-superior p < 0.005 proposal.

Maddening. Maybe it’s time I got out my poison pen and added my two cents to the scientific record.

Happy Emmy Noether Day!

Whenever anyone asks me for my favorite scientist, her name comes first.

At a time when women were considered intellectually inferior to men, Noether (pronounced NUR-ter) won the admiration of her male colleagues. She resolved a nagging puzzle in Albert Einstein’s newfound theory of gravity, the general theory of relativity. And in the process, she proved a revolutionary mathematical theorem that changed the way physicists study the universe.

It’s been a century since the July 23, 1918, unveiling of Noether’s famous theorem. Yet its importance persists today. “That theorem has been a guiding star to 20th and 21st century physics,” says theoretical physicist Frank Wilczek of MIT. […]

Although most people have never heard of Noether, physicists sing her theorem’s praises. The theorem is “pervasive in everything we do,” says theoretical physicist Ruth Gregory of Durham University in England. Gregory, who has lectured on the importance of Noether’s work, studies gravity, a field in which Noether’s legacy looms large.

And as luck would have it, today was the day she was born. So read up on why she’s such a critical figure, and use it as an excuse to remember other important women in science.

Rapid Onset Gender Dysphoria

Remember that old thing? No? OK, quick summary:

Parental reports (on social media) of friend clusters exhibiting signs of gender dysphoria and increased exposure to social media/internet preceding a child’s announcement of a transgender identity raise the possibility of social and peer influences.

Littman L (2018) Parent reports of adolescents and young adults perceived to show signs of a rapid onset of gender dysphoria. PLoS ONE 13(8): e0202330.

In short, maybe social media is making the kids transgender? This seems like something someone should study, and someone did!

Poorly. [Read more…]

I Think I Get It

We seem to be in a cycle. Every time PZ Myers posts something about transgender people, the comment thread floods with transphobes. Given the names involved, I suspect this is due to Ophelia Benson’s effect on the atheio/skeptic sphere.

Regardless, there may be another pattern in play. The go-to argument of these transphobes was transgender athletes, with the old bathroom line showing up late in the thread. I had a boo at GenderCritical on Reddit, to assess if this was just a local thing, and noticed there were more stories about athletics than bathrooms over there. Even one of the bigots thought this was new. Has there been a shift of rhetoric among transphobes?

If so, I think I understand why.

[Read more…]