Fundraising Update 1

TL;DR: We’re pretty much on track, though we also haven’t hit the goal of pushing the fund past $78,890.69. Donate and help put the fund over the line!

With the short version out of the way, let’s dive into the details. What’s changed in the past week and change?

import datetime as dt

import matplotlib.pyplot as pl

import pandas as pd
import pandas.tseries.offsets as pdto


cutoff_day = dt.datetime( 2020, 5, 27, tzinfo=dt.timezone(dt.timedelta(hours=-6)) )

donations = pd.read_csv('donations.cleaned.tsv',sep='\t')

donations['epoch'] = pd.to_datetime(donations['created_at'])
donations['delta_epoch'] = donations['epoch'] - cutoff_day
donations['delta_epoch_days'] = donations['delta_epoch'].apply(lambda x: x.days)

# some adjustment is necessary to line up with the current total
donations['culm'] = donations['amount'].cumsum() + 14723

new_donations_mask = donations['delta_epoch_days'] > 0
print( f"There have been {sum(new_donations_mask)} donations since {cutoff_day}." )
There have been 8 donations since 2020-05-27 00:00:00-06:00.

There’s been a reasonable number of donations after I published that original post. What does that look like, relative to the previous graph?

pl.figure(num=None, figsize=(8, 4), dpi=150, facecolor='w', edgecolor='k')

pl.plot( donations['delta_epoch_days'], donations['culm'], '-',c='#aaaaaa')
pl.plot( donations['delta_epoch_days'][new_donations_mask], \
        donations['culm'][new_donations_mask], '-',c='#0099ff')

pl.title("Defense against Carrier SLAPP Suit")

pl.xlabel("days since cutoff")
pl.ylabel("dollars")
pl.xlim( [-365.26,donations['delta_epoch_days'].max()] )
pl.ylim( [55000,82500] )
pl.show()

An updated chart from the past year. New donations are in blue.

That’s certainly an improvement in the short term, though the graph is much too zoomed out to say more. Let’s zoom in, and overlay the posterior.

# load the previously-fitted posterior
flat_chain = np.loadtxt('starting_posterior.csv')


pl.figure(num=None, figsize=(8, 4), dpi=150, facecolor='w', edgecolor='k')

x = np.array([0, donations['delta_epoch_days'].max()])
for m,_,_ in flat_chain:
    pl.plot( x, m*x + 78039, '-r', alpha=0.05 )
    
pl.plot( donations['delta_epoch_days'], donations['culm'], '-', c='#aaaaaa')
pl.plot( donations['delta_epoch_days'][new_donations_mask], \
        donations['culm'][new_donations_mask], '-', c='#0099ff')

pl.title("Defense against Carrier SLAPP Suit")

pl.xlabel("days since cutoff")
pl.ylabel("dollars")
pl.xlim( [-3,x[1]+1] )
pl.ylim( [77800,79000] )

pl.show()

A zoomed-in view of the new donations, with posteriors overlaid.

Hmm, looks like we’re right where the posterior predicted we’d be. My targets were pretty modest, though, consisting of an increase of 3% and 10%, so this doesn’t mean they’ve been missed. Let’s extend the chart to day 16, and explicitly overlay the two targets I set out.

low_target = 78890.69
high_target = 78948.57
target_day = dt.datetime( 2020, 6, 12, 23, 59, tzinfo=dt.timezone(dt.timedelta(hours=-6)) )
target_since_cutoff = (target_day - cutoff_day).days

pl.figure(num=None, figsize=(8, 4), dpi=150, facecolor='w', edgecolor='k')

x = np.array([0, target_since_cutoff])
pl.fill_between( x, [78039, low_target], [78039, high_target], color='#ccbbbb', label='blog post')
pl.fill_between( x, [78039, high_target], [high_target, high_target], color='#ffeeee', label='video')

pl.plot( donations['delta_epoch_days'], donations['culm'], '-',c='#aaaaaa')
pl.plot( donations['delta_epoch_days'][new_donations_mask], \
        donations['culm'][new_donations_mask], '-',c='#0099ff')

pl.title("Defense against Carrier SLAPP Suit")

pl.xlabel("days since cutoff")
pl.ylabel("dollars")
pl.xlim( [-3, target_since_cutoff] )
pl.ylim( [77800,high_target] )

pl.legend(loc='lower right')
pl.show()

The previous graph, this time with targets overlaid.

To earn a blog post and video on Bayes from me, we need the line to be in the pink zone by the time it reaches the end of the graph. For just the blog post, it need only be in the grayish- area. As you can see, it’s painfully close to being in line with the lower of two goals, though if nobody donates between now and Friday it’ll obviously fall quite short.

So if you want to see that blog post, get donating!

Fundraising Target Number 1

If our goal is to raise funds for a good cause, we should at least have an idea of where the funds are at.

(Click here to show the code)
created_at amount epoch delta_epoch culm
0 2017-01-24T07:27:51-06:00 10.0 2017-01-24 07:27:51-06:00 -1218 days +19:51:12 14733.0
1 2017-01-24T07:31:09-06:00 50.0 2017-01-24 07:31:09-06:00 -1218 days +19:54:30 14783.0
2 2017-01-24T07:41:20-06:00 100.0 2017-01-24 07:41:20-06:00 -1218 days +20:04:41 14883.0
3 2017-01-24T07:50:20-06:00 10.0 2017-01-24 07:50:20-06:00 -1218 days +20:13:41 14893.0
4 2017-01-24T08:03:26-06:00 25.0 2017-01-24 08:03:26-06:00 -1218 days +20:26:47 14918.0

Changing the dataset so the last donation happens at time zero makes it both easier to fit the data and easier to understand what’s happening. The first day after the last donation is now day one.

Donations from 2017 don’t tell us much about the current state of the fund, though, so let’s focus on just the last year.

(Click here to show the code)

The last year of donations, for the lawsuit fundraiser.

The donations seem to arrive in bursts, but there have been two quiet portions. One is thanks to the current pandemic, and the other was during last year’s late spring/early summer. It’s hard to tell what the donation rate is just by eye-ball, though. We need to smooth this out via a model.
The simplest such model is linear regression, aka. fitting a line. We want to incorporate uncertainty into the mix, which means a Bayesian fit. Now, what MCMC engine to use, hmmm…. emcee is my overall favourite, but I’m much too reliant on it. I’ve used PyMC3 a few times with success, but recently it’s been acting flaky. Time to pull out the big guns: Stan. I’ve been avoiding it because pystan‘s compilation times drove me nuts, but all the cool kids have switched to cmdstanpy when I looked away. Let’s give that a whirl.

(Click here to show the code)
CPU times: user 5.33 ms, sys: 7.33 ms, total: 12.7 ms
Wall time: 421 ms
CmdStan installed.

We can’t fit to the entire three-year time sequence, that just wouldn’t be fair given the recent slump in donations. How about the last six months? That covers both a few donation burts and a flat period, so it’s more in line with what we’d expect in future.

(Click here to show the code)
There were 117 donations over the last six months.

With the data prepped, we can shift to building the linear model.

(Click here to show the code)

I could have just gone with Stan’s basic model, but flat priors aren’t my style. My preferred prior for the slope is the inverse tangent, as it compensates for the tendency of large slope values to “bunch up” on one another. Stan doesn’t offer it by default, but the Cauchy distribution isn’t too far off.

We’d like the standard deviation to skew towards smaller values. It naturally tends to minimize itself when maximizing the likelihood, but an explicit skew will encourage this process along. Gelman and the Stan crew are drifting towards normal priors, but I still like a Cauchy prior for its weird properties.

Normally I’d plunk the Gaussian distribution in to handle divergence from the deterministic model, but I hear using Student’s T instead will cut down the influence of outliers. Thomas Wiecki recommends one degree of freedom, but Gelman and co. find that it leads to poor convergence in some cases. They recommend somewhere between three and seven degrees of freedom, but skew towards three, so I’ll go with the flow here.

The y-intercept could land pretty much anywhere, making its prior difficult to figure out. Yes, I’ve adjusted the time axis so that the last donation is at time zero, but the recent flat portion pretty much guarantees the y-intercept will be higher than the current amount of funds. The traditional approach is to use a flat prior for the intercept, and I can’t think of a good reason to ditch that.

Not convinced I picked good priors? That’s cool, there should be enough data here that the priors have minimal influence anyway. Moving on, let’s see how long compilation takes.

(Click here to show the code)
CPU times: user 4.91 ms, sys: 5.3 ms, total: 10.2 ms
Wall time: 20.2 s

This is one area where emcee really shines: as a pure python library, it has zero compilation time. Both PyMC3 and Stan need some time to fire up an external compiler, which adds overhead. Twenty seconds isn’t too bad, though, especially if it leads to quick sampling times.

(Click here to show the code)
CPU times: user 14.7 ms, sys: 24.7 ms, total: 39.4 ms
Wall time: 829 ms

And it does! emcee can be pretty zippy for a simple linear regression, but Stan is in another class altogether. PyMC3 floats somewhere between the two, in my experience.

Another great feature of Stan are the built-in diagnostics. These are really handy for confirming the posterior converged, and if not it can give you tips on what’s wrong with the model.

(Click here to show the code)
Processing csv files: /tmp/tmpyfx91ua9/linear_regression-202005262238-1-e393mc6t.csv, /tmp/tmpyfx91ua9/linear_regression-202005262238-2-8u_r8umk.csv, /tmp/tmpyfx91ua9/linear_regression-202005262238-3-m36dbylo.csv, /tmp/tmpyfx91ua9/linear_regression-202005262238-4-hxjnszfe.csv

Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.

Checking sampler transitions for divergences.
No divergent transitions found.

Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory for all transitions.

Effective sample size satisfactory.

Split R-hat values satisfactory all parameters.

Processing complete, no problems detected.

The odds of a simple model with plenty of datapoints going sideways are pretty small, so this is another non-surprise. Enough waiting, though, let’s see the fit in action. First, we need to extract the posterior from the stored variables …

(Click here to show the code)
There are 256 samples in the posterior.

… and now free of its prison, we can plot the posterior against the original data. I’ll narrow the time window slightly, to make it easier to focus on the fit.

(Click here to show the code)

The same graph as before, but now slightly zoomed in on and with trendlines visible.

Looks like a decent fit to me, so we can start using it to answer a few questions. How much money is flowing into the fund each day, on average? How many years will it be until all those legal bills are paid off? Since humans aren’t good at counting in years, let’s also translate that number into a specific date.

(Click here to show the code)
mean/std/median slope = $51.62/1.65/51.76 per day

mean/std/median years to pay off the legal fees, relative to 2020-05-25 12:36:39-05:00 =
	1.962/0.063/1.955

mean/median estimate for paying off debt =
	2022-05-12 07:49:55.274942-05:00 / 2022-05-09 13:57:13.461426-05:00

Mid-May 2022, eh? That’s… not ideal. How much time can we shave off, if we increase the donation rate? Let’s play out a few scenarios.

(Click here to show the code)
median estimate for paying off debt, increasing rate by   1% = 2022-05-02 17:16:37.476652800
median estimate for paying off debt, increasing rate by   3% = 2022-04-18 23:48:28.185868800
median estimate for paying off debt, increasing rate by  10% = 2022-03-05 21:00:48.510403200
median estimate for paying off debt, increasing rate by  30% = 2021-11-26 00:10:56.277984
median estimate for paying off debt, increasing rate by 100% = 2021-05-17 18:16:56.230752

Bumping up the donation rate by one percent is pitiful. A three percent increase will almost shave off a month, which is just barely worthwhile, and a ten percent increase will roll the date forward by two. Those sound like good starting points, so let’s make them official: increase the current donation rate by three percent, and I’ll start pumping out the aforementioned blog posts on Bayesian statistics. Manage to increase it by 10%, and I’ll also record them as videos.

As implied, I don’t intend to keep the same rate throughout this entire process. If you surprise me with your generosity, I’ll bump up the rate. By the same token, though, if we go through a dry spell I’ll decrease the rate so the targets are easier to hit. My goal is to have at least a 50% success rate on that lower bar. Wouldn’t that make it impossible to hit the video target? Remember, though, it’ll take some time to determine the success rate. That lag should make it possible to blow past the target, and by the time this becomes an issue I’ll have thought of a better fix.

Ah, but over what timeframe should this rate increase? We could easily blow past the three percent target if someone donates a hundred bucks tomorrow, after all, and it’s no fair to announce this and hope your wallets are ready to go in an instant. How about… sixteen days. You’ve got sixteen days to hit one of those rate targets. That’s a nice round number, for a computer scientist, and it should (hopefully!) give me just enough time to whip up the first post. What does that goal translate to, in absolute numbers?

(Click here to show the code)
a   3% increase over 16 days translates to $851.69 + $78039.00 = $78890.69

Right, if you want those blog posts to start flowing you’ve got to get that fundraiser total to $78,890.69 before June 12th. As for the video…

(Click here to show the code)
a  10% increase over 16 days translates to $909.57 + $78039.00 = $78948.57

… you’ve got to hit $78,948.57 by the same date.

Ready? Set? Get donating!

It’s Payback Time

I’m back! Yay! Sorry about all that, but my workload was just ridiculous. Things should be a lot more slack for the next few months, so it’s time I got back blogging. This also means I can finally put into action something I’ve been sitting on for months.

Richard Carrier has been a sore spot for me. He was one of the reasons I got interested in Bayesian statistics, and for a while there I thought he was a cool progressive. Alas, when it was revealed he was instead a vindictive creepy asshole, it shook me a bit. I promised myself I’d help out somehow, but I’d already done the obsessive analysis thing and in hindsight I’m not convinced it did more good than harm. I was at a loss for what I could do, beyond sharing links to the fundraiser.

Now, I think I know. The lawsuits may be long over, thanks to Carrier coincidentally dropping them at roughly the same time he came under threat of a counter-suit, but the legal bill are still there and not going away anytime soon. Worse, with the removal of the threat people are starting to forget about those debts. There have been only five donations this month, and four in April. It’s time to bring a little attention back that way.

One nasty side-effect of Carrier’s lawsuits is that Bayesian statistics has become a punchline in the atheist/skeptic community. The reasoning is understandable, if flawed: Carrier is a crank, he promotes Bayesian statistics, ergo Bayesian statistics must be the tool of crackpots. This has been surreal for me to witness, as Bayes has become a critical tool in my kit over the last three years. I suppose I could survive without it, if I had to, but every alternative I’m aware of is worse. I’m not the only one in this camp, either.

Following the emergence of a novel coronavirus (SARS-CoV-2) and its spread outside of China, Europe is now experiencing large epidemics. In response, many European countries have implemented unprecedented non-pharmaceutical interventions including case isolation, the closure of schools and universities, banning of mass gatherings and/or public events, and most recently, widescale social distancing including local and national lockdowns. In this report, we use a semi-mechanistic Bayesian hierarchical model to attempt to infer the impact of these interventions across 11 European countries.

Flaxman, Seth, Swapnil Mishra, Axel Gandy, H Juliette T Unwin, Helen Coupland, Thomas A Mellan, Tresnia Berah, et al. “Estimating the Number of Infections and the Impact of Non- Pharmaceutical Interventions on COVID-19 in 11 European Countries,” 2020, 35.

In estimating time intervals between symptom onset and outcome, it was necessary to account for the fact that, during a growing epidemic, a higher proportion of the cases will have been infected recently (…). Therefore, we re-parameterised a gamma model to account for exponential growth using a growth rate of 0·14 per day, obtained from the early case onset data (…). Using Bayesian methods, we fitted gamma distributions to the data on time from onset to death and onset to recovery, conditional on having observed the final outcome.

Verity, Robert, Lucy C. Okell, Ilaria Dorigatti, Peter Winskill, Charles Whittaker, Natsuko Imai, Gina Cuomo-Dannenburg, et al. “Estimates of the Severity of Coronavirus Disease 2019: A Model-Based Analysis.” The Lancet Infectious Diseases 0, no. 0 (March 30, 2020). https://doi.org/10.1016/S1473-3099(20)30243-7.

we used Bayesian methods to infer parameter estimates and obtain credible intervals.

Linton, Natalie M., Tetsuro Kobayashi, Yichi Yang, Katsuma Hayashi, Andrei R. Akhmetzhanov, Sung-mok Jung, Baoyin Yuan, Ryo Kinoshita, and Hiroshi Nishiura. “Incubation Period and Other Epidemiological Characteristics of 2019 Novel Coronavirus Infections with Right Truncation: A Statistical Analysis of Publicly Available Case Data.” Journal of Clinical Medicine 9, no. 2 (February 2020): 538. https://doi.org/10.3390/jcm9020538.

A significant chunk of our understanding of COVID-19 depends on Bayesian statistics. I’ll go further and argue that you cannot fully understand this pandemic without it. And yet thanks to Richard Carrier, the atheist/skeptic community is primed to dismiss Bayesian statistics.

So let’s catch two stones with one bird. If enough people donate to this fundraiser, I’ll start blogging a course on Bayesian statistics. I think I’ve got a novel angle on the subject, one that’s easier to slip into than my 201-level stuff and yet more rigorous. If y’all really start tossing in the funds, I’ll make it a video series. Yes yes, there’s a pandemic and potential global depression going on, but that just means I’ll work for cheap! I’ll release the milestones and course outline over the next few days, but there’s no harm in an early start.

Help me help the people Richard Carrier hurt. I’ll try to make it worth your while.

Graham Linehan, Cowardly Ass

Sorry all, I’ve been busy. But I thought this situation was worth carving some time out to write about: Graham Linehan is a cowardly ass.

See, EssenceOfThought just released a nice little video calling Linehan out for his support of conversion therapy. As they put it:

Now maybe you read that Tweet and didn’t think much of it. After all, it’s just a call for ‘gender critical therapists’. Why’s that a problem? Well gender critical is euphemism for transphobia in the exact same way that ‘race realist’ is for racism. It’s meant to make the bigotry sound more scientific and therefore more palatable.

The truth meanwhile is that every major medical establishment condemns the self-labelled ‘gender critical’ approach which is a form of reparative ‘therapy’, though as noted earlier it is in fact torture. Said methods are abusive and inflict severe harm on the victim in attempts to turn them cisgender and force them to adhere to strict and archaic gender roles.

I response, Linehan issued a threat:

Hi there I have already begun legal proceedings against Pink News for this defamatory accusation. Take this down immediately or I will take appropriate measures.

Presumably “appropriate measures” involves a defamation lawsuit, though when you’re associated with a transphobic mob there’s a wide universe of possible “measures.”

In all fairness, I should point out that Mumsnet is trying to clean up their act. Linehan, in contrast, was warned by the UK police for harassing a transgender person. He also does the same dance of respectability I called out last post. Observe:

Linehan outlines his view to The Irish Times: “I don’t think I’m saying anything controversial. My position is that anyone suffering from gender dysphoria needs to be helped and supported.” Linehan says he celebrates that trans people are at last finding acceptance: “That’s obviously wonderful.” […]

He characterises some extreme trans activists who have “glommed on to the movement” as “a mixture of grifters, fetishists, and misogynists”. … “All it takes is a few bad people in positions of power to groom an organisation, and in this case a movement. This is a society-wide grooming.”

I suspect Linehan would lump EssenceOfThought in with the “grifters, fetishists, and misogynists,” which is telling. If you’ve never watched an EssenceOfThought video before, do so, then look at the list of citations:

[4] UK Council for Psychotherapy (2015) “Memorandum Of Understanding On Conversion Therapy In The UK”, psychotherapy.org.uk Accessed 31st August 2016: https://www.psychotherapy.org.uk/wp-c…

[5] American Academy Of Pediatrics (2015) “Letterhead For Washington DC 2015”, American Academy Of Pediatrics Accessed 19th September 2018; https://www.aap.org/en-us/advocacy-an…

[6] American Medical Association (2018) “Health Care Needs of Lesbian, Gay, Bisexual, Transgender and Queer Populations H-160.991”, AMA-ASSN.org Accessed 21st September 2019; https://policysearch.ama-assn.org/pol…

[7] Substance Abuse And Mental Health Services Administration (2015) Ending Conversion – Supporting And Affirming LGBTQ Youth”, SAMHSA.gov Accessed 21st September 2019; https://store.samhsa.gov/system/files…

[8] The Trevor Project (2019) “Trevor National Survey On LGBTQ Youth Mental Health”, The Trevor Project Accessed 28th June 2019; https://www.thetrevorproject.org/wp-c…

[9] Turban, J. L., Beckwith, N., Reisner, S. L., & Keuroghlian, A. S. (2019) “Association Between Recalled Exposure To Gender Identity Conversion Efforts And Psychological Distress and Suicide Attempts Among Transgender Adults”, JAMA Psychiatry

[10] Kristina R. Olson, Lily Durwood, Madeleine DeMeules, Katie A. McLaughlin (2016) “Mental Health of Transgender Children Who Are Supported in Their Identities” http://pediatrics.aappublications.org…

[11] Kristina R. Olson, Lily Durwood, Katie A. McLaughlin (2017) “Mental Health And Self-Worth In Socially Transitioned Transgender Youth”, Child And Adolescent Psychiatry, Volume 56, Issue 2, pp.116–123 http://www.jaacap.com/article/S0890-8…

What I love about citation lists is that you can double-check they’re being accurately represented. One reason why I loathe Stephen Pinker, for instance, is because I started hopping down his citation list, and kept finding misrepresentation after misrepresentation. Let’s look at citation 9, as I see EoT didn’t link to the journal article.

Of 27 715 transgender survey respondents (mean [SD] age, 31.2 [13.5] years), 11 857 (42.8%) were assigned male sex at birth. Among the 19 741 (71.3%) who had ever spoken to a professional about their gender identity, 3869 (19.6%; 95% CI, 18.7%-20.5%) reported exposure to GICE in their lifetime. Recalled lifetime exposure was associated with severe psychological distress during the previous month (adjusted odds ratio [aOR], 1.56; 95% CI, 1.09-2.24; P < .001) compared with non-GICE therapy. Associations were found between recalled lifetime exposure and higher odds of lifetime suicide attempts (aOR, 2.27; 95% CI, 1.60-3.24; P < .001) and recalled exposure before the age of 10 years and increased odds of lifetime suicide attempts (aOR, 4.15; 95% CI, 2.44-7.69; P < .001). No significant differences were found when comparing exposure to GICE by secular professionals vs religious advisors.

Compare and contrast with how EssenceOfThought describe that study:

They also found no significant difference when comparing religious or secular conversion attempts. So it’s not a case of finding the right way to do it, there is no right way to do it. You’re simply torturing someone for the sake of inflicting pain. And that is fucking digusting.

And the thing is we know how to help young people who are questioning their gender. And that is to take the gender affirmative approach. That is an approach that allows a child and young teen to explore their identity with support. No mater what conclusion they arrive at.

Compare and contrast both with Linehan’s own view of gender affirmation in youth.

“There are lots of gender non-conforming children who may not be trans and may grow up to be gay adults, but who are being told by an extreme, misogynist ideology, that they were born in the wrong body, and anyone who disagrees with that diagnosis is a bigot.”

“It’s especially dangerous for teenage girls – the numbers referred to gender clinics have shot up – because society, in a million ways, is telling girls they are worthless. Of course they look for an escape hatch.”

“The normal experience of puberty is the first time we all experience gender dysphoria. It’s natural. But to tell confused kids who might every second be feeling uncomfortable in their own skin that they are trapped in the wrong body? It’s an obscenity. It’s like telling anorexic kids they need liposuction.”

So much for helping people with gender dysphoia. If Linehan had his way, the evidence suggests transgender people would commit suicide at a higher rate than they do now. EoT’s accusation that Linehan wishes to “eradicate trans children” is justified by the evidence.

Unable to argue against that truth, Linehan had no choice but to try silencing his critics via lawsuits. Rather than change his mind in the face of substantial evidence, Linehan is trying to sue away reality. It’s a cowardly approach to criticism, and I hope he’s Streisand-ed into obscurity for trying it.