[*CONTENT WARNING: TERFs, mentions of violence and sexual assault*]

Customer service: is run by John and Stacy

[*CONTENT WARNING: TERFs, mentions of violence and sexual assault*]

You’ve heard this meme before, that we need to block some/all transgender people from restrooms due to their inherently violent nature. It’s popular in TERF circles, in fact I’ve covered one example myself. There’s just one problem: it was invented out of thin air.

[*CONTENT WARNING: anti-LGBTQI+ rhetoric*]

Sorry, sorry, got lost in my day job for a bit there. It’s been a month since the fundraising deadline passed, though, and I owe you some follow-up. So, the big question: did we hit the fundraising goal? Let’s load the dataset to find out. [Read more…]

Looking back on her time in the “gender critical” feminist movement, [Amy Dyess] is unequivocal: it’s a cult.

A cult that groomed her when she was vulnerable and sleeping in her car; a cult that sought to control her, keeping tabs on her movements and dictating what she could and couldn’t say; a cult that was emotionally and sexually abusive towards her.

As Amy began to notice more and more red flags about the GC movement – like how it defended abusive women, how it wouldn’t let lesbians speak out about sexual assault perpetrated by women, and how it was forming alliances with homophobic groups – she started asking questions.

I definitely stuck a pin in this article when it popped up in my feeds. And yes, it’s old news by now, but I’m surprised so few people have discussed the central conceit: is the Gender Critical movement a cult?

[*CONTENT WARNING: TERFs, sexism*]

**TL;DR**: We’re pretty much on track, though we also haven’t hit the goal of pushing the fund past $78,890.69. Donate and help put the fund over the line!

With the short version out of the way, let’s dive into the details. What’s changed in the past week and change?

```
import datetime as dt
import matplotlib.pyplot as pl
import pandas as pd
import pandas.tseries.offsets as pdto
cutoff_day = dt.datetime( 2020, 5, 27, tzinfo=dt.timezone(dt.timedelta(hours=-6)) )
donations = pd.read_csv('donations.cleaned.tsv',sep='\t')
donations['epoch'] = pd.to_datetime(donations['created_at'])
donations['delta_epoch'] = donations['epoch'] - cutoff_day
donations['delta_epoch_days'] = donations['delta_epoch'].apply(lambda x: x.days)
# some adjustment is necessary to line up with the current total
donations['culm'] = donations['amount'].cumsum() + 14723
new_donations_mask = donations['delta_epoch_days'] > 0
print( f"There have been {sum(new_donations_mask)} donations since {cutoff_day}." )
```

There have been 8 donations since 2020-05-27 00:00:00-06:00.

There’s been a reasonable number of donations after I published that original post. What does that look like, relative to the previous graph?

```
pl.figure(num=None, figsize=(8, 4), dpi=150, facecolor='w', edgecolor='k')
pl.plot( donations['delta_epoch_days'], donations['culm'], '-',c='#aaaaaa')
pl.plot( donations['delta_epoch_days'][new_donations_mask], \
donations['culm'][new_donations_mask], '-',c='#0099ff')
pl.title("Defense against Carrier SLAPP Suit")
pl.xlabel("days since cutoff")
pl.ylabel("dollars")
pl.xlim( [-365.26,donations['delta_epoch_days'].max()] )
pl.ylim( [55000,82500] )
pl.show()
```

That’s certainly an improvement in the short term, though the graph is much too zoomed out to say more. Let’s zoom in, and overlay the posterior.

```
# load the previously-fitted posterior
flat_chain = np.loadtxt('starting_posterior.csv')
pl.figure(num=None, figsize=(8, 4), dpi=150, facecolor='w', edgecolor='k')
x = np.array([0, donations['delta_epoch_days'].max()])
for m,_,_ in flat_chain:
pl.plot( x, m*x + 78039, '-r', alpha=0.05 )
pl.plot( donations['delta_epoch_days'], donations['culm'], '-', c='#aaaaaa')
pl.plot( donations['delta_epoch_days'][new_donations_mask], \
donations['culm'][new_donations_mask], '-', c='#0099ff')
pl.title("Defense against Carrier SLAPP Suit")
pl.xlabel("days since cutoff")
pl.ylabel("dollars")
pl.xlim( [-3,x[1]+1] )
pl.ylim( [77800,79000] )
pl.show()
```

Hmm, looks like we’re right where the posterior predicted we’d be. My targets were pretty modest, though, consisting of an increase of 3% and 10%, so this doesn’t mean they’ve been missed. Let’s extend the chart to day 16, and explicitly overlay the two targets I set out.

```
low_target = 78890.69
high_target = 78948.57
target_day = dt.datetime( 2020, 6, 12, 23, 59, tzinfo=dt.timezone(dt.timedelta(hours=-6)) )
target_since_cutoff = (target_day - cutoff_day).days
pl.figure(num=None, figsize=(8, 4), dpi=150, facecolor='w', edgecolor='k')
x = np.array([0, target_since_cutoff])
pl.fill_between( x, [78039, low_target], [78039, high_target], color='#ccbbbb', label='blog post')
pl.fill_between( x, [78039, high_target], [high_target, high_target], color='#ffeeee', label='video')
pl.plot( donations['delta_epoch_days'], donations['culm'], '-',c='#aaaaaa')
pl.plot( donations['delta_epoch_days'][new_donations_mask], \
donations['culm'][new_donations_mask], '-',c='#0099ff')
pl.title("Defense against Carrier SLAPP Suit")
pl.xlabel("days since cutoff")
pl.ylabel("dollars")
pl.xlim( [-3, target_since_cutoff] )
pl.ylim( [77800,high_target] )
pl.legend(loc='lower right')
pl.show()
```

To earn a blog post and video on Bayes from me, we need the line to be in the pink zone by the time it reaches the end of the graph. For just the blog post, it need only be in the grayish- area. As you can see, it’s painfully close to being in line with the lower of two goals, though if nobody donates between now and Friday it’ll obviously fall quite short.

So if you want to see that blog post, get donating!

I owe you an update to the fundraiser, but alas I instead got addicted to watching Twitter feeds for protest info. So let’s do this instead.

The thesis of Chris Hayes’ last book was that there were two police systems in the USA: that of “The Nation,” which behaves much as you’d expect, and that of “The Colony,” which is aimed at subjugating a subset of the populace through terror and pain. Citizens of “The Nation” don’t usually see what citizens of “The Colony” see, those visions are hidden both by design and a willful blindness. In the USA, for instance, police killed 1,028 people in the last year. Most are never heard of, like Steven Taylor or Breonna Taylor, both because of the sheer number of times it happens and because we’re taught to think of these deaths as “justified.” Aggressively swing a baseball bat in a Wal-Mart? That justifies the death penalty, without trial. Suspected of having drugs and next to someone firing at the police? Death penalty, no trial. Citizens of The Nation grasp what’s happening on an intuitive level, but because they rarely face reality this knowledge is allowed to slip to the back of their minds.

Every once in a while, though, The Nation gets a glimpse of what The Colony has to live with. Being forced to confront reality can lead to changes, but often those changes are incremental or incomplete, and The Nation comes up with excuses to turn its head away again. Looting and rioting? How dare these villains break the law! If only they followed the example of Martin Luther King, Jr. [Read more…]

[*Content Warning: Police Violence*] [Read more…]

The average age of death from COVID in Alberta is 83, and I remind the House that the average life expectancy in the province is age 82. –

Premier Jason Kenney, May 27th 2020.

That really caught my ear, when it came across the local news. What the hell is our Premier saying, that the elderly are expendable? It was so outrageous, I wanted to dig into it further and get the full context. The original news report I heard only had that one sentence, though, so I did a bit of research. I couldn’t find an unedited clip of his speech, but I did find a clip with one extra sentence in place.

Mr. Speaker, it is critical as we move forward, that we focus our efforts on the most vulnerable; on the elderly, and the immuno-compromised.The average age of death from COVID in Alberta is 83, and I remind the House that the average life expectancy in the province is age 82. –also Premier Jason Kenney, May 27th 2020.

That radically changes the meaning, doesn’t it? Still, I have to point out I’m getting mixed messages here. [Read more…]

One of the ways I’m coping with this pandemic is studying it. Over the span of months I built up a list of questions specific to the situation in Alberta, so I figured I’d fire them off to the PR contact listed in one of the Alberta Government’s press releases.

That was a week ago. I haven’t even received an automated reply. I think it’s time to escalate this to the public sphere, as it might give those who can bend the government’s ear some idea of what they’re reluctant to answer. [Read more…]

If our goal is to raise funds for a good cause, we should at least have an idea of where the funds are at.

(Click here to show the code)

```
import matplotlib.pyplot as pl
import pandas as pd
import pandas.tseries.offsets as pdto
donations = pd.read_csv('donations.cleaned.tsv',sep='\t')
donations['epoch'] = pd.to_datetime(donations['created_at'])
donations['delta_epoch'] = donations['epoch'] - donations['epoch'].max()
# some adjustment is necessary to line up with the current total
donations['culm'] = donations['amount'].cumsum() + (78039 - donations['amount'].sum())
donations.head()
```

created_at | amount | epoch | delta_epoch | culm | |
---|---|---|---|---|---|

0 | 2017-01-24T07:27:51-06:00 | 10.0 | 2017-01-24 07:27:51-06:00 | -1218 days +19:51:12 | 14733.0 |

1 | 2017-01-24T07:31:09-06:00 | 50.0 | 2017-01-24 07:31:09-06:00 | -1218 days +19:54:30 | 14783.0 |

2 | 2017-01-24T07:41:20-06:00 | 100.0 | 2017-01-24 07:41:20-06:00 | -1218 days +20:04:41 | 14883.0 |

3 | 2017-01-24T07:50:20-06:00 | 10.0 | 2017-01-24 07:50:20-06:00 | -1218 days +20:13:41 | 14893.0 |

4 | 2017-01-24T08:03:26-06:00 | 25.0 | 2017-01-24 08:03:26-06:00 | -1218 days +20:26:47 | 14918.0 |

Changing the dataset so the last donation happens at time zero makes it both easier to fit the data and easier to understand what’s happening. The first day after the last donation is now day one.

Donations from 2017 don’t tell us much about the current state of the fund, though, so let’s focus on just the last year.

(Click here to show the code)

```
pl.figure(num=None, figsize=(8, 4), dpi=120, facecolor='w', edgecolor='k')
pl.plot(donations['delta_epoch'].apply(lambda x: x.days),donations['culm'],'-k')
pl.title("Defense against Carrier SLAPP Suit")
pl.xlabel("days since last donation")
pl.ylabel("dollars")
pl.xlim( [-365.26,0] )
pl.ylim( [55000,82500] )
pl.show()
```

The donations seem to arrive in bursts, but there have been two quiet portions. One is thanks to the current pandemic, and the other was during last year’s late spring/early summer. It’s hard to tell what the donation rate is just by eye-ball, though. We need to smooth this out via a model.

The simplest such model is linear regression, *aka.* fitting a line. We want to incorporate uncertainty into the mix, which means a Bayesian fit. Now, what MCMC engine to use, hmmm…. emcee is my overall favourite, but I’m much too reliant on it. I’ve used PyMC3 a few times with success, but recently it’s been acting flaky. Time to pull out the big guns: Stan. I’ve been avoiding it because pystan‘s compilation times drove me nuts, but all the cool kids have switched to cmdstanpy when I looked away. Let’s give that a whirl.

(Click here to show the code)

```
import cmdstanpy as csp
%time success = csp.install_cmdstan()
if success:
print("CmdStan installed.")
```

CPU times: user 5.33 ms, sys: 7.33 ms, total: 12.7 ms Wall time: 421 ms CmdStan installed.

We can’t fit to the entire three-year time sequence, that just wouldn’t be fair given the recent slump in donations. How about the last six months? That covers both a few donation burts and a flat period, so it’s more in line with what we’d expect in future.

(Click here to show the code)

```
mask = (donations['delta_epoch'].apply(lambda x: x.days) > -365.26*.5) # roughly six months
x = donations['delta_epoch'].apply(lambda x: x.days)[mask] # make the current time zero, for convenience
y = donations['culm'][mask]
yerr = donations['amount'].min() * .5 # the minimum donation amount adds some intrinsic uncertainty
print( f"There were {sum(mask)} donations over the last six months." )
```

There were 117 donations over the last six months.

With the data prepped, we can shift to building the linear model.

(Click here to show the code)

```
with open('linear_regression.stan','wt') as file:
file.write("""
/* tweaked version of https://mc-stan.org/docs/2_23/stan-users-guide/linear-regression.html */
data {
int
``` N;
vector[N] x;
vector[N] y;
real y_err;
}
parameters {
real slope;
real intercept;
real sigma;
}
model {
slope ~ cauchy( 0, 0.357369 ); /* approximate the inverse tangent prior */
/* intercept has a flat prior */
sigma ~ cauchy( 0, 1 ); /* prior that favours lower values */
/* fatter tails to reduce the influence of outliers */
y ~ student_t( 3, intercept + slope*x, sigma );
}
""")

I could have just gone with Stan’s basic model, but flat priors aren’t my style. My preferred prior for the slope is the inverse tangent, as it compensates for the tendency of large slope values to “bunch up” on one another. Stan doesn’t offer it by default, but the Cauchy distribution isn’t too far off.

We’d like the standard deviation to skew towards smaller values. It naturally tends to minimize itself when maximizing the likelihood, but an explicit skew will encourage this process along. Gelman and the Stan crew are drifting towards normal priors, but I still like a Cauchy prior for its weird properties.

Normally I’d plunk the Gaussian distribution in to handle divergence from the deterministic model, but I hear using Student’s T instead will cut down the influence of outliers. Thomas Wiecki recommends one degree of freedom, but Gelman and co. find that it leads to poor convergence in some cases. They recommend somewhere between three and seven degrees of freedom, but skew towards three, so I’ll go with the flow here.

The y-intercept could land pretty much anywhere, making its prior difficult to figure out. Yes, I’ve adjusted the time axis so that the last donation is at time zero, but the recent flat portion pretty much guarantees the y-intercept will be higher than the current amount of funds. The traditional approach is to use a flat prior for the intercept, and I can’t think of a good reason to ditch that.

Not convinced I picked good priors? That’s cool, there should be enough data here that the priors have minimal influence anyway. Moving on, let’s see how long compilation takes.

(Click here to show the code)

`%time lr_model = csp.CmdStanModel(stan_file='linear_regression.stan')`

CPU times: user 4.91 ms, sys: 5.3 ms, total: 10.2 ms Wall time: 20.2 s

This is one area where emcee really shines: as a pure python library, it has zero compilation time. Both PyMC3 and Stan need some time to fire up an external compiler, which adds overhead. Twenty seconds isn’t too bad, though, especially if it leads to quick sampling times.

(Click here to show the code)

```
%time lr_model_fit = lr_model.sample( data = {'N':len(x), 'x':list(x), 'y':list(y), 'y_err':yerr}, \
iter_warmup = 936, iter_sampling = 64 )
```

CPU times: user 14.7 ms, sys: 24.7 ms, total: 39.4 ms Wall time: 829 ms

And it does! emcee can be pretty zippy for a simple linear regression, but Stan is in another class altogether. PyMC3 floats somewhere between the two, in my experience.

Another great feature of Stan are the built-in diagnostics. These are really handy for confirming the posterior converged, and if not it can give you tips on what’s wrong with the model.

(Click here to show the code)

`print( lr_model_fit.diagnose() )`

Processing csv files: /tmp/tmpyfx91ua9/linear_regression-202005262238-1-e393mc6t.csv, /tmp/tmpyfx91ua9/linear_regression-202005262238-2-8u_r8umk.csv, /tmp/tmpyfx91ua9/linear_regression-202005262238-3-m36dbylo.csv, /tmp/tmpyfx91ua9/linear_regression-202005262238-4-hxjnszfe.csv Checking sampler transitions treedepth. Treedepth satisfactory for all transitions. Checking sampler transitions for divergences. No divergent transitions found. Checking E-BFMI - sampler transitions HMC potential energy. E-BFMI satisfactory for all transitions. Effective sample size satisfactory. Split R-hat values satisfactory all parameters. Processing complete, no problems detected.

The odds of a simple model with plenty of datapoints going sideways are pretty small, so this is another non-surprise. Enough waiting, though, let’s see the fit in action. First, we need to extract the posterior from the stored variables …

(Click here to show the code)

```
lr_model_names = { v:i for i,v in enumerate(lr_model_fit.column_names) }
flat_chain = list()
for sample in lr_model_fit.sample:
for chain in sample:
flat_chain.append( [chain[i] for i in map(lambda x: lr_model_names[x], ['slope', 'intercept', 'sigma'] )] )
print( f"There are {len(flat_chain)} samples in the posterior." )
```

There are 256 samples in the posterior.

… and now free of its prison, we can plot the posterior against the original data. I’ll narrow the time window slightly, to make it easier to focus on the fit.

(Click here to show the code)

```
pl.figure(num=None, figsize=(8, 4), dpi=120, facecolor='w', edgecolor='k')
pl.plot(donations['delta_epoch'].apply(lambda x: x.days),donations['culm'],'-k')
for m,b,_ in flat_chain:
pl.plot( x, m*x + b, '-r', alpha=0.05 )
pl.title("Defense against Carrier SLAPP Suit, with linear fit")
pl.xlabel("time, days since last donation")
pl.ylabel("fund, dollars")
pl.xlim( [-365.26*.75, 0] )
pl.ylim( [55000,82500] )
pl.show()
```

Looks like a decent fit to me, so we can start using it to answer a few questions. How much money is flowing into the fund each day, on average? How many years will it be until all those legal bills are paid off? Since humans aren’t good at counting in years, let’s also translate that number into a specific date.

(Click here to show the code)

```
import numpy as np
print( f"mean/std/median slope = ${np.mean(flat_chain,axis=0)[0]:.2f}/" + \
f"{np.std(flat_chain,axis=0)[0]:.2f}/{np.median(flat_chain,axis=0)[0]:.2f} per day")
print()
est_years = list()
est_day = list()
numer = 115000 - donations['culm'].max()
for m,b,_ in flat_chain:
target = numer/m
est_day.append( target )
est_years.append( target / 365.26 )
print( f"mean/std/median years to pay off the legal fees, relative to {donations['epoch'].max()} =" )
print( f"\t{np.mean(est_years):.3f}/{np.std(est_years):.3f}/{np.median(est_years):.3f}" )
print()
print( "mean/median estimate for paying off debt =" )
print( f"\t{donations['epoch'].max() + pdto.DateOffset(days=np.mean(est_day))} / " +
f"{donations['epoch'].max() + pdto.DateOffset(days=np.median(est_day))}" )
```

mean/std/median slope = $51.62/1.65/51.76 per day mean/std/median years to pay off the legal fees, relative to 2020-05-25 12:36:39-05:00 = 1.962/0.063/1.955 mean/median estimate for paying off debt = 2022-05-12 07:49:55.274942-05:00 / 2022-05-09 13:57:13.461426-05:00

Mid-May 2022, eh? That’s… not ideal. How much time can we shave off, if we increase the donation rate? Let’s play out a few scenarios.

(Click here to show the code)

```
for imp in [.01,.03,.1,.3,1.]:
est_imp = list()
numer = 115000 - donations['culm'].max()
offset = donations['epoch'].max().timestamp()/86400
for m,_,_ in flat_chain:
denom = m * (1+imp)
est_imp.append( numer/denom + offset )
print( f"median estimate for paying off debt, increasing rate by {imp*100.:3.0f}% = " +
f"{pd.Timestamp(np.median(est_imp),unit='d')}" )
```

median estimate for paying off debt, increasing rate by 1% = 2022-05-02 17:16:37.476652800 median estimate for paying off debt, increasing rate by 3% = 2022-04-18 23:48:28.185868800 median estimate for paying off debt, increasing rate by 10% = 2022-03-05 21:00:48.510403200 median estimate for paying off debt, increasing rate by 30% = 2021-11-26 00:10:56.277984 median estimate for paying off debt, increasing rate by 100% = 2021-05-17 18:16:56.230752

Bumping up the donation rate by one percent is pitiful. A three percent increase will almost shave off a month, which is just barely worthwhile, and a ten percent increase will roll the date forward by two. Those sound like good starting points, so let’s make them official: **increase the current donation rate by three percent, and I’ll start pumping out the aforementioned blog posts on Bayesian statistics. Manage to increase it by 10%, and I’ll also record them as videos.**

As implied, I don’t intend to keep the same rate throughout this entire process. If you surprise me with your generosity, I’ll bump up the rate. By the same token, though, if we go through a dry spell I’ll decrease the rate so the targets are easier to hit. My goal is to have at least a 50% success rate on that lower bar. Wouldn’t that make it impossible to hit the video target? Remember, though, it’ll take some time to determine the success rate. That lag should make it possible to blow past the target, and by the time this becomes an issue I’ll have thought of a better fix.

Ah, but over what timeframe should this rate increase? We could easily blow past the three percent target if someone donates a hundred bucks tomorrow, after all, and it’s no fair to announce this and hope your wallets are ready to go in an instant. How about… sixteen days. You’ve got sixteen days to hit one of those rate targets. That’s a nice round number, for a computer scientist, and it should (hopefully!) give me just enough time to whip up the first post. What does that goal translate to, in absolute numbers?

(Click here to show the code)

```
inc = 0.03
days = 16
start = donations['culm'].max()
delta = np.median(flat_chain,axis=0)[0] * (1.+inc)*days
print( f"a {inc*100.:3.0f}% increase over {days} days translates to" +
f" ${delta:.2f} + ${start:.2f} = ${start + delta:.2f}" )
```

a 3% increase over 16 days translates to $851.69 + $78039.00 = $78890.69

Right, if you want those blog posts to start flowing you’ve got to get that fundraiser total to $78,890.69 before June 12th. As for the video…

(Click here to show the code)

```
inc = 0.1
delta = np.median(flat_chain,axis=0)[0] * (1.+inc)*days
print( f"a {inc*100.:3.0f}% increase over {days} days translates to" +
f" ${delta:.2f} + ${start:.2f} = ${start + delta:.2f}" )
```

a 10% increase over 16 days translates to $909.57 + $78039.00 = $78948.57

… you’ve got to hit $78,948.57 by the same date.

Ready? Set? Get donating!

- 1
- 2
- 3
- …
- 39
- Next Page»