How not to organize a meeting

You really should read Jonathan Eisen’s response to a quantitative biology conference invitation that listed 25 male speakers and organizers, and one lonely woman. See? All kinds of geeks turn out to be unthinkingly sexist.

I wondered how this could happen, and then I saw that the first two organizers are named Ditto and Hasty. Well, of course then.


  1. Holms says

    Um. Should those names ring a bell? Or – this is just desperation here – is it just a pun on those namesake words? It better not be the latter…

  2. A Hermit (that's "A" with a "plus") says

    Well, next time some hyperskeptic pulls out the “where are your statistics” crap when someone mentions sexual discrimination I know where to go…;)

  3. hjhornbeck says

    AWESOME find, PZ! Not only did it put a spring in my step, I now have a template to use on the whiners who say there isn’t sex-based discrimination within X.

  4. jose says

    inb4 “that’s because they’re only inviting the GOOD speakers, it’s not their fault if the good ones happen to be male! You just want strictly regulated quotas and that’s reverse sexism and THAT’s what’s a problem in this country…”

  5. anotherbayesian says

    Wow the stats in that post is bad! Sure, the guy has a point about there being a case to answer, but a statistical argument like this just damages his point.

    The problem is the “I therefore conclude that the null hypothesis (that having one female out of 26 key participants) can be rejected – and that this meeting has a biased ratio of males: females” bit.

    The whole analysis rests on the assumption that, for each speaker slot, the organisers pick a member of the synthbio/Q-bio community at random, so that the probability the chosen speaker is a woman is p, the proportion of women in that community.

    Then, you get to X, the number of chosen women is Bin(26,p) and can do the analysis done by the author to find P(X=1) for different choices of p.

    So what is wrong with this? Well, the problem is that the given model does not capture the null hypothesis that the author rejects.

    The null hypothesis is that there was no gender bias adopted by the organisers when they compiled their list of speakers.

    The null hypothesis rejected is that speakers were chosen at random from the population of synth bio members. Should we be surprised by this and does it say anything about a gender bias?

    Well no! Anyone who has ever organised a conference will know that one doesn’t simply put all names in a given branch of science (obtained, for example, by looking at publishing records or attendance of similar conferences), and draw speakers out at random.

    I’d be very surprised if his revised abstract was accepted, and not because his point is contentious, but because his statistical analysis is wrong.

  6. jaybee says

    anotherbayesian —

    His abstract is tongue in cheek, but he is making a valid point. He is using the fraction of women in the field as a proxy for the fraction of women recognized as experts in the field. If you want to disagree with his conclusion, then you have to answer the question — why is this an invalid proxy? Is it that women are just not as smart or dedicated as the men, and thus fail to become experts, or is it because of the selfsame systematic bias that he is objecting to that prevents them from being recognized as experts?

  7. casey says

    My sister graduated from UCSD with a bio degree and she told me that when she was applying to jobs people from te school straight up told her that she shouldn’t take a job at NOAA because it was such a sexist work environment. I don’t know if it’s true about NOAA but I think it definitely reflects negatively on one or two of the institutions, and this doesn’t give me a lot of faith in the UCSD bio department.

  8. anotherbayesian says

    He has a valid point, but his statistical model does not capture the null hypothesis, so it shouldnt be used as evidence. The bible makes a lot of incorrect prophecies, yet to use the so-called incorrect prediction of the value of pi from kings, is a bad argument to illustrate it.

    The proxy is irrelevant. Suppose it was a really good proxy. Forget that, suppose the proportion of recognised women experts in the field was precisely the 20% as leads to the 5% significance level of his test: the model is still wrong.

    Why is it wrong? Because to not have a gender bias (i.e. if the null hypothesis was true), speakers would still not be randomly and independently drawn from the population of recognised experts.

    As the only way this statistical test makes sense (or is even defined) is if we have independent random draws from this population, either we 1. use it and reject the null hypothesis that speakers were drawn randomly and independently from the population at the 5% level or 2. come up with a statistical model that accounts for the way speakers would be chosen if there was no gender bias and analyse that.

    What we can’t do is pretend that not selecting speakers randomly from the population of experts means there is a gender bias.

    If that’s still not clear let me put it a slightly less mathematical way. The test sets up a false dichotomy. Either null: speakers are drawn randomly and independently from the population of experts or Alternative: There is a gender bias in the selection process.

    As I said, it’s fine to call BS on the selection process and to claim a bias. But don’t use shoddy evidence to back yourself up.

  9. eigenperson says

    #13 anotherbayesian:

    You are quite right, of course. Perhaps the speaker selection was performed in the following way:

    1. Choose X uniformly from [0,1]
    2. If X = 0.3, choose 25 men and one woman uniformly at random from their respective sets.

    In this case, despite an unbiased selection process, P(F=1) = 0.7, so the null hypothesis cannot be rejected.

    However, I find this model of speaker selection to be somewhat unlikely for any event not entitled “Almost Single-Gender Q-Bio Conference (Gender Not Yet Selected)”.

    By contrast, I think the model of selection at random from a pool of experts is actually very appropriate.

    In fact, I am having trouble coming up with a plausible selection method that has the following properties: (i) it is not gender-biased and (ii) P(F=1) > 0.05. For example, if we start by selecting the five “most desirable” speakers (who happen to all be men), inviting them, and then filling the rest of the slots at random, this increases P(F=1), but it is gender-biased because the expected number of women is less than their proportion in the population. If we decide to remedy this by increasing the probability of selecting a woman for the remaining 21 slots, then P(F=1) goes right back down again.

  10. eigenperson says

    Hmm. I had some text gobbled up by the system in my previous post. (The system does not like it when you use the less than character, it turns out. It devoured everything between a less than character and the next greater than character.)

    Point 2 should read: If X is less than 0.3, choose 25 women and one man at random from their respective sets. If X is greater than or equal to 0.3, choose 25 men and one woman at random from their respective sets.

    Hopefully it makes sense now.

  11. anotherbayesian says

    Obviously your example model is facetious. Let’s be clear I am not defending the selection process. There is clearly a problem there. However, like with any mathematical theorem, just because it seems obviously true, it doesn’t mean that you don’t have to prove it, or you can submit a proof that is plain wrong.

    So A) What makes you think conference speakers are ever selected according to some random process? B) Why, in particular, does uniform and independent random selection appear to be a good model?

    Here are some factors that have not been considered:

    Recent publication (often invitees are invited because the have a good recent paper). What is the gender distribution of lead authors of recent papers in the good Qbio journals?
    Reputation (there are very big names in any field that are often invited first, there are a very small number of these, how many are women?)
    Turning down invites: (the list of speakers is the list of people who accepted. Now what is P(accept) and is P(accept|M)=P(accept|F)? Could the timing of the conference be such that one is higher than the other?
    How are candidates picked? Does the committee have one list and choses its candidates, or does each person bring a list of candidates and the committee picks from a combined list. If so, who is on the committee and who are the members scientific collaborators and co-authors?

    I’m quite happy for anyone to say “look, 26:1 is an appalling ratio of males to females. There is a gender bias here that needs explaining/justifying. It’s out of order”. Just don’t use an incorrect statistical model to try to prove your assertions. If it’s too hard to come up with an appropriate model, then don’t just use the wrong one because it’s easy then claim victory anyway.

  12. maureenbrian says

    We may have a breakthrough here! Persons resistant to change find they can hide behind long, boring arguments about statistical methods.

    It is evident from what information we have that it would be impossible to arrive at that ratio of speakers by any rational method and almost impossible by a truly random method. End of story.

    Now, go and play with the figures to your hearts’ content.

  13. glendon says

    Presumably the sampling procedure is a little suspect too.

    i.e. if he was inspired to examine the probability of ending up with that particular ratio, because of the fact of that ratio, then the whole exercise (and even the decision to blog about it) has very little worth.

    He would really need to discount this years meeting and look at a few other similar meetings before before he could say with any confidence that there was a problem.

  14. anotherbayesian says

    maureenbrian wrong!

    As I said, there is obviously a problem with the ratio. Fine, say that. Fine, fight it. Fine, try to change it and improve the situation. I am totally with that and not resistant to the change.

    But don’t fudge the statistics and use them to back you up. Sometimes getting the right answer is “boring”. Scientists don’t care. Normally we WANT scientists to get to the right answer no matter how hard and how “boring”.

    You would think that people in a community trying to embrace science would get that!

    But no. The rigour of the scientific method is getting in the way of your indignation and isn’t backing you up this time, so instead you attack it.

    Can anyone name any other groups of people often discussed on these blogs that use science when it suits but dismiss it when it doesn’t?

  15. anotherbayesian says

    #19 Glendon.

    Actually, if you accept that having no gender bias means you have to pick all speakers randomly and independently (independence is so important and so not-applicable here) from the population of experts in the field, then the analysis is fine. That’s how you would prove that there was a gender bias.

  16. glendon says

    #21 anotherbayesian

    Could you? I accept that the analysis could (in principle and with all your caveats) say with that the ratio is improbable, but that is different to saying it is biased.

  17. anotherbayesian says

    Well yes, but frequentist statistics only ever rejects hypotheses on the basis that they are improbable to some level.

    So, a drug trial for instance does not result in proof that a drug works. It results in a low probability that the null hypothesis that the drug has no effect (<5% usually). Unfortunately much of science is wedded to using this type of analysis to "prove" something has happened (by rejecting the hypothesis that nothing happened as very improbable given data)

    Of course what you really want is P(drug helps/cures disease given data) or P(gender bias exists given data). To get that you have to do a Bayesian analysis. Lots of areas of science are coming round to this idea.

  18. glendon says


    Of course but that is a very different situation. In this case the data that provoked the question was also used to answer it. Given your user name, I’m sure I don’t have to spell out how this will affect the conclusions.

  19. glendon says

    Indeed, it is very easy to counter the kind of view point, expressed above, that statistical arguments are dry and irrelevant using the case of R v Sally Clark, a very notorious case in the UK. In short: a woman ended up in prison for killing her children because an expert witness was very much of the opinion that it was just obvious…

    In the end, a number of statisticians (who were fortunately familiar with the work of Mr Bayes) were able to raise objections to the evidence that resulted in her release (though sadly not before her life was irreparably damaged).

    Obviously the example above is in some respects a more trivial case (though the statistical argument is essentially the same); but I don’t think that that is sufficient excuse for abandoning all attempts at rigour when making claims of this nature. Particularly when those claims are (by implication) attacking the reputation of real and named people.

  20. David Marjanović says

    So many invited speakers? The Society of Vertebrate Paleontology never invites speakers to its annual meetings…

  21. eigenperson says

    Obviously your example model is facetious.

    Obviously it is. But I cannot think of a more realistic model that (i) is unbiased and (ii) makes the observed result likely.

    The examples you gave are not unbiased selection processes. To take one example, suppose women are much less likely to accept a speaking invitation than men are. In that case, they should send out more invitations to women! Otherwise they are using a biased selection process.

    For example, one possible reason that women might be less inclined to accept the invitation is that they are more likely to have families and therefore have fewer opportunities to travel. Well, if that is true, and conference organizers contact women at the same rate as they contact men, they will always get fewer women. But this is not inevitable. If 75% of men can make it to Conference X, and only 25% of women can make it, then contacting three times as many women as their proportion in the population suggests will result in an unbiased selection of speakers.

    Another possibility is that women are less inclined to attend this particular conference because they think it is going to be a sausage-fest. Once again, this is not inevitable. If 75% of men are willing to go to the conference and only 25% of women are, then they can just contact 3 times as many women as their proportion in the population suggests. Otherwise, they have a biased selection. (And, incidentally, if one cause of the bias does turn out to be that the conference is perceived as a sausage-fest, this procedure will also eliminate that root cause.)

    The presence of one of the factors you mentioned does not make the bias go away — it merely explains it. One might even argue that it justifies it. If the organizers of this meeting agree that their selection process is biased and want to present an explanation for that, I am quite ready to hear it.

  22. glendon says

    #27 eigenperson

    I think that the main point being made was that the composition of the reference population has not been adequately defined, and without that, it is hard to call ‘bias’. Obviously in an ideal world it would be similar (in terms of gender, at least) to the wider population, but the organising committee cannot just ignore any differences that might exist.

    The primary job of the organising committee is to put together the highest quality science programme that they can. In neither of the cases that you describe (though I acknowledge that your list was not exhaustive) would addressing gender bias at the invitation stage achieve that aim. In the second case, it might have the longer term benefit of changing the perception of the meeting, which would obviously be good for the science.

    In the former case, it is far more effective to give that 75% of women more opportunity to attend. No doubt there is a lot that the organisers could do to this end, but it is too big a problem for them to address alone.

  23. anotherbayesian says

    Glendon, data driven model selection is how statistics works in the real world (with drug trials perhaps one of the few exceptions). I understand your point of course, but the problem is that very rarely does someone have a problem, contact a statistician, design a careful experiment and then analyse the data. What happens is someone comes along with some data and wants a statistician to analyse it.

    This is why it is always crucial that statisticians report the conclusions that we can really make from the analysis done.

    Eigenperson, your points are valid in terms of what would constitute unbiasedness if the acceptance rate was different between the two populations.

    The main point though, is that the analysis done says nothing about gender bias as the rejected null hypothesis is one for a very specific and unrealistic program selection situation.

  24. says

    I call structuralist fallacy;
    A statistic situation is interpreted as the unfavored are oppressed and/or discriminated.

    To declare oppression and/or discrimination the following evidence must be presented;
    1. The two relevant categories (men/women) must be proportionally distributed (same amount working in the field with proximately the same level of contribution to their field).
    2. The two relevant categories (men/women) must be proportionally willing (same amount interested to speak at the conference).

    If one is a feminist and find that such evidence cannot be presented, one needs to address the willingness and interest for both categories to participate, an action that equals hard work. To instead make the structuralist fallacy and comment from afar is a level of laziness and unawareness that makes me question if one is interested in actually working for gender inequality in the first place or merely call oneself feminist to gain status, popularity and feel progressive.