Oh, man. What did I just read? It’s on medium.com, and it perfectly reflects the twistedly wonky and uneven character of that site — it’s a piece by techbros for techbros titled How San Francisco’s gender disparity affects the attractiveness pairings of couples.
It is truly, deeply, monumentally awful. It’s the work of a dude who knows how to use R, is happy to invent lots of numbers to feed into R, and is himself full up to the brim with unjustified assumptions that he never bothers to question. And apparently he got enough people remarking on how stupid his article was that he had to add a disclaimer:
There seems to have been quite a bit of misunderstanding stirred up by this article, so please read this disclaimer: this analysis makes absolutely no value judgements about how attractive men & women in SF are, or how attractive they should feel. I lay out all the simplifying assumptions and I’ve tried to explain that this is not how the real world works. Nor do I believe this is how the real world works. No sane human should heed any advice from this article. None of this has any basis in reality. It’s not supposed to. This is just a thought experiment about how one might build an economic model for dating with gender ratio imbalances. I’ve preserved the entirety of the original text below. There’s plenty of room for miscommunication because the assumptions are buried inside the text. That’s my fault. But I urge everyone to read the piece in its entirety before jumping to conclusions.
He is dimly aware that “no sane human” would accept any part of his exercise, yet it has a grand total of 4 comments, 2 of which scorn his model, and 2 that praise it…although one does make the point that because it doesn’t adequately confirm his biases, it’s not that good a model. From this, I guess we can conclude that half of the author’s readers on Medium are insane, and I will now go on to make an elaborate statistical justification of my assertion that there is a complex association between insane authors and insane readers on Medium, complete with mathematical formulae and lots of charts and graphs of theoretical “data”…
Wait, no. I’m not going to do that, because it would be stupid and false, and would involve a heck of a lot of
work masturbation. Also, thought experiments do have their place, but this kind of extended empirical vacuum offends me in principle.
So what is this wanker trying to do? Here’s the first paragraph.
There’s a joke that I’ve heard passed around the circles of frustrated single men in San Francisco. They claim that this city is home to 49ers — girls that are 4’s but think they’re 9’s in terms of attractiveness. Whether the ineptitude of San Franciscan men or the confidence of San Franciscan women bears responsibility for this sentiment I cannot say, but it did make me curious about whether the numbers might be able to reveal anything.
First off, his entire goal is to evaluate a bad joke built around sexist assumptions by “frustrated single men in San Francisco”. This starting condition might justify some response, but is entirely satisfied by some minimal effort…say, an eyeroll, or a rude non-verbal sound. But no, this prompts the author to build a complex model and test whether the joke is actually true, with a procedure that cannot possibly say anything about accuracy or real effects or even generalities about the population. It’s pure garbage in, garbage out.
Secondly, what is it with bad wanna-be engineers reducing complex phenomena to single small numbers? “Attractiveness” is not an easily quantifiable parameter; it’s going to vary between different people, is entirely a product of perception of a huge number of variables. I might be a 1 or a 2 to most people, but my wife might be willing to judge me as a 4, and maybe in the right light (or complete absence thereof) a 5. So how the heck does he measure this magic number? Right off the bat, we’re dealing with a gigantic assumption that he doesn’t bother to question, and further, that he isn’t going to bother to measure.
He’s going to assume it.
We’ll assume that men and women’s attractivenesses are distributed identically along the classic 0–10 scale. Said another way, their attractivenesses have the same probability density functions. There exists research on the statistical distributions of attractiveness, but they’re all pretty bad. So let’s assume something sensible and simple — that attractiveness follows a normal distribution. But since we want our distribution to have a minimum of 0 and maximum of 10, we need to truncate our distribution. To satisfy that, we can use the truncated normal distribution. Here’s the truncated normal distribution with various standard deviations:
And we’re off, with a long post full of formulae and charts. You don’t need to bother reading it though, since you can actually see the foundation of his argument right there.
Assume attractiveness is a single parameter that can be adequately encoded as a single small integer. Further, use a scale that is “classic” because it is used in men’s locker rooms, by cat-callers on the street, and in cheesy popular movies.
Dignify it with some sciencey terminology:
probability density function.
Dismiss any and all genuine research on the subject. I had to highlight that sentence in the middle, because it was so appalling. Everyone else’s work on this subject is
pretty bad, but this paper? Sterling quality. Dunning-Krueger, Mr Dunning-Krueger, please come to the white courtesy phone.
In the absence of any information he likes, assume a simple statistical distribution…of a parameter that he can’t show is valid in the first place.
Then fuck empiricism and observation and measurement, let’s just invent a whole data set for his imaginary parameter, complete with standard deviations. And then go on and on exploring this hypothetical data set with a sample size of 0 with a precision of 4 significant digits.
Really, this rubs me raw. I’ve been grading lab reports lately, and talking with students about their data, and one of the things I try to emphasize with them is making appropriate conclusions. They’re working with a tentative and preliminary data set on cell growth, developed while they were still learning the techniques, and with flaws and errors that they acknowledge in their description, but they also have software that allows them to plug in their data and get a complex formula with absurd degrees of precision that, it says, describes a best fit curve for their observations. Sometimes they’re paralyzed with the difficulty of matching the theoretical equation to the reality of their data, and I have to tell them to step back, simplify, and think about what the data is generally telling them. I also explain that if this were a real experiment, we’d entirely throw out this whole initial set of measurements that were acquired while they were learning how to do them and repeat the whole lab 10 times with more rigor, something not possible with our time constraints.
So you can imagine how this article in Medium smacked me between the eyes with its sexist assumptions, its complete lack of any real measurements (which, given the idiocy of its premises, couldn’t possibly be made anyway), and its ludicrously confident mathematical analyses.
Can I fail the author? Can I kick them out of the class? Can I deny them any kind of degree from any respectable university?
One good thing about it is that now I can read my students’ work and better appreciate that they’re honestly grappling with the data they collected. Suddenly I feel like I should just give them all an A+ and send them out to boot the oblivious pseudoscientists out of their establishment positions (sorry, any students who stumble across this, I’m giving a full range of grades on your lab reports.)