# Do this now, OkCupid

One of my favorite blogs is OkTrends (even if they haven’t updated since April, sadface). What’s better than combing dating site data for statistical trends an oddities? All of their articles are super interesting.

I was having lunch with some of my fellow graduate students, when the conversation turned geeky (as it tends to do). I mentioned how it would be great to have some sort of measure of sexual compatibility on OkCupid other than skimming through the various questions people have answered (which, don’t lie, is the first thing everyone does). Though the questions are very telling – just from reading other’s answers to the sex questions, I can tell if we’d be compatible or not. But there’s no good metric for it.

The solution to us was obvious to us: Principal Component Analysis.

“Principal component analysis (PCA) involves a mathematical procedure that transforms a number of (possibly) correlated variables into a (smaller) number of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible.”

You could do a PCA on people with all of their sex answers being their data. The magic of PCA (please don’t make me describe the math) would then decide what the proper variables are to measure. If I had to guess, kinkiness and experience would probably be the two main variables in someone’s sexual preferences. I would guess you’d get a chart looking something like this:

With each dot representing a person, and people potentially forming clusters. You could look and see if you easily fall into the kinky cluster, or whatever. And PCA can have more than two variables, though that’s a little trickier to graph. I can imagine the 3rd being something like desire. Do you want lots of sex, or are you happy with not that much? That’s a major point of conflict in relationships, so it would be great to have that sorted out by the power of statistics.

Thinking this was pretty much the best idea ever, we emailed OkCupid, highlighting our accolades as computational geniuses and internet nerds. This was the response:

“Hi Jennifer – Thanks for writing.  We only share our data with third parties when they have a budget to license the data…

Thanks,
Sam”

I think that “…” implies “You don’t have the money to do it yourself.”

So, fine, fine. But in the name of science, I want to see it done. Come on, OkCupid. You know you want to reach a new level of geekery in your statistical analysis. Make it so.

1. Alex says

It is disappointing that they won’t allow you access to the data. The data set is big enough, and you only need part of it, so that it shouldn’t be identifiable, and you should be trustworthy with the data. I would hope they would consider at least cutting the price for academics/grad students.

Also, being lumped in with republican senators is kind of insulting :P

2. Praedico says

I think the ‘Desire’ axis would be helpful; it would be good to know if someone hadn’t had much experience because they were shy/young/lived in a small town where buying a whip, double-ended dildo and thirteen gallons of coconut oil would cause comment; versus just not having much of a sex drive.

(Also, thank you for reminding me that I need to update my OkC profile. It hasn’t been updated since before I started getting treatment for my depression, and it kind of shows)

3. Aliasalpha says

“Assorted Virgins”? Great, now I feel like I’m in a box with a sign saying “3 for a dollar”

4. Aliasalpha says

My profile is in much the same state for much the same reason, might have to do some updatifying myself

5. Matt says

This reminds me of a friend of mine who was teaching a math class. At one point he decided to do principal component analysis on the homework of his students. I don’t remember the details of how, but he concluded from the results that two of his students were likely dating. It was only *after* he decided that that he started to realize that they’d been sitting with each other a lot and holding hands a lot. So apparently PCA really can be useful for matchmaking (or determining which matches have already been made)!

6. bitguru says

7. azkyroth says

Is there a way to make it sensitive enough to be useful for people who don’t necessarily demand global kinkiness but have specific activities that aren’t terribly uncommon but are really important to them? I know “he won’t go down on me” was a topic of conversation on Pandagon a few weeks ago, for instance…

8. Gordon says

Wow, I don’t go near the sex or dating answers until I’ve filtered by the religion and ethics answers. Believe in the power of prayer? Think students should hear “both sides”? Put more weight in faith than science? Well then I don’t care what you can do sexually because you are not doing it to me!

9. Sue D. Nymme says

You’re in luck, Jen! I happen to know a developer who works in OKCupid’s data center. I’ll point him to this article.

10. Although I’m currently in a relationship, I think that when/if I break up with my boyfriend I’d like to go back to OKC, but they need to do one thing first.

Allow people to choose Male, Female, or Transgender.

I know it screws up all kinds of demographics and would have to add a whole lot of extra coding to get it so people would be able to choose whether they were interested in TGs or not, but at the moment, I’m pretty much sitting around confused what I’ll choose.

I identify female, but I’m male. If I choose female, I’ll get lesbian women and straight men. If I choose male, I’ll get straight women and gay men. None of those demographics would be interested in a TG. Even bisexual men and women might not be interested in a TG…

11. Aliasalpha says

Would ‘transgender’ be a big enough umbrella term to catch everyone who doesn’t identify with the standard gender binaries? Would simply using ‘other’ seem insensitive?

I’m mostly asking because I’m doing a social networky thing for a uni project and am presently trying to decide how to address this myself

12. lordshipmayhem says

With the “Soon to be kinksters, Republican Senators”, aren’t we repeating ourselves?

13. Annie says

14. While “other” does seem slightly insensitive, it does solve the problem of adding non-traditional non-binary genders, intersex, and agendered persons together in a wide umbrella. For sake of not being too confusing, it might be best to add “other” as the third option rather than just female / male.

15. Mloren says

“Though the questions are very telling – just from reading other’s answers to the sex questions, I can tell if we’d be compatible or not.”

So Jen what’s the answer consider compatible? Inquiring minds want to know :p

16. Alteredstory says

Heehee – sorry, but I find that comment amazingly funny :D

As a fellow lover of OkTrends, I wholeheartedly approve of this. Don’t understand the math/science behind it, but that’s okay!

I also do not approve of being lumped in with “Republican Senators.” D:

18. Will says

And a random pointless attack.
Ah the joys of the internet.

I neither know nor care who you are Annie but I seriously doubt Ms. McCreight has any problems in the area that seems to interest you.

Have a good first day of the quarter Ms. McCreight. We share the campus but as a Library Science Grad student they rarely let me out of the Information school.

19. Will says

Does that mean if we can find one more we get a buck?

Mind you depending how far to the right us yellows fall on the chart we could have lots of fun with three to a box.

Just saying.

20. HeatherR says

I wouldn’t be so sure that the Evangelicals would be on the left of the Kinky access. Quit a few who I’ve known have a “I’m a bad girl/boy because I like sex so I need to be punished” view of sex. Granted it wouldn’t be all of them, but there would be a some further right on that axis.

21. Aliasalpha says

In my case it was a matter of balancing design simplicity (especially important since I have very limited screen space) and trying not to make anyone outside the gender binary feel like they’ve just been described as leftovers or factory seconds

22. Aliasalpha says

Heh but depending on the type of fun we might end up being disqualified for being in the box in the first place and then wed NEVER get a dollar!

23. Aliasalpha says

24. hoverFrog says

24. hoverFrog says

Even Facebook has an “interested in” question.

25. hoverFrog says

I wouldn’t be so sure that the Evangelicals would be on the left of the Kinky access. Quit a few who I’ve known have a “I’m a bad girl/boy because I like sex so I need to be punished” view of sex.

I think it’s fair to say that we’ve all played that particular game.

26. Yea, that’s always the issue. As I said with my first post, I could see why it would be hard to implement TG onto a dating site, but the same with a social network.

Could you, instead make it a blank field? So, say “Gender: [fill in blank field]” instead of a drop-down? That way you could include everyone without necessarily lumping everyone together.

27. Aliasalpha says

The only problem with that is that it gives people an open text box which always leads to trouble in the form of idiots who think it’d be funny to call their gender Jedi or from people who can’t spell. After all if you do a search for ‘transgender’, you wouldnt get users who just use the shortened form ‘trans’ or spell it ‘tranzjender’

28. says

29. Ah yea. I figured that might be what would happen. I guess I think too much of people sometimes.

30. “Interested in” covers so little, and with a site like OKCupid, they do a lot of things based off of the two-gender demographic.

31. Tch. That’s crazy. Who would play THAT kind of game with their boyfriend? Heh heh… *shifty glance*

32. says

33. loreleion says

I think that transgender and genderqueer should be checkboxes that can be selected in addition to male / female (which should also be optional as many genderqueer people identify as neither).

Also, maybe Seattle has spoiled me, but I find gay women and straight men far more interested in trans women than gay men and straight women. And I wouldn’t really want to be with someone who was attracted to me because I’m male-bodied, since those are the aspects I hate about myself.

34. Yea, that’s another good possibility. I need to move out west, seems a lot more progressive than Virginia…

35. binjabreel says

Also, you’re thinking of Catholics. We’re some kinky bastards, what with all the kneeling down on uncomfortable wood planks while begging a guy in massive physical torment to forgive us.

36. berlieparks says

1st, 2nd, and 3rd base, perhaps?

37. Can not recall my name because of the new system but it has something to do with bats and cats or basments and cats says

38. ewan says

It’s not completely clear that they’ve quite understood that Jen & co. are proposing to do the work for them, not for themselves.

I’d have another shot at it – spell it out that you don’t want them to give you data, you want to give them an analysis.

39. I’ve seen a set of options labeled “Male”, “Female”, and “It’s Complicated”. I thought that was particularly sweet.

40. Raiki says

Just thought I’d pop in to point out how awesome Jen is for ending a blog post (on mathematics as applied to getting freaky, no less) with a star trek reference.

…Or I could be reading too far into three small words.

Either way, the ‘Jen is awesome’ point stands.

41. Aliasalpha says

Presumably the low end of the scale is for people who’ve rarely or never masturbated and the high end is for people who routinely go with stuff like autoerotic asphixiation whilst covered in honey

42. Aliasalpha says

This quantifying reality business is harder than it sounds.

The main problem for my design (aside from the fact that I have 3 weeks left to finish it and still can’t be bothered to do any real work on it) is that I need to make it simple enough for joe & jane public to use whilst acknowledging the existance of people who aren’t like joe & jane public.

The checkbox idea for trans people is a good idea in theory but it doesn’t cover, for example, hermaphrodites who would require yet another option. The more specific I get the more specific I need to be to not exclude people and the project would just get too detail heavy and unuseable.

As an example, one of the options I’m implementing is letting users specify their musical tastes. Adding ‘metal’ as an umbrella term for generally heavy music was the only solution that let me retain my sanity & not spend weeks specifying the dozens of major metal subgenres and has the added benefit of not letting me no-true-scotsman it by declaring that christian metal is not real metal on account of how shite it is. A quick look at http://www.mapofmetal.com shows just how diverse the metal scene is (and its also one of the best websites ever made), New Wave Of British Heavy Metal is extremely distinct from Melodic Death Metal which extremely distinct from Rap Metal but to someone not esperienced with the genre it’d all sound basically the same.

As much as it sucks to strip away the nuance, especially with something as important to people as their gender identity, its pretty much necessary to keep things useable.

43. says

44. anon says

OKCupid used to show how someone’s traits compared to the overall population. They even showed it as a bell curve (superimposed on your own bell curve) so you could tell whether they were known to be of average kinkiness/whatever or just hadn’t answered any questions. I thought this was the best feature of their site and then it went away entirely, so I guess they decided this isn’t something people actually want.

45. Beowulf says

You do know that’s not the way PCA works, right?!

46. F says

47. Christoph says

Why use a PCA when you can use a real exploratory factor analysis?

It bothers me that many people use PCA when they actually want to use a factor analysis. It’s convenient, yes, when you click around in SPSS and select factor analysis the preset thing is PCA and you’d need to click two times to change from PCA to a factor analysis and the results are -similar-, but the goal of a PCA is different to that of factor analyses.

For reference: http://pareonline.net/pdf/v10n7.pdf

48. TK says

OkTrends is a gold mine of laughs and “whaaaaaaaat?”.

49. Crip Dyke, Right Reverend Feminist FuckToy of Death & Her Handmaiden says

For computer code, it doesn’t matter what you display as text for the choice.

So, why not use a slightly longer phrase that works better?

I like

“another gender”

rather than just

“other”

but if you’re creating things from scratch for yourself, remember to call sex sex and call gender gender. This is part what creates the problem that Katherine describes. Katherine appears to be saying that she’s a male woman. If you know the difference between sex & gender, this makes perfect sense. Whereas asking a question labeled “gender” but giving options that describe sex creates terrible confusion. One isn’t Female and Male, but one can be a female man.

Also, think about what youre using the data for. Are you attempting to do something with drug development where physiology and anatomy are relevant? Then ask about sex and give the choices male & female & “another sex”. If you’re attempting to do something that has to do with social interactions & social power, well, no one checks chromosomes when they decide to discriminate against women (or men) for gendered jobs. In fact, studies show that much discrimination happens in reaction to names on cover letters & resumes. They don’t even wait to see how curvy a person is. Nothing about the body enters into it.

Thus in those situations, asking about gender & giving options, “man, woman, “another gender”” is the appropriate choice.

It’s all about doing the work of thinking about what you’re analyzing – bodies or psychology or social interaction. Then use the appropriate words for what you wish to know.

50. jflcroft says

51. loreleion says

I don’t think of myself as a male woman and neither do any of the other trans women I know. I’m female. Full stop. I also happen to be transgender, which basically just means I have the wrong body. I’m not another gender, not to say that that’s not a valid identity (i.e. genderqueer). If the options are male and female I’m going to pick female. Hell, if it’s male / female / other I’m going to pick female, but if there’s a transgender option I can choose as well I’ll check that.

52. says

I’m curious to see how much of a match you and I are, Jen. :D I just joined the site the other day thanks to this post (and your previous post a while back).

53. Tobias says

Wow, I certainly feel a lot better knowing that OkCupid is only going to share data about my sexuality with people that can pay them a lot of money.

54. says

Well you could always go the route Fetlife had gone (if you aren’t familiar it’s a kinky social networking site). They have a massive list of gender options which I think is really awesome:
Male, Female, Crossdresser/Transvestite, Trans – Female to Male, Trans – Male to Female, Transgender, Genderfluid, Genderqueer, Intersex, Butch, Femme, Not Applicable

That’s maybe going a little overboard (they’re orientation list is equally long) but it does cover most of the bases for most people.

55. hoverfrog says

56. Sili says

Then why on earth did you hand over that information to some random site on the Internet?

57. aria says

Hey! I’m writing a paper on PCA for a class and stumbled upon this blog. You’re probably in the middle of finals (or just starting them) but if you could take the time, it would be greatly appreciated!!

Thanks!