70,000 OkCupid Users’ Data Has Been Published — But Is That So Wrong?

70,000 OkCupid Users’ Data Has Been Published — But Is That So Wrong?
Source: Flickr
Source: Flickr

All of the personal information you haphazardly upload to social media and dating sites is a treasure to someone, whether it's advertisers, police or programmers. But would you be fine with someone using all that data for the purpose of advancing scientific research?

A pair of researchers in Denmark culled together a database of 70,000 users from the dating site OkCupid and published it to an open science community for anyone to search or run experiments on.

"Despite many years of advocacy of proponents, it is still uncommon for social scientists to publicly share their datasets and even sharing data on request is rare," the researchers wrote in a paper included with the dataset. "Worse, there is some evidence which indicates that those who refuse to share data upon request make more statistical errors than those who share data."

The searchable database includes 36 points of personal information like username, age, location, "religion-related opinions" and number of photos. Actual photos themselves, as well as the body text of the profiles, were not collected, though that information could easily be found using an OkCupid account once a target profile was identified.

Hacks and exposures of giant databases of personal information have become frequent in recent years — with high profile incidents like the Ashley Madison hack or Anonymous' KKK dox— so some were outraged when they heard about the OkCupid database:

But Weingart's warning about our ability to unearth personal information was true about OkCupid even before this leak.

Already public? The OkCupid database was collected with a scraper, a program that automatically runs through a website to collect all of the data — the algorithmic equivalent of going through the semi-public profiles one by one and jotting down the info. 

Although it violates OkCupid's terms of service, it's not some sort of illegal hack. It's a convenient collection of information that was already available by inconvenient means.

All kinds of public data, digital and not, is scraped into databases daily. Police use license plate readers to tracks cars and justify surveillance by saying it's simply a photograph taken in public. Twitter just asked a big data firm to stop offering its services to law enforcement, but is still making its firehose of tweets searchable for marketers and advertisers.

Scraping and uploading the OkCupid dataset is mischievous, but if the researchers are to be believed, their motivation is to advance academics and science. OkCupid runs experiments on their dating data constantly to improve their product or learn more about human relationships — OkCupid cofounder Christian Rudder wrote a whole book about it.

As we move more of our lives online, it's harder to hide personal information from someone who wants to collect it. That data can be used for good or evil, and academic research hardly seems like the worst-case scenario.

How much do you trust the information in this article?

Jack Smith IV

Jack Smith IV is a senior writer covering technology and inequality. Send tips, comments and feedback to jack@mic.com.

MORE FROM

Hundreds rally in Times Square to protest Donald Trump’s transgender military ban

“I’m out here to support my trans brothers and sisters who have been serving our military for years and years and years."

Several Republicans are strongly denouncing Trump’s military transgender ban

“Anybody who wants to serve in the military should serve in the military. I don’t agree with the president.”

Worried Trump might pardon himself? Blame Alexander Hamilton.

Hamilton might not have been "thinkin' past tomorrow" when he pushed for broad executive privileges.

Harry Truman desegregated the military 69 years ago. Today, Trump banned transgender troops.

Truman wanted to end discrimination in the military "as rapidly as possible."

Here is a timeline of Donald Trump’s relationship with Jeff Sessions

Trump continued his Twitter attacks on Sessions Wednesday — reportedly while the embattled attorney general was in the White House.

How many transgender people serve in the U.S. military?

There's no exact number, but here's what research shows.

Hundreds rally in Times Square to protest Donald Trump’s transgender military ban

“I’m out here to support my trans brothers and sisters who have been serving our military for years and years and years."

Several Republicans are strongly denouncing Trump’s military transgender ban

“Anybody who wants to serve in the military should serve in the military. I don’t agree with the president.”

Worried Trump might pardon himself? Blame Alexander Hamilton.

Hamilton might not have been "thinkin' past tomorrow" when he pushed for broad executive privileges.

Harry Truman desegregated the military 69 years ago. Today, Trump banned transgender troops.

Truman wanted to end discrimination in the military "as rapidly as possible."

Here is a timeline of Donald Trump’s relationship with Jeff Sessions

Trump continued his Twitter attacks on Sessions Wednesday — reportedly while the embattled attorney general was in the White House.

How many transgender people serve in the U.S. military?

There's no exact number, but here's what research shows.