A few months ago, a friend and I were working on our taxes, and we had to look something up on the Internet. I searched for it on my browser, and my friend noticed that something was odd. This didn’t look like your ubiquitous Google results page. Somewhat reproachfully, he asked if I was …gasp… using a different search engine. Surely as a PhD student in a department that specializes in delivering information to users, I should know that Google is peerless in its ability to dig up the most appropriate information. So, why was I using something different? I told him that I was worried about my privacy and the amount of data that Google had been collecting about its users. I had switched to duckduckgo, which follows a strict no-tracking policy. “But, you don’t do anything illegal. Why do you need to hide?” was his retort. “Why does it bother you if they collect data about you?”
Well, let me give you three examples of data collection that keep me awake at night:
Case 1: A few years ago, the internet activist Eli Pariser ran a little experiment. He asked two of his friends in New York City to search for “Egypt” on Google and send him screenshots of the results page. Neither was signed in to any account, yet one saw results for Egyptian tourism, and the other saw news about the Egyptian revolution. It turns out that even if you had just made one search, Google makes a prediction of what sort of person you are. It then constructs a version of its information carefully tailored to this prediction. In a way, it creates a little bubble for you where the only information that filters through are the things that Google thinks you will click on. This is what Pariser calls the Internet Filter bubble,1 a world where we only see things that we like, or rather Google thinks that we like. And unfortunately, it is becoming harder and harder to opt out of it. We frown upon those countries in the world where governments prevent the press from publishing controversial news. Yet, why do we willingly submit to Google when it pushes controversial issues out of view?
When I was in middle school in India, most of my interests lay in science and all my knowlege of Indian history came from my textbooks. Once, I was leafing through a newspaper looking for the technology section when I came across an article on “The Emergency Years.” I was shocked to find that there had been a two-year period in the 1970’s when a few politicians declared “a state of emergency” in India, unilaterally suspending elections and turning the world’s largest democracy into an effective dictatorship. The sanitized version of modern history that I learned in school glossed over this important part of my country’s history. Most Indians would have liked to forget that dark period. Yet, it is important for events like these to remain in memory to prevent a relapse into a autocratic state. The newspapers were doing their job in periodically reminding us of our history. But the job of the internet companies is to keep you entertained while they show you advertising. To my middle school self, that would have meant showing me lots of science news. If that had happened, would I have ever known about the emergency years? What kind of person would I have become if I had only seen the things that I had wanted to see, or rather some powerful entity thought was best for me to see?
Case 2: Imagine this situation sometime in the future: you learn that you’re pregnant. You want to keep it a secret until you make the announcement at your party a month later. But before you can talk to your partner, your home is flooded with coupons for baby products. It turns out that the grocery store where you do your weekly shopping has been analyzing what you were buying. It used that to figure out that there was a pregnant woman in the family and sent you coupons for baby products. The store was able to find out something about you that even your own father did not know! This is no hypothetical situation – it happened two years ago to a young girl who shopped at Target.2,3
The store’s actions are actually fairly innocuous compared to what could have been done. Medicine prices, for instance, could shoot up when you are sick, and cost less when you are healthy. If you think that’s outrageous, consider that insurance companies already adjust how much they charge you based on predictions of when you’ll die or get into a car collision. That’s the reason health insurance costs more when you get older, and car insurance costs more when you are under 25. It’s ok if your doctor wants as much information about you as she can get. Her priority is to save your life. It’s not ok when a company wants as much information as it can get. Its priority is to make money. Imagine what happens when the companies start predicting that you are more likely to commit a crime based on the statistics about your ethnicity? Or that your marriage will end in divorce based on the amount of time you spend at work? Will we start being punished for crimes we haven’t committed like in Minority Report? Or what if the company decides that it will make more money if you get divorced?
Case 3: An Austrian man used European law to get Facebook to give him every bit of information they have about him, which turned out to be 1,222 pages of data. A substantial portion of that data consisted of things that he had thought he deleted.4 But Facebook could still use that deleted information to make predictions about him, and sell those predictions to other companies. It is starting to become apparent that we might have even less control over our information that we had initially assumed.
Most people know that once you put something on the internet, you can almost never take it down. Facebook and other social networks add another layer of complexity to this problem by inferring information from your friends. For instance, Facebook knows what I look like because my friends post group pictures and they tag me in the pictures. Even if I had never known what a social network was, Facebook will still know my name and face. And as we’ve seen, Facebook will know this even if my friends take down the picture.
So why do companies work so hard to collect information?
Think of your average company in the 1950’s. It makes a product, sells you the product, and then makes money on the profits. Now, think of your average Internet company today, such as Google, Yahoo, Facebook, Twitter, Tumblr, and even WordPress on which this blog is being published – all of them offer services for free. But in return, they place advertising on their site that targets the users. Most people don’t click on ads, so if a company wants to maximize profits, instead of spending more money to show their ad to more people, they would rather show their ad to the right people – that’s what we call “targeted advertisement.” For example, advertising car insurance to someone like me, who doesn’t own a car, won’t bring in any money. But showing me ads for pizza, knowing that I live next door to a pizza place, will be really effective. Websites that can tune their ads very well can charge advertisers more money to show those ads. The more the websites know about me, the more appropriate the ads they can show me. That’s the reason Gmail is free – Google can read the emails that you send to learn who you really are, and then deliver the most targeted ads. In a way, the Internet companies are still selling products like the companies of old. But this time they’re selling to advertisers, and the product is us.
While I might wear my tinfoil hat a lot while ranting about this, collecting data for better advertising isn’t bad in itself. I would certainly prefer to see ads for stuffed crust pizza rather than for female sanitary pads. Large scale data collection is also expected to provide huge breakthroughs in fields like artificial intelligence, social psychology, and economics. For instance, if you tracked the speed at which someone read through webpages, you might be able to detect early signs of cognitive decline due to Alzheimers. Doctors could then start treatment much earlier than would otherwise be possible. Although I pick on companies like Google, they take the issue of privacy very seriously, and they do give you a certain degree of control about what information they collect about you.
“But, you don’t do anything illegal. Why do you need to hide?”
Going back to my friend’s question, privacy is not the same as hiding – it’s having some amount of control over information about you. In a way, our information belongs to us the same way our money does. We give away some of it in order to get goods and services, such as letting banks look at our credit histories to give us lower interest rates. But that should be a choice. We don’t let people take money from our wallets without our knowledge or permission – why should collecting our information be different?
I don’t do any drugs, but I’m certainly not going to agree to be subjected to daily blood tests administered by any random person on the street. This is precisely what the companies around us are doing today. They have computers that pore over every bit of information about us they can get to learn more about us. In the same way that refusing a drug test is no admission of guilt, refusing to divulge information does not mean that I’m doing something illegal. Even if the companies collect data with good intentions, this process has no checks or balances. There are few barriers to abuse of this power. The Fourth Amendment to the US constitution is intended to protect us from unreasonable searches from the Government, but there is nothing right now that protects us from large scale private data collection.
My main worry is actually that most people don’t even realize that they are giving up privacy, or what sort of data they are giving the companies. In using Google or any other service, there is always a trade-off between giving the website information about you and the quality of the service you receive. I’m just worried that not everyone sees the trade-off that they are making. After all, there is no such thing as a free lunch.