Shut Up and Take My Data


By Jason Lewis and Margaret Roth

For some of us, “data” has become a scary word. Especially within the context of the fear culture we have created around it. Here, data is a big evil green monster that is going to follow you around for the rest of your life, ruin every chance you ever had to get a job, and make the cost of your health insurance skyrocket. Like Cthulhu, data is a monster that, emerging from its hidden lair, will destroy your life and then eat you.

This conception of data is a lie. Data is not a monster. Data is information, experiences, discrete facts collected for a purpose. Data is a currency, a currency we have the choice where to spend, what to buy, and how much to save up.

Sometimes we spend our data on the attention of others. Social media posts, pictures, selfies, tweets, they’re all self-imposed paparazzi, they’re pieces of data that we freely fling to the world. Sometimes we spend that data on college admission. Sometimes we spend that data on an endless stream of the perfect movies we never even knew that we wanted to watch next but we love.

Here’s the tradeoff: when you’re talking about data, you’re also talking about software, and with software, everything is a tradeoff. As a well-known adage in the software world goes: If you’re not paying for the product, you are the product. If you’re not paying with money, you’re probably paying with data.

Sometimes we decide to pay with data; sometimes we pay people to collect our data. Sometimes we do both at the same time.

Think Netflix. Hulu. Amazon. Anything with a recommendation engine. You are paying to get the content that is most relevant to you. You are paying for a service where computers use massive sets of data collected from across the population to create a system that outputs a perfect experience for you.

By using that service, we are paying with our data. What we watch, what we stop watching, what we watch over and over again. Spread this out over the millions of people using the service, we get to a point where the individual does not matter. No one is sitting in a room looking at your data and judging you, for only watching movies starring Katherine Heigl and Seth Rogen.

When we get to this level, your personally identifiable information is the first thing that is stripped out, because it doesn’t matter. What does matter is that when we get rid of all the personal information and have millions of data points aggregated from across the population, we can begin to teach computers to recognize patterns, and in turn, let you know every time a new Judd Apatow gem becomes available on Instant.

When we get to this level, it’s not a matter of collecting data, it’s about securing it.

Security is, in every case, a tradeoff with accessibility. You can have 100% security of data: in a computer, powered off, not connected to anything else, in a cement box, buried in lead. You can also have 100% accessibility, through up a public website or constant television broadcast, 24 hours a day, forever. Everything else is somewhere in the middle. Security is relative, and it is relative to accessibility. If you want something to be useable, you have to allow access to it in some form. Or it’s not going to be very useful to anyone. So the decision has to be made which data is useful, how accessible it needs to be to remain useful, and how secure does it need to be to prevent it from falling into the wrong hands.

When we post on Twitter, or any other publicly accessible forum, we’ve waived the right to the security of that data. This is why laws like COPPA exist, to protect children from sharing potentially sensitive information about themselves when they’re too young to understand the ramifications.

In schools, FERPA exists to protect personally identifiable student data like grades and test scores.

The key in both cases is that the information is personally identifiable. When we’re talking about the type of data Google collects (especially in the case of GAFE, where ads aren’t even in play), individuals don’t matter. You are not a unique and special snowflake, and completely uninteresting… until you collect millions of snowflakes to make a snowman.

The “snowman” here is developing more sophisticated algorithms to understand this type of data, so that products (the same ones you’re getting for “free”) can be improved and become more useful, and to develop future technologies that are also going to improve life, education, and so on.

And when we build that “snowman” with xAPI, it changes the way we collect, use, and access, every single piece of our learning process and our learning ability. Any Learning Record Store created will be more secure than a student records office. The latter can be penetrated with a lock pick; the former uses the same encryption protocols and security measures used by banks, the military and Google. And it is more useful because your data is in a form where we can actually do something with it, where we can derive meaning from it.

We make these tradeoffs, knowingly and willingly, because data in isolation is useless. When data has friends to play with it is massively useful, because that data is being used to create something better, to create a better experience down the road.

To create an experience that will fundamentally change the way learning happens on Earth.

Share on Sanderling

Subscribe via Feedburner