How Personally Identifiable Information (PII) Is Collected And Who Profits

What Is Personally Identifiable Information (PII)?

PII, short for personally identifiable information, is essentially all the information you don’t want getting out that at its core represents who you are in simple, mathematical data. Because of the connections between this data and every facet of your life, the incentive to steal, gather, and sell it is incredibly lucrative.

From the time you’re born you passively generate data that is linked to you; your social security number, for instance, is typically generated within 6 weeks of your birth. You don’t have any control over it, but this little nine-digit number will affect nearly every aspect of your life as you age.

Consider how mundane social media has made the act of sharing just very basic information with the world. We share our day-to-day lives pretty willingly and often without any deference given to who can take and use that information. You might scoff at the idea of sharing your social security number, but with just a little bit of easily attainable data that you willingly share, an identity thief can get their hands on your SSN.

A short list of some (but not all) PII would be:

  • Full name
  • Social Security Number
  • Passport Number
  • Driver’s License number
  • Full personal mailing address
  • Biological data, like retinal scans, facial reognition data, voice captures, and even fingerprints
  • VIN numbers for cars
  • IP addresses
  • Telephone numbers

Who Is After Your PII?

Not to sound alarmist, but everyone is after your personally identifiable information. Just a sample of potentially infringing entities include:

  • Identity thieves
  • Companies who want to sell to you
  • Companies that want to sell your data to companies who want to sell to you
  • Scammers
  • Government and police entities
  • Research institutes for everything from psychological studies to marketing and advertising
  • People making money off your photos, in advertising or other methods

Even companies that want your data to better market to you or help you (for instance, a line of credit) can end up being a liability. Think of the massive security breaches in the last few years. Millions of social security, bank, and credit card numbers were leaked and are freely available on the dark web because of what should have been safe, consumer-friendly companies.

But even those listed above are secondary users of your data, not the primary gatherers or the terminus of the information. There are people who gather your data, who may or may not sell it, then there are those who act as a go-between, and finally the end-users. Examples of PII gatherers are:

  • Facebook
  • Instagram
  • Twitter
  • Email providers
  • Email lists
  • Your internet service provider (ISP)
  • Snapchat
  • Tik-Tok and any other “social sharing” app
  • AirBnB in particular is treading into relatively uncharted territory, as is Uber. Both of these companies combine internet-accessible personal information with real-life interaction, the ramifications of which aren’t well researched yet.

These companies provide a service in a sense, but they are getting back far more than they provide. When you sign up, post pictures and video, use chat rooms, send emails, and post tweets, you’re giving these companies carte blanche to your personal information. Though Instagram doesn’t own the rights to your photos, they CAN use them for their own ends.

One of the most potentially problematic parts of these brands is that even IF you wanted to take everything you’ve ever posted online down, they have back-ups upon back-ups; there is truth to the saying that once it’s on the internet, it’s there forever.

Consider how Google targets you with ads that are eerily pointed; you talk to a friend about baby shower invites and suddenly you’re hit with ads for maternity gear, baby toys, and other kid-related stuff. How did this happen? You didn’t search for any of it that you remember, so why?

It’s because Google has access to every email you send. They comb for data and then use it to target you with ads that will most likely generate turnover and therefore create revenue for them. Facebook does the same thing, albeit in a more Machiavellian way. Mark Zuckerberg admitted to Congress that Facebook had been gathering data on conversations through their Messenger service for years, though he didn’t elucidate exactly why.

Making money is one thing but this gathering of data has far wider implications and once the data is gathered, it only takes one breach to make it into the hands of people who very much want to use it for nefarious purposes.

A good breakdown of the utter seriousness of the access something as innocuous as Facebook has to your information looks like this:

  • Facebook has access to your entire device information from your settings to very specific things like your time zone and zip code. In addition, all of those things that you click “allow access” to? They can gather information that way as well, including voice, text, video, and Bluetooth connectivity.
  • Information that is created or accessed when someone shares a picture of you – whether or not you are aware of it or gave consent – is Facebook’s forever. Location data is a big concern here, because say you don’t have Facebook but you’re on vacation with a friend who tags you in a photo, that location data is available to their friends (or the public).
  • Anytime you spend money within Facebook (the Marketplace for instance) that data passes through Facebook, even if you’re not buying from Facebook.
  • Facebook creates a sort of profile-within-a-profile of you based on your actions, shares, likes, and content. They even brand your political, religious, and social affiliations to better target you with news and advertisements. You can see how Facebook evaluates you on many marketable rubrics here: http://facebook.com/ads/preferences.
  • Naturally Facebook isn’t the only app that collects data like this; just about any time you “allow access” to your camera, voice recording software, Bluetooth, WiFI, or GPS, you’re providing much more information than it might seem is necessary for the operation of the app you’re using.

A larger net is cast by Google, who has their tendrils in every facet of online life. Whereas Facebook has somewhat ambiguous, almost supervillain-esque uses for your data, Google simply wants to tailor the internet as best it can to your tastes (and profit immensely off that). The problem with Google is that because they want to help you in every way they can, you end up with data spread to parts you weren’t even aware.

Assuming you’re using a Google app on your phone, it will associate your device with your Google accounts. This means that potentially everything you do on your phone that doesn’t pertain to Google is still under their watchful eye, including location data, which has very far-reaching implications.

As we said in a prior paragraph, Google “may” (read: “will”) analyze your content to provide you personally relevant product features, such as customized search results. This includes literally ALL of the data you transmit through a Google product, such as , Drive, Cloud storage (possibly the most frightening because of how easy it is to penetrate) and of course, email.

Naturally the other platforms collect data, and a lot of it is exactly what you’d expect – Snapchat has your photos and videos for advertising, Instagram has a HUGE database of photos, which can be theoretically used by the government for surveillance or honestly corporations for less large-scale creepiness.

We are now, however, in a realm of entirely different ways of getting your data, and entirely different data being collected.

Modern Data Collection Quirks

Your PII can be collected by the companies you willfully engage with, but there are subtler, sneakier ways that companies collect data and then combine it with available PII to create a trail of information back to you. Some examples of these more esoteric data collection methods are:

Free WiFi

WiFi is wonderful but you have to understand the terms of service when you login to any company’s free network. Whether you’re on your own WiFi at home, at the gym, or a Starbucks, the ISP related to that internet signal is tracking your every move, purchase, and search, gathering data about your associated email addresses, texts, and apps. If it uses their WiFi signal, then it’s their data as well.

Free WiFi at the corner local coffee shop might be fine (though still likely unsafe if there’s someone there who is using the WiFi to steal information), large corporate WiFi is best to be avoided.

Social-Media Interface Activity

Naturally social media tracks many parameters of your life; your online activity, your photos, your technically offline activity via check-ins, location tracking when you post, etc. Social media allows us to broadcast ourselves constantly and one largely unfortunate downside to that is that we’re simply no longer broadcasting to justour target group. So when you post on say Instagram, they collect location data in various ways, tie it into your personal data they have on file, and then that becomes a commodity, but this is all common knowledge.

What has evolved in the last several years is the connectivity of various social media platforms, such that you can log into Twitch via Facebook, web forums via your Google information, and so on. This leaves a trail where you might be posting on say Reddit with an anonymous account, but need to post a photo, so you log into Imgur with your Facebook account and there is now a digital trail between anonymous-internet-you, and your real life.

The ease with which you can “login with Facebook” causes people to let their guard down and then all the information from one site flows into the other, and possibly into prying hands as well.

Map Tracking (Also Known as Heat Mapping)

This is one of a handful of generic terms that refer to the practice of a website tracking how your mouse moves around the page, helping them gain insight into your behavior. The fact is that with some very simple cookie or form data, they can combine your tracking data with your personal information and get a story about what you’re looking for that you might not even consciously be aware of.

GPS

Not new, or newly mentioned, but the fact that the convenience of GPS on modern phones is ubiquitous should be unnerving. Your phone company, any WiFi you’re connected to, and any apps with which you’ve shared positioning data all have access to your real-time location. While we’re not all being tracked by government agents or Jason Bourne, the access to that knowledge is invasive and actually can be dangerous if it fell into the wrong hands.

Signal Tracking

Something relatively new to the data gathering market is the concept of signal tracking within brick-and-mortar stores. By using your phone’s WiFi data, a store will track your every movement, how long you stay at each display, and follow you through your purchase, gathering information the entire time. This is really only useful for targeted advertising within a store, until it is coupled with this next entry:

Facial-Recognition Cameras

What began as an effort to stop or identify shoplifters has become another way to gather real-world data and sync it up to PII on the web or within a small network within a business. The days when advertisers pick your face out of a crowd and target only you with advertising are not that far off; facial recognition software on cameras can spot you, and if you’re hooked up with email subscriptions from that business, immediately send you target coupons or ads to your email or phone.

Together with signal tracking, your shopping habits, movements, and even the brands and types of products you buy and use regularly can be free-game to companies willing to pay for that information.

Some examples of this data being used in the real world are a company called Realeyes based in London, which analyzes customer’s faces at checkout as well when they peruse online ads in-store to measure their mood. This allows them to connect the personal information you provide – even if it’s just an email or phone number – and combine it with actual real-life data about your face and perceived enjoyment of certain products.

A company in Russia has created a program that analyzes customers as they check out for mood, age, gender – all from facial recognition software. Then they take that information and use it to market directly to that customer, via their provided mobile phone number or email.

WHOIS, DARPA, and the Early Net

Without going into extreme, long-winded detail, the early web had 1 space for domain information collection that eventually turned into a database called WHOIS which is pronounced as it looks. Early on, the data was largely public and as the time for it to be public closed around 1999, some forward-thinking individuals combed through the data to gather it up to sell it later.

The data is somewhat still available but the database isn’t accessible the way it used to be; originally if you searched for say, your own last name, the search results would pull up EVERY person who had domain data with that name. Now you need extremely specific data in order to complete a WHOIS query and get results, but the data mined before the change happened is still available, if you’re willing to pay companies like domaintools.com.

As you can see, there are a multitude of ways for companies to gather information both online and in the real world. While some of this data is benign – the movement of new customers in a store for instance – it becomes significantly more potent, useful, and sought-after when it’s coupled with personally identifiable information.

License Plates

Due to the ubiquity of traffic cameras, red light cameras, and in some countries, CCTV, your license plate is no longer a private matter. In the US, most red light cameras are actually put there on contract with a non-police entity that is paid for the data they collect (to catch people who speed through intersections or cause accidents).

In some cases, the companies are given payment based on how many tickets can be issued due to the photos they capture. This is already sleazy and terrible, but those companies can then take your license plate and if they have enough data, track it back to yourother information.

This information can be fed into companies that will do their best to make you feel insecure or like you’re under surveillance (which you kind of are). Often a photo of your car will be provided along with an offer to check your credit, protect your identity, or something else vaguely creepy in order to coerce you into their product.

A real-world example of this data collection being bought and used is in the role it plays in making repossession of cars easier. Automatic traffic cameras can take up to a hundred photos a minute of cars passing by, and with access to a list of people who are behind on car payments, it makes that job significantly easier. This is literally digital data gathering being used against people against their will or in most cases without their knowledge.

How Is All of This Being Used?

To break it down even further than we have, PII is gathered primarily to sell. At least on the legitimate business side of things, large companies want your data to improve their targeted advertisements.

For example, advertising on Facebook can have very good turnover but it costs per click; just any random person who sees your ad might click through but if they’re not a person who is interested in your product or service, they’ll hit your landing page and then back out to watch cat videos.

This doesn’t matter to Facebook, though – an ad-click is an ad-click. So you as a company might be losing 65%+ of your click-throughs (and the associated money) to people who would never convert. This is why targeted marketing is such a big business and why all these companies are clamoring for your data.

It’s incredibly simple math to figure out who your target market is and if you spend a few hundred dollars gathering relevant data to prime your ads, you’ll save far more than that by avoiding wasted clicks.

Beyond Facebook (and the interactions therein), some other uses for the mined data include:

Retail Companies

Target, Walmart, and other companies will use this data to send out specialized coupons that could lure your purchases. Walgreen is a big proponent of this sort of data gathering; when you buy anything, you get printed coupons that are based on your previous purchases and in some cases, your purchases made at other stores. This is because they pay good money for their customer’s data (and gather it themselves).

Email Companies

Larger entities like Google are able to go through everything you search for and that you send in emails to find out just want actually interests you. The power of Google is so strong that it has been known to suggest maternity items to women who don’t know they’re pregnant yet based on data of their searches and the contents of the conversations. Just like the desire of retailers to hit you with targeted ads, Google gets paid based on sales made through their ad placement.

SEO Mills

Traffic data and statistics are sold to various entities in order to be monetized through SEO mills. This is a large scale sort of generic type of marketing on the surface level, such as blogs that have a bare bit of information just so they can embed a link for a product. With aggregated data, however, this becomes a much more powerful tool, allowing you to zero in on a specific market, based on the habits of that market.

Cold Callers and Autodialing

Finally, there is the art that just won’t die: cold calling. Cold calling companies will pay money for phone numbers, names, and demographic PII data in order to call you up and try to sell things to you. This is like the aforementioned methods of selling, just far more annoying and it’s more invasive in a much more visceral sense.

While all of this is definitely an invasion of privacy and somewhat annoying, it can be harmless. There are, however, more groups trying to gather your personally identifiable information that are not at all harmless and can actively ruin your life.

The Dark Web, Hackers, Scammers, and the Government(s)

Hackers and all of the associated scam artists have been around nearly as long as the internet; probably mere moments after the very first email account was opened did that person receive the first Nigerian prince scam email. As time has gone on, however, the tactics of these people and groups have evolved. No longer do they need to rely on the naivety or carelessness of the casual browser to click on a bad link, as data can now be intercepted and mined from any device as long as it’s connected to WiFi or a network.

Your personally identifiable information is not safe from the prying-yet-innocent hands of Old Navy staff which means it’s definitely not safe from someone who is actually actively attempting to steal it. Threat assessment intel is taken from the Dark Web on a regular basis and then sold to companies to shore up their own cyber security and plan for potential attacks, but you don’t have that luxury most likely.

The Dark Web is simply a term for the parts of the internet that won’t show up on search engines. It’s a popular place for a lot of horrible things but it’s also a useful tool for people who want to simply protect their privacy. That said, there is a lot of personally identifiable information available on there for the right price. Some examples are:

  • Email passwords are bought and sold on the Dark Web and with a little ingenuity, someone can take that information and affect every facet of your life, essentially taking it over online. When people use things like passwords (or any in-road really) to gain your social security number or other extremely vulnerable personally identifiable information, it is called identity theft. People who steal your identity can use it for a number of negative things from opening credit cards to setting up utilities.
  • Credit card information can be gathered by hackers fairly simply but rather than run the risk of being caught using it themselves, they will put it up on the Dark Web for sale. Someone who is capable of navigating that space readily can run your credit card up to its max in a matter of hours or less, leaving virtually no digital trail.
  • Doxxing or the act of stealing extremely personal information like your phone number or physical address is pretty common and usually precedes real-life harassment. Your IP address is another popular thing to be sold so that people can directly attack your computer; the internet is full of petty, angry people who are more than willing to pay money to get information to hurt you if you slight them online.

Government agencies can also monitor public chats and in fact often do to snare drug sellers or worse. While this is probably considered a reasonable use of power by most people, the idea that any chat you have that is on an open forum could be monitored by the government is somewhat unsettling. Beyond that, the Patriot Act and similar laws in the UK give the government the right to spy on your internet activity with very little if any “reason”.

Even police need a search warrant to search your house, and that requires a judge to sign off that their reasoning for the warrant is valid. Not so with federal snooping; if they for some reason think you’re of interest, they can access anything that is related to your actions online without actually coming to you and demanding it (unless they have the aforementioned search warrant).

A very pointed example of how your personally identifiable information can cause you harm in the wrong hands is the story of Jorge Molina. He was arrested and spent over a week in jail because police had subpoenaed Google for tracking data of phones in the area of an unsolved crime. Because his data (erroneously) was in the area of the murder, it was suspected that he was the culprit.

Because, however, the data is not precise, it turned out that it was NOT Molina and he was set free. This is an example of a person’s data being taken from Google and used to cause harm – accidental or not – to a very real person in real life. Though Molina wasn’t the perpetrator, Google tracked the GPS in his car, which was being used by his mother’s ex-boyfriend, who was the actual culprit.

It’s worth noting that the laws that outline the government’s ability to spy on you are vastly different throughout the world. The EU has incredibly powerful personal internet privacy laws that protect the user from needless snooping. In fact, the “we must inform you that we’re using cookies to track your data” thing at the top or side of every webpage now is a result of the actions of the European Union to create a fair, open, and private internet.

That said, the UK’s laws are still a bit more government-centric, and they don’t follow the strict, consumer-protecting protocols as the EU. Other countries have modeled their internet security and privacy measures from the EU standard, including Argentina, Afghanistan, Chad, the Bahamas, Canada, Australia, and a host of others.

This is in contrast to countries that are very high on the surveillance list, like China, and some that are just slightly off-center in favor of the government, like the USA.

Protecting Your PII

There are many methods you can take to protect your data that are specific to each type of breach of privacy you’re subject to on a daily basis. Some things, like protecting your license plate from spying cameras, requires veryspecific equipment. Others require a bit of due diligence on your part to simply act more defensively online.

  • Log out of your accounts when you leave a website, and don’t let your browser store login data
  • Turn off tracking cookies, and delete your browsing history and cookies upon exiting your browser
  • Never give out information that you honestly wouldn’t give to a person on the street
  • When you’re checking out at a register, most stores will prompt you to share your email “for special offers” but understand that this is another way for them to gather data, and that it will be connected with any type of traceable payment you make (debit, credit, etc)
  • If you keep using social media, be incredibly selective with whom you add to your network. In addition, NEVER do surveys or quizzes on Facebook – this sounds specific and everyone you know might be doing them, but these are usually the biggest data miners on the platform.
  • Change your password on a regular basis for your primary email account, and if you have a chance, go back and close out all of your old emails that you don’t use anymore
  • Never save your credit card information on any website, from the pizza joint down the road to Amazon. Amazon in particular can save basically an unlimited number of credit or debit cards and if someone hacks your account, that’s going to be a bad time.
  • If you don’t want to pay for something like Lifelock, it’s worth monitoring your credit if not monthly, then at least quarterly. Credit Karma is one of many free services that will give you notifications whenever something changes in your score; in particular when it’s checked for a new line of credit someplace.
  • Never log into personal email on a public computer unless you’re on a protected network, like your job or university

The best advice in general is to be as selective as possible when you personally share your data, and to stop sacrificing your privacy for convenience. Clear your data from every form, delete your delicious, delicious cookies, and keep sharing to an absolute minimum. It will definitely take some re-wiring of your habits and actions to get into a defensive position but it’s well worth knowing that you’re not being broken down into shareable data and sold.

Scroll to Top