What does the NSA do with our personal data?

How much data do we produce? According to recent IBM research, humanity generates 2.5 quintillion bytes of information daily. (If these bytes are represented as coins laid flat on top of each other, then they would cover the entire globe in five layers.) This amount includes recorded information - photos, videos, messages on social networks, text files, records of telephone conversations, financial reports and results of scientific experiments. This also includes those data that exist for only a few seconds - such as the contents of telephone conversations or Skype chats.

Data collection by security services is based on the basic thesis that all of their mass can be analyzed in such a way as to identify connections between different people. Understanding these connections, you can find clues for investigative actions.

The main principle in data processing is to label each fragment, and based on these metadata, computer algorithms will be able to identify the communications security service of interest. Metadata is data describing other data. Such, for example, are the names and sizes of files on your computer. In the digital world, a label pasted on a piece of data will be called a label. Labeling data is a mandatory first step in their processing, since it is the label that allows the analyst (or his program) to classify and organize the available information for further processing and analysis. Labels allow you to manipulate data fragments without delving into their contents. This is a very important legal moment in the work of the security service, since US law does not allow opening the correspondence of US citizens, as well as foreigners legally staying in the country without an appropriate warrant.

Edward Snowden

IDC, a data analysis company, reports that only 3% of all information circulating in the computer world, when created, is accompanied by a label. Therefore, the NSA uses a special, very complex program that “sticks” the appropriate labels on all the collected information. They represent the basis for any system that establishes relationships between different types of data - for example, between video files, documents and recordings of telephone conversations. Say, a data processing system can attract the attention of an investigation to a suspect who posts terrorist propaganda on the network, visits sites that describe the technology for making improvised explosive devices, and in addition buys a pressure cooker. (This pattern is consistent with the behavior of the Tsarnaev brothers who are accused of terrorist attacks at the Boston marathon.) Such tactics are based on the assumption that terrorists are distinguished by specific data profiles, although many experts question this assumption.

NSA collects metadata on telephone calls. This metadata reveals terrorists without delving into the content of the negotiations themselves. Among millions of calls, you can feel for certain patterns, as the scenario shown in the photo illustrates. 1. A call from Saudi Arabia from a well-known organization supporting terrorism, addressed to a cluster of possible accomplices. 2. A call from an organization known for its terrorist activities addressed to a US citizen who has attracted the attention of the National Security Agency. 3. The metadata on the telephone conversations that the suspect is conducting forms a cluster of accomplices in California. 4. Detailing telephone conversations shows that one of the accomplices in California is contacting someone in a cluster of Saudi Arabia. The NSA draws the attention of the FBI to this connection and gets the right to wiretap this line.

The NSA is a major customer of software that allows you to work with large databases. One such program is named Accumulo. There is no direct evidence that it is used for surveillance in international communication systems, but it was created specifically to provide labels for billions of disparate pieces of data. This "secret weapon" of the security service, created by Google software, is written in open source. This year, Sqrrl launched this program on the market and hopes to be interested in it in the healthcare and finance sectors to work with huge volumes of operational data.

The NSA has the right to monitor international communication channels and collects huge amounts of data. These are trillions of fragments of various messages that people write around the world. The agency does not hunt criminals, terrorists or spies who are identified through its work, but simply merges the information received to other government services - the Pentagon, the FBI and the CIA. Further work is carried out according to such a scheme. First, one of the 11 judges of the FISA secret court (Foreign Intelligence Surveillance) receives a request from the state agency for permission to process certain data received by the NSA. Having received permission (and there are usually no problems with this), the request is first redirected to the FBI Electronic Communications Control Department (ECSU). This move should ensure legal correctness - FBI agents check the request and confirm that the subject of surveillance is not US citizens. The ECSU forwards the same request to the FBI's Data Interception Methods. They receive information from Internet servers and transmit it to the NSA so that they can pass it there through their data processing programs. (Many communications companies deny the fact that their servers are open for access by the NSA. Federal officials, on the contrary, report the facts of such cooperation.) Finally, the NSA passes relevant information to the government agency from which it received inquiry.

What is the NSA plotting?

The NSA’s troubles began when Snowden revealed to the world that the US government was collecting metadata over telephone calls from all Verizon’s customers, including millions of Americans. In response to an FBI request, FISA Judge Roger Wilson issued a decree requiring Verizon to provide the FBI with details of all telephone conversations. The NSA calls this practice an “early warning system” that can detect terrorist activity.

Before society could digest the information about metadata, Snowden brought down on him a story about another direction in the work of the NSA, which has the designation US-984XN. Each search platform, each source of raw intelligence information receives its designation - SIGAD (Signals Intelligence Activity Designator, "intelligence index") and a code name. The SIGAD US-984XN service is known to us by its most commonly used code name - PRISM. The PRISM system is a collection of digital photos, somewhere stored and somewhere sent files, emails, chats, videos and video conversations. This information is seized from nine leading Internet companies. The U.S. government claims that it was these events that helped capture Khalid Uazzani, a naturalized US citizen whom the FBI accuses of plans to blow up the New York Stock Exchange.

The schemes published by Snowden show that the NSA, among other things, uses real-time tracking tools in its activities. Agency analysts can receive notifications about a user connecting to a service or sending a letter, as well as about entering a particular chat.

The rapid growth in digital information has attracted the attention of both the private sector and public services. Recycling these threads is becoming a promising activity.

In July, Snowden published a top-secret report that describes software that allows you to view hundreds of different databases. Snowden claims that these programs allow the lowest-level analyst to interfere uncontrollably in other people's information exchange processes. The report provides examples: “My client speaks German, but is located in Pakistan. How can I find it? ”Or“ My client uses GoogleMaps to search for their goals. Is it possible to use this information to determine his email address? ”The described program allows, by asking one such question, to simultaneously search 700 servers scattered all over the world.

Where can this data go?

Dogs trained to search for explosives sometimes panic when there is no explosive nearby. Such an error, called a false positive result, is a common thing. In the field of data collection, something similar happens too. This is when a computer program catches some suspicious data complex and makes an erroneous conclusion based on it. In such cases, immense amounts of information is a circumstance that increases the probability of failure.

Have you ever wondered where the proposals coming to your inbox from a wide variety of companies come from? They are formed by a certain algorithm based on your own interests, which left their mark on the Web. Target marketing is believed to increase sales.

In 2011, British researchers developed the game "Bomb in the bus." 60% of the players who got the role of “terrorists” were tracked using the DScent program. Her actions were based on recorded “purchases” and “visits” to a specific site taken under control. The ability of a computer to automatically find a correspondence between video files from security cameras and fixing purchases can be perceived as the blue dream of law enforcement services that care about our security. But for civilian freedom fighters, the ubiquitous surveillance is a serious concern.

The article “Secrets of informational intrigues” was published in the journal Popular Mechanics (No. 11, November 2013).


Bubble of silence: how to protect yourself from noise
10 cars from Romania: what did Dracula ride
Honda Lawnmower sets new speed record