NSA honing sophisticated skills to mine its vast data
A revolution in software technology that allows for the highly automated and instantaneous analysis of enormous volumes of digital information has transformed the National Security Agency into the virtual landlord of the digital assets of Americans and foreigners alike.
The New York Times
WASHINGTON — When American analysts hunting terrorists sought new ways to comb through the troves of phone records, emails and other data piling up as digital communications exploded over the past decade, they turned to Silicon Valley computer experts who had developed complex equations to thwart Russian mobsters intent on credit-card fraud.
The partnership between the intelligence community and Palantir Technologies, a Palo Alto, Calif., company founded by a group of inventors from PayPal, is one of many the National Security Agency (NSA) and other agencies forged in recent years as they rushed to unlock the secrets of “Big Data.”
Today, a revolution in software technology that allows for the highly automated and instantaneous analysis of enormous volumes of digital information has transformed the NSA, turning it into the virtual landlord of the digital assets of Americans and foreigners alike. The new technology has, for the first time, given America’s spies the ability to track the activities and movements of people almost anywhere without actually watching them or listening to their conversations.
New disclosures that the NSA has secretly acquired the phone records of millions of Americans and access to emails, videos and other data of foreigners from nine U.S. Internet companies have provided a rare glimpse into the growing reach of the nation’s largest spy agency.
With little public debate, the NSA has been undergoing rapid expansion to exploit the mountains of new data being created each day. The government has poured billions of dollars into the agency over the last decade, building a 1-million-square-foot fortress in the mountains of Utah, apparently to store huge volumes of personal data indefinitely. It created intercept stations across the country, according to former industry and intelligence officials, and helped build one of the world’s fastest computers to crack the codes that protect information.
While once the flow of data across the Internet appeared too overwhelming for the NSA to keep up with, the revelations of the past few days suggest the agency’s abilities are far greater than most outsiders believed.
“Five years ago, I would have said they don’t have the capability to monitor a significant amount of Internet traffic,” said Herbert Lin, an expert in computer science and telecommunications at the National Research Council. Now, he said, it appears “that they are getting close to that goal.”
On Saturday, it became clear how close: Another NSA document, again cited by The Guardian, showed a “global heat map” that appeared to represent how much data the NSA sweeps up around the world. It showed that in March 2013, there were 97 billion pieces of data collected from networks worldwide; about 14 percent of it was in Iran, much was from Pakistan and about 3 percent came from inside the United States, though some of that might have been foreign data traffic routed through U.S.-based servers.
Shift in focus
The agency’s ability to mine metadata, data about who is calling or emailing, has made wiretapping and eavesdropping on communications far less vital, according to data experts. That access to data from companies Americans depend on daily raises troubling questions about privacy and civil liberties that officials in Washington, D.C., have yet to address.
“American laws and American policy view the content of communications as the most private and the most valuable, but that is backward today,” said Marc Rotenberg, executive director of the Electronic Privacy Information Center, a D.C. group. “The information associated with communications today is often more significant than the communications itself, and the people who do the data mining know that.”
U.S. laws restrict wiretapping and eavesdropping on the content of the communications of U.S. citizens but offer little protection to the digital data thrown off by the telephone when a call is made. They offer virtually no protection to other forms of nonphone-related data, such as credit-card transactions.
Because of smartphones, tablets, social-media sites, email and other forms of digital communications, the world creates 2.5 quintillion bytes of new data daily, according to IBM. The computer giant estimates 90 percent of the data that now exists in the world has been created in just the past two years. From now until 2020, the digital universe is expected to double every two years, according to a study by the International Data Corp.
Accompanying that growth has been rapid progress in the ability to manipulate the data. Just four data points about the location and time of a mobile phone call, a study published in Nature found, make it possible to identify the caller 95 percent of the time.
When President George W. Bush secretly began the NSA’s warrantless wiretapping program in October 2001, to listen in on the international telephone calls and emails of U.S. citizens without court approval, the program was accompanied by large-scale data mining.
Those secret programs prompted a showdown in March 2004 between Bush White House officials and a group of top Justice Department and FBI officials in the hospital room of John Ashcroft, then the attorney general. Justice Department lawyers who were willing to go along with warrantless wiretapping argued the data mining raised greater constitutional concerns.
In 2003, after a Pentagon plan to create a data-mining operation known as the Total Information Awareness program was disclosed, protest forced the Bush administration to back off. But since then, the intelligence community’s data-mining operations have grown enormously, according to experts.
“More and more services like Google and Facebook have become huge central repositories for information,” said Dan Auerbach, a technology analyst with the Electronic Frontier Foundation. “That’s created a pile of data that is an incredibly attractive target for law enforcement and intelligence agencies.”
Industry experts say that intelligence and law-enforcement agencies also use a new technology, trilaterization, that allows tracking of an individual’s location, moment to moment, based on cellphone data. The data, obtained from cellphone towers, can track the altitude of a person, down to the specific floor in a building. There is software that exploits the cellphone data seeking to predict a person’s most likely route. “It is extreme Big Brother,” said an industry-data specialist.
Nothing revealed in recent days suggests NSA eavesdroppers have violated the law by targeting ordinary Americans. On Friday, President Obama defended the agency’s collection of phone records and other metadata, saying it did not involve listening to conversations or reading the content of emails. “Some of the hype we’ve been hearing over the past day or so — nobody has listened to the content of people’s phone calls,” Obama said.
Privacy advocates say a national debate must take place to come up with new rules to limit the intelligence community’s access to the mountains of data.
Rotenberg, referring to the constitutional limits on search and seizure, said, “It is a bit of a fantasy to think that the government can seize so much information without implicating the Fourth Amendment interests of American citizens.”