The NSA’s Attention Span: Widely Focused on the Narrow

Spread the love

When the power of a nation-state is directed upon you, they have resources that completely boggle the mind.  This applies even if it’s a minor power: Estonia, Hungary, and Cambodia all have their own capabilities and, while very small compared to some, your ability to hide from a country that makes you Priority One is limited.  They have seasoned pros that are in all likelihood a lot better than you are, and the allies they call in when they need help are even more dangerous to you.

But of all the agencies, the National Security Administration possesses perhaps the most impressive capability for finding information on the planet.  This comes largely from being funded at a level that completely dwarfs every other nation (he NSA’s actual budget is classified, but it is believed to have received at least $10 billion and perhaps as much as $20 billion in the 2012-13 intelligence community budget) and having access to an array of locations and technologies that few if any other nations possess. Many of its listening posts (not including temporary posts on ships, in aircraft, and set up in vehicles or shacks) are known even if exactly what each does is not, and their presence around the world shows the reach the NSA has through US allies.  Their technological edge includes supercomputers, interception methods, and hacking capabilities that render most defenses nearly moot.

The previous article discussed the difficulties associated with encryption, both in getting it right and in circumventing it by accessing the data via other means when it’s not encrypted.  In short, it requires some very careful planning to make sure that your implementation, both from a technical and an operational perspective, are as solid as they can be, and this is where most people fail.

This is not to say that encryption is useless.  Far from it.  If you’re trying to secure information from competitors, random attackers, or other enemies, it’s one of the best tools available.  Even if you’re doing something that a national agency doesn’t want you doing, it’s better to encrypt than to not, if possible and practical.  And there are ways to give even the most powerful adversary a headache.  But if you come under the scrutiny of the NSA, it becomes exceptionally difficult to effectively hide the contents of the message unless you take very specific precautions and you do it without failure every single time.

From this rises the second question from the last article: how do you avoid the NSA if they’re looking for you?  This turns out to be extraordinarily difficult not only because of the NSA’s reach into the world’s communications but also the legal framework in which the NSA operates.  We’ll start by looking at how far and with what difficulty the NSA can actually look.

And So Faintly You Came Tapping…

To get a look at someone else’s communications, you have to be able to see them to begin with.  Sometimes, this is easy.  A lot of communications are sent by radio, and that’s trivial to pick up.  Even factoring in some of the listening stations that the NSA has around the world, the concept is the same: find a place where you can collect the communications, put a receiver there, and start recording.  This sometimes requires special gear or conditions, but ultimately isn’t conceptually much different from intercepting a WiFi signal.  But even in this age where mobile devices far outsell wired devices, a lot of communication is still sent over the wire, whether copper or fiber-optic cable.  This is much more secure for various reasons, mostly that it requires proximity or contact to intercept anything but also because a cut in the connection or a change in the signal strength may indicate an attempt to tap the line.

The NSA has their secret abilities there, too, of course, including an ability to tap those lines generally considered secure, even by foreign intelligence agencies.  One of the rumored abilities of the USS Jimmy Carter, a specially-modified Seawolf-class attack submarine, is to surreptitiously tap into undersea fiber-optic lines for the NSA.  Such an ability might be extremely difficult to counter as it would entail manually inspecting cables that run the width of seas and even oceans, an expensive and arduous task that, if the NSA discovered the work and removed the splice, could ultimately be fruitless anyway.

Of course, it’s much easier when there are cooperative entities, and no shortage exists there.  Take, for instance, Room 641A at the AT&T office in San Francisco, revealed to the world by Mark Klein in 2006.  With AT&T’s consent, traffic from the nearby fiber-optic cables is split so that equipment in the room can read it.  The extent of the monitoring there is uncertain, but one can make guesses based on the equipment alleged to be there (the Narus STA 6400 back then, probably something much faster now).  This isn’t unique to AT&T, either; it’s been reported that every single major US telecommunications company has these rooms somewhere in their network, usually at more than one place.  Some foreign-owned companies with US divisions (and perhaps some without) have also signed on, as evidenced by the news that Global Crossing agreed to allow NSA intercepts with notice even though it was in the process of being purchased by an Asian company.  (The deal fell through for other regulatory reasons.)

At one extreme, these rooms could be explained as a potential intercept point instead of a constant one.  Indeed, that’s how it was characterized in the roundabout way the agency originally used when not really quite discussing it.  The Global Crossing agreement was explained as exactly that because the new owners didn’t want a permanent NSA presence.

But this brings about a very important point to consider.  It’s still unclear just how much traffic is being intercepted, where, and to what depth.  It’s dizzying to think of how much storage is required to tuck away all of the traffic that crosses just the communications lines in the United States.  An estimate from Cisco puts the average Internet traffic in North America alone at 130 terabits per second at the moment, or about 40.6 exabytes a month.  An exabyte is a billion terabytes.  The capability of the entire storage industry doesn’t come remotely close to this.  In 2012, a little over 352 million PCs and notebooks were shipped around the world.  Even if the average storage available was 1TB per system (probably too high), that would cover less than 1% of the necessary storage requirements for a single month of Internet traffic.  Even counting storage shipment for servers and add-on drives and some monumental deduplication capabilities, the capacity isn’t there unless the NSA has figured out a storage technology so far beyond even the concepts on the distant drawing board that it may as well be magic.

A Needle Hiding in a Haystack Surrounded by Magnets

Enter the Narus STA 6400.  The “STA” portion stands for “Semantic Traffic Analyzer” and the devices are intended to sift through vast quantities of data looking for keywords in context.  That’s why sending a benign message that ends in “P.S.: dirty bomb” or posting on a forum using a signature like, “ECHELON? Isn’t that where the government searches for words like bomb, plutonium, assassinate, and anarchy?” isn’t likely to garner any attention.  When it’s combined with other information in the body, perhaps including the name of a city or a landmark and other potential clues, it might be flagged for further analysis, probably initially by a bigger computer somewhere in the NSA.  From a legal perspective and setting aside questions of the legitimacy of the FISA courts, if one or both of the recipients are both determined to be US citizens, anything further requires a warrant.

There have been tremendous advances in semantic understanding by machines in the last decade.  IBM’s Watson used it to win at Jeopardy.  It’s what allows systems to (theoretically) figure out what you’re saying when potentially conflicting syllables exist.  You might want to send a message to your spouse expressing affection.  They can take what they hear and, using semantic context, know that you’re saying, “I love you” instead of saying, “Olive juice.”  Similarly, a message relayed to the store might ask for olive juice instead of sending a creepy message to a random product grabber.  As we send more of these messages through Apple’s Siri and Google Now and other services and we rate how they translate our speech into text, they get better and better.  Neural networks continue to improve as they consume more text and speech and build the understanding required to interpret the language more naturally.  At some point, speaking to a computer will be almost as normal as typing to it.

Even with all this, though, the odds of the NSA focusing on you are very small.  The fact of the matter is that even if you think that all of the NSA’s employees and contractors are analysts reading or listening to messages (they’re not–even the NSA needs janitors, and they all have security clearances), the ability of them to read through even an infinitesimal fraction of 40 billion terabytes a month is ludicrous.  If there were 200,000 analysts and each could reasonably sift through even 3MB per day (the size of War and Peace in raw text), that’s 600GB per day or about 18TB per month.  Yes, there are ways to reduce the overall load, but in order to review everything, it would have to be reduced by a factor of approximately 2,000,000,000:1.  Unless you communicate with people on a watch list and you use certain terms and you use them in a certain context, even if your message is deemed worthy of further review, the odds of any human other than the recipient looking at it are effectively nil.

That’s not to say that all of this is moral, ethical, or legal.  Those are questions beyond the scope of this article.  But from a technical perspective, worrying about the NSA capturing your particular escapades of questionable morality is probably not worth the time or hair loss unless you’re planning to blow something up.

It’s All Meta.  Even the Meta is Meta.

But it doesn’t take a fully-intercepted, decrypted message to start getting concerned when there’s a surge of encrypted messages going between suspected terrorists.  That’s where metadata comes in.  Metadata is anything other than the actual contents of a communication: participants, time, length, locations, headers, and so forth.  It can even include something that appears in the body of the message that does not betray the contents of the message, such as the header and footer usually included in a PGP-encrypted message or even just the fact that the message was encrypted or otherwise unreadable.

There are many mechanisms that one can use to try to avoid surveillance.  Encryption has been discussed here, as have the problems with it.  There are also anonymous proxies and the TOR project.  For those who are unfamiliar, the TOR project uses a series of three nodes in between the client and server.  Each node knows the point immediately before and after, but not anything else.  The idea is to use it to route around interception and to anonymize the source of the connection.  TOR nodes are present around the world, run by volunteers as well as groups that see a need for such anonymity.  It does work–it’s very popular in places like China, Syria, and Iran because it helps gets around their national firewalls and, when combined with encryption, their domestic surveillance programs.  And it can almost certainly be used to get around some domestic surveillance in the United States.

But, assuming their claims are accurate, look at the legal framework around the NSA programs.  They can’t spy on Americans without a warrant, but they can legally spy on foreigners without a warrant.  How do they determine if one of the communicants is an American?  They check the name, e-mail address, IP address, and other metadata against domestic databases.  By encrypting your message, hiding your identity, and routing through nodes that may be in another country, and thereby never showing up in any of those databases, you’re setting yourself up for a higher chance of capture and review because they can’t confirm that you’re an American.  This is not an adversarial system: you don’t get to prove that your communications should be private because you don’t know that they’re capturing them.  Even if you did know, how eager would you be to tell them your pseudonym and that you’re anonymizing your traffic for reasons you don’t want to disclose?

Reality: Always Making Life Difficult

I don’t mean for any of this to sound like encryption, anonymity, or TOR are useless.  They’re not, and they provide some very powerful services for both good and evil.  I also don’t mean to sound like I’m trivializing the civil rights aspects of ongoing surveillance.  I’m absolutely not, and based on the Snowden revelations, something needs to change, though what should change is for another article.

But before you use encryption or proxies or TOR for anything serious, you must understand their limitations.  Traffic exiting a foreign node while destined for a foreign node is going to gather more attention, especially if it passes through US-based circuits.  Encrypting a message is going to gather attention unless and until it can be shown that it’s to or from an American citizen or legal resident (and even then perhaps).  Using a pseudonym carefully constructed to hide your true identity basically guarantees the NSA’s authority to look at the message.

Even if you own your own server, the trust you must place in all of the traffic running from your computer through that server to the next server and into the recipient’s system is enormous, and even the best can falter.  Renting a server means trusting the hosting provider, and even other countries work with the NSA when the agency needs physical access to systems.  These are the reasons that many groups that don’t want to be tracked have reverted to using systems without network connections to encrypt messages that are then physically passed by courier either on a USB drive that can be smashed or on paper that can be burned or eaten.

The root problem is the allocation of resources, and it’s the same in any attacker/defender scenario.  Defending against your neighbor is probably trivial (presuming your neighbor doesn’t work for a TLA), but as resources move up from there, they get more difficult.  Defending against, say, Anonymous is tricky.  Moving up to something like the FBI, it gets much harder because capabilities grow rapidly.  Once you’re up against a full-fledged state intelligence agency–the NSA, the Russian Spetssviaz, even the Swedish FRA–there are realities that you must accept including that you are vastly outgunned if they start to focus on you.  The best that you can hope for at that point is strict operational diligence and a lot of luck, because perfection is an ideal, not reality.

Leave a Reply

Your email address will not be published. Required fields are marked *