The world recently found out that Tor is being targeted by the NSA. What I’ve suspected for a long time now has some data and technical details as some confirmation. We need to learn more. A few random thoughts pop into my head as I read the details.
# The Engineering Task at Hand
The first is derived from years as a network admin; the NSA has some pretty primitive systems. 20 years ago, I had more details on traffic flows and content running across my own networks than the NSA seems to grab now from the Internet. Granted, they are grabbing data across the entire Internet, at scale, and trying to keep a high signal to noise ratio. In 1995, I built what I called a “network flight recorder” for an employer. The goal was to be able to store a few hours of all network traffic in a buffer and be able to replay the buffer when needed. We used it for troubleshooting issues and investigating possible network-level attacks. It took a surprising amount of time and resources to do this, especially as we moved to Fast Ethernet and faster frame relay links between sites. Now imagine doing this in an environment 10x faster and vastly more diverse. Ultimately, the system was shut down because the cost to maintain it in both human and computing resources couldn’t be justified.
The second is how low a return on investment they must be getting. Spending hundreds of millions of dollars on a program for some minimal effect is pretty spectacular. I don’t know of any business models where such an upside down ROI is sustainable for very long.
# The Psychological
Should I change my behavior or not? From a young age, when playing with network sniffers, I knew someone, somewhere could record my traffic, save it, and analyze it; all. Having built systems which record and analyze all network traffic, as well as spending time building systems to better learn the user behavior on our websites and email lists, I knew what was possible at scale. Generally, I used a combination of tor, a private VPN I run myself, and ssh tunnels. I never sent data unencrypted or in the raw.
I started volunteering for Tor in 2003, roughly. I joined Tor full time in 2009. The slides and data released by Snowden are from around this 2008-2009 time period. Likely, this means since day 1 as a full-time employee of Tor, I was being stalked by the NSA. I already used Tor a lot, but as I was the person building all of the software packages for download, I took extra precautions. I setup build machines in my house. I blocked all incoming traffic. They were only accessible from inside the private network while connected to the right VLAN. I checked package signatures all the time, disabled Internet access for the VLAN, and generally built software like a paranoid. There’s an endless rabbit hole here, where when do you trust software? When do you trust the switch software running the VLAN? When do you fully disconnect the network from the Internet and only use purely standalone machines? What about RF interference and influence? How do you verify the hardware hasn’t been tampered with? How far down the rabbit hole shall we go?
The point of all of this detail is that it lays out my mental state on some scale of paranoia. It’s perhaps seemingly silly, but I felt it was justified. Friends told me I was crazy. Maybe I was, maybe I still am.
Can I beat the NSA? Hardly. In order to live and prosper in the modern world I have to give up bits of information about my daily life-all of which can be tracked and traded by anyone listening. Do I believe the NSA should be building a massive dossier on everyone in infinitesimal detail? No. I worry that while the contemporary NSA is dutifully saving the data and locking it away, but future NSAs and governments may use my past and current data against me. I have four main concerns:
1. Criminal Abuse. With such a massive dataset, some criminal some where will crack it. This data is the modern day buried treasure.
2. Institutional Abuse. Bad policies, poor enforcement, and an unethical administration are all it takes to turn the data into a weapon against me.
3. Personnel Abuse. Employees, contractors, and the like given access to this data will search for their friends, family, ex-relationship partners, and the like.
4. Discrimination. Who knows how my data looks now, or in the future. Maybe using some service I used in the past is horribly maligned and future biases and judgments turn such past usage into a liability.
# Action Items
What to do?
I don’t know. Everything is broken. There probably isn’t one magic bullet solution.
Putting more oversight into an industry already full of oversight sounds great, but is probably not going to work. Without effective enforcement of that oversight, it’s just more forms to complete and automate by the agency.The intelligence agencies are probably under the most surveillance ever.And yet, this situation still occurs.
Encrypt all the things. Encryption is pretty hard to get right. Getting users to make smart decisions by understanding threat models is difficult. I think making software which correctly uses encryption and helps users understand the concepts and make smarter decisions will just take time and iterations. Luckily, there are a number of projects starting over the past year to address this. I worry that by making turning encryption into a panacea, law enforcement and intelligence agencies will just lobby for weak encryption, backdoor access, or flat out make it illegal.
Defunding the NSA won’t work. Agencies never die, they just rename. We need intelligence agencies in the broader scope. Though, hopefully only focused on actual threats backed by human intelligence work, not all automated mass sweeping of data. Of course this gets back to the original problems with oversight, mission creep, etc. Old fashioned police and intelligence work still works very well.
Over the past year, as more and more Snowden documents were released, I half-jokingly say aloud, “I know you’re listening.”I wonder what my analyst or analysts at various agencies think and wonder about me. I wonder if they know me better than I know myself. Can they better predict my future actions than I? I wonder if I’ll ever meet them, or if I’m just some name in a database list with an attached dataset a human never sees.