Twitter’s former head of security, Peiter “Mudge” Zatko, has filed a whistleblower complaint against Twitter for, among other things, poor internal security and software development practices.
There are a number of salacious concerns, like that Twitter employs known government agents that can spy on protestors; that the executive team knew about security and privacy issues and repeatedly did nothing; and that there was no understanding or documentation of the entire infrastructure, which could result in catastrophic data center failures that would take the entire service permanently offline.
But, some of the most interesting information comes from the handling of user data. And there are lessons for every business in this behind-the-curtains look at one company’s approach to managing user relationships.
Let’s peel the curtain back and look at Twitter’s user data issues.
Purpose and use violations: Users have a right to control how their data is being used by the companies with which they have a relationship. Twitter collected phone numbers and email addresses from users for the express purpose of improving safety and security of their accounts. This is pretty common, two-factor authentication often uses phone numbers; and the email address can be used to send magic links, password resets, and alerts of suspicious activity. But, Twitter additionally used this data for targeted advertising without their users’ consent. They did the same thing with security cookies, which were ostensibly placed on the user’s browser to improve security, but then were misused for marketing purposes without permission. This is a violation of a number of current laws, including GDPR and CCPA, and also is just an unethical business practice.
Un-managed data assets: Twitter creates a lot of data – more than 12 TB per day. Unfortunately, only about 20% of their data assets are registered and managed. Which means they don’t even know what 80% of their data is. Even worse, they don’t know where it is! Data assets are spread across multiple geos, without proper redundancy or backups. Most of their systems are not completely documented. Even worse than that, this lack of indexing and tagging makes it impossible to redact and delete data according to best practices, laws, or customer requirements. Troves of data assets that should have been deleted a decade ago could still be found in many hidden corners of the infrastructure. Which would be bad but not catastrophic except…
Rampant access control issues: More than half of Twitter’s 11,000 employees have access to production systems and data assets that they should not have. In large part, this was driven by a lack of development and test environments, which meant that engineers could only deploy directly to production. Twitter is full of smart engineers who know that this is not a good way to develop software. But it seems that they never had the leadership to address and remediate these issues. With so many people able to access so much sensitive data, it’s not surprising that the Twitter accounts of Joe Biden, Donald Trump, and other high profile people were hacked. If a hacker got login credentials from an employee, there was a 50% chance the hacker would have inappropriate access. In fact, 90% of their actual security breaches were because of their access control issues.
No encryption at rest: Twitter infrastructure is inconsistent around the globe. In particular, many of their servers are out of date and unpatched. With older versions of server software, a large portion of their infrastructure did not have the technical capability to store encrypted data. This basic security flaw means that every unauthorized access event exposes actual, real data to the bad actor.
No SDLC: Perhaps most shockingly, the vast majority of the business is run without an SDLC. Mudge estimated that 8-12% of projects actually used an SDLC, the rest had a “template,” that was not actively being used. Of course, this means the risk of introducing a regression is extremely high; there is no way to trace and back-test problematic changes; and there is no way to roll back. In today’s age of CI/CD, with automated smoke testing, it’s truly shocking.
One of the saddest parts of this cautionary tale is that Twitter had already been investigated and fined by the FTC for deceiving consumers and failing to safeguard their personal information. The consent decree was signed in 2011, where Twitter agreed to make many of the required changes to improve its service. Apparently, they did not.
The rub is that for most businesses, this type of blatant and ongoing violation of security, user rights, and privacy expectations would simply be fatal. If it were disclosed, customers would flee and the offending services would be excised from the larger company. The reputational risk for most companies is just too high.
Throughout the 2010s, customers didn’t apply the same standards to Twitter. Most Twitter users were laissez faire about their data being used inappropriately, or, if pressed, would declare that the value of being able to tweet was not worth giving up for theoretical user protections. Besides, the thinking went, every other online company is just as bad or worse about protecting my data. I don’t like it, but what am I going to do about it?
In today’s environment, however, that lackadaisical attitude no longer exists. And, Twitter’s user numbers show it. Each of these services have more active users than Twitter: Tik Tok, Snapchat, Discord, WhatsApp, Instagram, and WeChat. All of these services are younger than Twitter.
Fortunately, most engineering teams are more organized than Twitter; and most executive teams are more supportive of security and privacy for their customers. But, even in better organizations, managing dynamic technical changes is difficult. Most teams have a mindset of continuous improvement. It seems that Twitter did not.