Wednesday, October 15, 2014

Adobe, Privacy and the Big Yellow Taxi

Here's the most important thing to understand about privacy on the Internet: Google doesn't know your password. The FBI can't march into Sergey Brin's office and threaten to put him in jail unless he tells them your password (if it thinks you're making WMD's). Because it wouldn't do them any good. If Google could produce your password, it would be a sign either of gross incompetance or the ill-considered choice of your cat's name, "mittens" as your password.

Because Google's engineers are at least moderately competent, they don't store your password anywhere.  Instead, they salt it and hash it. The next time they ask you for your password, they salt it and hash it again and see if the result is the same as the hash they've saved. It would be easier for Jimmy Dean to make a pig from a sausage than it would be to get the password from its hash. And that's how the privacy of your password is constructed.

Using similar techniques, Apple is able to build strong privacy into the latest version of iOS, and despite short-sighted espio-nostalgia from the likes of James Comey,  strong privacy is both essential and achievable for many types of data. I would include reading data in that category. Comey's arguments could easily apply to ebook reading data. After all, libraries have books on explosives, radical ideologies, and civil disobediance. But that doesn't mean that our reading lists should be available to the FBI and the NSA.

Here's the real tragedy: "we take your privacy seriously" has become a punch line. Companies that take care to construct privacy using the tools of modern software engineering and strong encryption aren't taken seriously. The language of privacy has been perverted by lawyers who "take privacy seriously" by crafting privacy policies that allow their clients to do pretty much anything with your data.

CC BY bevgoodin
Which brings me the the second most important thing to understand about privacy on the Internet. Don't it always seem to go that you don't know what you've got till it's gone? (I call this the Big Yellow Taxi principle)

Think about it. The only way you know if a website is being careless with your password is if it gets stolen, or they send it to you in an email. If any website sends you your password by email, make sure that website has no sensitive information of yours because it's being run by incompetents. Then make sure you're not using that password anywhere else and if you are, change it.

Failing gross incompetence, it's very difficult for us to know if a website or a piece of software has carefully constructed privacy, or whether it's piping everything you do to a server in Kansas. Last week's revelations about Adobe Digital Editions (ADE4) were an example of such gross incompetence, and yes, ADE4 tries to send a message to a server in Kansas every time you turn an ebook page. Much outrage has been directed at Adobe over the fact that the messages were being sent in the clear. Somehow people are less upset at the real outrage: the complete absence of privacy engineering in the messages being sent.

The response of Adobe's PR flacks to the brouhaha is so profoundly sad. They're promising to release a software patch that will make their spying more secret.

Now I'm going to confuse you. By all accounts, Adobe's DRM infrastructure (called ACS) is actually very well engineered to protect a user's privacy. It provides for features such as anonymous activation and delegated authentication so that, for example, you can borrow an ACS-protected library ebook through Overdrive without Adobe having any possibility of knowing who you are. Because the privacy has been engineered into the system, when you borrow a library ebook, you don't have to trust that Adobe is benevolently concerned for your privacy.

Yesterday, I talked with Micah Bowers, CEO of Bluefire, a small company doing a nice (and important) niche business in the Adobe rights management ecosystem. They make the Bluefire Reader App, which they license to other companies who rebrand it and use it for their own bookstores. He is confident that the Adobe ACS infrastructure they use is not implicated at all by the recently revealed privacy breeches. I had reached out to Bowers because I wanted to confirm that ebook sync systems could be built without giving away user privacy. I had speculated that the reason Adobe Digital Editions was phoning home with user reading data was part of an unfinished ebook sync system. "Unfinished" because ADE4 doesn't do any syncing. It's also possible that reading data is being sent to enable business models similar to Amazon's "Kindle Unlimited", which pays authors when a reader has read a defined fraction of the book.

For Bluefire ( and the "white label" apps based on Bluefire), ebook syncing is a feature that works NOW. If you read through chapter 5 of a book on your iPhone, the Bluefire Reader on your iPad will know. Bluefire users have to opt in to this syncing and can turn it off with a single button push, even after they've opted in. But even if they've opted in, Bluefire doesn't know what books they're reading. If the FBI wants a list of people reading a particular book, Bluefire probably doesn't have the ability to say who's reading the books. Of course, the sync data is encrypted when transmitted and stored. They've engineered their system to preserve privacy, the same way Google doesn't know your password, and Apple can't decrypt your iphone data. Maybe the FBI and the NSA can get past their engineering, but maybe they can't, and maybe it would be too much trouble.

To some extent, you have to trust what Bluefire says, but I asked Bowers some pointed questions about ways to evade their privacy cloaking, and it was clear to me from his answers that his team had considered these attacks.  Bluefire doesn't send or receive any reading data to or from Adobe.

For now, Bluefire and other ebook reading apps that use Adobe's ACS (including Aldiko, Nook, Apps from Overdrive and 3M) are not affected by the ADE privacy breech. I'm convinced from talking to Bowers that the Bluefire sync system is engineered to keep reading private. But the Big Yellow Taxi principle applies to all of these. It's very hard for consumers to tell a well engineered system from a shoddy hack until there's been a breach and then it's too late.

Perhaps this is where the library community needs to forcefully step in. Privacy audits and 3rd party code review should be required for any application or website that purports to "Take privacy seriously" when library records privacy laws are in play.

Or we could pave over the libraries and put up some parking lots.

0 comments:

Contribute a Comment

Note: Only a member of this blog may post a comment.