Friday, July 22, 2011

Pirates of the Charles River

First MP3s, then DVDs, then books.  Perhaps the Academy thought itself to be a little too esoteric to find itself squarely lined up as the next target in digital piracy's electronic crosshairs, but this week's arrest of Aaron Swartz for the theft of over 4 million articles from JSTOR, a nonprofit archive of scientific journals and academic papers, should shatter any such conceit and serve as a wake-up call to librarians and content providers alike that the tremors currently rocking the publishing world (academic and non-) are merely the opening act for the kind of tectonic change that turned the entertainment industry on its head.

According to JSTOR's statement, Swartz, a 24-year old programming wunderkind turned digital activist and former fellow at Harvard University's Center for Ethics, allegedly had been systematically downloading articles from JSTOR in a manner designed to avoid detection by the system involving unauthorized access to MIT's computer network and illegal entry into a restricted wiring closet on the MIT campus.  Once JSTOR got wise to what was going on, they stopped the bulk downloading, contacted Swartz, and confirmed that the content he had obtained was "secure" and had not been saved or uploaded elsewhere.  Shortly thereafter Swartz was indicted by the U.S. Attorney for the District of Massachusetts.

This is not the first time Swartz has been involved in a brouhaha concerning downloading activities.  In 2009 he downloaded over 19 million pages of court documents via a free trial of the Federal government's Pacer (Public Access to Court Electronic Records) system, following the call of open-government activist Carl Malamud to help make such public information more accessible.  Swartz had managed to download approximately 20% of the entire database before angry officials at the Government Printing Office suspended the free Pacer trial and threatened an FBI investigation.

(Iinterestingly enough, it was United States Attorney's Office, not JSTOR-- or MIT, whose computer network and facilities were illegally accessed to permit the downloading activities-- who opted to pursue criminal charges in this case.  Coincidence much?  You be the judge.)

Swartz had also downloaded over 400,000 law review articles, this time as part of a research project to determine the source of their funding, the results of which were published in the Stanford Law Review in an article written by Shireen Barday.  While he was at the Center for Ethics in 2010 and 2011, Swartz engaged in additional research involving large data sets on the issue of institutional corruption, and though his motive for the JSTOR downloads is still unclear some have suggested that he may have had a similar data-mining project in mind and not an attempt at wholesale data liberation.  As JSTOR points out in their statement that they welcome such research and have even set up a site to facilitate these kinds of projects (, however, this alternative explanation seems unlikely.

Whatever his motivation was (I'd recommend this post at Weibel Lines for the best overall personal assessment of Aaron Swartz, as Stuart Weibel has followed the career of this digital "phenom" for over a decade), the Feds brought down the hammer on him and then some.  If their intent was to make an example of Swartz so as to prevent any additional acts of bulk downloading and/or distribution, the news today that someone just uploaded 33 gigabytes of JSTOR content to the Pirate Bay in apparent retaliation for the arrest suggests that this strategy might end up backfiring big-time.

Indeed, just a casual perusal of the largest Reddit thread about Swartz's indictment (Aaron Swartz was one of the founding members of Reddit) finds little sympathy for JSTOR or publishers in general.  "He's now officially my hero,"  one commenter says.  "I hate journal publishers.  Every scientist hates journal publishers.  They're parasites that control access to content someone else created and that the taxpayer already paid for.  How can I get on his jury?"  Another chimes in with a decidedly egalitarian perspective:  "God forbid anyone should read a scientific journal without paying for the privilege.  What would the world come to if the common people got hold of the knowledge reserved for corporations and universities?"

Now, to be fair, JSTOR is a nonprofit organization whose business model explicitly includes options for affordable and sometimes even free access to the scholarly content they have collected and digitized.  But as library budgets continue to dry up, access to online resources-- however reasonably priced--  will increasingly find themselves on the chopping block as administrators frantically search for more things they can cut, driving more and more students, scholars, and researchers to the digital black market to get what they need.

Have you ever emailed a colleague from another institution a PDF from your online holdings, whether or not you had the license to share said resources with outside parties?  Odds are you have.  Did you know, however, that there are entire swaths of the internet where people with valid library credentials to proprietary databases provide articles on demand?  The Scholar "Subreddit" on is just one instance of this scholarly digital black market, which runs its own parallel course to traditional resource sharing (i.e. commercial document delivery and interlibrary loan).

A born digital generation of scholars do not stop to think about copyright or licensing terms-- they want what they need for their research, and they want it now.  Combine this group with the "hacktivist" demographic who laughs at the notion of paying for MP3s or DVDs and feel that failing to offer DRM-free versions of your software is more than legitimate cause to pirate your warez, and you can see why the government and content providers might be a wee bit concerned that Aaron Swartz is just the tip of the iceberg.

They are of course exactly right to be afraid.  The same revolution which turned the entertainment industry upside-down and devastated the traditional publishing industry has now reached the rarefied heights of the Ivory Tower, where years of growing inequities of access and cost and runaway prices have made academia ripe for just such a reckoning.  For years now the music business has had to learn how to compete with widespread availability of free downloads, something that video and book producers are still only trying to figure out themselves.  Scholarly publishers, aggregators, and content providers would do well to watch these markets, learn from their successes and failures, and adapt quickly to the new reality...  or else.

For if well meaning and reasonably affordable non-profits such as JSTOR are not safe from the digital pirates, then who is?

Disclaimer #1:  I work for Harvard University
Disclaimer #2:  I was an undergraduate at MIT
Disclaimer #3:  I am a librarian who specializes in the lawful/licensed sharing of library resources
Disclaimer #4:  I have assisted the JSTOR Project in my work activities
Disclaimer #5:  I am an active member of the Reddit community
Disclaimer #6:  I have never, ever illegally downloaded anything in my entire life
Disclaimer #7:  One of the above disclaimers may not be 100% true

No comments: