Author(s): Jason Schultz
Year: 2013
Abstract:
This
Amicus Brief was filed in the United States Court of Appeal for the
Second Circuit in the case of Authors Guild v. Hathitrust on June 4,
2013. The case is on Appeal from the United States District Court for
the Southern District of New York, No. 11 CV 6351 (Baer, J.)
Amici
are over 100 professors and scholars who teach, write, and research in
computer science, the digital humanities, linguistics or law, and two
associations that represent Digital Humanities scholars generally.
Mass
digitization, especially by libraries, is a key enabler of socially
valuable computational and statistical research (often called “data
mining” or “text mining”). While the practice of data mining has been
used for several decades in traditional scientific disciplines such as
astrophysics and in social sciences like economics, it has only recently
become technologically and economically feasible within the humanities.
This has led to a revolution, dubbed “Digital Humanities,” ranging
across subjects like literature and linguistics to history and
philosophy. New scholarly endeavors enabled by Digital Humanities
advancements are still in their infancy but have enormous potential to
contribute to our collective understanding of the cultural, political,
and economic relationships among various collections (or corpora) of
works – including copyrighted works – and with society.
The
Court’s ruling in this case on the legality of mass digitization could
dramatically affect the future of work in the Digital Humanities. The
Amici argue that the Court should affirm the decision of the district
court below that library digitization for the purpose of text mining and
similar non-expressive uses present no legally cognizable conflict with
the statutory rights or interests of the copyright holders. Where, as
here, the output of a database – i.e., the data it produces and displays
– is noninfringing, this Court should find that the creation and
operation of the database itself is likewise noninfringing. The copying
required to convert paper library books into a searchable digital
database is properly considered a “non-expressive use” because the works
are copied for reasons unrelated to their protectable expressive
qualities – the copies are intermediate and, as far as is relevant here,
unread.
The mass digitization of books for text-mining
purposes is a form of incidental or “intermediate” copying that enables
ultimately non-expressive, non-infringing, and socially beneficial uses
without unduly treading on any expressive – i.e., legally cognizable –
uses of the works. The Court should find such copying to be fair use.
Keywords: copyright, intellectual property, text-mining, digital humanities, digitization, non-expressive use, non-consumptive use
Link: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2274832