[Upd-discuss] Google public domain library?

Greg Newby gbnewby@pglaf.org
Fri, 17 Dec 2004 17:41:07 -0800


On Fri, Dec 17, 2004 at 07:19:12PM -0500, Richard Stallman wrote:
> Under US copyright law, converting the work to a different medium
> creates a new copyright.  So scanned files made from a public-domain
> book would not themselves be in the public domain.

I don't think this is a settled issue, but would be
very interested to know of relevant cases.  We do know from
way back when that Wordstar did not win their effort
to claim copyright on files created with their application.

A performance (say, of a Shakespeare play) *does* get a 
new copyright, but that doesn't mean the play becomes
copyrighted: just the performance.  We've heard arguments
that transformations of content might constitute a performance
(say, text to speech conversion).

As far as an effort to take a public domain item and
somehow convert it, the Project Gutenberg view is that this
does not warrant a new copyright.  More below.  Note especially
the notion of "authorship."  From what I've heard, there is
no "authorship" going on with Googleberg.

> Whether it is legal to extract the text from those scanned files
> and produce something that is in the public domain, I don't know.

It's pretty clear under the anti-circumvention provisions
(via DMCA) of Title 17 U.S.C. that bypassing access controls is
a violation.  The fact that this doesn't allow bypassing
for essentially anything but cryptographic research means
that there is an inconsistency with the Fair Use provisions
elsewhere in Title 17.

My guess is that someone who bypassed access controls (i.e., DRM)
to access public domain content could still be charged.  But
my hope is that such a case would eventually find that 
fair use trumps DRM.  This is just a hope...we all know
that both "fair use" and the public domain are not 
faring too well in the US courts, these days.


FWIW, here's our "no sweat" copyright statement, which has been
confirmed by our legal advisors but has not undergone any
sort of legal test or challenge.  Nevertheless, we do sometimes
use this as a basis to harvest, transform and/or redistribute
demonstrably public domain items, even when those items
are wrapped with passwords, licenses, etc.


PROJECT GUTENBERG'S POSITION ON "SWEAT OF THE BROW" COPYRIGHT CLAIMS

Work performed on a public domain item, known as sweat of the brow,
does not result in a new copyright.  This is the judgment of Project
Gutenberg's copyright lawyers, and is founded in a study of case law
in the United States.  This is founded in the notion of authorship,
which is a prerequisite for a new copyright.  Non-authorship
activities do not create a new copyright.

Some organizations erroneously claim a new copyright when they add
value to a public domain item, such as to an old printed book.  But
despite the difficulty of the work involved, none of these activities
result in new copyright protection when performed on a public domain
item:

   - scanning and optical character recognition (OCR)

   - proofreading and OCR error correction

   - fixing spelling and typography, including substantial updates to
spelling such as changing from American to British

   - adding markup (HTML, XML, TeX, etc.)

   - digitizing, cropping, color-adjusting or other modifications
to images

   - addition of trivial new content, such as images to indicate
page breaks in an HTML file, or pictures of gothic letters for the
first letter in a chapter, or adding or removing a few words per
chapter.

   - substantial reorganization, such as moving footnotes to end-notes,
or changing the locations of pictures within the text

   - recoding to new character sets, such as Unicode, or new formats,
such as PDF


There is some value-added content that DOES get a new copyright, but
only for the actual new work (that is, it may be possible to remove
the new copyrighted content to go back to a public domain document):

   - translation into another human language

   - creating a new compilation of existing materials (though the
individual items compiled retain their public domain status)

   - creating new original art work

   - creating an original derivative work, such as an audio
performance, a new chapter, or a set of favorite quotations

   - adding a new introduction or critical essay

Project Gutenberg is able to utilize any material which is judged to
be public domain in the country of use (i.e., the United States).  If
it is determined that components of a digital item are public domain,
but others are not, then the copyrighted components may be removed
without the permission of whoever owns the copyright for the new
content.

It is Project Gutenberg's practice to seek permission of copyright
claimants before harvesting their materials.  This is done in order to
be polite, and to allow the producer or distributor to request a
particular credit be used.  But if permission is not given, public
domain items can still be used by Project Gutenberg, typically without
any attribution.  Because Project Gutenberg receives submissions
from many different sources, it is not always clear where an item came
from.  Volunteers who submit content they did not themselves generate
should be diligent about reporting sources, even if the source will
not be credited in the item as distributed by Project Gutenberg.

Most recently updated April 6, 2004


  -- Greg

Dr. Gregory B. Newby
Chief Executive and Director
Project Gutenberg Literary Archive Foundation http://gutenberg.net
A 501(c)(3) not-for-profit organization with EIN 64-6221541
gbnewby@pglaf.org