First Monday

Descriptive metadata for copyright status by Karen Coyle


Abstract
The need to express the intellectual property rights of digital materials has focused on access and usage permissions which must be granted by the rights holder. A key set of permissions not acknowledged by these rights expressions is inherent in the legal copyright status of the item. Digital libraries can hold and provide access to many items for which copyright status is the sole governor of use. This article proposes a small set of descriptive data elements that should accompany digital materials to inform potential users of the copyright status of the item.

Contents

Copyright and digital materials
Rights expression and digital libraries
Current practice: Copyright data elements in metadata
Data elements for copyright status
Contact information
From data elements to metadata
The interaction of copyright and licensing
Conclusion

 


 

++++++++++

Copyright and digital materials

One of the chief characteristics of digital works is the ease with which they can be re–used either in whole or in part. This fact has spawned a near crisis over issues of intellectual property in the highly networked world in which we live today. Two main approaches to this problem have been taken by the industries most affected: the first is the modification of copyright law (Litman, 2001), and the second is the development of protective technologies for works in digital format. These latter solutions address the needs of the commercial sector, where the assumption is that digital materials will be licensed and permitted uses of the materials will be enforced through some technological controls.

Although the scholarly communities interact to some degree with the commercial intellectual property sector, they also contribute to and make use of a large body of academic materials that have significantly different characteristics from the current and popular materials that are of interest in other sectors. The scholarly community generally makes use of materials that have a small audience and little market potential. It also makes use of public domain materials, which are generally of only minor interest to the commercial sector. In its role of research and teaching, scholarly communities frequently rely on fair use principles when re–using materials that are covered by copyright law. For the most part, the materials used by the scholarly community do not have technological protection measures that limit the actual uses of the materials, such as printing or copying. Materials licensed by institutions are often under access controls that limit who can gain access to the materials, and may have some license conditions binding on the institution, such as restrictions on lending to other institutions. The institutions’ users, however, are not party to the institutional contract, which means that the scholarly communities rely on copyright law as guidance for their use of these materials.

When users want to make use of an unprotected digital resource, they have to know the copyright status of the work in order to make a reasonable judgment. Although fair use may be applied to some uses, in other cases re–use requires the researcher or teacher to obtain permission from the copyright holder. Yet, information about the copyright status of the work and contact information for the copyright holder is not generally included in the metadata for the digital material. In some cases, especially with archival materials, this information is not readily available.

It is not uncommon to see statements associated with online resources offered by libraries that read something like:

"It is the researcher’s obligation to determine and satisfy copyright or other use restrictions when publishing or otherwise distributing materials found in the Library’s collections." (Library of Congress, 2005)

The burden of determining the copyright status of the work, and of understanding the circumstances under which permission is required, is left to the user of library materials. It is therefore a matter of good user service for the library to provide all available information relating to the copyright status of the work so that the determination can be made.

The purpose of this paper is to define the metadata that is needed to carry the relevant information about the copyright status of a work, in particular for those works where copyright law — not a license or other contract — determines its usage. Although a simple copyright notice may be sufficient in the case of current works, such a copyright statement must be designed to meet the needs of digital libraries and archives which often provide access to unpublished or ephemeral works as well as works whose copyright status may not be immediately known.

The main requirements for this metadata is that it must be able to capture the data that is used to make a determination of the copyright status, and it must also be able to positively assert what aspects of the copyright status are unknown. The other requirement is to provide contact information for anyone wishing to explore further, either to determine the status or to request permissions for uses beyond those allowed by fair use.

 

++++++++++

Rights expression and digital libraries

Works disseminated electronically are often covered by a license. Typically, the creator of the content transfers either an exclusive or a nonexclusive license to a distributor, and that license allows the distributor to enter into certain agreements with institutions or individuals. These licenses can include very specific permissions for the distributor and for the end user of the work.

There are relatively well–developed rights expression languages that can be used to define the license terms for these licensed works. There are more complex rights expression languages designed for use by publishers and distributors of commercial content. These languages are realized in a number of available technologies, such as the use of the Open Digital Rights Language (2005) in products of the Open Mobile Alliance. There are proprietary rights languages as well, such as the usage–based protections embedded in Adobe Acrobat’s PDF products, and other solutions used in various e–book formats (MobiPocket, Palm Reader) and in music software (iTunes). CreativeCommons licenses offer a simple rights expression designed for creators of digital works (Creative Commons, 2005). The only rights expression standard that has been issued by a formal standards body is the ISO standard rights expression language based on the work of the MPEG–21 group (International Standards Organization, 2004).

These rights expression languages and their related technologies are limited to expressing license terms and therefore are not appropriate for materials that will be distributed or used entirely under copyright law. The need for metadata for copyright information arises whenever there are materials that are being disseminated by a third party that does not hold rights in the works. This is a situation faced by libraries and archives that digitize materials in their collections — such as old photographs, maps, letters, etc. — that are not covered by licenses.

Because libraries are very active providers of access to information resources in both analog and digital formats, some members of the public and even some members of the library profession are under the mistaken impression that libraries hold some intellectual property rights in the content of the materials they own. In fact, libraries rarely hold such legal rights in the materials in their collections. The main exception is with archival material, and even then only when an owner has explicitly deeded rights to the library (Society of American Archivists, 1998). For the vast majority of archival materials and for commercially and non–commercially published materials, libraries hold no intellectual property rights.

Some materials in libraries and archives are in the public domain and are no longer protected by copyright law. For materials still covered by copyright, U.S. libraries frequently rely on exceptions in the copyright law that allow fair use copying of materials (17 U.S.C. Section 107). There is also a specific exception in U.S. copyright law that allow libraries to make copies of works for preservation purposes (17 U.S.C. Section 108). One can think of the library as being a kind of copyright "super–user" because Section 108 exceptions in the copyright law allow libraries to make copies in the course of their social function of preserving cultural heritage. This exception is limited both circumstantially (i.e., to materials that cannot be easily replaced in the marketplace) and in terms of future distribution (digital copies are not to be made available outside of the premises of the library).

There are specific circumstances in which libraries may legally make copies of items in their collections and make those copies available to users: for materials that are in need of preservation; for materials that have entered the public domain; and, for uses that would be considered fair use. Although some argue that a digital copy of a work is worthy of its own copyright, if no new content has been added, the library holds no intellectual property rights in the copy if it held none in the original (Bridgeman, 1999). Use of the digital copy is governed by the copyright law that applied to the original, and cannot be modified by the library. Rights expression languages that have been developed for rights holders cannot apply to the library’s copy because the library cannot grant permissions for items over which it itself has no legal rights. What the library can do, however, is assure that digital copies made by the library carry information that helps subsequent users make some assessment of the copyright status of the item. This is especially important for digitized archival materials that will be accessed and used outside of the context of the physical archive in which they originated, since that context may be lost over time as the items are distributed over digital networks.

 

++++++++++

Current practice: Copyright data elements in metadata

The tradition of library cataloging has not included the recording of information relating to the copyright status of works, except when that information is included for other purposes, such as the date "c2004" representing the copyright date when the publication date is not available. The MARC21 format has no fields for recording a copyright statement. The note field "Restrictions on Access Note" (MARC21 field 506) can be used to indicate a variety of access restrictions, either in terms of the contractual arrangement with the donor of an archive or for materials whose access is limited to a certain class of users. There is a similar note field called "Terms Covering Use and Reproduction" (MARC21 field 540) that can be used to record terms that apply once access has been obtained. Both of these notes are however more of the genre of the rights expression languages in that they are used when some arrangement has been made beyond copyright law that governs rights, such as a licensor or a deed of gift.

National libraries often acquire works through copyright deposit programs. The MARC21 record has a field to record the copyright deposit number (MARC21 field 017) and certain usage fees that are based on copyright (MARC21 field 018). The coded date fields (in the MARC21 008 fixed field and the 046 field) can record the copyright date, but only based on a preference list of dates that are specified in the cataloging rules. When a publication date is available, the copyright date is not recorded.

The end result is that a MARC21 record does not show the name listed as the copyright holder of a work. Given two cataloging records there is nothing to show that the publisher is listed as the copyright holder in the Krupat book, whereas the second title, Heavy Weather, lists the author, Bruce Sterling, as the copyright holder.

 

Krupat, Arnold. Sterling, Bruce.
Ethnocriticism: Ethnography, History, Literature. Heavy weather / Bruce Sterling.
Berkeley: University of California Press, c1992. New York: Bantam Books, 1994.

 

For non–book materials of complex authorship, it is even more difficult to ascertain who might hold rights in the material:

Dark Star / Jack H. Harris Enterprises ; producer, Jack Harris ; produced and directed by John Carpenter ; screenplay, John Carpenter and Dan O’Bannon. United States : Jack H. Harris Enterprises, 1974 ; United States : VCI Entertainment, c2001.

Part of the lack of concern regarding copyright in our cataloging standards could stem from our pre–Berne copyright laws which stated that works had to be registered with the Copyright Office for them to be protected under copyright law. After 1978, however, registration is no longer required and original works in a fixed medium are considered to be worthy of copyright law protection. In addition, since 1978 works no longer need to have a copyright statement (i.e. "© Karen Coyle, 2004") to be protected by copyright law. This means that for many works that are covered by copyright law, especially unpublished works, there is no record of their existence in the registration database, and nothing on the works themselves to tell us who claimed the copyright and when.

The library has the work in hand while making the digital copy, and presumably, for most published works, the copyright notice on the work itself is readily available. For archival works and ephemera, information about the copyright status may not be on the piece itself but some evidence of its status may be provided by the context of the archive.

When rights information is included in online displays for digital library works, it is often stated in the form of permissions and restrictions. It is not uncommon to encounter statements on the pages of digital archives that appear to give permissions and set restrictions on the use of materials, including materials that are in the public domain.

[On a digital copy of a map dated 1873] "Not to be reproduced without permission. To purchase copies of images and/or for copyright information, contact [library name]."
[On an item digitized by the library, but for which no copyright information was retained.] "This item is intended to support research, teaching, and private study. Users may print, download, or link to the image without prior consent of the [library name]."
[On a digital copy of a magazine article from 1858] "These pages may be freely searched and displayed. Permission must be received for subsequent distribution in print or electronically. Please contact [library name] for more information."

It is quite possible that the permissions and restrictions here have no basis in copyright law nor are covered by licenses held by the library. It is understandable that the libraries want to inform users of the permitted uses of the material, but the information given in these rights statements may mislead users regarding the actual use that is permitted by law. Scholars who are knowledgeable about copyright law may question whether the library’s statements are legitimate but do not have the necessary information to make their own assessment. Libraries and other cultural institutions need a way to convey the data elements that inform a copyright determination, both for present day use and as part of the archival package that accompanies longer term digital preservation metadata.

 

++++++++++

Data elements for copyright status

The obvious place to look for data elements for copyright statements is in the data collected by the U.S. Copyright Office on their registration forms. There are separate registration forms for literary works, visual arts, performing arts, sound recordings and serials. The forms vary in some details, but the majority of data elements are shared across them. The following is a list of the elements on form "TX" for literary works (U.S. Copyright Office, June, 2005):

 

Section
Data element
Description
Title section Title of this work  
  Previous or alternative titles Alternative titles are ones that someone might search under when looking for this work.
  Publication as a contribution If the work was published as a contribution to a collection this field carries the host item title, and if it was published in a serial it also has the volume, number, issue date, and pagination.
Author section Name of author  
  Year of birth and death of author Birth year is optional (although useful for identification purposes), but death year is required if author is dead.
  Work made for hire? yes/no
  Author’s nationality or domicile Either the author’s country of citizenship or the country of residence.
  Author is anonymous yes/no
  Author is pseudonymous yes/no
  Nature of authorship Free text field for a brief general statement, i.e., "Entire text," "English translation," "Editorial revisions."
Creation and Publication Year in which creation of the work was completed Required in all cases.
  Date (month/day/year) and nation of first publication of the work Only for published works.
Claimant(s) Name(s) and address(es) of copyright claimants Claimant may be the same as author.
  Transfer If the copyright has been transferred, this is a brief description of the nature of the transfer. Must be filled in if claimant is different from author.
Previous Registration Has the work been previously registered? yes/no
  Reason (check box) a) This is the first published edition of a work previously registered as unpublished
b)This is the first application with this author as claimant
c)This is a changed version of the work
  Previous registration number  
  Year of previous registration  
Derivative Work or Compilation Pre–existing material Describes the preexisting work.
  Material added to this work Statement of what new material is covered by this claim.
Correspondence, etc. Correspondence The contact information for a correspondence relating to the claim.
  Certification Signature and category:
1) author
2) other copyright claimant
3) owner of exclusive rights
4) authorized agent.

 

Other sources of data elements for copyright are the step–by–step walkthroughs that some library and archive colleagues have provided for the determination of the copyright status of an item, such as the work of Peter Hirtle (Hirtle, 2005) and Mary Minow (Minow, 2002). Unlike the Copyright Office form, which assumes that an author or other rights holder is completing the data, these instructions are designed for a third party who is not intimately involved with the work itself. The primary question to be answered is: is this work still in copyright? If it is likely that the work is indeed protected by copyright, based on the criteria provided, then other steps lead the researcher to either a fair use determination or a decision to request permission for the intended use. The data elements given here are the ones that inform the determination of copyright status.

The Copyright Determination Algorithm:

A. Unpublished Work

B. Published Work

Those are the main data elements, but in reality the answers may include partial information or the fact that a particular data element is not known. Also, the step–by–step algorithm assumes that one is looking at the item itself. When creating metadata, it would be ideal to transfer some of the data from the piece, such as publisher’s name, to the metadata although this may be available through bibliographic metadata associated with the same item. Taking this into account, we have an expanded set of data elements that are needed for a full description of copyright status:

Unpublished Work
Published Work

There are two sources of information about a digital item: the digital item itself (or its original, in the case of material digitized by the library), or the result of research. Research could include bibliographic resources, the files of the Copyright Office, contact with the author or the institution. It will be useful for a user to know the extent of the research into the rights, especially if many metadata elements are listed as "unknown." Therefore, the metadata needs at least two more elements:

The latter should allow a brief description of what steps were taken to ascertain the copyright status of the item.

 

++++++++++

Contact information

Users of digital library collections sometimes need to locate copyright owners to obtain permissions, such as when a scholar wishes to use a photograph as an illustration in a book. This is one of the more difficult tasks for creators and publishers because the necessary contact information often is not included with the piece itself. For current published works, the publisher name usually appears on the piece, and at times a full or partial address is included. For older works, and for unpublished works, finding the copyright owner is difficult at best and sometimes impossible. As a service to users of digital library materials, contact information must be provided where it is known. In the absence of specific information about the copyright holder, users should at least be referred to the library or archive that holds the original, or that accepted or harvested the digital copy. Contact information should include a full address, phone number, e–mail, fax, and any other information that will facilitate making the contact.

Rights holder contact

Contact information for the rights holder would be especially useful for licensed materials, or for items where copyright is owned by a library. It would also be relevant for self–archived works. For the many orphan materials in libraries and archives, the rights holder will be unknown, however.

Rights researcher contact

In the case of archives or of locally digitized materials, it can be useful to know who did the due diligence that resulted in the decision to digitize the material, and who provided the rights information included in the metadata. The rights researcher could presumably answer questions about the type of research that was done, and may have further background information relevant to the item. This would also be the contact for anyone wishing to dispute or correct the copyright information included with the item, such as the case where a creator finds his or her work presented with mistaken copyright information.

Contact for Other Services

Libraries and archives may offer other services relating to an item. For example, archives may provide thumbnail images of works on an open access Web site but charge for a print–quality copy, or for publication rights. In the case of digitized items this may also be a listing of who holds the originals, for those uses that require access to the physical copy.

 

++++++++++

From data elements to metadata

The metadata listed above is conceptual and does not define a machine–readable schema of data elements. Some redundancy can be eliminated where items like names and dates appear more than once. There are some data elements that modify the category of unpublished works and others that are specific to published works. For some types of works, such as self–archived journal article pre–prints, it may be possible to embed existing rights metadata, such as a Creative Commons license, in the rights metadata statement. Conceptually, the metadata could contain the following structural elements:

General rights information
Copyright status (copyrighted, public domain, unknown)
Publication status (published, unpublished)
Dates
Year of copyright or creation
Year of renewal of copyright
Copyright statement (from the piece)
Country of publication or creation
Creator
Creator name, dates, and contact
Copyright holder
Copyright holder contact
Publisher
Publisher name and contact
Year of publication
Administrative data
Source of information (piece itself or other resources)
Contact information
Rights research contact
Services contact

 

++++++++++

The interaction of copyright and licensing

As mentioned above, much of the work on rights expression has focused on the expression of licenses or other contracts. These often delineate specific access rules and usage permissions. Although those permissions take precedence over copyright law, they do not negate the need for the description of copyright as defined in this article. A license does not remove the copyright status of an item; it establishes an agreement between parties that is founded on the ownership rights that copyright law defines. One possible challenge to a license is that the licensor does not actually hold the rights that are the subject of the contract.

A license does not remove the copyright status of an item; it establishes an agreement between parties that is founded on the ownership rights that copyright law defines.

Information on the copyright status of an item does not become irrelevant when a license is signed. In particular, the description of terms that inform a copyright determination remain among the key data elements that are needed in the long–term preservation of intellectual works. This means that copyright data elements should be recorded for all works, even those whose current business case relies primarily on explicit licenses rather than copyright law.

 

++++++++++

Conclusion

Adding descriptive data elements for copyright status to the metadata created for intellectual works places a burden on the communities that create that metadata. The lack of such descriptive data elements, however, places an even larger burden on those who would like to make use of the works. Today’s massive problem of orphan works (U.S. Copyright Office, August, 2005) arises mainly because information about the initial creation of the work has been lost over time. More particularly, there was no effective means to record that information when it was available. Digital works and analog works that are digitized can be removed from the original context that contains many of the elements that are evidence of the copyright status of a work, such as the provenance of the archive. The provision of descriptive data elements that can be transmitted with the work itself should facilitate subsequent uses of the valuable intellectual content that the work represents. Copyright–related metadata, therefore, should be seen as an essential component of the resource description. End of article

 

About the author

Karen Coyle is a librarian with nearly 30 years experience in digital libraries. She worked for over 20 years at the University of California in the California Digital Library, primarily on the development the online access system used in the University libraries. She is a recognized expert in technical issues, such as metadata and information retrieval, as well as social, political and policy issues. She currently writes and consults on a variety of digital library topics. Her writings can be found at http://www.kcoyle.net.

 

Acknowledgements

A portion of the research supporting this article was done under the auspices of the California Digital Library’s Rights Metadata Framework Project. I benefited from reviews of early drafts by Sharon Farb, University of California, and Mary Minow. Any errors remaining in the work are, however, entirely my own.

 

References

17 U.S.C. § 107. Limitations on exclusive rights: Fair use.

17 U.S.C. § 108. Limitations on exclusive rights: Reproduction by libraries and archives

Bridgeman Art Library, Ltd. V. Corel Corp., 36 F. Supp. 2d 191 (S.D.N.Y. 1999)

Creative Commons, at http://www.creativecommons.org/ accessed 14 August 2005

Peter Hirtle, 2005. "Copyright Term and the Public Domain in the United States," at http://www.copyright.cornell.edu/training/Hirtle_Public_Domain.htm, accessed 14 August 2005.

International Standards Organization. ISO/IEC 21000–5:2004. Information technology — Multimedia framework (MPEG–21) — Part 5: Rights Expression Language.

Library of Congress, 2005, "Legal," at http://www.loc.gov/homepage/legal.html, accessed 14 August 2005

Jessica Litman, 2001. Digital Copyright. Amherst, N.Y.: Prometheus Books.

Mary Minow, 2002 "Library Digitization Projects and Copyright," at http://www.llrx.com/features/digitization.htm, accessed 14 August 2005.

Open Digital Rights Language, 2005. at http://www.odrl.net, accessed 14 August 2005.

Society of American Archivists, 1998. "A Guide to Deeds of Gift" http://www.archivists.org/publications/deed_of_gift.asp, accessed 14 August 2005.

U.S. Copyright Office, June, 2005 "Forms" at http://www.copyright.gov/forms/, accessed 14 August 2005.

U.S. Copyright Office, August, 2005 "Orphan Works" at http://www.copyright.gov/orphan/, accessed 14 August 2005.


Editorial history

Paper received 14 August 2005; revised 31 August 2005; accepted 1 September 2005.
HTML markup: Kyleen Kenney and Edward J. Valauskas; Editor: Edward J. Valauskas.


Contents Index

Creative Commons License
This work is licensed under a Creative Commons License.

Copyright ©2005, Karen Coyle

Descriptive metadata for copyright status by Karen Coyle
First Monday, volume 10, number 10 (October 2005),
URL: http://firstmonday.org/issues/issue10_10/coyle/index.html