Wednesday, April 11, 2007

LS 500 Authority Challenges

Bennett, D.B. & Williams, P. (2006). Name Authority Challenges for Indexing and Abstracting Databases. Evidence Based Library and Information Practice, 1(1).

Introduction

Indexing and Abstracting (I&A) databases generally have not implemented name authority control as is used in many library catalogs. Most I&A databases burden the searcher with identifying and selecting name variations. The use of widely varied forms of authors’ names without reference or links to alternatives causes problems for the searchers. End results may be inaccurate or incomplete, resulting in a decrease in the scientific integrity of the research.

Individual library online catalogs have been applying authority control since the implementation of AACR2. Personal name authorities bring together works by an author, regardless of the variations in name as identified in the work itself.

One large challenge lies in managing author name changes. Few databases have chosen to link the variations or name changes to facilitate searching and retrieval of an author’s works. I&A databases may also move all of an author’s works from the former name to the current name, altering some records so then the author name no longer matches the name displayed on the original article.

Examples of Problems with Name Changes

Authors that publish works under two forms of their name and authors that have changed their names are both not easily found in databases and in most cases not all the relevant citations are found. If database citations do not contain the form of the name used on the article, citing errors most often will occur.

The Web of Science, the original citation tool, uses the author name exactly as it appears in the citing article. The policy of ISI is not to over-correct “variations” because it cannot check them all and refuses to second guess an author’s intentions.

When the searcher uses only the author name on an article but the I&A database has reformatted the author name and the user selects the name from the I&A database rather than the name on the article, some citations will not be retrieved.

Potential Solutions: Overview

Solutions to the problem of identifying and linking author name changes within I&A databases can take many approaches both in production and in the research modeling stage.

  • Authority Control through the use or linking of Name Authority files
  • Uses a file: MathSciNet or WilsonWeb
  • Proposed file: International Standard Authority Name/Data Number
  • Linking across files: HoPEC, ANAC Levy Project, LEAF

Name Disambiguation through automated methods

  • In Practice: Author-ity
  • Models in development by research teams, including use of social networks

Maintaining name authority files requires a high amount of labor but benefits the user with high recall and high precision results. Automated methods of name disambiguation require less manual labor but cannot compare to the level of high recall and high precision of well maintained authority files.

Potential Solutions: Authority File in the MathSciNet Database

The MathSciNet database creates and maintains a name authority file to control variations. Much of the identification process is automated; however around 20% of all the items require manual checking. MathSciNet’s solution is workable in small database communities, where it is possible for human indexers to check and correct all problem entries manually. It should be noted that this solution may not work for large databases, but it could prove very useful for databases covering a smaller range of information and topics.

Potential Solutions: More Examples Creating, Using, or Linking Authority Files

I&A Databases may follow Library of Congress (LC) practice but may get an added benefit in looking at the LC Name Authority File (LCNAF) to help with collocating the names in the author databases. I&A Databases would also benefit from the effort that goes into compiling the LCNAF. But the I&A make the mistake of changing the authors’ names rather than pointing or linking to the variations as given within the articles themselves.

Several projects are currently in the works to build on LCNAF and other authority files.

FRANAR – Functional Requirements and Numbering of Authority Records

“Is working to develop a conceptual model to assist in an assessment of the potential for international sharing and use of authority data both within the library sector and beyond.”

HoPEc System

Implements an author registration component that places the burden on the authors to create and maintain their own authority files if they wish for the papers to be clustered.

Librarians realized long ago that linking methods could be exchanged for authorized forms of names. In the automated environment a system does have to select a “correct” form as long as all of the variations link to each other.

LEAF Project – Linking and Exploring Authority Files

Links all authority records that pertain to the same person based on the automatic linking rules of the project and includes birth/death dates.

Potential Solutions: Alternative Approaches Using Name Disambiguation

Instead of using name authority files, researchers are aiming for an automated method of examining more than the author name to determine the likelihood that any two papers with similar author names have been written by the same person.

Authority name issues can be grouped into three categories: (1) multiple name variations that signify the same author, (2) similar or homonymic names that belong to more than one author and (3) linear changes when an author alters his/her name.

Disambiguation projects generally share similar attributes. All of them use metadata beyond the author name alone. Most have proven that adding more data elements to the models help to disambiguate names in a faster manner and with a higher rate of success than solely using single author names. Merging the techniques of adding data elements and relying on disciplines to maintain their own linked name files could garner great success for large, multidisciplinary databases such as I&A.

Conclusion

Alternative solutions must be implemented to assure access, retrieval and the proper crediting of authors’ works. Without control or linkage to name variations, searchers may retrieve incomplete or inaccurate results. To meet the access needs of the 21st Century both catalogs and I&A databases may need to implement options that present a high degree of probability that the items have been authored by the same individual rather than options that provide high precision with manual maintenance. Striving for name disambiguation rather than name authority control may be the best option for catalogs, I&A databases and digital library collections.

Developing automated methods can reduce the searchers burden of determining author name variations while ensuring that the author index entries match the names on the article and that the end user can successfully retrieve all of an author’s works from that database.

Reflection

Anything that furthers getting the correct packets of information to the user is something that we should be pursuing in library science. The average user does not know to look under the author name variations unless the search page tells them to do so and even then they may not read the instructions. I believe we need to focus on the quality of the results and ensure that people are finding what they are looking for and gaining good quality information. We must as a profession, do whatever is necessary to deliver the correct information in a quick and timely manner to those that are requesting it.

No comments:

Post a Comment