

I was waiting my turn when someone left and just ran right out the door.
It really is not that difficult - I have even found some instructions for you to follow.
So Please Wash Your Hands!
Your Stripper Song Is |
![]() I'm Too Sexy by Right Said Fred "And I'm too sexy for your party Too sexy for your party No way I'm disco dancing" Yes, you're super sexy. But you never yourself too seriously! |
Barite, M. (2000). The Notion of "Category:" Its Implications in Subject Analysis and in the Construction and Evaluation of Indexing Languages. Knowledge Organization 27:4-10.
The author does mention that the comprehension of the concept if neither simple nor easily accessible.
Definitions
Usefulness
Category, Object and Analyst
Characters of Categories
Conclusion and Reflection
At the end of the article the author proposes greater attention should be shown to this topic. I have to agree with that along with requesting that it be in simpler terms. After two readings my mind is still boggling with some of the terms and ideas. The references to the time-space continuum made me think of a play I know that involved paradoxes. I can only hope that I will understand this topic more after my class tomorrow night.
Liddy, E. (2001). How a Search Engine Works. Searcher 9(5):38-45.
A Search Engine is the more popular term for an Information Retrieval (IR) System. Whichever term you call the system it contains four different elements:
Document Processor
Prepares, processes and inputs the documents, pages or sites that users are searching. Document processors perform some of the following steps:
Query Processor
Query processing has seven possible steps:
Search and Matching Function
How systems carry out their search and matching functions change depending on which theoretical model of information retrieval underlies the system’s design philosophy. Searching the inverted file for documents meeting the query requirements, referred to simply as “matching”, is typically a standard binary search, no matter whether the search ends after the first two, five or all seven steps in the query process. Some search engines use algorithms for scoring not based on document content, but based on the relation among documents or past retrieval history of documents and pages. After the similarity is computed for each document in the subset of documents, the system presents an ordered list to the searcher. The sophistication of the ordering of the documents depends on the model the system uses as well as how advanced the document and query weighting mechanisms are. Some systems that are very sophisticated go the extra mile and let the user provide relevance feedback or modify their query based on the results they were given.
What Document Features Make a Good Match to a Query
Term Frequency
How frequently a term appears in a document is one of the most obvious ways to determine a document’s relevance to a query. However, several situations can undermine this premise. Many words have multiple meanings; such as “pool” or “fire.” Also in some domains certain words are so common and so frequent that their relevance declines sharply.
Location of Terms
Many search engines give preference to words found in the title or lead paragraph or in the metadata of a document. Terms that occur in the title of a document or page that match a query term are therefore frequently weighted more heavily than terms occurring in the body of the document. Also, query terms that occur in section headings or within the first paragraph of the document may be more likely to be relevant.
Link Analysis
Link analysis works like bibliographic citation practices. Link analysis is based on how well connected each page is as defined by Hubs and Authorities, where Hub documents link to large numbers of other pages (out-links) and Authority documents are those referred to by many other pages, or have a high number of “in-links.”
Popularity
Google and several other search engines use popularity to determine page relevance. Popularity uses data on the frequency that a page is chosen by users to predict the relevance of it.
Date of Publication
Some search engines assume that the newer the information is the more likely that it will be relevant to the user. These engines present the results beginning with the most current ones first followed by the older results.
Length
When there is a choice with two documents having the same query terms, the search engine chooses the document that has a higher occurrence of the term relative to the length of the document.
Proximity of Query Terms
When the terms occur near each other in a document it is more likely that the document is relevant to the query than if the terms occur at a greater distance.
Proper Nouns
These sometimes have a greater weight, since many searches are performed on people, places or things.
Summary and Reflection
Up till now search engine providers have primarily opted for less versus more complex processing of documents and queries. This then leaves the bulk of the work to be done by the searcher to pick their way through the results to find what they are seeking. Hopefully this status-quo will not continue and search engines will continue to enhance the quality of the processing.
I have to honestly say it never occurred to how or what exactly happens when I perform a search. It was interesting to learn exactly how complex the search process is and what all the different components are. Just today I saw an additional article (see below for link) from ZDnet.com that stated that Google is drawing 64% percent of the search queries for the month of March. Overall I found the article very enlightening and informative as to how the whole process works. I certainly won’t look at performing a search the same way again.
http://news.zdnet.com/2100-9595_22-6175248.html?part=rss&tag=feed&subj=zdnn
Introduction
Indexing and Abstracting (I&A) databases generally have not implemented name authority control as is used in many library catalogs. Most I&A databases burden the searcher with identifying and selecting name variations. The use of widely varied forms of authors’ names without reference or links to alternatives causes problems for the searchers. End results may be inaccurate or incomplete, resulting in a decrease in the scientific integrity of the research.
Individual library online catalogs have been applying authority control since the implementation of AACR2. Personal name authorities bring together works by an author, regardless of the variations in name as identified in the work itself.
One large challenge lies in managing author name changes. Few databases have chosen to link the variations or name changes to facilitate searching and retrieval of an author’s works. I&A databases may also move all of an author’s works from the former name to the current name, altering some records so then the author name no longer matches the name displayed on the original article.
Examples of Problems with Name Changes
Authors that publish works under two forms of their name and authors that have changed their names are both not easily found in databases and in most cases not all the relevant citations are found. If database citations do not contain the form of the name used on the article, citing errors most often will occur.
The Web of Science, the original citation tool, uses the author name exactly as it appears in the citing article. The policy of ISI is not to over-correct “variations” because it cannot check them all and refuses to second guess an author’s intentions.
When the searcher uses only the author name on an article but the I&A database has reformatted the author name and the user selects the name from the I&A database rather than the name on the article, some citations will not be retrieved.
Potential Solutions: Overview
Solutions to the problem of identifying and linking author name changes within I&A databases can take many approaches both in production and in the research modeling stage.
Name Disambiguation through automated methods
Maintaining name authority files requires a high amount of labor but benefits the user with high recall and high precision results. Automated methods of name disambiguation require less manual labor but cannot compare to the level of high recall and high precision of well maintained authority files.
Potential Solutions: Authority File in the MathSciNet Database
The MathSciNet database creates and maintains a name authority file to control variations. Much of the identification process is automated; however around 20% of all the items require manual checking. MathSciNet’s solution is workable in small database communities, where it is possible for human indexers to check and correct all problem entries manually. It should be noted that this solution may not work for large databases, but it could prove very useful for databases covering a smaller range of information and topics.
Potential Solutions: More Examples Creating, Using, or Linking Authority Files
I&A Databases may follow Library of Congress (LC) practice but may get an added benefit in looking at the LC Name Authority File (LCNAF) to help with collocating the names in the author databases. I&A Databases would also benefit from the effort that goes into compiling the LCNAF. But the I&A make the mistake of changing the authors’ names rather than pointing or linking to the variations as given within the articles themselves.
Several projects are currently in the works to build on LCNAF and other authority files.
FRANAR – Functional Requirements and Numbering of Authority Records
“Is working to develop a conceptual model to assist in an assessment of the potential for international sharing and use of authority data both within the library sector and beyond.”
HoPEc System
Implements an author registration component that places the burden on the authors to create and maintain their own authority files if they wish for the papers to be clustered.
Librarians realized long ago that linking methods could be exchanged for authorized forms of names. In the automated environment a system does have to select a “correct” form as long as all of the variations link to each other.
LEAF Project – Linking and Exploring Authority Files
Links all authority records that pertain to the same person based on the automatic linking rules of the project and includes birth/death dates.
Potential Solutions: Alternative Approaches Using Name Disambiguation
Instead of using name authority files, researchers are aiming for an automated method of examining more than the author name to determine the likelihood that any two papers with similar author names have been written by the same person.
Authority name issues can be grouped into three categories: (1) multiple name variations that signify the same author, (2) similar or homonymic names that belong to more than one author and (3) linear changes when an author alters his/her name.
Disambiguation projects generally share similar attributes. All of them use metadata beyond the author name alone. Most have proven that adding more data elements to the models help to disambiguate names in a faster manner and with a higher rate of success than solely using single author names. Merging the techniques of adding data elements and relying on disciplines to maintain their own linked name files could garner great success for large, multidisciplinary databases such as I&A.
Conclusion
Alternative solutions must be implemented to assure access, retrieval and the proper crediting of authors’ works. Without control or linkage to name variations, searchers may retrieve incomplete or inaccurate results. To meet the access needs of the 21st Century both catalogs and I&A databases may need to implement options that present a high degree of probability that the items have been authored by the same individual rather than options that provide high precision with manual maintenance. Striving for name disambiguation rather than name authority control may be the best option for catalogs, I&A databases and digital library collections.
Developing automated methods can reduce the searchers burden of determining author name variations while ensuring that the author index entries match the names on the article and that the end user can successfully retrieve all of an author’s works from that database.
Reflection
Anything that furthers getting the correct packets of information to the user is something that we should be pursuing in library science. The average user does not know to look under the author name variations unless the search page tells them to do so and even then they may not read the instructions. I believe we need to focus on the quality of the results and ensure that people are finding what they are looking for and gaining good quality information. We must as a profession, do whatever is necessary to deliver the correct information in a quick and timely manner to those that are requesting it.