skip navigation
National Criminal Justice Reference Service
Login | Subscribe/Register | Manage Account | Shopping Cartshopping cart icon | Help | Contact Us | Home     
National Criminal Justice Reference Service
  Advanced Search
Search Help
     
| | | | |
Hidden Link
Administered by the Office of Justice Programs U.S. Department of Justice Office of Justice Programs Seal National Criminal Justice Reference Service National Criminal Justice Reference Service Office of Justice Programs Seal National Criminal Justice Reference Service
Topics
A-Z Topics
Corrections
Courts
Crime
Crime Prevention
Drugs
Justice System
Juvenile Justice
Law Enforcement
Victims

Home / Library/Abstracts

·About the Database
·Field Searches
·General Search
·Limit by Date
·Search Types
·Additional Search Techniques
·Refine Your Search
·Display Your Results
·Download Your Results
·Obtain Documents
arrow How Results Are Found and Displayed
·NCJRS Online Thesaurus

Abstracts Database Online Tutorial image

How Results Are Found and Displayed

Documents are found based on the criteria described below, and they are displayed in reverse order by NCJ number up to the number of documents specified by the user. This order ensures that the most current documents relevant to the search criteria appear at the top of the list.

Results of Boolean Searches

A Boolean search finds exactly the words a user types. If the word or combination of words is found anywhere in a document, that document is included in the search results. Top



Results of Concept and Pattern Searches

As a first step, the search simply looks for the existence or absence of query words or related terms in the document. The calculation process ranks various factors, and each factor adds a certain relative "weight" to the document. Added together, these weights determine a document's relevance. If a document has any relevance to the search criteria, no matter how small the relevance, that document is included in the results set. Weights are determined by the following factors:

  • Completeness: The larger the number of query words (either exactly or by reference), the greater the weight. A relevant document should contain at least one term or related term for each word in the query. If the document contains only a fraction of the original words, then the maximum rank of the document is equal to this fraction. For example, if the document contains only 3 out of the 4 original terms in the query, then its maximum rank is 75 percent. Related terms contribute less weight than the original (exact) words. If a query consists of three query terms, a document containing one instance of each of the three words will be ranked higher than a document containing 100 instances of one of the query terms.

  • Contextual Evidence: The larger the number of related terms, the greater the weight. Words are supported by their related terms. If a document contains a word and its related terms, the word is given a greater weight because it is surrounded by supporting evidence. For example, the word "charge" near the words "credit," "debt," and "card" is more likely to mean "charge card" than to mean "ward," "battery energy," or "to assign a task."

  • Semantic Distance: The more closely related the terms, the greater the weight. For example, synonyms are more closely related than antonyms. This association is used to compute the amount of contextual evidence that supports a word. The closer the terms are in relationship to the query words, the more weight they are given.

To further refine this semantic relevance, the search looks both at the physical location of query words and related terms within the document and the total number of terms. By default, a document's rank is calculated using a formula that equally combines the factors described below.

  • Proximity: The closer together the query words and related terms are within the document, the greater the weight. A document is judged more relevant if it contains related terms that occur close together, preferably in the same sentence or paragraph. The system computes a factor for physical proximity, which increases for adjacent terms and lessens as terms become increasingly distant (physically) from each other. Thus, documents with many hits close together are ranked higher than documents in which those same hits are present but scattered far apart.

  • Hit Density: The greater the ratio of query words and related terms to the total number of words in the document, the greater the weight. A document is judged more relevant if a large portion of the total number of words in it are query words or related terms. Thus, short documents with many hits are ranked higher than longer documents that have the same number of hits.

Top

 left arrow
Previous
right arrow
Next
 
       
Last updated on: 4/3/2007


Contact Us | Feedback | Site Map
Freedom of Information Act | Privacy Statement | Legal Policies and Disclaimers | USA.gov

U.S. Department of Justice | Office of Justice Programs | Office of National Drug Control Policy

place holder