Building Tools for Efficient Searching: Updating the Thesaurus of Psychological Index Terms®

This article explores how the Thesaurus of Psychological Index Terms® is updated and why it is such a valuable resource for building a good search.

There is no question that the index terms found in the Thesaurus of Psychological Index Terms® provide targeted information retrieval for PsycINFO® and its related databases.

As Brenda Evans, PsycINFO's Manager of Bibliographic Production, explains:

I think the most important point of index terms is that they bring a level of consistency to the indexing. And that is both for the person who is doing the indexing and for people searching the database. It gives us a standardized vocabulary with which we can train people to focus on the most important elements in a document — its "about-ness" — as main concepts are translated to index terms. So one of the main things we do in training is to help people understand "What is this about? What is the most important thing here?" And index terms, because you have to choose them, force you to that level of conceptualization.

The Thesaurus also gives us a way to bring a level of user-focus to the way that we think about what we index — we need to put records into the database using a language that will enable searchers to find them. The English language is very broad, and many words mean the same thing. It gives us a common language with which to index, and it removes the ambiguity or the synonyms surrounding a word's use.

Alvin Walker, PsycINFO's Manager of Product Development, edited five editions of the Thesaurus. As he explains, "The most important part of the controlled vocabulary in the Thesaurus is that it does allow you to zero in on specific concepts, without the noise, without the garbage, that you get without using a controlled vocabulary."

But as the language of science changes and evolves, tools such as the index terms used in the APA databases naturally need to evolve in response. As Evans explains, "One thing we find important is that we constantly need new terms to describe the language, to describe the material that we're covering. We're always evaluating, 'Do we need a new term? Is this something we're just going to get once in a while or are we going to see this consistently?'"

So how do terms get added to the Thesaurus? The process to get into the highly respected Thesaurus of Psychological Index Terms takes an interesting combination of relevance, research trends, and popularity.

Gathering Candidate Index Terms

The person who guides the process of determining what terms are selected for consideration and eventually make it to inclusion in the Thesaurus is Ian Galloway, PsycINFO's Senior Specialist for Vocabulary Development. Galloway's background is in computational linguistics, an interdisciplinary field dealing with the statistical or rule-based modeling of natural language from a computational perspective.

Galloway explains that the first step of the process is to gather candidate terms. These come primarily from six sources:

  • PsycINFO Indexers
    Because the indexers are the ones who work most closely with the index terms, they are the ones who are most aware when concepts that may warrant inclusion arise.
  • Machine Aided Indexing
    PsycINFO employs an overall Machine Aided Indexing tool that reviews all the content indexed in PsycINFO. Data from the tool provides a macro overview of terms associated with information going into the database. However, the tool suggests terms that the indexers review, supplement, and often change.
  • Database Users
    PsycINFO users will often have strong opinions both about what terms should be added and what terms should be dropped.
  • Incoming Journals
    New journals being added to the PsycINFO database are scrutinized for the impact that their inclusion will have on the Thesaurus. For example when PsycINFO expanded its neuroscience coverage several years ago, it necessitated the addition of more neuroscience-related index terms.
  • Subject Matter Experts
    Subject matter experts, who are often former editors of the prominent scholarly journals, are consulted periodically to help APA identify trends in research and, ultimately, the vocabulary of the science.
  • Related Lexicons
    Lexicons in many of the disciplines covered by PsycINFO are consulted to identify the use of terminology and disambiguate the use of the same term that may have totally different meanings to scholars in different fields.

Researching the Candidate Terms

Once the candidate list is established, the work begins. As Galloway explains, "I'll review that list. There are often some that I can disregard right away because we already have coverage for that concept or I can say that will be a "Used For" term for something we already have coverage for." ("Used For" references represent some but not all of the most frequently encountered synonyms, abbreviations, and alternative spellings or word sequences of existing Thesaurus terms.) Remaining candidate terms are then discussed and sifted through with the indexing staff.

Then the research starts. Galloway first searches the title, keyword, and abstract fields of existing PsycINFO records; if the term exists in two of those three fields for a significant number of records, its chances for inclusion improve greatly. He also checks the term in other databases from both APA and outside sources, as well as in other thesauri and reference works such as the APA Dictionary of Psychology. Often Galloway will have to negotiate over the addition or deletion of a term.

As terms emerge as finalists, Galloway determines how they should fit into the Thesaurus of Psychological Index Terms, including the term's place in the Thesaurus hierarchy and the creation of the scope note for the potential entry.

The final candidates are presented to PsycINFO management and Dr. Gary VandenBos, the APA Publisher, who make the final determination about which will be added to the Thesaurus.

Adding Terms to the Thesaurus

New index terms are incorporated into PsycINFO during the annual refresh of the database. Terms are back-mapped to hundreds or thousands of records that use a predecessor term. This enables researchers to use the new terms from the time that the database is refreshed.

A crucial update to the database occurs when there is a change in index terms for a topic that reflects changes in current scientific thinking — for example, "Bipolar" replacing "Manic Depression." In that case, the term will be back-mapped during the refresh so that previous entries for "Manic Depression" will include "Bipolar." In other cases, terms may be totally replaced, as when "Gypsies" was replaced by "Romanies." Another essential element is the creation of Historical Notes that are added to the Thesaurus to reflect the change.

In addition to back-mapping terms in the database, staff also need to add terms to the Machine Aided Indexing (MAI) tool used by the PsycINFO indexers. To train the MAI tool, Galloway identifies 75 records that show an ideal use of each new Thesaurus term that are then analyzed by the MAI tool. As part of this process, potentially ambiguous terms (such as moral/morale) are also identified, and 75 ideal records for each are loaded into the MAI tool. These efforts are undertaken to ensure that the list of MAI suggestions the APA indexers receive are as current and as unambiguous as possible.

Users and the Thesaurus

Galloway explains the benefits of using the index, saying "when researchers take the time to access the Thesaurus, to build a search, it's easy for them to see the Scope Notes to give them a basic understanding of how we apply a concept. Once they understand the value of the vocabulary, their searches become more targeted, more efficient, and more accurate."