APA PsycNET® Is Now Powered by MarkLogic

We'd like to introduce you to the new engine behind APA PsycNET® searching and discuss why we chose it.

If you're an APA PsycNET® user, you've probably noticed a few things that are different about our platform of late. Most noticeable is the statement in red text on the platform itself that is both announcement and request for your feedback:

APA PsycNET has a new semantic search engine with improved search result speed and 30% faster delivery of new content. This new technology opens the way for emerging semantics, open linked data, and social search on the platform. Help us make PsycNET better with your feedback.

The new engine behind APA PsycNET searching is called MarkLogic, and our decision to move to the new system is a very big deal for us. We'd like to talk about why we made the move and, concomitantly, why we chose MarkLogic as our new search engine.

We'd also like to review some differences you'll encounter between our previous Lucene-based search server and MarkLogic in the way a search is processed and some issues that have cropped up as part of the transition process.

How We Got Here

In 2007 APA's Office of Publications and Databases and Information Technology Services Directorates collaborated on the development and implementation of the state-of-the-art APA PsycNET. Designed exclusively for APA's five databases at the time, PsycINFO®, PsycARTICLES®, PsycBOOKS®, PsycEXTRA®, and PsycCRITIQUES®, it combined the content from multiple databases in one search result set, provided Easy and Advanced search options and results managements tools including faceted search, delivered citation finder and cited reference tools, and created a place to save, rerun, or share your searches.

We've added a number of features to the APA PsycNET platform over the years, in response to your requests. We have added fields and limits to the searches, as well as new types of notification services such as RSS feeds and custom email alerts. We have also greatly enhanced the display of full-text articles through more robust, interactive HTML and integrated the APA Dictionary of Psychology.

Challenges

However, five years is a very long time in database years, and we've continued to make more and more demands on our existing platform. Our original five databases are now seven, and we also make a host of other resources available through APA PsycNET, such as PsycSCANS, our Handbooks in Psychology Series, and APA Books® E-Collections. And we have plans for additional features.

In short, we've outgrown our original capabilities. We found that to be able to move in the direction we wanted, we needed to make a change, and we wanted a platform that would allow us to build many more applications than we could before and to leverage our unstructured content. We wanted to be able to do things like create content mashups by combining existing journals and books into content tailored for an individual need, such as a class. We wanted the option of building Web services on top of our content. In addition, we wanted to be able to provide the services we now offer faster and better.

Why MarkLogic?

Founded in 2003 as a relational database alternative, MarkLogic's core product is an XML repository that allows the system to focus on managing complex and unstructured information. In layperson's terms, it's a database and web server and search engine that can form the backbone of an extremely powerful website. After a careful search, APA staff selected it based on its "big data" applications and state-of-the-art digital capabilities, including semantic search, improved search retrieval speed, and accelerated delivery of new content.

We began with a soft launch to the new system in late December, running both the old and new systems, and by mid-January we took the training wheels off and transitioned to MarkLogic only. What does this make possible? All sorts of things. The first two clear changes are that we are now delivering our content 2 days earlier and our searches are faster. You'll also notice that we have introduced Type Ahead and Spell Suggest (see APA PsycNET® Update: Spell Suggest and Type-Ahead Features Now Available, this issue. And we have many other improvements and new features in the planning stages. These will be released in the months ahead.

Search Differences

Inevitably, there are differences between an old and new system. We've already made some updates to the APA PsycNET Quick Reference Guide (PDF, 425KB), which you may either download from the Web or request from us as a free wallet-sized guide by contacting PsycINFO. Changes will soon be made to the APA PsycNET Help Menu as well.

Also, since a new search engine will come with some variants in the search syntax, there will be a more detailed guide posted to explain how to create complex Booleans.

Here are some of the differences you're likely to encounter:

Boolean Capitalization

Boolean terms must be capitalized. Thus, as you can see, a search for death or taxes is quite different from a search for death OR taxes.

Screenshot of Boolean capitalization in APA PsycNET searches using MarkLogic

 

Relevance

MarkLogic has a different relevance calculation algorithm, so you are likely to get results in different order than you have experienced in the past.

Proximity Search

You asked for it, and we've now provided a new format for proximity searching. Our previous format was the following:

"perceived stress scale"~1.

That has been changed to

"perceived" NEAR/1 "stress" NEAR /1 "scale"

Saved Searches

Though My List entries have transitioned without issue, we have had reports of issues rerunning some saved searches. Sometimes the issue is easy to see. For example, a saved search that had used a lowercase Boolean may now have different results than one with an uppercase Boolean, as might a saved search with multiple Booleans not specifically grouped with parentheses.

If you are getting results that don't seem consistent with previous findings, the best thing to do is to reconstruct the search and save it again.

Growing Pains

In most respects, the "under-the-hood" installation of MarkLogic has gone smoothly and lived up to our expectations. However, the transition has not been without challenges. A move this major involves teams of managers, developers, software engineers, user acceptance testing, just to scratch the surface.

And in all the coding, connecting, and testing, some things get missed, or act a bit differently than we had expected, or one change can have unanticipated consequences somewhere else. We know — we're preaching to the choir, and you've all been there. But the consequence is that we've encountered a number of bugs over the past couple of months, many of them reported by the library community. We have fixed or will soon fix all those that we've discovered so far, but we need to be vigilant and watch for any other issues that crop up.

That's where you can be so helpful as additional eyes on the platform. If you encounter something that doesn't seem quite right, please let us know. You can use the form on the platform to report to us, the contact us link, or email PsycINFO.