Degree In Sight

Computer mouse connected to data

If you're thinking about collecting data for your dissertation or research project, consider tapping into a remarkable online resource: public databases.

These resources range from familiar repositories of social science findings to databases of genetic material, video footage and three-dimensional images of neurons.

Public databases give students unparalleled access to large sample sizes, longitudinal data and special populations — benefits difficult to achieve on limited student resources.

"The sample I used is sufficiently powered to allow me to look at intersections of race, sexual orientation and gender — something smaller data sets don't usually offer," says City University of New York Graduate Center student Sara McClelland, who used one of these resources for her dissertation.

The downside is that you're limited to a study design and variables that you didn't choose yourself, but many students combine their own smaller-scale studies with analyses from large data sets, says Bonnie Knoke, who manages the Study of Early Child Care and Youth Development (SECCYD) data coordinating center at RTI International for the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD).

To make the most of these resources, learn what is in them and how to use them properly, advises Cornell University graduate student Taryn Morrissey, who used secondary SECCYD analyses for her dissertation and a paper she published in the Journal of Marriage and Family. Students can attend formal training sessions or peruse the databases' Web sites and read training manuals.

That investment can pay big dividends for both researchers and the general population, notes Carol Cushing, a clinical trials specialist for the National Institute on Drug Abuse's Clinical Trials Network database.

"By opening the door to more people through our database, some smart person may make a new connection or correlation," she says.

Large-scale databases come in a variety of forms and can be useful for grad students in almost any area of psychology. The major ones include:

Database: Inter-university Consortium for Political and Social Research (ICPSR)

Sponsor: A range of government agencies, colleges and universities

What's in it: The world's largest archive of digital social science data, containing 10 archives on topics including substance abuse, criminal justice and aging.

How to use the data: If your university is an ICPSR member, you can contact your institution's ICPSR representative for assistance or to find out about new data releases.


Database: TheDataWeb

Sponsor: The U.S. Census Bureau

What's in it: Federal census, economic, health, income, family and labor data.

How to use the data: The Web site includes tutorials for handling a variety of data types. You can also call (866) 437-0171 for help.


Database: Substance Abuse and Mental Health Data Archive

Sponsor: Substance Abuse and Mental Health Services Administration

What's in it: Data from surveys and studies including SAMHSA's National Survey on Drug Use and Health and the National Institute on Drug Abuse's Monitoring the Future study.

How to use the data: The Web site includes complete documentation on all of the data sets, including code books, descriptive information and lists of published papers.


Database: Clinical Trials Network (CTN)

Sponsor: National Institute on Drug Abuse

What's in it: Data on about 3,000 participants from 13 clinical trials of substance abuse treatments.

How to use the data: Contact CTN clinical trials specialist Carol Cushing at (301) 443-6697 or via e-mail.


Database: Add Health (National Longitudinal Study of Adolescent Health)

Sponsor: NICHD and 17 other agencies

What's in it: Four waves of health, behavior and social contexts data on about 21,000 people first surveyed in 1994 who are being followed into adulthood. It also includes some parent and biomarker data.

How to use the data: The restricted-use contract includes four hours of free consultation with appropriate staff; after that, there's a fee for help. Researchers can also share information through a listserv devoted to the database.


Database: The NICHD Study of Early Child Care and Youth Development

Sponsor: NICHD

What's in it: Longitudinal data on more than 1,000 children and their families, designed to address the relationship between child care and children's development. Beginning in 1991, researchers have followed the same children from birth to adolescence, using an extensive array of variables and measures. Data from the first three phases (birth to sixth grade) are available now; data from Phase IV (adolescents) will be ready in October.

How to use the data: The Web site includes extensive training material and study documentation.


Database: CHILDES and TalkBank databases

Sponsor: National Science Foundation and the National Institutes of Health

What's in it: These related databases capture spoken language from about 150 studies in audio and video formats. CHILDES includes conversations between children and their caretakers and playmates; TalkBank covers adult communication, including that of people with aphasia, language problems that arise from brain damage.

How to use the data: The Web site includes training manuals, a database of transcripts, programs for analyzing transcripts, methods for linguistic coding and systems for linking transcripts to recordings.


Database: National Institute of Mental Health (NIMH) Human Genetics Initiative

Sponsor: NIMH

What's in it: Phenotype data, genotype data and biomaterials (such as DNA samples and cell line cultures) on more than 4,000 people with bipolar disorder, schizophrenia, Alzheimer's disease, autism and depression, 4,000 controls and 12,000 of their relatives.

How to use the data: The Web site contains a wealth of information on using the data, as well as a technical support specialist you can e-mail.


Database: The MRI Study of Normal Brain Development

Sponsor: NIH

What's in it: MRI data on a representative sample of approximately 500 children and young adults enrolled in a longitudinal study of normal brain development, as well as behavioral, psychological and neuropsychological data.

How to use the data: The Web site offers a downloadable data dictionary.


Database: Neuromorpho

Sponsor: NIH

What's in it: The world's largest centralized, curated repository of digital reconstructions of neurons, representing a wide variety of species, brain regions and cell types. It contains colorful two-dimensional images as well as a virtual reality display that allows you to manipulate, zoom or rotate the images.

How to use the data: Training manuals can help you to analyze cell geometry or run simulations of electronic activity within neurons. The manuals include links to listservs, where seasoned users trade tips.

By Tori DeAngelis