When doing research in Sociology, you may find it helpful to refer to data or statistics to support your arguments or provide you with more context. But what exactly are data and statistics? To start, it's important to note that these terms mean different things and are not meant to be used interchangeably. While data can be considered unique pieces of information that can be analyzed, statistics are often the result of doing that analysis to answer questions of "why" or "how."
You can also watch the following video from the University of Houston to learn more about different types of data and how they might be used:
Acknowledgment: Information in the box came from the University of Houston Libraries Finding Data Research Guide.
Data is defined as facts or information that can be used for reporting, calculations, planning, or analysis. Data can be analyzed and interpreted using statistical procedures to answer “why” or “how.” Data is used to create new information and knowledge, and has the following characteristics:
Qualitative data describes the qualities or characteristics of something. It is non-numerical and often collected through interviews, participant observation, and focus groups. It can be subjective and typically describes a perception or point of view. It is particularly useful for gaining cultural insight into the social contexts and beliefs of a particular population. Qualitative data can take the form of field notes, audio, transcripts, and video.
Quantitative data attempts to quantify an answer to a question(s). It is numerical and often collected through measurements, surveys, and observations. Quantitative data is analyzed usually in programs such as Excel, R, SPSS, STATA, and more.
Open data and content can be freely used, modified, and shared by anyone for any purpose. Open data should
For more information about open data and examples of open data, see unlocking the power of open data. A tutorial created by the Data Equity for Main Street Project, a partnership between the California State Library and the Washington State Office of the Chief Information Officer and funded by the John S. and James L. Knight Foundation.
Proprietary data are generally documented in contracts and legally should not be published or disclosed to outside entities. Proprietary data may be protected under copyright, patent, or trade secret laws. Examples of proprietary data include:
Data from library subscription databases are proprietary data. The use of data from library databases requires authentication, and generally cannot be shared freely on the web or with people outside of the university.
Restricted-use data contain sensitive information (i.e., information that can cause potential harm if revealed) or information that enables the potential identification of respondents. Data may also be restricted-use because of confidentially promises or proprietariness.
Examples of sensitive information are reports of sexual behavior, criminal history, drug use, mental health history, HIV status, information collected from minors, or other materials that warrant extra discretion.
However, such data offers potential for research. Therefore, the government and the University want to ensure that restricted data is handled in a way that will safeguard the respondents/research subjects while allowing access to research which benefits our society as a whole. Files containing the confidential information are available to researchers only under certain conditions and agreements. Standard requirements may include the following:
Be specific about your topic so that you can narrow your search, but be flexible enough to tailor your needs to existing sources.
Identify the Unit of Analysis
You should be able to define the following:
Who or What?
Social Unit: This is the population that you want to study.
It can be...
When?
Time: This is the period of time you want to study.
Things to think about...
Where?
Space: Geography or place.
There are two main types of geographic classifications...
Keep in mind...
Data is not available for every thinkable topic. Some data is private, must be purchased, uncollected, or unavailable. Be prepared to try alternative data.
Content from MSU Libraries-Finding Data & Statistics
Once you have defined the boundaries of your topic, you can use them to identify search terms, or keywords, to get started in the search process. This will ensure that your search methods are efficient and effective, save you time, and yield the most relevant results.
For example, You are studying education equity in schools and would like to collect median household income for Houston from 2010 to 2015. For this question, your main concepts are: 2010-2015, median household income, Houston
For example, instead of “Houston household income" you might try the search terms “Houston income”, “Houston family income” or “Houston family earnings”.
Acknowledgment: Information in the box came from the University of Houston Libraries Finding Data Research Guide.
This is a good strategy if you are not sure what types of variables exist or what data would be relevant for your project. Look within a data archive that collects within the general subject area that you are searching for.
Dataset Search enables users to find datasets stored across thousands of repositories on the Web, making these datasets universally accessible and useful.
Click on the free filter to remove sets you can't access
Ask yourself: Who might collect and publish this type of data? This can be a good strategy if you are familiar with library databases or have a sense of who is a major source of the sort of data you are seeking. Visit the Data and Statistics Library Databases page, or go to the website of a relevant organization to look for data. There are several commonly used secondary data sources listed on this guide you can try.
These are some of the main types of producers of statistical information:
By searching through existing literature, you can discover datasets. When you find a relevant article, it may point you to the dataset it used. What data sources are they using in their methods? Are they working with general-purpose datasets, or did the researchers have to collect their own data? This will give you an idea of the possibilities and limitations of data on your topic. If they are using a secondary dataset, you can try to track that source down. Knowing the exact name of a specific source (or even better, the DOI) can make it much easier to locate.
The library provides access to hundreds of databases that you can search to find scholarly articles on your subject. Check out our Databases A-Z list and limit by subject to find databases that may work for your subject area.
I know, it's obvious! When searching Google, be sure to identify your topic keywords carefully and try using synonyms. Add in terms like “data” or “statistics” or method terms like survey.
You may need to include dates and variables you are looking for in your search. For example, “2010 Houston median household income.” If you are getting too few results, try decreasing the number of concepts in your search. For example, we could change our original search to “Houston income” or “Houston household income ." Another way to broaden your search is to use synonyms or related terms. In our example here, you may also try “Houston family income” or “Houston family earnings”.
Search strategy #5: Ask for help.
Contact your subject librarian for assistance if you encounter problems in locating the data source you need.
Content from MSU Libraries-Finding Data & Statistics
When considering whether or not to use data created by someone else in your research it is important that you are able to evaluate it and determine its usefulness to your work and its validity and trustworthiness. To do this, ask yourself these questions:
Acknowledgment: Information in the box came from the University of Houston Libraries Finding Data Research Guide.
What is secondary data? Secondary data is data that was collected in the past by someone other than the researcher using it. This data can be analyzed to address the researcher’s questions.
Provides social science research data in disciplines such as sociology, political science, criminology, history, psychology, and more, that may be used to conduct secondary analyses. Includes access to over 7,000 data collections and 500,000 files.
Offers access to data and digital content from the U.S. Census Bureau which includes statistics about population, housing, industry, and business.
Access to authoritative public and proprietary sources. PolicyMap allows users to make maps, find specific locations, and create market reports.
Use Social Explorer to visualize and interact with data, create maps, charts, reports and downloads that help you reach your goals. Explore hundreds of thousands of built-in data indicators related to demography, economy, health, politics, environment, crime and more. Easily add your own data for further impact.
Below are examples of resources that provide data or statistics on a variety of topics. This list is not meant to be exhausted. For more examples, see the Data for Social Sciences Guide.
The online data sources below cover several disciplines including health, the social sciences, science, and more!
Provides social science research data in disciplines such as sociology, political science, criminology, history, psychology, and more, that may be used to conduct secondary analyses. Includes access to over 7,000 data collections and 500,000 files.
Use Social Explorer to visualize and interact with data, create maps, charts, reports and downloads that help you reach your goals. Explore hundreds of thousands of built-in data indicators related to demography, economy, health, politics, environment, crime and more. Easily add your own data for further impact.
Access to authoritative public and proprietary sources. PolicyMap allows users to make maps, find specific locations, and create market reports.
This authority for statistics on the social, political, and economic conditions of the U.S. provides a snapshot of America and its people with data from the Census Bureau, Bureau of Labor Statistics, and other Federal agencies and private organizations.