Week 8: Advance Allocation of Practice-Based Research in Health
Secondary Data Sources I
In this information age where data are readily accessible and there is both a great demand for accelerated research projects and strict limitations on research funding, using existing data makes sense. Data used in this way are called secondary data; they come in many forms and contain information on just about anything—depending on who collected the information in the first place, and why.
As a health professional, you have access to a wide range of secondary data sources, including government agencies, such as the Census Bureau or the Centers for Disease Control (CDC), and private sources, including local health service providers. Global and international data are available from familiar sources, such as the World Health Organization and the United Nations. In addition, nearly every nation maintains statistics on social, economic, and environmental indicators, which contain a wealth of health information.
As a member of the Walden community, you have access to the Inter-University Consortium for Political and Social Research (ICPSR), the world’s largest archival database of secondary data. You also have access to the Social Change Impact Report (SCIR) data sets—a Walden-owned database. There are also a number of sources and tutorials available to you through Walden’s Center for Research Quality.
No matter the topic—be it vaccination rates, women’s access to mammography, alcohol use on college campuses, or childhood obesity—you can probably find an existing secondary data source with the information you need.
· Locate appropriate secondary data sources
· Evaluate secondary data sets related to Doctoral Study topics and research questions
Kiecolt, K. J., & Nathan, L. E. (1985). Secondary analysis of survey data. Beverly Hills, CA: Sage Publications.
· “Locating Appropriate Data”
Putting existing data to work to improve quality care. (2004). Quality Letter for Healthcare Leaders, 16(3), 2–9, 1.
Evans, E., Grella, C. E., Murphy, D. A., & Hser, Y. (2010). Using administrative data for longitudinal substance abuse research. The Journal of Behavioral Health Services & Research, 37(2), 252–271.
Hofferth, S. L. (2005). Secondary data analysis in family research. Journal of Marriage and Family, 67(4), 891–907.
Pearce, A., Jenkins, R., Kirk, C., & Law, C. (2008). An evaluation of UK secondary data sources for the study of childhood obesity, physical activity and diet. Child: Care, Health and Development, 34(6), 701–709.
Smith, A. K., Ayanian, J. Z., Covinsky, K. E., Landon, B. E., McCarthy, E. P., Wee C. C., & Steinman, M. A. (2011). Conducting high-value secondary dataset analysis: An introductory guide and resources. Journal of General Internal Medicine, 26(8), 920–929.
Yiannakoulias, N. (2011). Understanding identifiability in secondary health data. Canadian Journal of Public Health, 102(4), 291–293.
FedStats. (2015). Retrieved from http://fedstats.sites.usa.gov/
Inter-University Consortium for Political and Social Research. (2017). Retrieved from http://www.icpsr.umich.edu/icpsrweb/landing.jsp
National Institutes of Health. (n.d.). Retrieved from http://www.nih.gov/
Partners in Information Access for the Public Health Workforce. (2014). Health data tools and statistics. Retrieved from https://phpartners.org/health_stats.html
Walden University. (n.d.b). Center for Research Quality: Research resources. Retrieved from http://academicguides.waldenu.edu/researchcenter/orqm/researchresources
Walden University. (n.d.j). Office of Student Research Administration: DHA Doctoral Study. Retrieved from http://academicguides.waldenu.edu/researchcenter/osra/DHA
Note: At this website, locate and review the Secondary Data Resources for DHA Students, which is titled Secondary Data Source.
Laureate Education (Producer). (2013). Introduction to secondary data [Video file]. Baltimore, MD: Author. Note: The approximate length of this media piece is 10 minutes.
The Video transcript
Introduction to Secondary Data
FEMALE SPEAKER: In this video program, Talmadge Holmes and Tammy Root
think about the issues you should consider when using secondary data in
research. Reflect upon their perspectives presented about using secondary data.
Then consider the purpose of secondary data and when it may be appropriate for
you to use secondary data sets in your own research.
TALMADGE M. HOLMES, PHD, MPH: One of the barriers in doing research,
especially if you are doing primary data collection, is sometimes you don’t know
where the data are. In my specific case, I had to physically go to a facility and
look through paper records to find the data that I needed.
And of course with any data gathering process– and this is a form of secondary
data analysis in that I wasn’t actually interviewing people or doing anything like
that. So I had to accept what the data provided me. So therefore the same issues
exist. The data might not be complete. The data might not be recorded the same
way from time to time. So those are all common barriers when you’re doing
primary, slash, secondary data collection.
But today more than ever, there is a tendency and actually a preference to do a
secondary data analysis. So here’s a situation where we have data at our
disposal. And now we need to come to the research question, develop the
research question or questions given the data.
The process is, first, what is the data. Characterizing the data. What are the
different data elements that are there? How are they measured? We need to be
very familiar with that data to start out with, otherwise we can’t really ask a
relevant question. And more importantly, we won’t know how to do the analysis.
So we’ve got to start out being very familiar with the data if that’s possible.
A lot of data that are collected and are in the public access have things called
data dictionaries out there. And those basically represent what I’m talking about
in terms of being familiar with the data. Usually there are the variables that are
listed, how they’re measured, and any other characteristics of the data that the
sponsoring agency has collected for use. And that’s what I would say would be
the first step is familiarity with the data.
I have had many Ph.D. students now at Walden. And several of them have done
secondary data analysis using existing public data, if you will. One of my
students used the NHANES data, which is data collected by the National Center
for Health Statistics, which is a part of the Centers for Disease Control and
And NHANES has been an ongoing study for many years now. And each year
they may add a few questions on a specific topic. But generally speaking, they’ve
asked the same questions over many years of cohorts of individuals. And there’s
a lot of data there. So this allows a lot of opportunity for doing this, taking data
and asking specific research questions based on those data.
At one point in time I was very much against Ph.D. Students using secondary
data. And that partly is based on my own experience. Because I did, essentially,
primary data collection for both my Master’s and Ph.D.
And I had an experience with a colleague who did not have that experience. She
actually was handed a data file and developed her question from the data file,
based on the data file. And then went to her first job, and was asked to do a
study not based on secondary data but to gather the data. And unfortunately, she
had no experience with that. So it can be a problem.
But a lot of our students at Walden have had some experience with perhaps not
doing a study, but they understand about differences in quality, differences in
completeness, those kinds of things.
So one of my processes or policies is that I will actually try to get a student to tell
me what experience they’ve had with that. And then I feel more comfortable
saying to them, especially if they’re proposing to use secondary data, I feel a lot
more comfortable with them doing that as long as they understand these issues
and can talk about them in chapter five and other places in the dissertation where
we need to talk about that as potential limitations to the study.
So I have not changed my mind. I would still rather there be primary data
collection. But given the world as it is today and a lot of data we have out there,
information overload, there’s a lot of data out there that we can take advantage
of. And students often have access to those data.
TAMMY ROOT: Well, my research first started with my dissertation. And my
dissertation was focused on expanding the latent class model for small samples.
And when I was developing this methodology, I thought about, well, I need a
small sample. And something that had come to mind was eating disorders and
substance use. So my dissertation was heavily mathologically focused.
Throughout the research process I’ve experienced lots of barriers, as I’m sure
most colleagues and students will also attest to. One of the barriers I
experienced was data acquisition, finding the data that I needed. I’ve run into
several issues, because I’ve used secondary data analysis throughout my entire
publishing career, with having the data in a format that is friendly enough to use
in a particular data analysis package.
When using secondary data analysis, there are several things to keep in mind.
The first one is data veracity. You want to make sure that the data that you’re
using are valid, and reliable, and were collected in a way that is appropriate.
And one thing to also keep in mind with regard to this is who sponsored the data
collection. For example, if you are interested in looking at smoking behavior, and
Phillip Morris was a sponsor for a particular data set, then you have to have
some caution in terms of what those findings were. So it’s important to know who
collected the data, how they were collected, and who sponsored the data
You also have to keep in mind that when you’re using secondary data, you’re
limited to the data that was collected. And you’re also limited to the scales of
measurement in which that data were collected. So if you did not want to do
some type of discrete analysis, but yet your primary outcome is discrete, that
could cause a problem. You can also lose information if you have a discreet
outcome versus a continuous outcome. So that’s something to keep in mind.
Also, you often, unless it’s a publicly available data set, have to have a data
sharing agreement in place with the owner of the data. So if you’re not using a
publicly available data set, you have to consider this and be sure that you
understand what the data sharing agreement entails.
And this can also be related to publishing. If you are using a data set from
someone else where a data sharing agreement is in place, you will sometimes
have to have publishing guidelines set forth initially.
There’s also cost, potentially. If you don’t have access to a publicly available data
set, data can cost quite a bit of money to use. And sometimes the data that you
need for your particular research isn’t available publicly.
For my dissertation I was interested in eating disorders and substance use. I
could not find a publicly available data set. So, what I did was I was part of a list serve in the eating disorder community. And I sent out an email to everyone on
the list-serve explaining what my dissertation was going to be examining, and did
anyone have archival secondary data that they would be interested in sharing
with me. And that is how I found my data.
With that though, it was not a publicly available data set. So, I had to make sure
that the data sharing agreement met with both my and my adviser’s approval.
Topics, Research Question, and Data Set
In a previous week, you developed research questions for your Doctoral Study Prospectus. The next step in developing your Prospectus is selecting a suitable data set. To determine an appropriate data set, there are a number of questions that you will need to answer. For example, is your research question of global scale? Or will you focus on a local community? Are you focusing on a single hospital, or a hospital system? Are you looking at the Medicare population within a state, or comparing states? Do you have command of the language used in the data set? Are the data current enough or old enough for your needs? In this week’s Learning Resources, authors Evans, Grella, Murphy, and Hser (2010); Hofferth (2005); and Smith et al. (2011) offer additional considerations, while Yiannakoulias (2011) reminds us of some of the ethical sensitivities that arise with secondary data sources.
1. For this Discussion, you will research secondary data sources, select a secondary quantitative data set from these data sources to answer your research question, and explain the rationale behind your selection.
2. Research secondary data sources, and then select a secondary quantitative data set that could be used to help answer the research questions you developed last week.
By Day 4
Submit a 2- to 3-paragraph post that includes the following:
· The topic you have chosen for your Doctoral Study Prospectus
· The research questions you have chosen
· The secondary data set you would consider to answer your research questions and the rationale behind selecting it for this topic
· Support your Discussion with citations and specific references to all resources used in its preparation. You are asked to provide a reference list for all resources, including those in the resources for this course.
Read a selection of your colleagues’ postings.
By Day 6
Respond to at least two of your colleagues with one or both of the following:
· Describe potential limitations with your colleague’s data set (i.e., what it cannot answer).
· Suggest possible alternative data sets.
Return to this Discussion in a few days to read the responses to your initial posting. Note any insights you have gained as a result of the comments your colleagues made.
Submission and Grading Information
To access your rubric:
Week 8 Discussion Rubric
Post by Day 4 and Respond by Day 6
To participate in this Discussion: Week 8 Discussion
Revised Doctoral Study Prospectus
Now that you have received additional feedback on your revised submission of your Doctoral Study Prospectus from your Instructor, it is time to refine your document to ensure it clearly communicates a general sense of the direction of your research. To complete this Assignment, ensure you have addressed and incorporated any feedback from your Instructor on your initial Doctoral Study Prospectus.
The Assignment (3 pages):
· Incorporate any Instructor feedback you have received and additional information learned from this week’s discussion on secondary data and, following the guidance found in the Doctoral Study Prospectus document, create a revised version of your Prospectus.
By Day 7
Submit your Assignment.
Submission and Grading Information
To submit your completed Assignment for review and grading, do the following:
Please save your Assignment using the naming convention “WK8Assgn+last name+first initial.(extension)” as the name.
Click the Week 8 Assignment Rubric to review the Grading Criteria for the Assignment.
Click the Week 8 Assignment link. You will also be able to “View Rubric” for grading criteria from this area.
Next, from the Attach File area, click on the Browse My Computer button. Find the document you saved as “WK8Assgn+last name+first initial.(extension)” and click Open.
If applicable: From the Plagiarism Tools area, click the checkbox for I agree to submit my paper(s) to the Global Reference Database.
Click on the Submit button to complete your submission.
To access your rubric: