The purpose of these extended notes is to describe the rationale of this series of exercises on research methods, to describe the data set and statistical package used, and to suggest other ways that you might use these exercises.
These exercises were built around the idea of a research design. All research starts with one or more research questions. A research design is your plan of action. It lays out how you plan to go about answering your questions. The research design includes how you plan to select the cases for analysis (sampling), how you will measure concepts, how you plan to collect your data, and how you will analyze the data. The first exercise focuses on constructing good research questions and introduces the idea of a research design. Exercises two through five focus on the components of a research design and exercises six through thirteen deal with data analysis.
Part I – Choosing the Data Set
I wanted to use a data set that met three requirements:
- it is publicly accessible,
- It will be interesting to students, and
- the data are current.
The data set that I chose is the Monitoring the Future Survey of high school seniors in the United States that has been conducted yearly since 1975. There is a website that will give you a lot of information about this study. Here’s a brief description from the website’s home page.
“Monitoring the Future is an ongoing study of the behaviors, attitudes, and values of American secondary school students, college students, and young adults. Each year, a total of approximately 50,000 8th, 10th and 12th grade students are surveyed (12th graders since 1975, and 8th and 10th graders since 1991). In addition, annual follow-up questionnaires are mailed to a sample of each graduating class for a number of years after their initial participation.”
A major focus of these surveys is students’ drug use. But the surveys include a lot more information than just drug use. The website describes the range of questions asked.
“Questions include drug use and views about drugs, delinquency and victimization, changing roles for women, confidence in social institutions, concerns about energy and ecology, and social and ethical attitudes.”
These are only a few of the areas that students are asked about. Other areas include, for example, their educational goals, religion, politics, the military, race, health, and background information including their family.
The data are publicly accessible. The data set and information about this study are archived at the Inter-university Consortium for Political and Social Research (ICPSR) located at the University of Michigan. You will need an account at ICPSR to access the data. Start by going to their website. In the upper-right corner of the home page click on “Log In/Create Account.” Scroll down and click on “Create Account” below “New User.” Fill in the requested information and click on “Submit.” It will create your account and give you access to the ICPSR archive. You can use your account from anywhere you have internet access. If you don’t use your account for six months, your account will go away.
If you are a student, faculty member or staff at a university or college that belongs to the ICPSR, you will have access to all the archive’s data holdings. If you are not, then you will only have access to public-use data. Fortunately, the Monitoring the Future Surveys were funded for public access so you have access to this study regardless of your status.
Once you have created your account, click on “Find Data” in the menu bar at the top of the screen. Then type “Monitoring the Future” in the “Find Data” box. Look through the search results for the following. It will likely be one of the first couple of search outcomes.
Monitoring the Future: A Continuing Study of the Lifestyles and Values of Youth, 1994 (ICPSR 6517)
Bachman, Jerald G.; Johnston, Lloyd D.; O'Malley, Patrick M.
Click on the link in the lower right for the “Monitoring the Future (MTF) Series." Scroll down a little ways until you see “Most Recent Studies” and click on the one that says “Monitoring the Future: A Continuing Study of American Youth (12th-Grade Survey), 2015.” That’s the survey that we will be using in these exercises.
This is a very large data set both in terms of variables and number of cases. A little fewer than 14,000 high school seniors completed the survey. In order to include the large number of questions that the researchers wanted to ask they randomly divided the sample into six subsets. Each subset was asked a series of core questions that was common to all six subsets and a series of questions that were unique to that subset. Under “Dataset(s)” you will see seven downloadable data sets – the core data and the data from the six subsets. To simplify this for students we’re only going to use the core data set in these exercises. This is referred to as the “DS1: Core Data.”
Part II – Choosing the Statistical Package
Exercises 6 through 13 introduce students to the principles of data analysis. There are a number of statistical packages that we could use including SPSS, PSPP, and SDA. We'll be using SDA in these exercises. SDA stands for Survey Documentation and Analysis which is an online statistical package written by the Survey Methods Program at UC Berkeley. SDA can be used without cost wherever one has an internet connection. Students can be shown how to use SDA in approximately ten minutes making it unnecessary to spend valuable class time on how to use the statistical package. There is also an extensive help menu available to users of SDA. However, students will not need not the help menu. Everything they need to know is included in these exercises. In order to avoid repeating some of the SDA instructions in each exercise, I created an Appendix which is an Introduction to SDA. This introduction covers opening the data set, navigating the SDA dialog boxes, and carrying out the statistical analysis. You will want to have your students read this appendix for many of these exercises. Information more specific to a particular exercise will be covered in that exercise.
There is also a spreadsheet which shows the methodological and statistical terms introduced in each exercise. This can be used to see which terms are discussed in each exercise.owHoooo
SDA is free, accessible over the internet, and easy to use. One disadvantage to SDA is that you will not be able to use your own data to create a data set that SDA can read. Doing that requires a site license and there is a somewhat steep learning curve for creating your own SDA data set. However, this is not a problem for these exercises since we’re using a SDA data set that has already been prepared for us.
Part III – Which Statistical Techniques Should We Use?
For most of the exercises, I have limited the statistical techniques used to those that assume only nominal or ordinal measurement. The one exception is exercise 7RM which includes a discussion of the mean, range, standard deviation, and variance which assume interval or ratio level measurement. For bivariate (two-variable) and multivariate (sets of three or more variables) analysis we will use crosstabulation.
I did this for two reasons. First, these exercises are designed for a research methods course and not a statistics course. I wanted to provide an introduction to data analysis and not a complete statistics course. Second, the Monitoring the Future surveys include only nominal and ordinal variables.
However, if you want to include a broader range of statistics such as measures of skewness and kurtosis, correlation, and regression there are three series of statistical exercises that you could use. One of these uses SPSS, another uses PSPP, and the third uses SDA. The data set used in all three of these series is the General Social Survey which includes nominal, ordinal, and ratio level variables.
Part IV – Other Notes
Since these exercises were written so each exercise was independent of the other exercises, there is some duplication from exercise to exercise. If you are using several exercises, you may want to remove some of that duplication.
In most of the exercises I do not discuss how to compute the various statistics. You might want to add such material to these sections.
You have permission to use these exercises and to revise them to fit your needs. Feel free to revise the exercises in any way you want. Just recognize the source of the original exercise. Please send me a copy of the revised exercise so I can see how others are using it.
If you would like to contact me, please email me at email@example.com. I’m Professor Emeritus at California State University, Fresno in the Sociology department. I taught research methods, statistics, and critical thinking before retiring and now teach a critical thinking course part time.