This section explains how to set up a file with new data.After finishing this chapter, you should be able to create a SPSS data file that will include 1) the data and 2) some labeling indicating what the data is about.Also, if you don’t have complete data for a case such as, if someone didn’t answer a question or if they chose two answers to a question, you will be able to mark it as missing so it will be excluded from the analysis.To illustrate this process, we will use a shortened version of the questionnaire used by the General Social Survey (GSS) conducted by the National Opinion Research Center (NORC).For this example, our students wanted to see if their opinions on social issues were similar to those of the national sample.More details can be found by looking at theGeneral Social Survey codebook. See General Social Survey, Davis, Smith, and Marsden, 2001.
The students knew they were not a representative sample, even of college students, but this questionnaire is an interesting way to learn how to create a new data file.They decided to use the following questions:
·What
is your age?
·Are you male or female?
·What is your religious preference?
·Generally speaking, in politics do you consider yourself as conservative, liberal, middle of the road?
·What kind of marriage do you think is the more satisfying way of life:one where the husband provides for the family and the wife takes care of the house and children or one where both the husband and wife have jobs and both take care of the house and children?
·Do you think it should be possible for a pregnant woman to obtain a legal abortion:
If there is a strong chance of a serious defect in the baby? [ABDEFECT]]
If she is married and does not want any more children? [ABNOMORE]
If the woman's own health is seriously endangered by pregnancy? [ABHLTH]
If the family has a very low income and cannot afford any more children? [ABPOOR]
If she became pregnant as a result of rape? [ABRAPE]
If she is not married and does not want to marry the man? [ABSINGLE]
If the woman wants it for any reason [ABANY]
Basic
Steps in Creating a Data File
There are a few things that always need to be done to create a data file.It is best to start your data file with some careful planning.
1.First we will want to assign each respondent an identification number, not so individuals can be identified, but so we can keep track of each case when we go back to check the accuracy of the data entering.For each question (variable), we need a variable name that is simple but expresses something about the variable.SPSS limits variable names to eight characters or less starting with a letter.Variable names can contain numbers or letters but not spaces and only a few special characters are permitted, so don’t use any odd symbols.AGE and SEX would be easy variable names for the first two questions.For the questions on abortion, we decided to use the first three characters of the variable names used by the General Social Survey (in brackets after each question).We used MG for the preferred type of marriage and called political orientation C-L.Each variable name can be given an extended variable label that gives more detail, and they can use spaces or special characters.For example, C-L could have a variable label that said Conservative-Liberal.
2.After we have given each variable a name and label, we give each possible response to the question a code called a value label that is often the number corresponding to the order of the answers.(We could use another system, but this is the easiest because SPSS works best with numeric codes to represent the data.)For example, SEX could use 1 for male and 2 for female; C-L could use 1 for conservative, 2 for liberal, and 3 for middle of the road.These would be given value labels such as Male, Female, Conservative, Liberal, Middle of the Road.
3.Sometimes respondents do not answer a question, give more than one answer, or do something else that would make their answers unusable.In our example, respondent #2 marked both yes and no on the last question, respondent #3 wrote in none on question 4, and respondent #13 didn’t answer the marriage question.We can assign these missing value codes so they don’t mess up the analysis.Often 9is used to indicate missing data or 99 if it is a two-digit value.(Note that this would cause problems in the analysis if 9 or 99 were real codes, for example, if there were 9 possible responses to a question or if age included some ninety-nine-year-olds.So think carefully before you choose numbers for missing values.).
It is a good idea to plan all this carefully.It is often useful to put the data in a matrix like Table 2.1 before entering it into the SPSS Data Editor.
Table 2.1. Sample Data Set: Questionnaire Responses
|
id
|
age
|
sex
|
rel
|
c-l
|
mg
|
abd
|
abn
|
abh
|
abp
|
abr
|
abs
|
aba
|
|
01
|
20
|
1
|
4
|
2
|
2
|
2
|
2
|
1
|
3
|
1
|
2
|
2
|
|
02
|
24
|
2
|
5
|
2
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
9
|
|
03
|
21
|
2
|
2
|
9
|
2
|
2
|
2
|
2
|
2
|
2
|
2
|
2
|
|
04
|
24
|
2
|
5
|
3
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
05
|
26
|
2
|
4
|
2
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
06
|
28
|
2
|
2
|
2
|
2
|
2
|
2
|
1
|
2
|
1
|
2
|
2
|
|
07
|
23
|
1
|
1
|
2
|
2
|
1
|
2
|
1
|
1
|
1
|
2
|
2
|
|
08
|
22
|
2
|
4
|
3
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
09
|
22
|
1
|
5
|
2
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
10
|
22
|
2
|
4
|
4
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
|
11
|
23
|
1
|
2
|
2
|
1
|
2
|
2
|
1
|
2
|
1
|
2
|
3
|
|
12
|
24
|
2
|
2
|
3
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
2
|
|
13
|
51
|
2
|
1
|
2
|
9
|
1
|
1
|