The term variable
refers to each set of response categories and the associated codes. A
machine-readable data set (whether stored on computer tape or computer disk)
consists of codes for each individual in the sample corresponding to the response
categories for the variables included in the data set. Suppose, for example, the
earlier question on whether civil rights demonstrations have helped Negroes is
the tenth variable in a survey. Suppose, also, that the first respondent in the
sample had said that demonstrations “helped a little.” The data set would then
include a “2” in the tenth location for the first individual. To know precisely
what is included in a data set and wherein the data set is located, a codebook
is prepared and used as a map of the data set. Here, it is sufficient to note that the rudimentary
materials necessary to carry out the sort of analysis dealt with in this book
are a data set, a codebook for the data set, and documentation that describes
the sample; we will not be concerned with data collection problems for
preparing a machine-readable data set, except in passing. These topics require
full treatment in their own right, and we will not have time for them here.
It is customary to classify variables
according to their level of measurement: nominal, ordinal, interval, or ratio.
Nominal variables consist simply of mutually exclusive and collectively
exhaustive categories. Religious affiliation is an example of such a variable.
For example, we might have the following response categories and codes:
Student 1
Local
authorities 2
Teachers 3
Other 4
None 5
No
answer 9
Note that no order is implied
among the responses-no response is “better” or “higher” than any other. The
variable provides a way of classifying people into religious groups. Note,
further, that every individual in the survey has a code, even those
Comments
Post a Comment