Design, measurement, and analysis
Introduction
Many people equate quantitative research with statistical analysis.
Indeed, statistics is only a subset of data analysis, and data analysis
is only one of three components of quantitative research. The three
components are:
- Research design
- Measurement
- Data analysis
Pedhazur and Schmelkin (1991) wrote a clear introduction regarding
the difference and the integration of the above three quantitative
research components. Creswell (2005) also presented a concise history
of quantitative research methodology by partitioning quantitative
methods into these three domains.
Research design: Create the form
There are many forms of research designs. One of
the seminal works on research designs is Shadish, Cook, and Campbell (2002). If you want
to employ experimentation, design of experiment (DOE) will be your focal point.
DOE is the optimization of a research setting that can minimize overall background noise, which is also called experimental error. Usually a schematic form is used to specify the factors and the levels of factors. |
|
DOE is a crucial step because it affects subsequent steps of research.
"If your result needs a statistician, then you should design a better
experiment." This saying has been attributed to British physicist
Ernest Rutherford but no one could identify a credible source.
Nonetheless, the idea is important: design precedes statistics.
If you have a good research design, you don't need complicated
statistics to interpret the result. You need sophisticated statistics
when there is a lot of noise in the experiment. Likewise, Box, Hunter,
and Hunter (1978) pointed out,
Frequently conclusions are easily drawn from a well-designed
experiment, even when rather elementary methods of analysis are
employed. Conversely, even the most sophisticated statistical analysis
cannot salvage a badly designed experiment. (p. vii)
Simply put, sound research design precedes data
collection. The father of exploratory data analysis John Tukey (1986a)
wrote, "the combination of some data and an aching desire for an answer
does not ensure that a reasonable answer can be extracted from a given
body of data" (p.72). In a similar vein, Kerlinger (1960) noted that
there was a naive
misconception that gathering data constitutes research. Kerlinger
(1986) criticized that many researchers blindly apply standard data
collection and analytic strategies without regard for larger research
design issues. For example, many researchers used analysis of variance
even though the independent variables were intervally-scaled or
collected in non-experimental settings.
Well-designed research should meet the following three criteria:
- Completeness: Poor planning can lead to incomplete designs
that leave important questions unanswered. Researchers should start
with clear research questions, explicit hypotheses, operationalized
variables (or open concepts), and foresee how data are collected to address the questions.
- Efficiency: Poor planning can lead to redundant
data collection while an efficient design provides the information to
the researchers at a fraction of the cost of a poor design.
- Insight: A well-designed experiment allows
researchers to reveal the patterns in the data that would be difficult
to spot in a simple table of hastily collected data.
Measurement: Fill with substance
|
Measurement
is a process of assigning numbers to the attributes and properties of
something. In conjunction to the design process, measurement can be
viewed as filling the form with substance. |
A well-planned experimental design is a necessary
condition, but not a sufficient condition to good research. Blalock
(1974) warned researchers how measurement error could cripple
statistical analysis:
The more errors that creep into the data collection stage, the more
complex our analyses must be in order to make allowances for these
errors. In many instances we may discover that there will be too many
unknowns and that untested a priori assumptions must be made before one
can proceed with the analysis...All this implies that our knowledge of
methodological principles and the availability of appropriate
mathematical and statistical tools and models will never, in
themselves, be sufficient. To the degree that our methodological
sophistication greatly exceeds our ability to collect adequate data,
disillusionment is likely to result. (p.2)
(Blalock might be too optimistic to our knowledge of mathematical and
statistical tools. I found that in many research projects design,
measurement and analysis are all problematic.)
Likewise, Bond and Fox (2015) asserted that measurement is
fundamental. They said that although psychologists and others in human
sciences have been effective in applying sophisticated statistical
procedures to their data, there is no way to replace bad measurement
with good statistical procedures. They wrote,
Measurement systems are ignored when we routinely express
the results of our research interventions in terms of either
probability levels of p < 0.01 or p < 0.05, or—better yet—as
effect sizes. Probability levels indicate only how un/likely it is that
A is more than B or that C is different from B, and effect size is
meant to tell us by how much the two samples under scrutiny differ.
Instead of focusing on constructing measures of the human condition,
psychologists and others in the human sciences have focused on applying
sophisticated statistical procedures to their data. Although
statistical analysis is a necessary and important part of the
scientific process, and the authors in no way would ever wish to
replace the role that statistics play in examining relations between
variables, the argument throughout this book is that quantitative
researchers in the human sciences are focused too narrowly on
statistical analysis and not concerned nearly enough about the nature
of the data on which they use these statistics (p.1).
Some researchers went even further to assert that measurement is the
core of research because revolutions in science have often been
preceded by measurement revolutions (Cukier, 2010). One obvious example
is that the advance of data mining and big data analytics is driven by
the rapid expansion of data warehousing and our capability of
collecting "big data".
Data analysis: Manipulate the substance within the form
The objective of data analysis is to
make sense out of the data under the given design. data analysis is the
manipulation of the substance within the form. The objective of
analysis is not to make things more complicated, rather it is data reduction and clarification (Pedhazur, 1982)
It is important to point out that data analysis is by no means equated with statistical analysis
(Tukey, 1986b). Statistical analysis is essentially tied to probability.
Data analysis involves probability when it is needed, but avoids
probability when it is improper. For example, exploratory data analysis
focuses on the data pattern rather than on probabilistic inferences
when the data structure is unclear and a particular model should not be
assumed. Resampling, which is based upon empirical probability rather
than the traditional probability theory, can be employed when the
sample size is small, data structure does not meet parametric
assumptions, or both.
|
|
Positive feedback loop
Please notice that design, measurement, and analysis is not
a linear process. Rather it is a circular process (SPSS Inc, 1997) or a
positive feedback loop , as shown in the following animation:
Researchers start with some prior knowledge, which can
be based upon past research or conventional wisdom (a polite term for
"common sense"), to formulate research questions and design the
experiment. At the stage of interpretation, researchers should reply on
not only the result of the analysis, but also prior knowledge.
Afterwards, new knowledge is generated and can be used for new research
questions, new experimental design, and so on.
Causal inference
One of the objectives of conducting experiments is to make
causal inferences. At least three criteria need to be fulfilled to
validate a causal inference (Hoyle, 1995):
- Directionality: The independent variable affects the dependent variable.
- Isolation: Extraneous noise and measurement errors must
be isolated from the study so that the observed relationship cannot be
explained by something other than the proposed theory.
- Association: The independent variable and the dependent variable are mathematically correlated.
To establish the direction of variables, the researcher can apply logic
(e.g. physical height cannot cause test performance), theory (e.g.
collaboration affects group performance), and most powerfully, research
design (e.g. other competing explanations are ruled out from the
experiment). Some researchers hastily assign an event as the dependent
variable and another one as the independent variable. Later the data
might seem to "confirm" this causal relationship. However, the actual
relationship may be reversed or bi-directional. For instance, when the
economy is improving, the crime rate drops significantly. One may
assert that because more and more people are employed and earn a higher
salary, they don't need to do anything illegal to exhort money.
However,
it is also possible that because we have a safer society as a result of
a lower crime rate, foreign investment pours in the market and thus
boosting jobs and the economy. Another famous example is: Does marriage
make people happier, or do happy people tend to get married (Stutzer
& Frey, 2006)?
To meet the criterion of isolation, careful measurement should be
implemented to establish validity and reliability, and to reduce
measurement errors. In addition, extraneous variance, also known as
threats against validity of resarch, must be controlled in the
research design.
Last, statistical methods are used to calculate the mathematical
association among variables. However, in spite of a strong mathematical
association, the causal inference may not make sense at all if
directionality and isolation are not established.
In summary, statistical analysis is only a small part of the entire
research process. It is only a subset of data analysis, and data
analysis is one of the three components of quantitative research. Hoyle
(1995) explicitly warned that researchers should not regard statistical
procedures as the only way to establish a causal and effect
interpretation.
Last updated: 2016
References
- Blalock, H. M. (1974). (Ed.) Measurement in the social sciences: Theories and strategies. Chicago, Illinois: Aidine Publishing Company.
- Bond, T. G., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences. New York, NY: Routledge.
- Box, G. E. P., Hunter, W. G., & Hunter, J. S. (1978). Statistics for experimenters. New York: Wiley.
- Creswell, J. (2005). Educational research: Planning, conducting, and evaluating quantitative and qualitative research. Upper Saddle River, NJ: Pearson.
- Cukier, K. (2010). Data, data everywhere. The Economist, 25. Retrieved from http://www.economist.com/node/15557443
- Hoyle, R. H.. (1995). The structural equation modeling approach: Basic concepts and fundamental issues. In R. H. Hoyle (Eds.),
Structural equation modeling: Concepts, issues, and applications (pp. 1-15). Thousand Oaks: Sage Publications.
- Kerlinger, F. N. (1960). The mythology of educational research: The methods approach. School and Society, 88, 149-151.
- Kerlinger, F. N. (1986). Foundations of behavioral research (3rd ed.). Forth Worth, TX: Holt, Rinehart and Winston.
- Pedhazur, E. J. (1982). Multiple regression in behavioral research: explanation and predication (2nd ed.). Forth Worth, TX: Harcourt Brace College Publishers.
- Pedhazur, E. J. & Schmelkin, L. P. (1991). Measurement, design, and analysis : An integrated approach. Hillsdale, N.J.: Lawrence Erlbaum Associates.
- Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs
for generalized causal inference. Boston, MA: Houghton Mifflin.
- SPSS, Inc. (1997). Trial Run 1.0 user guide. Chicago, IL: Author.
- Stutzer, A., & Frey, B. (2006). Does marriage make people happy, or do happy people get married? Journal of Socio-Economics, 35(2), 326-347. https://doi.org/10.1016/j.socec.2005.11.043
- Tukey, J. (1986a). Sunset Salvo. American Statistician, 40(1), 72-76.
- Tukey, J. (1986b). The collected works of John W. Tukey (Volume IV): Philosophy and principles of data analysis: 1965-1986.
Monterey, CA: Wadsworth & Brooks/Cole.
Last updated: 2021
Go up to the main menu
|
|