# Design, measurement, and analysis

 Chong-ho Yu, Ph.Ds.

### Introduction

Many people equate quantitative research with statistical analysis. Indeed, statistics is only a subset of data analysis, and data analysis is only one of three components of quantitative research. The three components are:
• Research design
• Measurement
• Data analysis

Pedhazur and Schmelkin (1991) wrote a clear introduction regarding the difference and the integration of the above three quantitative research components. Creswell (2005) also presented a concise history of quantitative research methodology by partitioning quantitative methods into these three domains.

### Research design: Create the form

 There are many forms of research designs. One of the seminal works on research designs is Shadish, Cook, and Campbell (2002). If you want to employ experimentation, design of experiment (DOE) will be your focal point. DOE is the optimization of a research setting that can minimize overall background noise, which is also called experimental error. Usually a schematic form is used to specify the factors and the levels of factors.

DOE is a crucial step because it affects subsequent steps of research. "If your result needs a statistician, then you should design a better experiment." This saying has been attributed to British physicist Ernest Rutherford but no one  could identify a credible source. Nonetheless, the idea is important: design precedes statistics. If you have a good research design, you don't need complicated statistics to interpret the result. You need sophisticated statistics when there is a lot of noise in the experiment. Likewise, Box, Hunter, and Hunter (1978) pointed out,

Frequently conclusions are easily drawn from a well-designed experiment, even when rather elementary methods of analysis are employed. Conversely, even the most sophisticated statistical analysis cannot salvage a badly designed experiment. (p. vii)
Simply put, sound research design precedes data collection. The father of exploratory data analysis John Tukey (1986a) wrote, "the combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data" (p.72). In a similar vein, Kerlinger (1960) noted that there was a naive misconception that gathering data constitutes research. Kerlinger (1986) criticized that many researchers blindly apply standard data collection and analytic strategies without regard for larger research design issues. For example, many researchers used analysis of variance even though the independent variables were intervally-scaled or collected in non-experimental settings.

Well-designed research should meet the following three criteria:

• Completeness: Poor planning can lead to incomplete designs that leave important questions unanswered. Researchers should start with clear research questions, explicit hypotheses, operationalized variables (or open concepts), and foresee how data are collected to address the questions.

• Efficiency: Poor planning can lead to redundant data collection while an efficient design provides the information to the researchers at a fraction of the cost of a poor design.

• Insight: A well-designed experiment allows researchers to reveal the patterns in the data that would be difficult to spot in a simple table of hastily collected data.

### Measurement: Fill with substance

 Measurement is a process of assigning numbers to the attributes and properties of something. In conjunction to the design process, measurement can be viewed as filling the form with substance.
A well-planned experimental design is a necessary condition, but not a sufficient condition to good research. Blalock (1974) warned researchers how measurement error could cripple statistical analysis:

The more errors that creep into the data collection stage, the more complex our analyses must be in order to make allowances for these errors. In many instances we may discover that there will be too many unknowns and that untested a priori assumptions must be made before one can proceed with the analysis...All this implies that our knowledge of methodological principles and the availability of appropriate mathematical and statistical tools and models will never, in themselves, be sufficient. To the degree that our methodological sophistication greatly exceeds our ability to collect adequate data, disillusionment is likely to result. (p.2)
(Blalock might be too optimistic to our knowledge of mathematical and statistical tools. I found that in many research projects design, measurement and analysis are all problematic.)

Likewise, Bond and Fox (2015) asserted that measurement is fundamental. They said that although psychologists and others in human sciences have been effective in applying sophisticated statistical procedures to their data, there is no way to replace bad measurement with good statistical procedures. They wrote,

Measurement systems are ignored when we routinely express the results of our research interventions in terms of either probability levels of p < 0.01 or p < 0.05, or—better yet—as effect sizes. Probability levels indicate only how un/likely it is that A is more than B or that C is different from B, and effect size is meant to tell us by how much the two samples under scrutiny differ. Instead of focusing on constructing measures of the human condition, psychologists and others in the human sciences have focused on applying sophisticated statistical procedures to their data. Although statistical analysis is a necessary and important part of the scientific process, and the authors in no way would ever wish to replace the role that statistics play in examining relations between variables, the argument throughout this book is that quantitative researchers in the human sciences are focused too narrowly on statistical analysis and not concerned nearly enough about the nature of the data on which they use these statistics (p.1).

Some researchers went even further to assert that measurement is the core of research because revolutions in science have often been preceded by measurement revolutions (Cukier, 2010). One obvious example is that the advance of data mining and big data analytics is driven by the rapid expansion of data warehousing and our capability of collecting "big data".

### Data analysis: Manipulate the substance within the form

 The objective of data analysis is to make sense out of the data under the given design. data analysis is the manipulation of the substance within the form. The objective of analysis is not to make things more complicated, rather it is data reduction and clarification (Pedhazur, 1982) It is important to point out that data analysis is by no means equated with statistical analysis (Tukey, 1986b). Statistical analysis is essentially tied to probability. Data analysis involves probability when it is needed, but avoids probability when it is improper. For example, exploratory data analysis focuses on the data pattern rather than on probabilistic inferences when the data structure is unclear and a particular model should not be assumed. Resampling, which is based upon empirical probability rather than the traditional probability theory, can be employed when the sample size is small, data structure does not meet parametric assumptions, or both.

### Positive feedback loop

Please notice that design, measurement, and analysis is not a linear process. Rather it is a circular process (SPSS Inc, 1997) or a positive feedback loop , as shown in the following animation:

Researchers start with some prior knowledge, which can be based upon past research or conventional wisdom (a polite term for "common sense"), to formulate research questions and design the experiment. At the stage of interpretation, researchers should reply on not only the result of the analysis, but also prior knowledge. Afterwards, new knowledge is generated and can be used for new research questions, new experimental design, and so on.

### Causal inference

One of the objectives of conducting experiments is to make causal inferences. At least three criteria need to be fulfilled to validate a causal inference (Hoyle, 1995):

• Directionality: The independent variable affects the dependent variable.
• Isolation: Extraneous noise and measurement errors must be isolated from the study so that the observed relationship cannot be explained by something other than the proposed theory.
• Association: The independent variable and the dependent variable are mathematically correlated.
To establish the direction of variables, the researcher can apply logic (e.g. physical height cannot cause test performance), theory (e.g. collaboration affects group performance), and most powerfully, research design (e.g. other competing explanations are ruled out from the experiment). Some researchers hastily assign an event as the dependent variable and another one as the independent variable. Later the data might seem to "confirm" this causal relationship. However, the actual relationship may be reversed or bi-directional. For instance, when the economy is improving, the crime rate drops significantly. One may assert that because more and more people are employed and earn a higher salary, they don't need to anything illegal to exhort money. However, it is also possible that because we have a safer society as a result of a lower crime rate, foreign investment pours in the market and thus boosting jobs and the economy.

To meet the criterion of isolation, careful measurement should be implemented to establish validity and reliability, and to reduce measurement errors. In addition, extraneous variance, also known as threats against validity of resarch, must be controlled in the research design.

Last, statistical methods are used to calculate the mathematical association among variables. However, in spite of a strong mathematical association, the causal inference may not make sense at all if directionality and isolation are not established.

In summary, statistical analysis is only a small part of the entire research process. It is only a subset of data analysis, and data analysis is one of the three components of quantitative research. Hoyle (1995) explicitly warned that researchers should not regard statistical procedures as the only way to establish a causal and effect interpretation.

Last updated: 2016

### References

• Blalock, H. M. (1974). (Ed.) Measurement in the social sciences: Theories and strategies. Chicago, Illinois: Aidine Publishing Company.

• Bond, T. G., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences. New York, NY: Routledge.

• Box, G. E. P., Hunter, W. G., & Hunter, J. S. (1978). Statistics for experimenters. New York: Wiley.

• Creswell, J. (2005). Educational research: Planning, conducting, and evaluating quantitative and qualitative research. Upper Saddle River, NJ: Pearson.

• Cukier, K. (2010). Data, data everywhere. The Economist, 25. Retrieved from http://www.economist.com/node/15557443

• Hoyle, R. H.. (1995). The structural equation modeling approach: Basic concepts and fundamental issues. In R. H. Hoyle (Eds.), Structural equation modeling: Concepts, issues, and applications (pp. 1-15). Thousand Oaks: Sage Publications.

• Kerlinger, F. N. (1960). The mythology of educational research: The methods approach. School and Society, 88, 149-151.

• Kerlinger, F. N. (1986). Foundations of behavioral research (3rd ed.). Forth Worth, TX: Holt, Rinehart and Winston.

• Pedhazur, E. J. (1982). Multiple regression in behavioral research: explanation and predication (2nd ed.). Forth Worth, TX: Harcourt Brace College Publishers.

• Pedhazur, E. J. & Schmelkin, L. P. (1991). Measurement, design, and analysis : An integrated approach. Hillsdale, N.J.: Lawrence Erlbaum Associates.

• Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.

• SPSS, Inc. (1997). Trial Run 1.0 user guide. Chicago, IL: Author.

• Tukey, J. (1986a). Sunset Salvo. American Statistician, 40(1), 72-76.

• Tukey, J. (1986b). The collected works of John W. Tukey (Volume IV): Philosophy and principles of data analysis: 1965-1986. Monterey, CA: Wadsworth & Brooks/Cole.
Last updated: 2017

 Go up to the main menu