Checklist
The following questions can be used to help you design and implement your research study. The answers are provided as examples. They are by no mean the model answers. Also, you don't have to answer every question. Some questions may not be relevant to your study.
Theory
What is the supporting instructional/psychological theory? What previous research can support the theory?
The supporting theory is Anderson's Adaptive Character of Cognition. The theory proposes that human cognition is a schemata that consists of numerous nodes. Non-linear learning strengthens the association of nodes, thereby form a coherent schemata. Research on HyperCard, Authorware, Director, Flash, and online learning modules supports this theory.
What is the treatment program?
A website with various hypertext features.
What is/are the media propertie(s)?
Hypertext in World Wide Web.
What is/are the observed variable(s) to represent the media property(ies)?
Use of websites. The usage will be checked by user access log.
What is/are the mental construct(s)?
Non-linear thinking.
What is/are the behavioral outcomes to represent the mental construct(s)?
Problem solving that requires extensive search of scattered information such as Web search.
Could the media property(ies) and the mental construct(s) form a direct causal relationship? If not, is/are there more mediating variable(s)?
The relationship is direct. There is no mediating variable.
Hypothesis
What is your null hypothesis?
There is no significant difference between learners who use hypertext-based online tutorials and those who don't in problem solving skills meaasured by objective tests and subjective observation.
What is your alternate hypothesis?
There is a significant difference between learners who use hypertext-based online tutorial and those who don't in problem solving skills, meaasured by objective tests and subjective observation.
Can the hypothesis be falsified? Will it lead to a meaningful result?
Either one of the hypotheses can be wrong. According to Karl Popper, this risky position will lead to findings that carry scientific values.
Design of experiment
Is your study experimental, quasi-experimental, or non-experimental?
The study is quasi-experimental because randomization will be used, but the sampling scheme is not random sampling. It will be a factorial design with two factors. The between subject factor has two levels: hypertext group and conventional learning group. The within subject factor has two levels, too: pretest and posttest.
What are the potential threats against the validity of your study? How can you counteract those threats?
The study is subject to the threats of instrumentation and subject selection mentioned by Cook and Campbell. Inter-rater reliability will be used to identify measurement errors. Randomization will be used as a counter measure against selection bias. The attributes of groups will be examined to avoid the Simpsonšs paradox.
Do you want to study many interaction effects or to have a small study? Which experimental design will you use to optimize the study?
There is only one interaction effect and thus a full-factorial design is acceptable.
Population and sample
What is the target (hypothetical) population to which you will make inference?
All university students who have basic computer skills.
What is the accessible population from which you will draw samples?
University students who have basic computer skills in a local university
What is your sampling method?
A mixture of convenience sampling and purposive sampling. I will go to several classes to recruit subjects who are qualified to be the users of the treatment (e.g. basic computer skills).
How will you choose the alpha level, effect size, power level, and sample size?
I conducted a literature review to collect articles related to this topic. Relevant statistics were extracted from those studies for computing the averaged effect size. The overall effect size can inform me the expected direction and magnitude of the treatment effect. Given that the alpha level is .05, the effect size is .25, and the power level is .80, the desirable sample size should be 128. However, I can get only 80 participants.
If you do not conduct a power analysis or your sample size is less than what the power analysis suggests, what remedy will you use?
Resampling.
Measurement
Will you establish the content validity, the criterion validity, or/and the construct validity?
The content validity will be established by a panel of content experts. The construct validity will be verified by factor analysis.
Which reliability estimation method will you use?
Cronbach coefficient Alpha for internal consistency and Kappa coefficient for inter-rater reliability.
If your instrument is a performance test, which item difficulty method will you use?
Latent trait model item analysis.
If you conduct a test item bias analysis, which method will you use?
MH procedure.
How many constructs does your instrument measure? Can factor analysis support the factor structure?
There are five different domains in the problem solving test. Factor analysis confirms that there are five subscales in the test and they are not strongly correlated.
Analysis
If you use parametric tests, can the nature of the study and data structure meet the parametric assumptions? If not, what remedy will you use?
The study is experimental and thus an ANOVA model is appropriate. However, although outliers are removed, the data cannot meet the assumptions of normality and homogeneous variances. According to Monte Carlo simulations, my sample size is not large enough to make the test robust against the violations. ANOVA with data transformation and resampling will be used.
By resampling, can you verify the parametric test results?
The bootstrapping results confirm the parametric test results.
By the Bayesian inference, can you infer the treatment effectiveness?
The Bayesian inference concludes that the hypertext treatment is more effective in enhancing problem solving.
By exploratory data analysis, can you find an interesting pattern of the data?
By brushing and linking, it is found that the scores of female humanity majors tend to center around the post-test means.
Interpretation
Are noises such as threats against validity of experiment and measurement errors isolated from the study?
What other noises remain in the study?
The measurement errors are acceptable but the pre-test effect is not controlled.
Are the predicted relationships among variables confirmed by the statistical methods?
The association between the dependent and independent variables are confirmed by the ANOVA with data transformation. The Bayesian inference and bootstrapping concurred with the finding.
Can you establish a cause and effect relationship among variables?
Because the study is experimental in nature, the noises are isolated and the association is confirmed, the causal inference is established.
Can you generalize the findings to the entire population?
The subjects are randomized and their characteristics meet the requirements of the target audience, thereby the findings can be generalized.
Go up to the main menu
|
|