Communication Studies 525-0

MTS 525-0
Special Topics Research Seminar

Section 20: Generalizing about Message Effects
Spring 2020

SYLLABUS: TOPIC 5

TOPIC 5: Interpreting effect size magnitude and variability

5.1 Effect size magnitudes

5.1.1 Abstract characterizations of effect size magnitude

5.1.2 Observed average effect sizes

5.1.3 The null as a range: Equivalence testing and second-generation p-values

5.2 Effect size variability

5.2.1 Heterogeneity indices (I2, Q, Birge’s R, etc.)

5.2.2 Prediction intervals

5.3 The “replication crisis” revisited

5.1 Effect size magnitudes

5.1.1 Abstract characterizations of effect size magnitude

Funder, D. C., & Ozer, D. J. (2019). Evaluating effect size in psychological research: Sense and nonsense. Advances in Methods and Practices in Psychological Science, 2, 156-168. doi:10.1177/2515245919847202

For further reading:

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.

Abelson, R. P. (1985). A variance explanation paradox: When a little is a lot. Psychological Bulletin, 97, 129-133. doi: 10.1037/0033-2909.97.1.129

Prentice, D. A., & Miller, D. T. (1992). When small effects are impressive. Psychological Bulletin, 112, 160-164. doi:10.1037/0033-2909.112.1.160

Pogrow, S. (2019). How effect size (practical significance) misleads clinical practice: The case for switching to practical benefit to assess applied research findings. The American Statistician, 73(S1), 223-234. doi:10.1080/00031305.2018.1549101

Correll, J., Mellinger, C., McClelland, G. H., & Judd, C. M. (2020). Avoid Cohen’s ‘small’, ‘medium’, and ‘large’ for power analysis. Trends in Cognitive Sciences, 24(3), 200-207. https://doi.org/10.1016/j.tics.2019.12.009

5.1.2 Observed average effect sizes

Rains, S. A., Levine, T. R., & Weber, R. (2018). Sixty years of quantitative communication research summarized: Lessons from 149 meta-analyses. Annals of the International Communication Association, 42, 105-124. doi:10.1080/23808985.2018.1446350

Schäfer, T., & Schwarz, M. A. (2019). The meaningfulness of effect sizes in psychological research: Differences between sub-disciplines and the impact of potential biases. Frontiers in Psychology, 10, article 813. doi:10.3389/fpsyg.2019.00813

For further reading:

Haase, R. F., Waechter, D. M., & Solomon, G. S. (1982). How significant is a significant difference? Average effect size of research in counseling psychology. Journal of Counseling Psychology, 29, 58-65.

Cooper, H., & Findley, M. (1982). Expected effect sizes: Estimates for statistical power analysis in social psychology. Personality and Social Psychology Bulletin, 8, 168-173. doi:10.1177/014616728281026

Hemphill, J. F. (2003). Interpreting the magnitudes of correlation coefficients. American Psychologist, 58(1), 78-79. doi:10.1037/0003-066X.58.1.78

Richard, F. D., Bond, C. F., Jr., & Stokes-Zoota, J. J. (2003). One hundred years of social psychology quantitatively described. Review of General Psychology, 7(4), 331-363. doi:10.1037/1089-2680.7.4.331

Hill, C. J., Bloom, H. S., Black, A. R., & Lipsey, M. W. (2008). Empirical benchmarks for interpreting effect sizes in research. Child Development Perspectives, 2(3), 172-177. doi:10.1111/j.1750-8606.2008.00061.x

Ferguson, C. F. (2009). Is psychological research really as good as medical research? Effect size comparisons between psychology and medicine. Review of General Psychology, 13, 130-136. doi:10.1037/a0015103

Chen, H., Cohen, P., & Chen, S. (2010). How big is a big odds ratio? Interpreting the magnitudes of odds ratios in epidemiological studies. Communications in Statistics: Simulation and Computation, 39(4), 860-864. doi:10.1080/03610911003650383

Bosco, F. A., Aguinis, H., Singh, K., Field, J. G., & Pierce, C. A. (2015). Correlational effect size benchmarks. The Journal of Applied Psychology, 100(2), 431–449. doi:10.1037/a0038047

Leucht, S., Helfer, B., Gartlehner, G., & Davis, J. M. (2015). How effective are common medications: A perspective based on meta-analyses of major drugs. BMC Medicine, 13, 253. doi:10.1186/s12916-015-0494-1

Gignac, G. E., & Szodorai, E. T. (2016). Effect size guidelines for individual differences researchers. Personality and Individual Differences, 102, 74–78. doi:10.1016/j.paid.2016.06.069

Paterson, T. A., Harms, P. D., Steel, P., & Credé, M. (2016). An assessment of the magnitude of effect sizes: Evidence from 30 years of meta-analysis in management. Journal of Leadership & Organizational Studies, 23(1), 66-81. doi:10.1177/1548051815614321

Lovakov, A., & Agadullina, E. (2017). Empirically derived guidelines for interpreting effect size in social psychology. PsyArXiv manuscript. psyarxiv.com/2epc4. doi:10.17605/OSF.IO/2EPC4

Brydges, C. R. (2019). Effect size guidelines, sample size calculations, and statistical power in gerontology. Innovation in Aging, 3(4), igz036. doi:10.1093/geroni/igz036

5.1.3 The null as a range: Equivalence testing and second-generation p-values

Weber, R., & Popova, L. (2012). Testing equivalence in communication research: Theory and application. Communication Methods and Measures, 6, 190-213. doi:10.1080/19312458.2012.703834

Blume, J. D., Greevy, R. A., Welty, V. F., Smith, J. R., & Dupont, W. D. (2019). An introduction to second-generation p-values. The American Statistician, 73(S1), 157-167. doi:10.1080/00031305.2018.1537893

For further reading:

Wellek, S. (2010). Testing statistical hypotheses of equivalence and noninferiority (2nd ed.). Boca Raton, FL: Chapman & Hall/CRC.

Goertzen, J. R., & Cribbie, R. A. (2010). Detecting a lack of association: An equivalence testing approach. British Journal of Mathematical and Statistical Psychology, 63, 527–537. doi:10.1348/000711009X475853

Rainey, C. (2014). Arguing for a negligible effect. American Journal of Political Science, 58, 1083-1091. doi:10.1111/ajps.12102

Lash, T. L., & Kaufman, J. S. (2015). Seeking persuasively null results. Epidemiology, 26, 449-450. doi: 10.1097/EDE.0000000000000318

Lakens, D. (2017). Equivalence tests: A practical primer for t-tests, correlations, and meta-analyses. Social Psychological and Personality Science, 8, 355-362. doi:10.1177/1948550617697177

Lakens, D., Scheel, A. M., & Isager, P. M. (2018). Equivalence testing for psychological research: A tutorial. Advances in Methods and Practices in Psychological Science, 1, 259-269. doi:10.1177/2515245918770963

5.2 Effect size variability

5.2.1 Heterogeneity indices (I², Q, Birge’s R, etc.)

Higgins, J. P. T., & Thompson, S. G. (2002). Quantifying heterogeneity in a meta-analysis. Statistics in Medicine, 21, 1539-1558. doi:10.1002/sim.1186

For further reading:

Birge, R. T. (1932). The calculation of errors by the method of least squares. Physical Review, 40 (2nd ser.), 207-227.

Hall, J. A., & Rosenthal, R. (1991). Testing for moderator variables in meta-analysis: Issues and methods. Communication Monographs, 58, 437-448. doi:10.1080/03637759109376240

Sánchez-Meca, J., & Marin-Martinez, F. (1997). Homogeneity tests in meta-analysis: A Monte Carlo comparison of statistical power and Type I error. Quality and Quantity, 31, 385-399.

Engels, E. A., Schmid, C. H., Terrin, N., Olkin, I., & Lau, J. (2000). Heterogeneity and statistical significance in meta-analysis: An empirical study of 125 meta-analyses. Statistics in Medicine, 19, 1707-1728.

Higgins, J., Thompson, S., Deeks, J., & Altman, D. (2002). Statistical heterogeneity in systematic reviews of clinical trials: A critical appraisal of guidelines and practice. Journal of Health Services Research and Policy, 7, 51-61. doi:10.1258/1355819021927674

Higgins, J. P. T., Thompson, S. G., Deeks, J. J., & Altman, D. G. (2003). Measuring inconsistency in meta-analyses. BMJ, 327, 557-560. doi:10.1136/bmj.327.7414.557

Huedo-Medina, T. B., Sánchez-Meca, J., Marín-Martínez, F., & Botella, J. (2006). Assessing heterogeneity in meta-analysis: Q statistic or I² index? Psychological Methods, 11, 193-206. doi:10.1037/1082-989X.11.2.193

Rücker, G., Schwarzer, G., Carpenter, J. R., & Schumacher, M. (2008). Undue reliance on I² in assessing heterogeneity may mislead. BMC Medical Research Methodology, 8, 79. doi:10.1186/1471-2288-8-79

Ioannidis, J. P. A. (2008). Interpretation of tests of heterogeneity and bias in meta-analysis. Journal of Evaluation in Clinical Practice, 14, 951-957. doi:10.1111/j.1365-2753.2008.00986.x

Pereira, T. A., Patsopoulos, N. A., Salanti, G., & Ioannidis, J. P. A. (2010). Critical interpretation of Cochran's Q test depends on power and prior assumptions about heterogeneity. Research Synthesis Methods, 1, 149–161. doi: 10.1002/jrsm.13

Card, N. A. (2012). Section 8.4: Evaluating heterogeneity among effect sizes. In Applied meta-analysis for social science research (pp. 184-191). New York: Guilford.

Langan, D., Higgins, J. P. T., & Simmonds, M. (2015). An empirical comparison of heterogeneity variance estimators in 12 894 meta-analyses. Research Synthesis Methods, 6, 195–205. doi: 10.1002/jrsm.1140

Wiernik, B. M., Kostal, J. W., Wilmot, M. P., Dilchert, S., & Ones, D. S. (2017). Empirical benchmarks for interpreting effect size variability in meta-analysis. Industrial and Organizational Psychology, 10(3), 472–479. https://doi.org/10.1017/iop.2017.44

5.2.2 Prediction intervals

Borenstein, M., Higgins, J. P. T., Hedges, L. V., & Rothstein, H. R. (2017). Basics of meta-analysis: I² is not an absolute measure of heterogeneity. Research Synthesis Methods, 8, 5-18. doi:10.1002/jrsm.1230

IntHout, J., Ioannidis, J. P. A., Rovers, M. M., & Goeman, J. J. (2016). Plea for routinely presenting prediction intervals in meta-analysis. BMJ Open, 6, e010247. doi:10.1136/bmjopen-2015-010247

For further reading:

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Chapter 17: Prediction intervals. In Introduction to meta-analysis (pp. 127-133). Chicester, West Sussex, UK: Wiley.

Spence, J. R., & Stanley, D. J. (2016). Prediction interval: What to expect when you’re expecting … a replication. PLoS ONE, 11, e0162874. doi:10.1371/journal.pone.0162874

Partlett, C., & Riley, R. D. (2017). Random effects meta‐analysis: Coverage performance of 95% confidence and prediction intervals following REML estimation. Statistics in Medicine, 36, 301-317. doi:10.1002/sim.7140

Borenstein, M. (2018). Chapter 9.3: Prediction intervals. In Common mistakes in meta-analysis and how to avoid them (pp. 85-93). Englewood, NJ: Biostat.

Nagashima, K., Noma, H., & Furukawa, T. A. (2019). Prediction intervals for random-effects meta-analysis: a confidence distribution approach. Statistical Methods in Medical Research, 28, 1689–1702. doi:10.1177/0962280218773520

5.3 The “replication crisis” revisited

Patil, P., Peng, R. D., & Leek, J. T. (2016). What should researchers expect when they replicate studies? A statistical view of replicability in psychological science. Perspectives on Psychological Science, 11, 539-544. doi:10.1177/1745691616646366

De Boeck, P., & Jeon, M. (2018). Perceived crisis and reforms: Issues, explanations, and remedies. Psychological Bulletin, 144, 757-777. doi:10.1037/bul0000154

For further reading:

Hedges, L. V. (1987). How hard is hard science, how soft is soft science? The empirical cumulativeness of research. American Psychologist, 42(5), 443–455. https://doi.org/10.1037/0003-066X.42.5.443

O’Keefe, D. J. (1999). Variability of persuasive message effects: Meta-analytic evidence and implications. Document Design, 1, 87-97. doi:10.1075/dd.1.2.02oke

Kaptein, M., & Eckles, D. (2012). Heterogeneity in the effects of online persuasion. Journal of Interactive Marketing, 26, 176-188. doi: 10.1016/j.intmar.2012.02.002

Bahník, Š., & Vranka, M. A. (2017). If it’s difficult to pronounce, it might not be risky: The effect of fluency on judgment of risk does not generalize to new stimuli. Psychological Science, 28(4), 427–436. https://doi.org/10.1177/0956797616685770

Amrhein, V., Trafimow, D., & Greenland, S. (2019). Inferential statistics as descriptive statistics: There is no replication crisis if we don’t expect replication. The American Statistician, 73(S1), 262-270. doi:10.1080/00031305.2018.1543137

Kenny, D. A., & Judd, C. M. (2019). The unappreciated heterogeneity of effect sizes: Implications for power, precision, planning of research, and replication. Psychological Methods, 24(5), 578-589. doi:10.1037/met0000209

Vivalt, E. (in press). How much can we generalize from impact evaluations? Journal of the European Economics Association. Available at: http://evavivalt.com/wp-content/uploads/How-Much-Can-We-Generalize.pdf