Tests and Measurements II

Upon completion of this course, the student should be able to:

  • Apply APA standards for psychological assessment instruments
  • Understand core concepts in psychometric theory, including reliability, validity, test construction, item response theory, generalizability theory, and test bias.
  • Apply critical thinking skills regarding psychometric theory and specific psychological test instruments
  • Understand the potential influence of diversity issues in psychological assessment in all domains. 

Syllabus (tentative)
In addition to the readings listed below, we will have occasional reading/reference to Kaplan & Saccuzzo from Tests & Measurements I.

November 26, 2012

Topics: Measurement in psychology

             Factor analysis in test construction and development

Required Text

Furr & Bacharach Chapter 4:  Test dimensionality and factor analysis

Required reading:

Russell, B. (1897). On the relations of number and quantity, Mind, 6(23), 326-341. (pdf)

Stevens, S. S. (1946). On the theory of scales of measurement. Science, 1103, 677-680. (pdf)

Blanton H & Jaccard J (2006).  Abritrary metrics in psychology. American Psychologist, 61, 27-41. (pdf)

Kahn, J. H. (2006). Factor analysis in counseling psychology research, training, and practice: Principles, advances, and applications. The Counseling Psychologist, 34(5), 684-718. (pdf)

Lecture #1


December 3, 2012

Topics: Reliability revisited: Concepts and methods

Required text: 

Furr & Bacharach Chapter 5: Reliability: Conceptual basis

Furr & Bacharach Chapter 6: Empirical estimates of reliability; 

Required reading

Standards for Educational and Psychological Testing Chapter 2 (pdf)

Osborn HG (2000).  Coefficient alpha and related internal consistency reliability coefficients.  Psychological Methods, 5, 343-355 (pdf)

Lecture #2


December 10, 2012

Topics:   Reliability: Methods and importance

Required text:  

Furr & Bacharach Chapter 7: Importance of reliability

Lecture #3 

Interesting article from the Washington Post on the Myers-Briggs from December 14


December 17, 2012

Topics: Validity revisited: Concepts and methods

Required text: 

Furr & Bacharach,  Chapter 8: Validity: Conceptual basis

Furr & Bacharach, Chapter 9: Validity: Estimating and evaluating convergent and discriminant validity evidence

Required readings:

Standards for Educational and Psychological Testing Chapter 1. (pdf)

Cronbach L.J. & Meehl, P.E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281-302. (pdf)

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81-105. (pdf)

Goodwin L.D., & Leach N.L. (2003) The meaning of validity in the new Standards for Education and Psychological Testing: Implications for measurement courses. Measurement and Evaluation in Counseling and Development, 36, 181-191. (pdf)

Hogan, R. & Nicholson, R.A. (1988). The meaning of personality test scores.  American Psychologist, 43, 621-626. (pdf)

Borsboom, D., Mellenbergh, G.J., & Van Heerden (2004). The concept of validity. Psychological Review, 111, 1061-1071. (pdf)

Borsboom, D., Cramer, A.O.J., Kievit, R.A., Zand Scholten, A., & Franic, S. (2009). The end of construct validity. In: Lissitz, R.W. (Ed.). The concept of validity: Revisions, new directions, and applications. Information Age Publishers. (pdf)

Lecture #4

Borsboom Lecture


January 7, 2013

Topics: Response bias, review of requirements for papers (handout)

Required text

Furr & Bacharach Chapter 10: Response biases

Furr & Bacharach Chapter 11: Test biases

Lecture #5


January 14, 2013

Topics: Generalizability theory

ALSO -- Please read Ch. 1 and Ch. 2 of the Standards for Educational and Psychological Testing, now posted above (the reading that WAS delayed has now arrived)

Required text

Furr & Bacharach Chapter 12: Generalizability theory

Required readings

Shavelson, R.J., Webb, N.M., & Rowley, G.L. (1989).  Generalizability theory. American Psychologist, 44, 922-932 (pdf)

A sample paper using generalizability theory (not required reading, but it might help things make more sense).

Another sample paper comparing generalizability theory to classical test theory.  

Lecture #6


January 21, 2013

NO CLASS - Martin Luther King Jr. Day

An interesting link to The New Yorker dealing with Bayesian statistics and Nate Silver. 


January 28, 2013

Topic: Item response theory

Required text

Furr & Bacharach Chapter 13: Item response theory and Rasch models

Required reading

Harvey, R.J., & Hammer, A.L. (1999) Item response theory.  The Counseling Psychologist, 27, 353-383. (pdf)

Lecture #7

Barnum effect example

A real site where you can take a personality measure and get a report about how you do.


February 4, 2013

Topic: Evaluating psychological tests: applications

Required reading:   

Dawes, R.M., Faust, D., & Meehl, P.E. (1989) Clinical versus actuarial judgment. Science, 243, 1668-1673. (pdf)

Meehl, P.E. (1954). Clinical versus statistical prediction: A theoretical analysis and a review of the evidence. Minneapolis, MN:  University of Minnesota Press. (link)   

  • You don't really have to read the whole book above for class, but it would serve you well to read it at some point in your academic career.  Daniel Kahneman the only psychologist to win a Noble prize in economics ([you can read about his new book in the New York Review of Books here] and with Tomas Tranströmer, one of the only two Noble laureate psychologists ever) was inspired to do his award-winning work by the book. Even if you don't read the book now, please do read the article above relating to it as well as the meta-analysis below relating to it for class.

Grove, W.M., Zald, D.H., Lebow, B.S., Snitz, B.E., & Nelson, C. (2000). Clinical versus mechanical prediction: a meta-analysis. Psychological Assessment, 12(1), 19-30. (pdf)

Wei, M.,  Alvarez, A.N.,  Ku, T.Y.,  Russell, D.W.,  & Bonett, D.G.  (2010). Development and validation of a coping with discrimination scale: Factor structure, reliability, and validity.  Journal of Counseling Psychology, 57, 328-344. (pdf)

Lecture #8

Video of Meehl (see also wikipedia page and obituary) lecturing on clinical vs. statistical prediction, circa March 2, 1989.  Also, you might want to see this exchange between Meehl and Donald Peterson, his former student, or Peterson's chapter here. Also, see a link to an upcoming invited symposium about Meehl's legacy.

Recommended reading:  

Vrieze, S. I., & Grove, W. M. (2009). Survey on the use of clinical and mechanical prediction methods in clinical psychology.  Professional Psychology: Research and Practice 40(5), 19-30. (pdf)

Garb, H. N. (2010). The social psychology of clinical judgments. In J. E. Maddux & J. P. Tangney (Eds.). Social psychological foundations of clinical psychology,  297-311. New York, NY: Guilford Press. (pdf)

From class discussion 

(not required reading, but to verify any claims I made or see examples of the things I discussed):

Borsboom, D. (2006). The attack of the psychometricians. Psychometrika, 71, 425-440.  (pdf)

Dawes, R. M. (1971). A case study of graduate admissions: Application of three principles of human decision making. American Psychologist, 26, 180-188. (pdf)

Dawes, R. M. (1975). Graduate admissions criteria and future success. Science, 187, 721-723. (pdf)

Dawes, R. M. (1979). The robust beauty of improper linear models in decision making. American Psychologist, 34(7), 571-582. (pdf)

Dawes, R. M. (2005). The ethical implication of Paul Meehl's work on comparing clinical and actuarial prediction methods. Journal of Clinical Psychology, 61(10), 1245-1255. (pdf).

Goldberg, L. R. (1977).  Admission to the Ph.D. program in the Department of Psychology at the University of Oregon.  American Psychologist, 32(8), 663-668. (pdf)

Lilienfeld, S.O. (2007). Psychological treatments that cause harm.  Perspectives on Psychological Science, 2, 53-70. (pdf)

Meehl, P. E. (1957). When shall we use our heads instead of the formula? Journal of Counseling Psychology, 4, 268-273.  (pdf)

Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods, 8(2), 164-184. (pdf)

Wilson, M. (2013). Seeking a balance between the statistical and scientific elements in psychometrics. Psychometrika, 78(2), xxx-yyy. (pdf)

A link about professor and adjunct salaries.


February 11, 2013

Topic: Evaluating psychological tests: applications, review for exam

Example article:

Wood, J. M., Lilienfeld, S. O., Nezworski, T. M., Garb, H. N., Allen, K. H., & Wildermuth, J. L. (2010). Validity of Rorschach Inkblot scores for discriminating psychopaths from nonpsychopaths in forensic populations: A meta-analysis. Psychological Assessment, 22(2), 336-349. (pdf) 

(We will wrap up odds and ends for the semester, discuss more application of tests and measurements to evaluating tests, and then do a review driven by YOUR questions rather than based on something I design.  Hence, little will be posted here [if some student wants to share typed notes, I'll post that])


February 18, 2013


Exam (please don't click here, as I won't give you permission; I put the exam here so I can access it easily  when I need to do so).