Human Resources
Talent Blog

CATs are the greatest (and cutest) right?


Apr 5, 2017 | by Kate LaPort, John Capman, Caitlin Blackmore, and Michael McKenna

No, no, not this CAT.

We’re talking about Computer Adaptive Tests. These CATs are most often used in college entrance exam such as the SAT and the GRE. They represent the most sophisticated tests in employment assessment and selection. CATs’ adaptive capability allows them to dynamically respond to the response patterns of a test taker. In other words, the CAT’s adaptive algorithm chooses which item to administer by evaluating a test taker’s response to previous items. With CATs, items are chosen so that the item provides the most information about the test taker (i.e., at their given level of a trait) on whatever that test is attempting to measure. For example, if we’re talking about a cognitive ability test, a test taker who responds to a difficult item incorrectly will be administered a subsequent item that is less difficult. CATs offer a variety of benefits for test takers and test administrators compared to traditional non-CAT or “static” tests, such as:

  • Shorter Testing Time
  • Greater Test Security
  • Unproctored Administration
  • Positive Test Taker Reactions

The advantages offered by CAT tests are dependent on two attributes: (1) providing more precise estimates of a test taker’s true ability level and (2) administering only items that are appropriate to each test taker. CAT tests cannot live up to their full potential of being an efficient, secure assessment if either one of these elements are compromised.

Though all CAT tests share common objectives, there is a great deal of variability in how test developers evaluate how effective CAT tests are at delivering on their promise. A common approach to evaluating the accuracy of CAT estimates and item administration alike is to conduct a series of Monte Carlo simulations.

A Monte Carlo simulation “ostensibly” administers a CAT assessment to thousands of simulated candidates representing the full range of test-takers’ ability. These simulated candidates are then programmed to respond in certain ways to evaluate the precision of a CAT’s ability estimation and item administration. For example, some simulated candidates are programed to respond correctly to all items, while others are programmed to respond all incorrectly.

There are a number of factors to address when designing and implementing a Monte Carlo simulation. Some of those questions include:

  • What should the simulated candidates’ true ability levels be (i.e., the anticipated ability distribution)?
  • What is the best approach for choosing and administering an item to a simulated test taker?
  • How should the simulated candidates be programmed to respond to test items?

To date, best practices for evaluating the quality of a CAT simulation has been limited. The lack of standard processes or concrete best practices for conducting simulations has the potential to lead to variable or even contradictory information from multiple simulations of the same CAT assessment.

Recognizing a lack of general best practices in this area, we are presenting original work at the 32nd Annual Conference of the Society for Industrial and Organizational Psychology which will:

  • Outline the important considerations when evaluating CAT performance
  • Empirically investigate how choices related to these considerations influence estimates of CAT performance
  • Provide recommendations for practitioners using and evaluating CATs

If you’re developing CATs, evaluating CATs, or simply want to learn more, we’d love to see you in April!

“Evaluating CAT Effectiveness through Simulations: A Better Way Forward”
2017 Conference of the Society for Industrial and Organizational Psychology
Walt Disney World Swan and Dolphin
April 27, 2017
12:30 PM to 1:20 PM
Atlantic BC

About the Authors