Measuring Test Quality: Reliability and Validity
August 28, 2023
Research shows that almost 85% of today's companies use some form of pre-employment assessment in their hiring procedure. So, there is a high possibility that you are part of an organization that uses tests to hire new employees.
Does that mean you can pick up any test, implement it for your hiring needs, and end up with stellar employees at work? Definitely not. Pre-employment tests need to be vetted to determine their usefulness. To do this, you need to answer two questions:
Before jumping to determine the usefulness of these tests, you need to understand what a pre-employment test is.
Pre-employment tests are a standardized way of obtaining data on candidates during hiring. They are used to determine your potential employees' abilities and traits.
For a test to be deemed as "good", it must satisfy three main conditions:
Simply put, a test must be reliable and valid for it to be considered "good". The extent to which a test satisfies these conditions is determined by two properties: Reliability and Validity.
Reliability refers to the extent to which a test consistently measures a characteristic.
Think of reliability in this way: No matter how many times you take the test, you obtain similar scores. If this is true for your pre-employment test, it is considered reliable.
Say your candidate obtains different scores each time they take the test. How do you account for the change? Some possible reasons (errors) are:
Simply put, the degree to which the test scores remain unaffected by measurement errors is known as the reliability of the test.
You want to use a pre-employee assessment which is reliable, right? But how do you determine the reliability of a test? There are multiple ways to do that.
You want to choose the test which consistently gives similar scores no matter how many times your candidate takes the test. Using a less reliable test will harm your organization. You end up with inferior employees, a culture mismatch, and an inefficient team.
Reliability is quantified using the reliability coefficient (r). Typically higher values of 'r' indicate more reliability. For a test to be usable, you want the coefficient to be higher than 0.7 on a scale of 1. Any test below this benchmark may not have much application in practical workplace scenarios.
Validity refers to what attribute the test measures and how well it measures that attribute.
It gives meaning to your candidate's test scores. If the test that you are using is valid, then the scores which your candidates obtain on the test directly correlate to their job performance.
To consider an employment assessment valid, you want evidence that higher scores on the test mean better job performance. There are multiple ways in which we can obtain this evidence.
Validity is a direct indication of the usefulness of a test. Remember, reliability tells you how accurate the scores are, but validity gives meaning to these scores. Without this, there is no way of correlating your candidate's test scores to their job performance.
Similar to reliability, to interpret validity, there exists a validity coefficient. Typically, when you consider a single test, a coefficient value greater than 0.35 is known to be very beneficial.
Applying these two properties to evaluate the usefulness of tests must be the key takeaway from this post. Picture yourself playing darts. The objective is simple, strike the centre of the board. This is similar to what you are trying to accomplish with a pre-employment test, you want the test to do one thing, and you want it to be done right. Let's consider the three possible scenarios:
Not reliable, not valid:
Sorry to say, you have had a terrible game. Not only are your shots off target, but none of them consistently strike any point on the board. Similarly, tests that are neither valid nor reliable do not have consistent test scores, nor do they have any relation between the test score and job performance.
Reliable, but not Valid:
You had a better game than before since you managed to strike near the target consistently, but you still aren't hitting the mark. With reliable tests, the scores are consistent, but with no test validity, the relation with job performance does not exist.
Reliable and Valid:
This is the performance you want to have every game. You consistently strike the target (or close to it). In this scenario, the test scores are consistent, and the scores relate perfectly to the candidate's job. Higher scores imply better job performance, making filtering candidates a walk in the park.
We here at Adaface focus on tests that are objective, technical tests that test for actual technical skills used on the job and psychometric tests, which research proves to have the highest correlation with job performance.
Do you want to know more about the various tests you can incorporate into your organization to help with your hiring needs? Check out our library of pre-employment assessments.
Pragnesh is the EiR at Adaface. He loves reading books more than scrolling through social media, which is a big deal if you ask him.
We make it easy for you to find the best candidates in your pipeline-
with a 40 min skills test.