Many parents and teachers are critical of the Standardised Assessment Tests (SATs) that have recently been taken by primary school children. One common complaint is that they are too hard. Teachers at my son’s school sent children home with example questions to quiz their parents on, hoping to show that getting full marks is next to impossible.
Invariably, when parents try out these tests, they focus on the most difficult or confusing items. Some parents and teachers can be heard complaining on social media that if they get questions wrong, surely the tests are too hard for ten-year-olds.
Today's SATs reading test was ridiculous, my little lovelies tried so hard. Surely children crying in a test is a sign that it's too hard?
— Charlotte Higgins (@charlottehiggin) May 9, 2016
But how hard should tests for children be?
As a psychologist, I know we have some well-developed principles that can help us address the question. If we look at the SATs as measures of some kind of underlying ability, then we can turn to one of the oldest branches of psychology – “psychometrics” – for some guidance.
Getting it just right
A good test shouldn’t be too hard. If most people get most questions wrong, then you have what is called a “floor effect”. The result is that you can’t tell any difference in ability between the people taking the test.
If we started the school sports day high jump with the bar at two metres high (close to the world record), then we’d finish sports day with everybody getting the same – zero successful jumps – and no information about how good anyone is at the high jump.
But at the same time, a good test shouldn’t be too easy. If most people get everything right, then the effect is, as you might expected, called a “ceiling effect”. If everybody gets everything right then again we don’t get any information from the test.
The key idea is that tests must discriminate. In psychometric terms, the value of a test is about the match between the thing it is supposed to measure and the difficulty of the items on the test. If I wanted to gauge maths ability in six-year-olds and I gave them all an A-Level paper, we can presume that nearly everyone would score zero. Although the A-Level paper might be a good test, it is completely uninformative if it is badly matched to the ability of the people taking the test.
Here’s the rub: for a test to be sensitive to differences in ability, it must contain items which people get wrong. Actually, there’s a precise answer to the proportion that you should get wrong – in the most sensitive test it should be half of the items. Questions which you are 50% likely to get right are the ones which are most informative.
How we feel about measuring and labelling children according to their skill at taking these tests is a big issue, but it is important that we recognise that this is what tests do. A well designed test will make all children get some items wrong – it is inherent in their design. It is up to us how we conceptualise that: whether tests are an unnecessary distraction from true education, or a necessary challenge we all need to be exposed to.
Better tests?
If you adopt this psychometric perspective, it becomes clear that the tests we use are an inefficient way of measuring any individual child’s particular ability to do the test. Most children will be asked a bunch of questions which are too easy for them, before they get to the informative ones which are at the edge of their ability. Then they will go on to attempt a bunch of questions which are far too hard. And pity the people for who the test is poorly matched to their ability and consists mostly of questions they’ll get wrong – which is both uninformative in psychometric terms, and dispiriting emotionally.
A hundred years ago, when we began our modern fixation with testing and measuring, it was hard to avoid the waste where many uninformative and potentially depressing questions were asked. This was simply because all children had to take the same exam paper.
Nowadays, however, examiners can administer tests via computer, and algorithmically identify the most informative questions for each child’s ability – making the tests shorter, more accurate, and less focused on the experience of failure. You could throw in enough easy questions that no child would ever have the experience of getting most of the questions wrong. But still there’s no getting around the fact that an informative test has to contain questions most people sitting it will get wrong.
Even a good test can measure an educationally irrelevant ability (such as merely the ability to do the test, or memorise abstract grammar rules), or be used in ways that harm children. But having difficult items isn’t a problem with the SATs, it’s a problem with all tests.
Tom Stafford does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond the academic appointment above.
Tom Stafford, Lecturer in Psychology and Cognitive Science, University of Sheffield
This article was originally published on The Conversation. Read the original article.



Novo Nordisk and Eli Lilly Lower Prices for Weight-Loss Drugs Amid U.S. Agreement
Eli Lilly’s Inluriyo Gains FDA Approval for Advanced Breast Cancer Treatment
FDA Names Tracy Beth Høeg as Acting CDER Director After Richard Pazdur Announces Retirement
The ghost of Robodebt – Federal Court rules billions of dollars in welfare debts must be recalculated
Merck Nears Acquisition of Cidara Therapeutics at Significant Premium
Glastonbury is as popular than ever, but complaints about the lineup reveal its generational challenge
Why a ‘rip-off’ degree might be worth the money after all – research study
Pfizer Secures $10 Billion Deal for Obesity Drug Developer Metsera, Outbids Novo Nordisk
CDC Shake-Up Sparks Vaccine Policy Clash Between RFK Jr. and Susan Monarez
Canada Loses Measles-Free Status After Nearly 30 Years Amid Declining Vaccination Rates
How to support someone who is grieving: five research-backed strategies
Cogent Biosciences Soars 120% on Breakthrough Phase 3 Results for Bezuclastinib in GIST Treatment
Innovent’s Xinermei Intensifies Weight-Loss Drug Battle in China
Trump Signs Executive Order to Boost AI Research in Childhood Cancer
Novo Nordisk Appoints Greg Miley as Global Head of Corporate Affairs Amid U.S. Pricing Pressure 



