News

A new promise for scenario-based assessments

Do scenario-based assessments hold new promise for uncovering reading ability?

February 2020

The middle schooler sits at her computer, poised to take a 45-minute test that measures her reading comprehension. In the past, such annual reading exams have presented her with a series of short, sometimes boring, stretches of text about various, unrelated topics, usually followed by multiple-choice questions to see how well she understands what she has read.

But this time, she’s given a different sort of test, one that’s more like a fun project with simulated classmates and a virtual teacher with whom she can interact. Everything in the exam is based around a single theme. In this case, she’s told she needs to help her class create a website about environmentally-friendly schools. She reads web text and participates in a message board with her classmates, viewing their comments and adding her own; writes a summary of a section of the text (and sees how her virtual classmate “Brian” wrote a similar one); completes a graphic organiser that’s already been partly filled out by “Diana”; and assesses whether a host of web comments about green schools are fact, opinion, right, wrong or totally off-topic.

This student has just completed a pilot version of a new research test, the Global Integrated Scenario-based Assessment, or GISA (rhymes with visa). The exam was developed by an ETS-led team of researchers as part of a U.S. government-funded initiative called Reading for Understanding. The initiative aims to research and provide effective strategies for improving reading comprehension for students in grades Pre-K–12.

GISA is designed to help educators better assess a student’s reading comprehension than traditional exams have been able to do. Its innovative features also could ultimately help improve how well students read, experts suggest.

Following the science

The Reading for Understanding grant project that led to GISA was helmed by ETS researchers John Sabatini and Tenaha O’Reilly, along with other researchers from ETS as well as three partner universities (Florida State University, Arizona State University and Northern Illinois University). Work began in 2010 and is expected to be finalised in 2017.

During its development, GISA was administered to about 100,000 Pre-K through 12th-grade students around the country. This testing of the test showed that GISA is a reliable measure of comprehension, says Sabatini. Students have neither found it too difficult nor too easy, even though it’s markedly different from previous assessments they’ve taken.

In creating GISA, test developers turned to recent cognitive research about reading to design tasks and a structure that would effectively reveal how well today’s students understand what they read.

For example, says Sabatini, researchers looked to the work of Walter Kintsch, a professor of psychology and neuroscience at the University of Colorado Boulder, who found that people read in two different ways. First is to understand what the text says — the details and basic point. Second is to use this information in some way, such as to solve a problem.

Traditional reading assessments typically evaluate basic understanding of a text, and GISA does too. But unlike most standard tests, GISA also assesses the reader’s ability to purposefully apply the knowledge he or she has gained from a text.

“GISA is a much more challenging, integrative, analytic assessment,” says Catherine Snow, a professor of education at Harvard University, who worked on a Reading for Understanding research project and used GISA to test student outcomes.

Snow says that GISA asks students not just to summarise, but also to select, compare and evaluate.

“That wider array of tasks better replicates what we expect kids to be able to do as a result of reading comprehension instruction these days,” she says.

Each GISA is focused around a single scenario, such as having students prepare a presentation on the features of deserts or fact-check a website on the origins of the Mona Lisa. Students are then given problems or tasks related to the theme, which they must resolve using information they’ve read. This helps assessors see how readers analyse and interpret text.

“One of the ideas was to make assessments more authentic, to really add a purpose to the assessment,” says O’Reilly.

Reading expert P. David Pearson, a professor of education at the University of California, Berkeley, says one reason that he likes GISA is that students are given a clear reason for their activities. Plus, the activities “correspond to the ways in which we ask people to use reading in everyday life.” He adds that “using what you know to apply to the world is part and parcel of reading comprehension.”

Another area of recent reading research, says Sabatini, covers the integration of information across multiple sources. Thanks to the Internet, today’s readers are presented with vast quantities of often un-vetted data in various contexts (blogs, social media posts, websites) and from varying viewpoints. This requires them to know how to critically and strategically sort, sift and synthesize what they’ve read.

GISA’s scenario-based, digital delivery system tests for some of these 21st-century skills by featuring a multitude of digital sources of text and information. These often provide differing viewpoints that readers must evaluate.

Sometimes those differing perspectives come from the test’s simulated classmates, who often model tasks and provide a social, collaborative and engaging context for the test. At other times, however, they might give inaccurate or irrelevant information. Just like in the real world, readers need to know how to tell when sources are credible or untrustworthy, and when information is true or not factual.

Can GISA improve reading?

If scenario-based assessments like GISA were to become commonplace, how might this impact not just the testing of reading comprehension, but the teaching of reading? Could the tests actually improve how well students grasp what they read?

Reading expert Pearson believes tests “should reflect rather than lead good curriculum development” but also recognises that tests “have an impact on curriculum and teaching whether you want them to or not.”

If GISA ever moved beyond the pilot phase and became an actual product, it could “have a positive impact on the teaching of reading,” he says. “It would motivate curriculum designers and teachers and policymakers to teach reading comprehension in a way that’s more inherently social, more focused on applying what you learn from reading to problems in the world. And because GISA tasks tend to engage students in gathering information from digital sources, it would of necessity have a critical edge where issues of relevance and validity are part and parcel of reading.”

Snow agrees, noting that assessments are “very powerful” because they can change how educators conceive and teach a subject. “Ideally, if a test like GISA were used more widely, it would change how educators are thinking about comprehension,” she says.

Sabatini doesn’t claim that GISA could, by itself, improve the way children read.

“As much as we’d love to believe the assessment causes the reading development, we’re not there yet — and maybe that’s not an appropriate goal,” he says.

However, “assessments can model and provide a target and feedback,” which can help guide teachers and learners, he adds.

To that end, GISA’s creators included empirically validated reading strategies, says O’Reilly, such as summarisation, paraphrasing and the use of graphic organisers.

“We decided to put some of those types of strategies in the test itself to keep pace with the types of effective instructional practices that are done in the classroom and to support these good reading habits,” he says, noting that “partially serving as a potential model of learning was one of our high-end goals.”

Growing in use — now and in the future

GISA is a pilot test that was developed as part of a research initiative and is not being used in classrooms. However, scenario-based testing is beginning to find its way into other tests, both of reading and of other content areas. The National Assessment of Educational Progress (NAEP), for example, has begun piloting “GISA-like” scenarios in reading, science and other disciplines, says Pearson, who serves as the chair of NAEP’s standing committee on reading.

Meanwhile, Sabatini and O’Reilly hope that scenario-based tests like GISA become more common and familiar to teachers and students, which would allow them to push the envelope in future designs of the test.

“We kind of didn’t want to go too far because students had never taken a test like this before,” says Sabatini, referring to GISA’s first iteration.

But a GISA 2.0 could be even more creative, featuring more adaptations, choices and feedback loops, he says. In other words, students would be given more options during the exam based upon the choices they make. Also under consideration: including a hybrid mix of print and digital material, just as students are confronted with in the real world.

GISA or GISA-like scenarios potentially could be adapted for use in adult literacy and foreign language assessments, says Sabatini. “We hope to continue the work in one direction or another into the future.”

—

Lorna Collier is a writer specializing in education, technology and business. She writes frequently for the National Council of Teachers of English and the Center for Digital Education.