Alternative-Response Design: Structure and Advice

In the previous post we talked about the pros and cons of the alternative-response (e.g., true-false) types of questions as well as their application to Bloom’s Taxonomy.  Next we discuss aspects to consider when writing the questions.

I found this “Simple Guidelines” list helpful and informative.

(Source: http://webs.rtc.edu/ii/Teaching%20Resources/GuidelinesforWritingTest.htm)

  1. Base the item on a single idea.
  2. Write items that test an important idea
  3. Avoid lifting statements right from the textbook.
  4. Make the questions a brief as possible
  5. Write clearly true or clearly false statements.  Write them in pairs: one “true” and one “false” version and choose one to keep balance on the test.
  6. Eliminate giveaways:
    • Keep true and false statements approximately equal in length
    • Make half the statements true and half false.
    • Try to avoid such words as “all,” “always,” “never,” “only,” “nothing,” and “alone.” Students know these words usually signify false statements.
  7. Beware of words denoting indefinite degree.  The use of words like “more,” “less,” “important,” “unimportant,” “large,” “small,” “recent,” “old,” “tall,” “great,” and so on, can easily lead to ambiguity.
  8. State items positively.  Negative statements may be difficult to interpret.  This is especially true of statements using the double negative.  If a negative word, such as “not” or “never,” is used, be sure to underline or capitalize it.
  9. Beware of detectable answer patterns.  Students can pick out patterns such as (TTTTFFFF) which might be designed to make scoring easier.

All of this makes sense to me.  At first I objected to “Make half the statements true and half false” but when I thought about it, I wouldn’t do exactly half necessarily but maybe close to half.  In fact this source, http://teaching.uncc.edu/learning-resources/articles-books/best-practice/assessment-grading/designing-test-questions,  suggests making the ratio more like 60% false to 40% true since students are more likely to guess the answer is true.

I found other points to add to the guidelines list.  (Source:  http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf)

Two ideas can be included in a true-false statement if the purpose is to show cause and effect.

  • If a proposition expresses a relationship, such as cause and effect or premise and conclusion, present the correct part of the statement first and vary the truth or falsity of the second part.
  • When a true-false statement is an opinion, it should be attributed to someone in the statement.
  • Underlining or circling answers is preferable to having the student write them.
  • Make use of popular misconceptions/beliefs as false statements.
  • Write items so that the incorrect response is more plausible or attractive to those without the specialized knowledge being tested.
  • Avoid the use of unfamiliar vocabulary.
  • Determine that the questions are appropriately answered by “True” or “False” rather than by some other type of response, such as “Yes” or “No.”
  • Avoid the tendency to add details in true statements to make them more precise.  The answers should not be obvious to students who do not know the material.
  • Be sure to include directions that tell students how and where to mark their responses.

This same source gives you a nice tip for writing true-false items:

Write a set of true statements that cover the content, then convert approximately half of them to false statements.  State the false items positively, avoiding negatives or double negatives.

Most of this discussion has been about True-False questions but the category is really Alternative-Response.  Let’s look at the variations available to us.

(Source:  http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf)

  • The True-False-Correction Question
    In this variation, true-false statements are presented with a key word or brief phrase that is underlined.  It is not enough that a student correctly identify a statement as being false.  … the student must also supply the correct word or phrase which, when used to replace the underlined part of the statement, makes the statement a true one.This type of item is more thorough in determining whether students actually know the information that is presented in the false statements.

    The teacher decides what word/phrase can be changed in the sentence; if students were instructed only to make the statement a true statement, they would have the liberty of completely rewriting the statement so that the teacher might not be able to determine whether or not the student understood what was wrong with the original statement.

    If, however, the underlined word/phrase is one that can be changed to its opposite, it loses the advantage over the simpler true-false question because all the student has to know is that the statement is false and change is to is not.

  • The Yes-No Variation
    The student responds to each item by writing, circling or indicating yes-no rather than true-false.  An example follows:

What reasons are given by students for taking evening classes?  In the list below, circle Yes if that is one of the reasons given by students for enrolling in evening classes; circle No if that is not a reason given by students.

Yes   No   They are employed during the day.
Yes   No   They are working toward a degree.
Yes   No   They like going to school.
Yes   No   There are no good television shows to watch.
Yes   No   Parking is more plentiful at night.

  • The A-B Variation
    The example below shows a question for which the same two answers apply.  The answers are categories of content rather than true-false or yes-no.

Indicate whether each type of question below is a selection type or a supply type by circling A if it is a selection , B if it is supply.

Select     Supply
A      B            Multiple Choice
A      B            True-False
A      B            Essay
A      B            Matching
A      B            Short Answer

In summary, the sources all tend to agree that the best type of Alternative-Response items are those that are unambiguous (“true or false with respect to what?”), concisely written, covering one idea per question, and aimed at more than rote memorization.  We should avoid trick questions or questions that test on trivia.  And the best tests with A-R items have a lot of questions with a True-to-False ratio of 40:60.

Next test item type:  Matching!

Objective or Subjective? Those are the Questions

tobeornottobe

Now that we have studied general test writing strategies, ideas, and tips, it is time to pull our focus inward to the details of the questions themselves.

In general, question types fall into two categories:

  1. Objective
  2. Subjective

I needed specific definitions for these, which I found here.

Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf

1. Objective, which require students to select the correct response from several alternatives or to supply a word or short phrase to answer a question or complete a statement.

Examples: multiple choice, true-false, matching, completion

2. Subjective or essay, which permit the student to organize and present an original answer

Examples: short-answer essay, extended-response essay, problem solving, performance test items

This source also suggests guidelines for choosing between them:

Essay tests are appropriate when:

  • The group to be tested is small and the test is not to be reused
  • You wish to encourage and reward the development of student skill in writing
  • You are more interested in exploring student attitudes than in measuring his/her achievement

Objective tests are appropriate when:

  • The group to be tested is large and the test may be reused.
  • Highly reliable scores must be obtained as efficiently as possible.
  • Impartiality of evaluation, fairness, and free from possible test scoring influences are essential.

Either essay or objective tests can be used to:

  • Measure almost any important educational achievement a written test can measure
  • Test understanding and ability to apply principles.
  • Test ability to think critically.
  • Test ability to solve problems.

And it continues with this bit of advice:

 The matching of learning objective expectations with certain item types provides a high degree of test validity:  testing what is supposed to be tested.

  • Demonstrate or show: performance test items
  • Explain or describe: essay test items

I wanted to see what different sources would say, so I also found this one.

Source: http://www.helpteaching.com/about/how_to_write_good_test_questions/

If you want the student to compare and contrast an issue taught during a history lesson, open ended questions may be the best option to evaluate the student’s understanding of the subject matter.

If you are seeking to measure the student’s reasoning skills, analysis skills, or general comprehension of a subject matter, consider selecting primarily multiple choice questions.

Or, for a varied approach, utilize a combination of all available test question types so that you can appeal to the learning strengths of any student on the exam.

Take into consideration both the objectives of the test and the overall time available for taking and scoring your tests when selecting the best format.

I am not sure that “multiple choice” should be the primary choice but I understand they are suggesting to avoid open-ended questions if you want to measure reasoning or analytic skills or general comprehension.

This bothers me a little.  It seems to me, from reviewing the previous posts in this blog, that an open-ended question could measure those skills.  The example that comes to mind is the question I had in botany about describing the cell types a pin might encounter when passing through a plant stem.  That was an essay question measuring general comprehension of plant tissues.

The following source brings up good points about analyzing the results.  It also notes that objective tests, when “constructed imaginatively,” can test at higher levels of Bloom’s Taxonomy.

Source: http://www.calm.hw.ac.uk/GeneralAuthoring/031112-goodpracticeguide-hw.pdf

Objective tests are especially well suited to certain types of tasks. Because questions can be designed to be answered quickly, they allow lecturers to test students on a wide range of material. … Additionally, statistical analysis on the performance of individual students, cohorts and questions is possible.

The capacity of objective tests to assess a wide range of learning is often underestimated. Objective tests are very good at examining recall of facts, knowledge and application of terms, and questions that require short text or numerical responses. But a common worry is that objective tests cannot assess learning beyond basic comprehension.

However, questions that are constructed imaginatively can challenge students and test higher learning levels. For example, students can be presented with case studies or a collection of data (such as a set of medical symptoms) and be asked to provide an analysis by answering a series of questions…

Problem solving can also be assessed with the right type of questions. …

A further worry is that objective tests result in inflated scores due to guessing. However, the effects of guessing can be eliminated through a combination of question design and scoring techniques. With the right number of questions and distracters, distortion through guessing becomes largely irrelevant. Alternatively, guessing can be encouraged and measured if this is thought to be a desirable skill.

There are, however, limits to what objective tests can assess. They cannot, for example, test the competence to communicate, the skill of constructing arguments or the ability to offer original responses. Tests must be carefully constructed in order to avoid the decontextualisation of knowledge (Paxton 1998) and it is wise to use objective testing as only one of a variety of assessment methods within a module. However, in times of growing student numbers and decreasing resources, objective testing can offer a viable addition to the range of assessment types available to a teacher or lecturer.

I like their point about how objective tests cannot test competence to communicate, construct arguments, or offer original answers.  Training our students to take only multiple choice tests (or simply answer “true” or “false”) does not help them to learn how to explain their thoughts or even ensure that they can write coherent sentences.

This is addressed by the second source and in previous posts.  The suggestion is to use a variety of test item types.  This can give you a better picture of what your students know, whereas using one single type can be biased against students who are not strong respondents to that type.

Strategy Summary

summary

We are at the point of our investigation where we need to start looking in more detail at test construction.  Here is a brief summary that puts together the pieces of what we have learned so far.

Our challenges

Write an accurate measure of student achievement that

  • motivates students and reinforces learning,
  • enables us to assess student mastery of course objectives,
  • and allows us to recognize what material was or was not communicated clearly.

Some things we can do to accomplish this

In general, when designing a test we need to

  • consider the length of the test,
  • write clear, concise instructions,
  • use a variety of test item formats,
  • test early and/or frequently,
  • proofread and check for accuracy,
  • consider the needs of our disabled or non-native speaker students,
  • and use humor.

More specifically, our test goals are to

  • Assess achievement of instructional objectives,
  • measure important aspects of the subject,
  • accurately reflect the emphasis placed on important aspects of instruction,
  • measure an appropriate level of student knowledge,
  • and have the questions vary in levels of difficulty.

We should consider the technical quality of our tests

Quality means “conformance to requirements” and “fitness for use.”  The criteria are

  • offering cognitive complexity,
  • reviewing content quality,
  • writing meaningful questions,
  • using appropriate language,
  • being able to generalize about student learning from their test performance,
  • and writing a fair test with answers that represent what students know.

A useful tool is Bloom’s Taxonomy

It lists learning levels in increasing order of complexity:

  1. Remembering
  2. Understanding
  3. Applying
  4. Analyzing
  5. Evaluating
  6. Creating

To apply Bloom’s directly, we looked at

  • Lists of verbs associated with each level (some were discipline-specific),
  • question frames — nearly complete questions we can modify for our topics,
  • and knowledge domains — the kinds of knowledge that can be tested:
    • factual,
    • conceptual,
    • procedural,
    • and metacognitive.

Next up:  Learning what question types to use to achieve our goals.

Using Bloom’s in Test Writing

bloom-verbs

When I first started considering Bloom’s Taxonomy, I thought it was good to help expand my ideas on how to test but I struggled with applying it directly.  I appreciated the increasing cognitive levels but needed help in writing test questions that utilized them.

What I found were lists of verbs associated with each level.  A good one to start with is:

Source: http://www.lshtm.ac.uk/edu/taughtcourses/writinggoodexamquestions.pdf

A table of suggested verbs mapped against the Anderson and Krathwohl adapted levels of Bloom’s Taxonomy of Cognition Cognitive Level Verb Examples

  1. Remember: define, repeat, record, list, recall, name, relate, underline.
  2. Understand: translate, restate, discuss, describe, recognise, explain, express, identify, locate, report, review, tell.
  3. Apply: interpret, apply, employ, use, demonstrate, dramatise, practice, illustrate, operate, schedule, sketch.
  4. Analyse: distinguish, analyse, differentiate, appraise, calculate, experiment, test, compare, contrast, criticise, diagram, inspect, debate, question, relate, solve, examine, categorise.
  5. Evaluate: judge, appraise, evaluate, rate, compare, revise, assess, estimate
  6. Create: compose, plan, propose, design, formulate, arrange, assemble, collect, construct, create, set-up, organise, manage, prepare.

Here is an extensive list that is printable on one page, useful for reference while you are designing your test:

Bloom’s Verbs, one page.

Other useful lists:

Bloom’s Verbs for Math

Bloom’s Question Frames (looks very good for English, literature, history, etc.)  This gives you nearly complete questions which you can manipulate into test items appropriate to your discipline.

More Bloom’s Question Frames (2 pages).

Bloom’s Verbs for Science

What comes across to me again and again throughout the sources is that considering the hierarchy when designing exams creates a culture of learning that involves thinking deeply about the course material, taking it beyond simple rote memorization and recitation.

This culture would benefit from also considering Bloom’s while you are teaching.  Modeling higher level thought processes, showing joy at cognitive challenges, exploring topics in depth (if time permits) or mentioning the depth exists (if time is short) can send a strong signal that thinking is valued and important to learning.

Another view on Bloom’s as applied to test writing is to consider the knowledge domains inherent in your course material.  They are:

Source: http://www.lshtm.ac.uk/edu/taughtcourses/writinggoodexamquestions.pdf

The kinds of knowledge that can be tested

Factual Knowledge

Terminology, Facts, Figures

Conceptual Knowledge

Classification, Principles, Theories, Structures, Frameworks

Procedural Knowledge

Algorithms, Techniques and Methods and Knowing when and how to use them.

Metacognitive Knowledge

Strategy, Overview, Self Knowledge, Knowing how you know.

When I put this list with the verbs lists, I get more ideas for test questions and directions for exploring student acquisition of the course knowledge.

Defining Bloom’s Taxonomy

fx_Bloom_New

One recurring recommendation in the resources is that we should consider Bloom’s Taxonomy when designing tests. To do so, we should know what it is.

The triangle above is a version of the revised Bloom’s, using active verbs and with an addition of one level and a slight reordering at the top.

According to http://www.learnnc.org/lp/pages/4719,

Bloom’s Taxonomy was created in 1948 by psychologist Benjamin Bloom and several colleagues. Originally developed as a method of classifying educational goals for student performance evaluation, Bloom’s Taxonomy has been revised over the years and is still utilized in education today.

The original intent in creating the taxonomy was to focus on three major domains of learning: cognitive, affective, and psychomotor. The cognitive domain covered “the recall or recognition of knowledge and the development of intellectual abilities and skills”; the affective domain covered “changes in interest, attitudes, and values, and the development of appreciations and adequate adjustment”; and the psychomotor domain encompassed “the manipulative or motor-skill area.” Despite the creators’ intent to address all three domains, Bloom’s Taxonomy applies only to acquiring knowledge in the cognitive domain, which involves intellectual skill development.

The site goes on to say:

Bloom’s Taxonomy can be used across grade levels and content areas. By using Bloom’s Taxonomy in the classroom, teachers can assess students on multiple learning outcomes that are aligned to local, state, and national standards and objectives. Within each level of the taxonomy, there are various tasks that move students through the thought process. This interactive activity demonstrates how all levels of Bloom’s Taxonomy can be achieved with one image.

Further, http://www.edpsycinteractive.org/topics/cognition/bloom.html tells us,

The major idea of the taxonomy is that what educators want students to know (encompassed in statements of educational objectives) can be arranged in a hierarchy from less to more complex.  The levels are understood to be successive, so that one level must be mastered before the next level can be reached.

And also,

In any case it is clear that students can “know” about a topic or subject in different ways and at different levels.  While most teacher-made tests still test at the lower levels of the taxonomy, research has shown that students remember more when they have learned to handle the topic at the higher levels of the taxonomy (Garavalia, Hummel, Wiley, & Huitt, 1999).

Let’s see what each level represents.  The following list is based on the original Bloom’s categories but it is still enlightening.

Source: http://www.calm.hw.ac.uk/GeneralAuthoring/031112-goodpracticeguide-hw.pdf

Table 2.2 Bloom’s taxonomy and question categories

Competence        Skills demonstrated  

Knowledge             Recall of information

Knowledge of facts, dates, events, places

Comprehension    Interpretation of information in                                                                            one’s own words

Grasping meaning

Application            Application of methods, theories,                                                                          concepts to new situations

Analysis                 Identification of patterns

Recognition of components and their           relationships

Synthesis                Generalize from given knowledge

Use old ideas to create new ones

Organize and relate knowledge from several areas

Draw conclusions, predict

Evaluation             Make judgments

Assess value of ideas, theories

Compare and discriminate between ideas

Evaluate data

Based on the work by Benjamin B.S. Bloom et. al. Evaluation to Improve Learning (New York: McGraw-Hill, 1981)

We will look at these in more detail in the next post.

The Technical Quality of a Test — Part 2

quality habit

In the previous post (below) we started looking at features that define the technical quality of a test.

     Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf

Criteria for establishing the technical quality of a test

  1. Cognitive complexity

The test questions will focus on appropriate intellectual activity ranging from simple recall of facts to problem solving, critical thinking, and reasoning.

Bloom’s Taxonomy*

  1. Content quality

The test questions will permit students to demonstrate their knowledge of challenging and important subject matter.  The emphasis of the test should be a reflection of the emphasis of the lecture.

  1. Meaningfulness

The test questions will be worth students’ time and students will recognize and understand their value.

  1. Language appropriateness

The language demands will be clear and appropriate to the assessment tasks and to students.  It should reflect the language used in the classroom.  Test items should be stated in simple, clear language, free of nonfunctional material and extraneous clues, and free of race, ethnic, and sex bias.

  1. Transfer and generalizability

Successful performance on the test will allow valid generalizations about achievement to be made.

  1. Fairness

Student performance will be measured in a way that does not give advantage to factors irrelevant to school learning:  scoring schemes will be similarly equitable.

Basic rules of fairness:

  • Test questions should reflect the objectives of the unit
  • Expectations should be clearly known by the students
  • Each test item should present a clearly formulated task
  • One item should not aide in answering another
  • Ample time for test completion should be allowed
  • Assignment of points should be determined before the test is administered.
  1. Reliability

Answers to test questions will be consistently trusted to represent what students know.

 

*More on Bloom’s Taxonomy in a future post.

We have already discussed points #1 and 2; let us continue with the rest.

  1. Meaningfulness

I think if we are writing exam questions that explore the knowledge we want the students to learn, the questions will be meaningful, even when they only test simple recall.  Each question should trigger a memory in any student who has prepared and studied.

I am not always certain our students will recognize and understand the value of the questions we offer but I am not sure that really matters.  We want to avoid outrage at a question that comes across as grossly unfair or outside the scope of the class, which I think will happen with meaningful questions.

  1. Language appropriateness

When I see the phrase “the language used in the classroom,” I think about how I describe concepts and the level of the vocabulary I use in discussions.  I try to avoid “dumbing down” the words I use but I also try to avoid choosing words that are esoteric or outdated.  In lecture, it is often easy to see student reaction to words they don’t understand, and that tells me I need to define those words, even if they aren’t words in my discipline.  This gives me an opportunity to raise the student vocabulary closer to college level.  Once I have used and defined them, I feel free to use those words in exams.

One hazard of making the questions “free of nonfunctional material and extraneous clues” in mathematics is that students become trained to believe they must use every number and every bit of information in the problem or they won’t be working it correctly.  Unfortunately, real world problems that use math often contain nonfunctional material and extraneous clues and our students need to learn how to weed it out.  I introduce this skill at the calculus level.

  1. Transfer and generalizability

The goal I set for my students is for them to learn the course material in such a way that they can perform the skills, recall the ideas, and recognize the vocabulary and notation, and that they are prepared to take the next course in the sequence successfully.  This is my definition of transfer and generalizability.

How would you define it for your discipline?

  1. Fairness

This seems straightforward and reasonable to me.  I don’t always have the time to determine the assignment of points before the test is administered but I always do before I start grading.  If something causes me to rethink the point distribution, I regrade all the problems affected by it.

  1. Reliability

The description given for this point did not help me understand reliability but this source’s definition did [note that “marker” means “the person grading the exam”]:

     Source:  http://www.lshtm.ac.uk/edu/taughtcourses/writinggoodexamquestions.pdf

Does the question allow markers to grade it consistently and reproducibly and does it allow markers to discriminate between different levels of performance? This frequently depends on the quality of the marking guidance and clarity of the assessment criteria. It may also be improved through providing markers with training and opportunities to learn from more experienced assessors.

What resonates with me is the ability to discriminate between different levels of performance.  That can be challenging when grading math problems because I feel partial credit is important.  Students can work problems in so many incorrect or partially correct ways that I have to work hard to determine how much they really knew and how much was due to simple error.

From the criteria list I see the opportunity to consider the overall structure of my exams and assess my general test writing skills.  I like the guidelines and how they direct me to think beyond my personal experiences while considering how the students will perceive the test.

The Technical Quality of a Test — Part 1

quality habit

We want our tests to be good measures of student achievement so we need to pay attention to what one source calls the “technical quality of a test.”

To help me understand what quality really means, I found these definitions useful:

  1. The characteristics of a product or service that bear on its ability to satisfy stated or implied needs; “conformance to requirements”
  2. A product or service free of deficiencies; “fitness for use”

(from http://asq.org/glossary/q.html)

So what criteria should we use to improve quality?

     Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf

Criteria for establishing the technical quality of a test

  1. Cognitive complexity

The test questions will focus on appropriate intellectual activity ranging from simple recall of facts to problem solving, critical thinking, and reasoning.

Bloom’s Taxonomy*

  1. Content quality

The test questions will permit students to demonstrate their knowledge of challenging and important subject matter.  The emphasis of the test should be a reflection of the emphasis of the lecture.

  1. Meaningfulness

The test questions will be worth students’ time and students will recognize and understand their value.

  1. Language appropriateness

The language demands will be clear and appropriate to the assessment tasks and to students.  It should reflect the language used in the classroom.  Test items should be stated in simple, clear language, free of nonfunctional material and extraneous clues, and free of race, ethnic, and sex bias.

  1. Transfer and generalizability

Successful performance on the test will allow valid generalizations about achievement to be made.

  1. Fairness

Student performance will be measured in a way that does not give advantage to factors irrelevant to school learning:  scoring schemes will be similarly equitable.

Basic rules of fairness:

  • Test questions should reflect the objectives of the unit
  • Expectations should be clearly known by the students
  • Each test item should present a clearly formulated task
  • One item should not aide in answering another
  • Ample time for test completion should be allowed
  • Assignment of points should be determined before the test is administered.
  1. Reliability

Answers to test questions will be consistently trusted to represent what students know.

 

*More on Bloom’s Taxonomy in a future post.

Let’s examine this list in detail.

  1. Cognitive complexity

Oh, I like this one.  We should be challenging our students with intellectual activity; more importantly with a range of it.  This brings me back to the previous post (below) where we discussed the classification of questions based on how easily they can be answered; from those that most can get to the few “A-B Breakers”.  There should be some questions that make the student think, “Hmmm, how can I use what I have learned to answer this?” and some that bring on the reaction of “Oh yes, I have seen all this before and I can remember it.”

I recall a question on a botany exam that asked me to imagine holding a plant stem in my hand and piercing it with a straight pin.  I needed to describe the various tissue types the pin might touch as it passed through to the middle of the stem.  I had learned the list of tissues already; this question forced me to consider their locations in the plant and organize them from the outside to the inside.  I hadn’t already considered that idea so I was cognitively challenged but I had all the tools I needed to answer the question.

One message that comes across in a number of the sources is that we can be tempted to test on the easier parts of the material rather than the important parts.  Considering cognitive complexity helps us focus on drawing from our students what they have learned beyond simple recall.

  1. Content Quality

Again, we need to ensure we are testing more than simple recall but we also have to make sure we are not writing questions that test outside of the course material.

Here is one concern I have about emphasizing on the test what is emphasized in the lecture: if this is taken too literally, our students are at risk of paying attention only to the information we explicitly label as important and ignoring any nuances or “items of lesser importance.”  They are often keenly tuned into the way we write on the board, in a PowerPoint slide, or on digital lecture notes and are quick to infer that words in bold, italics, all capitals, or that are underlined are the only things they should study for a test.  I found I was giving them that impression in my lectures so I changed the way I wrote on the board, forcing my students to consider all the words I presented.

We will continue this discussion in the next post!

Test Goals — Thinking About the Strategies

questioning_mind_bronze_stand

When considering “What am I testing?”, I agree with this web page’s statement:

Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf

In general, test items should

Assess achievement of instructional objectives

Measure important aspects of the subject (concepts and

conceptual relations)

Accurately reflect the emphasis placed on important
aspects of instruction

Measure an appropriate level of student knowledge

Vary in levels of difficulty

But my reaction is to laugh because it is easy to make this list.  What do you have to do to implement it?

In beginning to construct a test, I should at least acknowledge the instructional objectives.  I might even make a written list, depending on the time I have available.  I ask myself:

  • What do I want the students to get out of each section and chapter?
  • What are the big-picture goals, the skills, the vocabulary, the concepts?
  • What are the common mistakes previous students have made?
  • Is there any information I want to foreshadow?
  • What have I brought to their attention in lecture of what to do or what not to do?

The words, “measure important aspects of the subject,” make me wary.  I think it is easy to interpret them as “only focus on the most important aspects” which means I should not test at all on anything else.

What I do think it means is “avoid testing on minor facts”, which opens the field up to a great many topics as well as encourages us to think deeply about what is and is not important.  For example, in a history class you might learn that Lincoln was assassinated on April 14, 1865.  Is it important that you know it was April 14?  Maybe not.  But April 14 of this year is the 150th anniversary of that event and that might make it important.  In any other year it might be enough to know it happened in 1865.

I find it challenging to determine “an appropriate level of student knowledge.”  This really deals with how long the test should be compared to how much time I have to give it and how well I feel the students are learning the material.  I measure how long it takes me to write up the solutions and divide that into the test time, and am happy if the answer achieves certain values depending on the class.  This generally works well although sometimes I am surprised by the students’ reaction. How do you determine the length?

I address the variation in difficulty by thinking about which questions are easily answered, which take a “usual” amount of work and thinking, and then I throw in one or two “A-B breakers.”  These are the questions that will separate the students who have really learned the material from those who are somewhat prepared or not prepared at all.  They might require a little more conceptual thinking or a slight stretch on the skills or knowledge students should have already attained.

The strategies I use are helpful for writing a test in my discipline.  Other strategies might be appropriate for other disciplines.  Feel free to write about them in the comments section.