Multiple Choice Structure

Taking a Test

One type of objective question is multiple choice.  We all know what it is but let’s look in detail at its description anyway.

Source: https://www.msu.edu/dept/soweb/writitem.html

Description of a multiple choice item:

Presents a problem or question in the stem of the item and requires the selection of the best answer or option. The options consist of a most-correct answer and one or more distractors or foils.

The major purpose of a multiple choice item is to identify examinees who do not have complete command of the concept or principle involved.

Properties:
• State the problem in the stem
• Include one correct or most defensible answer
• Select diagnostic foils or distractors such as:

o Clichés
o Common misinformation
o Logical interpretations
o Partial answers
o Technical terms or textbook jargon

The distractors must appear reasonable as the correct answer to the students who have not mastered the material.

So the structure of a multiple choice question is a stem followed by options.  The options contain one correct answer and a set of distractors.

The Stem

Some advice for constructing a good stem is

Source: http://www.iub.edu/~best/pdf_docs/better_tests.pdf

  • Write questions that test a significant concept, that are unambiguous, and that don’t give test-wise students an advantage
  • The stem should fully state the problem and all qualifications. Always include a verb in the statement
  • Items should measure students’ ability to comprehend, apply, analyze, and evaluate as well as recall
  • Include words in the stem that would otherwise be repeated in each option
  • Eliminate excessive wording and irrelevant information in the stem

Here are some examples of good and bad stem design:

Source:  http://cft.vanderbilt.edu/guides-sub-pages/writing-good-multiple-choice-test-questions/#stem

     

A stem that does not present a clear problem, however, may test students’ ability to draw inferences from vague descriptions rather serving as a more direct test of students’ achievement of the learning outcome.

  

If a significant learning outcome requires negative phrasing, such as identification of dangerous laboratory or clinical practices, the negative element should be emphasized with italics or capitalization.

  

A question stem is preferable because it allows the student to focus on answering the question rather than holding the partial sentence in working memory and sequentially completing it with each alternative

The best thought about the stem I have seen on the Internet:

Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf

Way to judge a good stem: students who know the content should be able to answer before reading the alternatives.

The Options

Sometimes known as “the alternatives”, they are composed of one right answer and a group of “foils” or distractors.

One point that is emphasized regularly in the resources is that the distractors should all be plausible and attractive answers.

Source:  http://cft.vanderbilt.edu/guides-sub-pages/writing-good-multiple-choice-test-questions/

Common student errors provide the best source of distractors.

Alternatives should be stated clearly and concisely. Items that are excessively wordy assess students’ reading ability rather than their attainment of the learning objective

Alternatives should be mutually exclusive. Alternatives with overlapping content may be considered “trick” items by test-takers, excessive use of which can erode trust and respect for the testing process.

[Ed. Note:  I have some issues with this particular example but I get the point of their suggestion.]

Alternatives should be homogenous in content. Alternatives that are heterogeneous in content can provide cues to student about the correct answer.

  

The alternatives should be presented in a logical order (e.g., alphabetical or numerical) to avoid a bias toward certain positions.

Avoid complex multiple choice items, in which some or all of the alternatives consist of different combinations of options. As with “all of the above” answers, a sophisticated test-taker can use partial knowledge to achieve a correct answer.

Other suggestions from this source:

Alternatives should be free from clues about which response is correct. Sophisticated test-takers are alert to inadvertent clues to the correct answer, such differences in grammar, length, formatting, and language choice in the alternatives. It’s therefore important that alternatives

  • have grammar consistent with the stem.
  • are parallel in form.
  • are similar in length.
  • use similar language (e.g., all unlike textbook language or all like textbook language).

The alternatives “all of the above” and “none of the above” should not be used. When “all of the above” is used as an answer, test-takers who can identify more than one alternative as correct can select the correct answer even if unsure about other alternative(s). When “none of the above” is used as an alternative, test-takers who can eliminate a single option can thereby eliminate a second option. In either case, students can use partial knowledge to arrive at a correct answer.

The number of alternatives can vary among items as long as all alternatives are plausible. Plausible alternatives serve as functional distractors, which are those chosen by students that have not achieved the objective but ignored by students that have achieved the objective. There is little difference in difficulty, discrimination, and test score reliability among items containing two, three, and four distractors.

Keep the specific content of items independent of one another. Savvy test-takers can use information in one question to answer another question, reducing the validity of the test.

 

There is more to think about for multiple choice questions, which we will examine in the next post.

Defining Bloom’s Taxonomy

fx_Bloom_New

One recurring recommendation in the resources is that we should consider Bloom’s Taxonomy when designing tests. To do so, we should know what it is.

The triangle above is a version of the revised Bloom’s, using active verbs and with an addition of one level and a slight reordering at the top.

According to http://www.learnnc.org/lp/pages/4719,

Bloom’s Taxonomy was created in 1948 by psychologist Benjamin Bloom and several colleagues. Originally developed as a method of classifying educational goals for student performance evaluation, Bloom’s Taxonomy has been revised over the years and is still utilized in education today.

The original intent in creating the taxonomy was to focus on three major domains of learning: cognitive, affective, and psychomotor. The cognitive domain covered “the recall or recognition of knowledge and the development of intellectual abilities and skills”; the affective domain covered “changes in interest, attitudes, and values, and the development of appreciations and adequate adjustment”; and the psychomotor domain encompassed “the manipulative or motor-skill area.” Despite the creators’ intent to address all three domains, Bloom’s Taxonomy applies only to acquiring knowledge in the cognitive domain, which involves intellectual skill development.

The site goes on to say:

Bloom’s Taxonomy can be used across grade levels and content areas. By using Bloom’s Taxonomy in the classroom, teachers can assess students on multiple learning outcomes that are aligned to local, state, and national standards and objectives. Within each level of the taxonomy, there are various tasks that move students through the thought process. This interactive activity demonstrates how all levels of Bloom’s Taxonomy can be achieved with one image.

Further, http://www.edpsycinteractive.org/topics/cognition/bloom.html tells us,

The major idea of the taxonomy is that what educators want students to know (encompassed in statements of educational objectives) can be arranged in a hierarchy from less to more complex.  The levels are understood to be successive, so that one level must be mastered before the next level can be reached.

And also,

In any case it is clear that students can “know” about a topic or subject in different ways and at different levels.  While most teacher-made tests still test at the lower levels of the taxonomy, research has shown that students remember more when they have learned to handle the topic at the higher levels of the taxonomy (Garavalia, Hummel, Wiley, & Huitt, 1999).

Let’s see what each level represents.  The following list is based on the original Bloom’s categories but it is still enlightening.

Source: http://www.calm.hw.ac.uk/GeneralAuthoring/031112-goodpracticeguide-hw.pdf

Table 2.2 Bloom’s taxonomy and question categories

Competence        Skills demonstrated  

Knowledge             Recall of information

Knowledge of facts, dates, events, places

Comprehension    Interpretation of information in                                                                            one’s own words

Grasping meaning

Application            Application of methods, theories,                                                                          concepts to new situations

Analysis                 Identification of patterns

Recognition of components and their           relationships

Synthesis                Generalize from given knowledge

Use old ideas to create new ones

Organize and relate knowledge from several areas

Draw conclusions, predict

Evaluation             Make judgments

Assess value of ideas, theories

Compare and discriminate between ideas

Evaluate data

Based on the work by Benjamin B.S. Bloom et. al. Evaluation to Improve Learning (New York: McGraw-Hill, 1981)

We will look at these in more detail in the next post.

General Tips About Test Design

Test. Keyboard

In the previous post (below) we asked, “Do you think about what you are testing and how you are assessing that information?” when it comes to test design.

This site,

Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf,

provides some general tips:

General tips about testing

  • Length of test

The more items it has, the more reliable it is.  However, if a test is too long, the students may get tired and not respond accurately.  If a test needs to be lengthy, divide it into sections with different kinds of tasks.

  • Clear, concise instructions

It is useful to provide an example of a worked problem, which helps the student understand exactly what is necessary.

  • Mix it up!

It is often advantageous to mix types of items (multiple choice, true-false, essay) on a written exam.  Weaknesses connected with one kind of item or component or in students’ test taking skills will be minimized.

  • Test Early

Consider discounting the first test if the results are poor.  Students often need a practice test to understand the format each instructor uses and anticipate the best way to prepare and take particular tests.

  • Test frequently

Frequent testing helps students to avoid getting behind, provides instructors with multiple sources of information to use in computing the final course grade, and gives students regular feedback.

  • Check for accuracy

Instructors should be cautious about using tests written by others.  They should be checked for accuracy and appropriateness in the given course.

  • Proofread exams

Check them carefully for misspellings, misnumbering responses, and page collation.

  • One wrong answer

It is wise to avoid having separate items or tasks depend upon answers or skills required in previous items or tasks.

  • Special considerations

Anticipate special considerations that learning disabled students or non-native speakers may need.

  • A little humor

Using a little humor or placing less difficult items or tasks at the beginning of an exam can help reduce test anxiety and thus promote a more accurate demonstration of their progress.

My reaction to their advice is mixed.  I’m not sure I could provide good examples of worked problems on the test itself because I teach mathematics — working problems for the students defeats the purpose of the test.  However I can have the students get that knowledge before the exam by having them complete homework problems and emphasize that many of the test problems will utilize those skills and strategies.

I am able to “mix it up” sometimes, depending on the course and the material being covered.  When testing vocabulary in statistics, for example, sometimes I use multiple choice and sometimes I use fill-in-the-blank.

I am not fond of the idea of discounting the first test if it is poor.  I get around that by offering my students short quizzes on a regular basis — I write the problems and grade them so students get a feel for my writing style and notation expectations before the longer, high-stakes exams.  My goal for the quizzes is to have the cumulative points be similar to an exam but then that total is weighted less than an exam towards the overall grade.

In math it is difficult to completely avoid having separate items or tasks depend upon previous answers.  The dilemma is this:  Do I write a complicated problem and have the students recall all the steps I want?  Or do I walk them through the steps knowing the answer to one may be dependent on the answer of another?

I think humor is a wonderful addition to tests.  Whenever I can (i.e., there is room), I include a math-related cartoon on the last page.

Also, the phrase I have heard about placing less difficult items at the beginning of test is “establishing a pattern of success.”  Give the student who has prepared a chance to start off with a victory thus building confidence for the rest of the questions.

What do you think of the list?  Would you add to it?  Is there anything with which you disagree?

The Challenges to Writing a Good Test

Paying attention to the details

What sorts of challenges do we, as professors, face when writing an exam?  That was one question in my mind when I started reading the resources.  This site made a statement that really struck a chord with me:

Source: https://www.psychologytoday.com/blog/thinking-about-kids/201212/how-write-final-exam

In reality, most professors develop exams as best they can.

Few have any formal training in assessment (the field that focuses on how to accurately measure performance).

Although many professors spend most of their time teaching, most of us have no formal training in education whatsoever.

So we tend to write questions that sound good and make sense to us.

We try to minimize cheating by writing new exams every semester so we never have a chance to weed out bad questions and develop really good measurement instruments.

We often use the same types of tools used by our own professors to assess the skills and learning of our students instead of thinking about what would work best.

We often don’t think clearly enough about our course goals to accurately measure them.

And sometimes our questions are not clear enough so different students interpret them differently and we only recognize interpretations that match our own.

And all this happens despite our best efforts and all our hard work.

For better or for worse.

Some of us have an education background but many of us do not.  We mimic the test styles we liked in our experience and perhaps avoid the ones we disliked.  Certainly we formed opinions about our teachers and decided which we wanted to emulate when we taught.

I don’t see anything wrong with that but if we want to improve, we need to explore new ideas.

We can start by acknowledging the basic features of a good exam.

A test should be an accurate measure of student achievement.

Source: http://www.iub.edu/~best/pdf_docs/better_tests.pdf :

Problems that keep tests from being accurate measures of students’ achievement:

  • Too many questions measuring only knowledge of facts.
  • Too little feedback.
  • The questions are often ambiguous and unclear.
  • The tests are too short to provide an adequate sample of the body of content to be covered.
  • The number of exams is insufficient to provide a good sample of the students’ attainment of the knowledge and skills the course is trying to develop.

Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf :

Well-constructed tests:

  1. Motivate students and reinforce learning
  2. Enable teachers to assess the students’ mastery of course objectives
  3. Provide feedback on teaching, often showing what was or was not communicated clearly

What makes a test good or bad?  The most basic and obvious answer is that good tests measure what you want to measure and bad tests do not.

The whole point of testing is to encourage learning.  A good test is designed with items that are not easily guessed without proper studying.

Have you ever spent time studying your tests?  When designing them, do you think about what you are testing and how you are assessing that information?

Have you analyzed the responses students put on the test to see if they understood what you were asking?  Do you think about how the wording or design could be improved on future tests?

We will explore these in more detail in the next posts.

List of Resources

Here are the various web links that we will reference for this blog.  The list may be edited over time.

Feel free to post more resources in the Comments section.

  1. http://www.iub.edu/~best/pdf_docs/better_tests.pdf
  2. http://cft.vanderbilt.edu/guides-sub-pages/writing-good-multiple-choice-test-questions/
  3. http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf
  4. https://www.msu.edu/dept/soweb/writitem.html
  5. http://www.helpteaching.com/about/how_to_write_good_test_questions/
  6. http://www.psychologytoday.com/blog/thinking-about-kids/201212/how-write-final-exam
  7. http://www.instructables.com/id/How-to-write-a-test/
  8. http://www.uleth.ca/edu/runte/tests/
  9. http://teachonline.asu.edu/2013/06/quick-reference-guide-for-writing-effective-test-questions/
  10. http://www.lshtm.ac.uk/edu/taughtcourses/writinggoodexamquestions.pdf
  11. http://www.crlt.umich.edu/P8_0
  12. http://www.cmu.edu/teaching/assessment/assesslearning/creatingexams.html
  13. http://www.teaching-learning.utas.edu.au/assessment/authentic-assessment/designing-exams
  14. http://depts.washington.edu/eproject/ExamChecklist.htm
  15. http://www.princeton.edu/mcgraw/library/sat-tipsheets/designing-exam/
  16. http://www4.ncsu.edu/unity/lockers/users/f/felder/public/Papers/TestingTips.htm
  17. http://www.tltc.ttu.edu/teach/TLTC%20Teaching%20Resources/CreateTests.asp
  18. http://www.edutopia.org/better-tests-differentiate
  19. http://teaching.uncc.edu/learning-resources/articles-books/best-practice/assessment-grading/designing-test-questions
  20. http://www.calm.hw.ac.uk/GeneralAuthoring/031112-goodpracticeguide-hw.pdf

What We are About

Welcome!

We are teachers and we want to do our best for our students.  Sometimes we need a chance to see what others are doing to help us “improve our game.”

The goal of this blog is to explore the strategies, philosophies, and various options of test writing.  We’ll take a systematic approach, starting with general tips about tests and test construction and then proceeding through different test item types.

We will look at articles and advice on the Internet and discuss how the ideas may or may not apply to our discipline.  This is not a “one-size-fits-all” topic!  Neither should it be considered a best practices list.  We are the topic experts and the best judges for the information we are assessing.

Feel free to share these posts.

Due to the number of spammers, comments have been disallowed.

Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license

Tracy Johnston
STEM 1 Curriculum and Program Improvement (CPI) Coordinator
Palomar College

This is a sticky post; newer posts appear below.

For a list of the resources on this blog, click here.