Aug 2015 Plenary Talk Documents

The documents used at the Palomar College Plenary breakout sessions for August, 2015 are below.

The slideshow in .PPTX format.  You can click “View” and then “Notes View” to see each slide presented on a single page with the accompanying text:

Test Writing Plenary Talk Aug 2015

The worksheet in .PDF format:

Test Writing Plenary Talk Worksheet

 

The verb lists and question frames documents are found here as well as in context within the blog.

Bloom’s Verbs, one page

Bloom’s Verbs for Math

Bloom’s Question Frames

More Bloom’s Question Frames

Bloom’s Verbs for Science

 

Strategy Summary

summary

We are at the point of our investigation where we need to start looking in more detail at test construction.  Here is a brief summary that puts together the pieces of what we have learned so far.

Our challenges

Write an accurate measure of student achievement that

  • motivates students and reinforces learning,
  • enables us to assess student mastery of course objectives,
  • and allows us to recognize what material was or was not communicated clearly.

Some things we can do to accomplish this

In general, when designing a test we need to

  • consider the length of the test,
  • write clear, concise instructions,
  • use a variety of test item formats,
  • test early and/or frequently,
  • proofread and check for accuracy,
  • consider the needs of our disabled or non-native speaker students,
  • and use humor.

More specifically, our test goals are to

  • Assess achievement of instructional objectives,
  • measure important aspects of the subject,
  • accurately reflect the emphasis placed on important aspects of instruction,
  • measure an appropriate level of student knowledge,
  • and have the questions vary in levels of difficulty.

We should consider the technical quality of our tests

Quality means “conformance to requirements” and “fitness for use.”  The criteria are

  • offering cognitive complexity,
  • reviewing content quality,
  • writing meaningful questions,
  • using appropriate language,
  • being able to generalize about student learning from their test performance,
  • and writing a fair test with answers that represent what students know.

A useful tool is Bloom’s Taxonomy

It lists learning levels in increasing order of complexity:

  1. Remembering
  2. Understanding
  3. Applying
  4. Analyzing
  5. Evaluating
  6. Creating

To apply Bloom’s directly, we looked at

  • Lists of verbs associated with each level (some were discipline-specific),
  • question frames — nearly complete questions we can modify for our topics,
  • and knowledge domains — the kinds of knowledge that can be tested:
    • factual,
    • conceptual,
    • procedural,
    • and metacognitive.

Next up:  Learning what question types to use to achieve our goals.

Using Bloom’s in Test Writing

bloom-verbs

When I first started considering Bloom’s Taxonomy, I thought it was good to help expand my ideas on how to test but I struggled with applying it directly.  I appreciated the increasing cognitive levels but needed help in writing test questions that utilized them.

What I found were lists of verbs associated with each level.  A good one to start with is:

Source: http://www.lshtm.ac.uk/edu/taughtcourses/writinggoodexamquestions.pdf

A table of suggested verbs mapped against the Anderson and Krathwohl adapted levels of Bloom’s Taxonomy of Cognition Cognitive Level Verb Examples

  1. Remember: define, repeat, record, list, recall, name, relate, underline.
  2. Understand: translate, restate, discuss, describe, recognise, explain, express, identify, locate, report, review, tell.
  3. Apply: interpret, apply, employ, use, demonstrate, dramatise, practice, illustrate, operate, schedule, sketch.
  4. Analyse: distinguish, analyse, differentiate, appraise, calculate, experiment, test, compare, contrast, criticise, diagram, inspect, debate, question, relate, solve, examine, categorise.
  5. Evaluate: judge, appraise, evaluate, rate, compare, revise, assess, estimate
  6. Create: compose, plan, propose, design, formulate, arrange, assemble, collect, construct, create, set-up, organise, manage, prepare.

Here is an extensive list that is printable on one page, useful for reference while you are designing your test:

Bloom’s Verbs, one page.

Other useful lists:

Bloom’s Verbs for Math

Bloom’s Question Frames (looks very good for English, literature, history, etc.)  This gives you nearly complete questions which you can manipulate into test items appropriate to your discipline.

More Bloom’s Question Frames (2 pages).

Bloom’s Verbs for Science

What comes across to me again and again throughout the sources is that considering the hierarchy when designing exams creates a culture of learning that involves thinking deeply about the course material, taking it beyond simple rote memorization and recitation.

This culture would benefit from also considering Bloom’s while you are teaching.  Modeling higher level thought processes, showing joy at cognitive challenges, exploring topics in depth (if time permits) or mentioning the depth exists (if time is short) can send a strong signal that thinking is valued and important to learning.

Another view on Bloom’s as applied to test writing is to consider the knowledge domains inherent in your course material.  They are:

Source: http://www.lshtm.ac.uk/edu/taughtcourses/writinggoodexamquestions.pdf

The kinds of knowledge that can be tested

Factual Knowledge

Terminology, Facts, Figures

Conceptual Knowledge

Classification, Principles, Theories, Structures, Frameworks

Procedural Knowledge

Algorithms, Techniques and Methods and Knowing when and how to use them.

Metacognitive Knowledge

Strategy, Overview, Self Knowledge, Knowing how you know.

When I put this list with the verbs lists, I get more ideas for test questions and directions for exploring student acquisition of the course knowledge.

Defining Bloom’s Taxonomy

fx_Bloom_New

One recurring recommendation in the resources is that we should consider Bloom’s Taxonomy when designing tests. To do so, we should know what it is.

The triangle above is a version of the revised Bloom’s, using active verbs and with an addition of one level and a slight reordering at the top.

According to http://www.learnnc.org/lp/pages/4719,

Bloom’s Taxonomy was created in 1948 by psychologist Benjamin Bloom and several colleagues. Originally developed as a method of classifying educational goals for student performance evaluation, Bloom’s Taxonomy has been revised over the years and is still utilized in education today.

The original intent in creating the taxonomy was to focus on three major domains of learning: cognitive, affective, and psychomotor. The cognitive domain covered “the recall or recognition of knowledge and the development of intellectual abilities and skills”; the affective domain covered “changes in interest, attitudes, and values, and the development of appreciations and adequate adjustment”; and the psychomotor domain encompassed “the manipulative or motor-skill area.” Despite the creators’ intent to address all three domains, Bloom’s Taxonomy applies only to acquiring knowledge in the cognitive domain, which involves intellectual skill development.

The site goes on to say:

Bloom’s Taxonomy can be used across grade levels and content areas. By using Bloom’s Taxonomy in the classroom, teachers can assess students on multiple learning outcomes that are aligned to local, state, and national standards and objectives. Within each level of the taxonomy, there are various tasks that move students through the thought process. This interactive activity demonstrates how all levels of Bloom’s Taxonomy can be achieved with one image.

Further, http://www.edpsycinteractive.org/topics/cognition/bloom.html tells us,

The major idea of the taxonomy is that what educators want students to know (encompassed in statements of educational objectives) can be arranged in a hierarchy from less to more complex.  The levels are understood to be successive, so that one level must be mastered before the next level can be reached.

And also,

In any case it is clear that students can “know” about a topic or subject in different ways and at different levels.  While most teacher-made tests still test at the lower levels of the taxonomy, research has shown that students remember more when they have learned to handle the topic at the higher levels of the taxonomy (Garavalia, Hummel, Wiley, & Huitt, 1999).

Let’s see what each level represents.  The following list is based on the original Bloom’s categories but it is still enlightening.

Source: http://www.calm.hw.ac.uk/GeneralAuthoring/031112-goodpracticeguide-hw.pdf

Table 2.2 Bloom’s taxonomy and question categories

Competence        Skills demonstrated  

Knowledge             Recall of information

Knowledge of facts, dates, events, places

Comprehension    Interpretation of information in                                                                            one’s own words

Grasping meaning

Application            Application of methods, theories,                                                                          concepts to new situations

Analysis                 Identification of patterns

Recognition of components and their           relationships

Synthesis                Generalize from given knowledge

Use old ideas to create new ones

Organize and relate knowledge from several areas

Draw conclusions, predict

Evaluation             Make judgments

Assess value of ideas, theories

Compare and discriminate between ideas

Evaluate data

Based on the work by Benjamin B.S. Bloom et. al. Evaluation to Improve Learning (New York: McGraw-Hill, 1981)

We will look at these in more detail in the next post.

The Technical Quality of a Test — Part 2

quality habit

In the previous post (below) we started looking at features that define the technical quality of a test.

     Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf

Criteria for establishing the technical quality of a test

  1. Cognitive complexity

The test questions will focus on appropriate intellectual activity ranging from simple recall of facts to problem solving, critical thinking, and reasoning.

Bloom’s Taxonomy*

  1. Content quality

The test questions will permit students to demonstrate their knowledge of challenging and important subject matter.  The emphasis of the test should be a reflection of the emphasis of the lecture.

  1. Meaningfulness

The test questions will be worth students’ time and students will recognize and understand their value.

  1. Language appropriateness

The language demands will be clear and appropriate to the assessment tasks and to students.  It should reflect the language used in the classroom.  Test items should be stated in simple, clear language, free of nonfunctional material and extraneous clues, and free of race, ethnic, and sex bias.

  1. Transfer and generalizability

Successful performance on the test will allow valid generalizations about achievement to be made.

  1. Fairness

Student performance will be measured in a way that does not give advantage to factors irrelevant to school learning:  scoring schemes will be similarly equitable.

Basic rules of fairness:

  • Test questions should reflect the objectives of the unit
  • Expectations should be clearly known by the students
  • Each test item should present a clearly formulated task
  • One item should not aide in answering another
  • Ample time for test completion should be allowed
  • Assignment of points should be determined before the test is administered.
  1. Reliability

Answers to test questions will be consistently trusted to represent what students know.

 

*More on Bloom’s Taxonomy in a future post.

We have already discussed points #1 and 2; let us continue with the rest.

  1. Meaningfulness

I think if we are writing exam questions that explore the knowledge we want the students to learn, the questions will be meaningful, even when they only test simple recall.  Each question should trigger a memory in any student who has prepared and studied.

I am not always certain our students will recognize and understand the value of the questions we offer but I am not sure that really matters.  We want to avoid outrage at a question that comes across as grossly unfair or outside the scope of the class, which I think will happen with meaningful questions.

  1. Language appropriateness

When I see the phrase “the language used in the classroom,” I think about how I describe concepts and the level of the vocabulary I use in discussions.  I try to avoid “dumbing down” the words I use but I also try to avoid choosing words that are esoteric or outdated.  In lecture, it is often easy to see student reaction to words they don’t understand, and that tells me I need to define those words, even if they aren’t words in my discipline.  This gives me an opportunity to raise the student vocabulary closer to college level.  Once I have used and defined them, I feel free to use those words in exams.

One hazard of making the questions “free of nonfunctional material and extraneous clues” in mathematics is that students become trained to believe they must use every number and every bit of information in the problem or they won’t be working it correctly.  Unfortunately, real world problems that use math often contain nonfunctional material and extraneous clues and our students need to learn how to weed it out.  I introduce this skill at the calculus level.

  1. Transfer and generalizability

The goal I set for my students is for them to learn the course material in such a way that they can perform the skills, recall the ideas, and recognize the vocabulary and notation, and that they are prepared to take the next course in the sequence successfully.  This is my definition of transfer and generalizability.

How would you define it for your discipline?

  1. Fairness

This seems straightforward and reasonable to me.  I don’t always have the time to determine the assignment of points before the test is administered but I always do before I start grading.  If something causes me to rethink the point distribution, I regrade all the problems affected by it.

  1. Reliability

The description given for this point did not help me understand reliability but this source’s definition did [note that “marker” means “the person grading the exam”]:

     Source:  http://www.lshtm.ac.uk/edu/taughtcourses/writinggoodexamquestions.pdf

Does the question allow markers to grade it consistently and reproducibly and does it allow markers to discriminate between different levels of performance? This frequently depends on the quality of the marking guidance and clarity of the assessment criteria. It may also be improved through providing markers with training and opportunities to learn from more experienced assessors.

What resonates with me is the ability to discriminate between different levels of performance.  That can be challenging when grading math problems because I feel partial credit is important.  Students can work problems in so many incorrect or partially correct ways that I have to work hard to determine how much they really knew and how much was due to simple error.

From the criteria list I see the opportunity to consider the overall structure of my exams and assess my general test writing skills.  I like the guidelines and how they direct me to think beyond my personal experiences while considering how the students will perceive the test.

The Technical Quality of a Test — Part 1

quality habit

We want our tests to be good measures of student achievement so we need to pay attention to what one source calls the “technical quality of a test.”

To help me understand what quality really means, I found these definitions useful:

  1. The characteristics of a product or service that bear on its ability to satisfy stated or implied needs; “conformance to requirements”
  2. A product or service free of deficiencies; “fitness for use”

(from http://asq.org/glossary/q.html)

So what criteria should we use to improve quality?

     Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf

Criteria for establishing the technical quality of a test

  1. Cognitive complexity

The test questions will focus on appropriate intellectual activity ranging from simple recall of facts to problem solving, critical thinking, and reasoning.

Bloom’s Taxonomy*

  1. Content quality

The test questions will permit students to demonstrate their knowledge of challenging and important subject matter.  The emphasis of the test should be a reflection of the emphasis of the lecture.

  1. Meaningfulness

The test questions will be worth students’ time and students will recognize and understand their value.

  1. Language appropriateness

The language demands will be clear and appropriate to the assessment tasks and to students.  It should reflect the language used in the classroom.  Test items should be stated in simple, clear language, free of nonfunctional material and extraneous clues, and free of race, ethnic, and sex bias.

  1. Transfer and generalizability

Successful performance on the test will allow valid generalizations about achievement to be made.

  1. Fairness

Student performance will be measured in a way that does not give advantage to factors irrelevant to school learning:  scoring schemes will be similarly equitable.

Basic rules of fairness:

  • Test questions should reflect the objectives of the unit
  • Expectations should be clearly known by the students
  • Each test item should present a clearly formulated task
  • One item should not aide in answering another
  • Ample time for test completion should be allowed
  • Assignment of points should be determined before the test is administered.
  1. Reliability

Answers to test questions will be consistently trusted to represent what students know.

 

*More on Bloom’s Taxonomy in a future post.

Let’s examine this list in detail.

  1. Cognitive complexity

Oh, I like this one.  We should be challenging our students with intellectual activity; more importantly with a range of it.  This brings me back to the previous post (below) where we discussed the classification of questions based on how easily they can be answered; from those that most can get to the few “A-B Breakers”.  There should be some questions that make the student think, “Hmmm, how can I use what I have learned to answer this?” and some that bring on the reaction of “Oh yes, I have seen all this before and I can remember it.”

I recall a question on a botany exam that asked me to imagine holding a plant stem in my hand and piercing it with a straight pin.  I needed to describe the various tissue types the pin might touch as it passed through to the middle of the stem.  I had learned the list of tissues already; this question forced me to consider their locations in the plant and organize them from the outside to the inside.  I hadn’t already considered that idea so I was cognitively challenged but I had all the tools I needed to answer the question.

One message that comes across in a number of the sources is that we can be tempted to test on the easier parts of the material rather than the important parts.  Considering cognitive complexity helps us focus on drawing from our students what they have learned beyond simple recall.

  1. Content Quality

Again, we need to ensure we are testing more than simple recall but we also have to make sure we are not writing questions that test outside of the course material.

Here is one concern I have about emphasizing on the test what is emphasized in the lecture: if this is taken too literally, our students are at risk of paying attention only to the information we explicitly label as important and ignoring any nuances or “items of lesser importance.”  They are often keenly tuned into the way we write on the board, in a PowerPoint slide, or on digital lecture notes and are quick to infer that words in bold, italics, all capitals, or that are underlined are the only things they should study for a test.  I found I was giving them that impression in my lectures so I changed the way I wrote on the board, forcing my students to consider all the words I presented.

We will continue this discussion in the next post!

Test Goals — Thinking About the Strategies

questioning_mind_bronze_stand

When considering “What am I testing?”, I agree with this web page’s statement:

Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf

In general, test items should

Assess achievement of instructional objectives

Measure important aspects of the subject (concepts and

conceptual relations)

Accurately reflect the emphasis placed on important
aspects of instruction

Measure an appropriate level of student knowledge

Vary in levels of difficulty

But my reaction is to laugh because it is easy to make this list.  What do you have to do to implement it?

In beginning to construct a test, I should at least acknowledge the instructional objectives.  I might even make a written list, depending on the time I have available.  I ask myself:

  • What do I want the students to get out of each section and chapter?
  • What are the big-picture goals, the skills, the vocabulary, the concepts?
  • What are the common mistakes previous students have made?
  • Is there any information I want to foreshadow?
  • What have I brought to their attention in lecture of what to do or what not to do?

The words, “measure important aspects of the subject,” make me wary.  I think it is easy to interpret them as “only focus on the most important aspects” which means I should not test at all on anything else.

What I do think it means is “avoid testing on minor facts”, which opens the field up to a great many topics as well as encourages us to think deeply about what is and is not important.  For example, in a history class you might learn that Lincoln was assassinated on April 14, 1865.  Is it important that you know it was April 14?  Maybe not.  But April 14 of this year is the 150th anniversary of that event and that might make it important.  In any other year it might be enough to know it happened in 1865.

I find it challenging to determine “an appropriate level of student knowledge.”  This really deals with how long the test should be compared to how much time I have to give it and how well I feel the students are learning the material.  I measure how long it takes me to write up the solutions and divide that into the test time, and am happy if the answer achieves certain values depending on the class.  This generally works well although sometimes I am surprised by the students’ reaction. How do you determine the length?

I address the variation in difficulty by thinking about which questions are easily answered, which take a “usual” amount of work and thinking, and then I throw in one or two “A-B breakers.”  These are the questions that will separate the students who have really learned the material from those who are somewhat prepared or not prepared at all.  They might require a little more conceptual thinking or a slight stretch on the skills or knowledge students should have already attained.

The strategies I use are helpful for writing a test in my discipline.  Other strategies might be appropriate for other disciplines.  Feel free to write about them in the comments section.

General Tips About Test Design

Test. Keyboard

In the previous post (below) we asked, “Do you think about what you are testing and how you are assessing that information?” when it comes to test design.

This site,

Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf,

provides some general tips:

General tips about testing

  • Length of test

The more items it has, the more reliable it is.  However, if a test is too long, the students may get tired and not respond accurately.  If a test needs to be lengthy, divide it into sections with different kinds of tasks.

  • Clear, concise instructions

It is useful to provide an example of a worked problem, which helps the student understand exactly what is necessary.

  • Mix it up!

It is often advantageous to mix types of items (multiple choice, true-false, essay) on a written exam.  Weaknesses connected with one kind of item or component or in students’ test taking skills will be minimized.

  • Test Early

Consider discounting the first test if the results are poor.  Students often need a practice test to understand the format each instructor uses and anticipate the best way to prepare and take particular tests.

  • Test frequently

Frequent testing helps students to avoid getting behind, provides instructors with multiple sources of information to use in computing the final course grade, and gives students regular feedback.

  • Check for accuracy

Instructors should be cautious about using tests written by others.  They should be checked for accuracy and appropriateness in the given course.

  • Proofread exams

Check them carefully for misspellings, misnumbering responses, and page collation.

  • One wrong answer

It is wise to avoid having separate items or tasks depend upon answers or skills required in previous items or tasks.

  • Special considerations

Anticipate special considerations that learning disabled students or non-native speakers may need.

  • A little humor

Using a little humor or placing less difficult items or tasks at the beginning of an exam can help reduce test anxiety and thus promote a more accurate demonstration of their progress.

My reaction to their advice is mixed.  I’m not sure I could provide good examples of worked problems on the test itself because I teach mathematics — working problems for the students defeats the purpose of the test.  However I can have the students get that knowledge before the exam by having them complete homework problems and emphasize that many of the test problems will utilize those skills and strategies.

I am able to “mix it up” sometimes, depending on the course and the material being covered.  When testing vocabulary in statistics, for example, sometimes I use multiple choice and sometimes I use fill-in-the-blank.

I am not fond of the idea of discounting the first test if it is poor.  I get around that by offering my students short quizzes on a regular basis — I write the problems and grade them so students get a feel for my writing style and notation expectations before the longer, high-stakes exams.  My goal for the quizzes is to have the cumulative points be similar to an exam but then that total is weighted less than an exam towards the overall grade.

In math it is difficult to completely avoid having separate items or tasks depend upon previous answers.  The dilemma is this:  Do I write a complicated problem and have the students recall all the steps I want?  Or do I walk them through the steps knowing the answer to one may be dependent on the answer of another?

I think humor is a wonderful addition to tests.  Whenever I can (i.e., there is room), I include a math-related cartoon on the last page.

Also, the phrase I have heard about placing less difficult items at the beginning of test is “establishing a pattern of success.”  Give the student who has prepared a chance to start off with a victory thus building confidence for the rest of the questions.

What do you think of the list?  Would you add to it?  Is there anything with which you disagree?

The Challenges to Writing a Good Test

Paying attention to the details

What sorts of challenges do we, as professors, face when writing an exam?  That was one question in my mind when I started reading the resources.  This site made a statement that really struck a chord with me:

Source: https://www.psychologytoday.com/blog/thinking-about-kids/201212/how-write-final-exam

In reality, most professors develop exams as best they can.

Few have any formal training in assessment (the field that focuses on how to accurately measure performance).

Although many professors spend most of their time teaching, most of us have no formal training in education whatsoever.

So we tend to write questions that sound good and make sense to us.

We try to minimize cheating by writing new exams every semester so we never have a chance to weed out bad questions and develop really good measurement instruments.

We often use the same types of tools used by our own professors to assess the skills and learning of our students instead of thinking about what would work best.

We often don’t think clearly enough about our course goals to accurately measure them.

And sometimes our questions are not clear enough so different students interpret them differently and we only recognize interpretations that match our own.

And all this happens despite our best efforts and all our hard work.

For better or for worse.

Some of us have an education background but many of us do not.  We mimic the test styles we liked in our experience and perhaps avoid the ones we disliked.  Certainly we formed opinions about our teachers and decided which we wanted to emulate when we taught.

I don’t see anything wrong with that but if we want to improve, we need to explore new ideas.

We can start by acknowledging the basic features of a good exam.

A test should be an accurate measure of student achievement.

Source: http://www.iub.edu/~best/pdf_docs/better_tests.pdf :

Problems that keep tests from being accurate measures of students’ achievement:

  • Too many questions measuring only knowledge of facts.
  • Too little feedback.
  • The questions are often ambiguous and unclear.
  • The tests are too short to provide an adequate sample of the body of content to be covered.
  • The number of exams is insufficient to provide a good sample of the students’ attainment of the knowledge and skills the course is trying to develop.

Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf :

Well-constructed tests:

  1. Motivate students and reinforce learning
  2. Enable teachers to assess the students’ mastery of course objectives
  3. Provide feedback on teaching, often showing what was or was not communicated clearly

What makes a test good or bad?  The most basic and obvious answer is that good tests measure what you want to measure and bad tests do not.

The whole point of testing is to encourage learning.  A good test is designed with items that are not easily guessed without proper studying.

Have you ever spent time studying your tests?  When designing them, do you think about what you are testing and how you are assessing that information?

Have you analyzed the responses students put on the test to see if they understood what you were asking?  Do you think about how the wording or design could be improved on future tests?

We will explore these in more detail in the next posts.