HANDOUT TO - The construction of tests and kinds of testing items
1. The essential step in testing is to make clear about what it is you want to know and for what purpose. So what questions you need to answer:
-
-
-
-
2. When designing a test, firstly, set the specifications for the test. This will include information on: content, format and timing, criteria levels of performance, and scoring procedures.
3. Decision on the content.
The content is described in different ways according to its nature. The content of a grammar test, for example, may simply list all the relevant structures. The content of a test of a language skill such as reading comprehension, may be specified along a number of dimensions.
The test constructors first decide on discrete points and areas they want to test. This involves distinguishing broad objectives from more specific ones. Test items are then developed for each objective.
4. Format and timing - what should it specify ?
- specify test structure (including time allocated to components)
-item types procedures with examples
-tell how important is the given component
-decide how many passages would be presented (in the case of reading and listening) or required (in the case of writing). How many items there will be in each component.
One basic issue of format is whether the test progresses to increasingly more difficult items or whether easy and difficult items are interspersed. There are arguments on both sides. If items get increasingly more difficult, the respondents may give up after a while and not attempt items after the first one they get stuck. Yet if respondents experience failure too frequently at the outset of a test because of difficult items, they may be discouraged from attempting the remainder of the items in a section. Thus, there may be a psychological advantage to pacing the items so that they become progressively more difficult. A compromise is to start the test with relatively easy items and then to start interspersing easy and difficult items.
5. Instructions
What should the instructions be like ?
Should be brief, explicit and understandable
What else should a student know before s/he starts writing the test ?
Students should know the rules of each item and section of the test
Teacher must give the time limit allowed for each subtest and for the total test
6. Criteria levels of performance - what does it mean ?
You should specify the required level of performance for success - for example to succeed in this test is to obtain 60% or 55 points.
7. Scoring procedures - If an objective is tested by more than one item (say, five items) then it is possible to speak of mastery of the objective. If somebody gets four of the five items right, the person has displayed 80% mastery of the objective, according to the test. The test may be series of such items. The test constructor has to consider how long it will take to score particular types of items. The more objective the item, the higher the scorer reliability is likely to be (i.e., the likelihood that two different teachers checking the same test would come up with the same score for a particular respondent's test). Machine scoring involves answer sheets.
8. The next step is to decide on Test items
- alternate response item: in with a correct response must be chosen from two alternatives such as TRUE/FALSE , YES/NO, A/B
- fixed response item: one in with the correct answer must be chosen from among several alternatives
- free response item, also open-ended response: one in with the ss is free to answer a question as he/ she wishes without having to choose from among alternatives provided
- structured response item: in with some control or guidance is given for the answer, but the ss must contribute sth of their own eg. after a reading passage a comprehension
9. the most common types of test items: True-false tests ; Multiple-choice test; The fill-in-the-blank test ; Matching tests ; Cloze Tests; C-Test.
10. What is true-false test ? in with a ss is to accept or reject a statement or utterance head or read. They are useful as tests of listening or reading comprehension or knowledge of historical literary and cultural facts, since the choice is of. 50/ 50 probably of a chance.
11. What is Multiple choice test ? Provide answer to question on meaning of words items of grammar or cultural interest. There is usually a choice of 4 alternatives were only 1 is appropriate the others are based on probable errors.
To make it more demanding at least three of the options must sound or look reasonable and be an non-sense proposals.
12. The fill-in-the-blank test may be multiple-choice or it may require the student to write in an appropriate word. These tests are useful for assessing knowledge of grammatical structures, use of tenses, levels of language, and vocabulary [very often used is testing phrasal verbs where a student is to put the correct preposition which is a part of a collocate or a phrase].
What are the dangers of fill-in-the-blank tests ?
Teachers must make sure the ss are quite unambiguous and can be completed appropriately by only one word, some test- designers provide a pattern to be filled in giving a number of letters and some letters to help to fill in the piece of vocabulary.
13. Matching tests are commonly used as vocabulary tests. Students are asked to match synonyms, antonyms, names of objects with occupations, etc.
14. What is a close test ?
Words are removed from a reading passage at regular intervals, leaving blanks. Eg. every fifth word is removed. The reader must then read the passenger and try to guess the missing words.
The close procedure it thus a technique for measuring reading comprehension. But it can also be used to judge the differently of reading materials
The reader is given a score according to how well the words guessed matched the original words, or whether or not they made sense. Two types of scoring procedure are used:
1. The reader must guess the exact word that was used in original. This is called the exact word method.
2. The reader can guess any word that is appropriate or acceptable in the context. This is called the acceptable word method (also the appropriate word method).
15. What is a C-Test ?
- a special kind of doze test. Were you leave out a half on every second word. So a ss gets only the beginning half of the word and is to reconstruct the whole word judging from the surrounding context,
- usually such a test has no more then 200 words in with a ss reconstruct 100 words. Giving the first half of the word makes the test objective as you cannot give many interpretations, usually only one possibility is correct.
18. Final step in test preparation is - Test Evaluation
Piloting the test. If time and resources permit, a test should be piloted on a group of learners similar to that for which it is designed. The pilot administration provides the test constructor feedback on the items and the appropriateness of predicted timing.
Item difficulty refers to the proportion of correct responses to a test item. A test which aims to differentiate among respondents should have items which 40 to 60% of the respondents answer correctly. If the purpose of the test is to determine whether nearly all learners have achieved the objectives, then the item difficulty should be 90% or better.
The item discrimination index [moc dyskryminacyjna zadania testowego] tells how well an item performs in separating better learners from the poorer ones. The index is intended to distinguish respondents who know the most or have the skills or abilities being tested from those who do not. The teacher can calculate item difficulty and discrimination for a test in class with the students' help.
Test revision, if an item has a difficulty coefficient of lower than forty percent and higher than sixty percent, and if the discrimination index is low, then the item should probably be eliminated.