Độ "tin cậy" và độ "xác trị" trong xây dựng, thiết kế bài kiểm tra đánh giá năng lực Tiếng Anh, những điểm cần lưu ý đối với giảng viên
Kiểm tra là một phần không thể thiếu trong các chương trình học ngoại ngữ nói chung, và chương
trình học tiếng Anh nói riêng. Từ thực tế đó, mối quan tâm tới “độ tin cậy” và “độ xác trị” của
một bài kiểm tra năng lực tiếng Anh là thực sự quan trọng. Bởi một thực tế là hầu hết các giáo
viên tiếng Anh hiện nay hầu như chưa được đào tạo về kiểm tra đánh giá, mà họ hầu hết dựa vào
khả năng trực giác, kinh nghiệm và giáo trình để xây dựng, thiết kế một bài kiểm tra tiếng Anh.
Từ những lý do nêu trên, trong khuôn khổ bài viết này, một vài vấn đề có liên quan tới quá trình
xây dựng và thiết kế một bài kiểm tra năng lực trong chương trình học tiếng Anh sẽ được nêu lên
và thảo luận.
t method In this method, the same test is implemented twice in the same group of students. The second implementation takes place no later than two weeks from the first one. Students are not only uninformed of the first test result but also given no feedback on their performances. They are also not warned about the second one and, therefore, undergo no preparation in the upcoming test during this period. After the second test, individual results will be arranged into two columns to make comparison. If there is no significant difference, it will be claimed that the test seem to meet reliability requirement. Although, as Brown (1996) states, this way might sound strange and upset students who are asked to take the same test twice, it could 90 KHOA HỌC NGOẠI NGỮ QUÂN SỰSố 14 - 7/2018 v NGHIÊN CỨU - TRAO ĐỔI prove to be a useful method of working out about the reliability of a test. Parallel Test Method In this method two test equivalent in terms of difficulty are conducted to the same group of students. The same procedures as in the test-retest methods are applied. Now, although parallel test method sounds more natural than the test-retest method, it is more challenging because two versions of a test need to be designed with the strict equivalence in terms of difficulty. Consequently, the level of difficulty, at first, is defined and then the test items are developed to match the difficulty, requiring teachers and test designer a huge amount of effort. 3.2. Test Validity As Huges (1992) states, a test proves valid only when it corresponds with language skills or structures which are going to be measured. For example, when testing students’ knowledge of vocabulary, which they have just covered, students should be tested what they have already been presented. If in the test, some vocabulary items of which students have yet to receive instructions and explanations are included in the test, the test is surely reduced to invalidity, since it fails to respond what is designed to identify. It will be a mistake when discussing language test validity without clarifying the construct validity. According to Bachman (1996) “the so called construct validity is subordinate to the sense and rationality of interpretation of the language test scores, which means this interpretation is the assessment of language skills of the subject” (Bachman and Palmer, 1996, pp.254- 271). Bachman holds a belief that by means of interpreting the test score, we can not only assess the language ability of the subject, but we also estimate the reasonability of the language adopted in the test. For example, when the aim of the test is to evaluate students’ ability to use Passive Voice, it is important that the test be designed to directly deal with this grammatical structure in the hope that the scores will help us to assess our students’ language proficiency. If somehow the test items include other structures, such as Conditionals, the test will surely lack validity. From the mentioned ideas, it could be said that construct validity is to interpret scores, from which language proficiency of students and test tasks can be estimated. 3.2.1. Factors that Affect Test Validity A series of factors having negative effects on validity have been identified. Henning (1987), for example, has listed some of them. The first factor that affects test validity is the mismatch between a test and construct it is going to measure. Bachman also proposes that an invalid adaptation of tests is another detrimental factor. If, for instance, a test designed to test lexical level of first-year students, is used with high school students, it is surely invalid. However, only when McNamara (2000) proposes that there are two major notable factors: “irrelevant variance of validity” and “underrepresentation of validity ”, is the problem further clarified. Irrelevant Variance of Validity A test will be classified into “irrelevant variance” if the test is too broad, consisting a number of variables which are irrelevant to the interpreted validity. McNamara argues that the tested knowledge or skill mismatches in a setting which is either out of student’s experience or irrelevant to the content being tested. For example, in an oral test, candidates may be asked to discuss an abstract topic; if that topic is of their disinterest or is one of which they may be ignorance, their performance stands less chance of competence than when they are asked to speak on a more accustomed topic at the same level of abstraction. 91KHOA HỌC NGOẠI NGỮ QUÂN SỰSố 14 - 7/2018 NGHIÊN CỨU - TRAO ĐỔI v In this case, it is noted that the quality being tested, the ability to discuss an abstract topic in English, is inconsistent with irrelevant requirement of having particular knowledge of a certain topic. Underrepresentation of Validity “Underrepresentation of validity is contrary to “irrelevant variance of validity”, that is to say the testing is insufficient; the test either is too narrow in terms of knowledge or fails to include important aspects of validity. In other words, as Fulcher (2010) states, the extent to which a test fails to measure the relevant knowledge is the degree to which it under-represents the validity that is supposed to be tested. 3.2.2. Methods of Improving Language Proficiency Test Validity When discussing how to determine the test validity, Henning (1987) indicates that there are two main ways to achieve test validity. One is the experimental method in which the data collection together with the statistic formulas is applied to calculation of validity. The other is through non- experimental methods. This involves inspection, intuition and common sense. Since the application of experimental methods requires special training in terms of statistics and the use of specialized computer programs to work out complex calculations, within the paper, the author would focus on non-experimental methods for preference. Although, as many worry, lack of experimental evidence may somehow lead to lack of objectivity, by a number of practical actions teachers can enhance the chances of upholding the validity of their test. For example, if one teacher wants to evaluate his/her students’ knowledge of grammar at the end of an elementary course, he or she need to acknowledge and be aware of what knowledge of grammar at the elementary level consists of. Then, he or she should adopt test items matching what students have been exposed to during the course. 4. CONCLUSION AND IMPLICATIONS FOR TEACHERS This paper has provided some basic understandings of English proficiency test in which the definition, along with qualities needed for English proficiency test, is mentioned. Also, “reliability” and “validity” are chosen among the features of English proficiency test to be discussed. Accordingly, the factors that affect and the methods used to improve “reliability” and “validity” are also discussed. The paper is written in the hope of providing what is fundamental in designing and developing English proficiency test. Without it, students will be exposed to a considerable challenge in English learning process. This, unfortunately, leads to the fact that teachers are incapable of providing students with objective feedback about students’ progress in their English learning process. This lack of knowledge in turn has bad effect on teachers as well. They will do not address what their students’ weaknesses are and how to promote their strengths. From such reasons, it is significant that teachers train themselves in problems relevant to assessment and testing. Also, our educational institutions should start offering courses in test design and development together with other courses in English language teaching methodology./. References: Bachman, L. (1980). Fundamental Considerations in Language Testing. Oxford: Oxford University Press. Bachman, L. F., & Palmer, A. S. (1996). Language Testing in Practice: Designing and Developing Useful Language Tests. Oxford: Oxford University Press. Brown, J. D. (1996). Testing in Language Programs. New Jersey: Prentice Hall Regents. Brown, H. D. (2004). Language assessment: 92 KHOA HỌC NGOẠI NGỮ QUÂN SỰSố 14 - 7/2018 v NGHIÊN CỨU - TRAO ĐỔI Principles and classroom practices. White Plains, New York: Pearson Education. Fulcher, G. (2010). Practical Language Testing. London: Hodder Education Henning, G. (1987). A Guide to Language Testing: Development, Evaluation, Research. Massachusetts: Heinle & Heinle. Hughes, A. (1992). Testing for Language Teachers. Cambridge: Cambridge University Press. Hughes, A. (2003). Testing for Language Teachers (2nd ed.). Cambridge: Cambridge University Press. McNamara, T. F. (2000). Communication and design of language tests. In H. G. Widdowson (Ed,), Language testing (pp 13-22). Oxford, England: Oxford University Press. A REVIEW OF ENGLISH PROFICIENCY TEST: RELIABILITY, VALIDITY, AND IMPLI-CATIONS FOR TEACHERS NGUYEN MANH TUAN Abstract: Testing is an indispensable component in foreign language programs in general, and in English in particular. In this context, the concerns about the reliability and validity are of importance. There is a fact that teachers with practically no training in the field of test development often depend mostly on their own intuition or their previous experience and text books. From these above, within this article, the problems of test design and development in English program will be raised and discussed. Keywords: English proficiency test, English program, reliability, validity Received: 24/4/2018; Revised: 22/5/2018; Accepted for publication: 20/6/2018
File đính kèm: