Implicating the integrated format on reading test assessment: An evaluation of relevant factors

The present study reconsiders the extent to which test formats contribute to the reduction of cognitive
loads in language learning. From this consideration, the purpose of the study is to evaluate relevant factors
in reading test assessment. More specifically, this study examines the scope of factors includingstudy
subject, learning task, and pre-proficiency level under the manipulation of the integrated test format. Under
the setting of these factors, three research questions for the present study are as follows:
1. What arethe combined effects of test formats, study subject, and learning task on test score
performance?
2. What are the combined effects of pre-proficiency level, test formats, study subjects, and learning
task on test score performance?
3. What is the single effect of learning task and of test formats on task performance?
9 trang | Chia sẻ: hoa30 | Lượt xem: 1256 | Lượt tải: 0Free
Bạn đang xem nội dung tài liệu Implicating the integrated format on reading test assessment: An evaluation of relevant factors, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên
IMPLICATING THE INTEGRATED FORMAT 
ON READING TEST ASSESSMENT: AN EVALUATION 
OF RELEVANT FACTORS 
Trinh Ngoc Thanh* 
HCMC University of Technology and Education 
Received: 27/02/2020; Revised: 18/03/2020; Accepted: 28/04/2020 
Abstract: The present study in general evaluates relevant factors involved in assessing reading 
performance in the context of English classroom teaching. Unlike the traditional way of placing the 
reading text and test items separately in the split format, the test items were placed in accordance with 
relevant parts of the reading text in the integrated format and this study compares reading performance 
on the ground of the following variables: test formats, study subject, learning task, and pre-proficiency 
level. Findings from the study firstly indicated the influence of test formats on reading performance 
from the evidence that participants in the integrated format performed better than those in the split 
format. Second, findings also showed significant effect of the interaction between test formats and 
learning task as well as the interaction between pre-proficiency level and study subject in reading test 
performance. Third, in combination with test formats, task design also had an influence on reading test 
performance. 
Keywords: Integrated format, reading assessment, reading performance, split-attention 
1. Introduction 
 The present study reconsiders the extent to which test formats contribute to the reduction of cognitive 
loads in language learning. From this consideration, the purpose of the study is to evaluate relevant factors 
in reading test assessment. More specifically, this study examines the scope of factors includingstudy 
subject, learning task, and pre-proficiency level under the manipulation of the integrated test format. Under 
the setting of these factors, three research questions for the present study are as follows: 
 1. What arethe combined effects of test formats, study subject, and learning task on test score 
performance? 
 2. What are the combined effects of pre-proficiency level, test formats, study subjects, and learning 
task on test score performance? 
 3. What is the single effect of learning task and of test formats on task performance? 
2. Theoretical framework 
This section reviews major points on cognitive load theory and its application in learning. Depending 
on the amount of cognitive demands for the learning task, the component of elements interactivity can be 
classified into the low element and high element interactivity where the informational complexity 
determines how much effort should be gained from the learners in the learning of the materials (Paas, Rekl, 
& Sweller, 2004). In a further elaboration from Sweller (1994), while low element interactivity may work 
best in the simultaneous learning model where the reference of other elements is not necessary in the 
structure of the learning task, the high element interactivity occurs from the connectedness of simultaneous 
* Email: [email protected] 
learning and thus requires a higher order of organizing abstract elements in the promotion for successful 
learning task. 
A crucial factor in explaining the constitution of low-and high-element interactivity in the success of 
the learning task is the concept of cognitive load. Schnotz and Kürschne (2007) mentioned a general 
division of cognitive load into intrinsic and extraneous cognitive load. The intrinsic cognitive load refers to 
the type of cognitive load which is inherent from the learning material itself and thus imposes on the 
capacity of working memory in the acquisition of knowledge from different resources of information. 
Meanwhile, the type of instruction given to the learning task also plays a role in the knowledge acquisition; 
and thus, when the information is excessively delivered in the learning task, the excessive information in 
turn results becomes the by-product of extraneous or ineffective cognitive load (Paas et al., 2004). 
 Therefore, to eliminate the split-attention effect in the processing of learning material, the usage of 
worked examples is more often hypothesized as a means to initiate the mental integration of relevant 
referents only when the meaningful learning information is embedded in the representation of knowledge 
(Chandler & Sweller, 1991). Another hypothesis related to the split-attention effect is whether the physical 
integration in the integrated format can prevent the split-attention effect from emerging in the interaction 
with the learning material; and as being noted in the integrated format the material designers should closely 
align the combination of two different sources of information with each other (Sweller, Ayres & Kalyuga, 
2011). 
There are major arguments regarding cognitive load theory. First, according to Sweller (1994), the 
major concern in cognitive load theory is given for the reduction of extraneous cognitive load in the design 
for the learning tasks. Second, the reduction of extraneous cognitive in return goes along the concern that 
learners have certain constraints in their cognitive system and thus the instruction should appropriately 
adapted to amend the gap between the cognitive levels and the instructional design (Schnotz & Kürschne, 
2007). Third, an implication of cognitive load theory in teaching and learning is reserved for resolving 
complex learning tasks via means of simplifying information elements so that the cognitive load from long-
term memory is sufficient for information retrieval and knowledge storage. 
3. Methods 
3.1. Reading task design 
The present study follows the reading task design in Huynh’s (2015) study in which the reading 
assessment was taken in classroom context and the test formats were divided into the split and integrated 
form. While the split format contained the separation between the reading text and the set of reading 
questions, the reading questions were inserted in accordance with the relevant text in the integrated format. 
In the selection of reading text, an online free-access article entitled “Robin Hood: Fact or Fiction” from 
Linguapress publisher was chosen as the main text for the present study. Withinthe text length of 
approximately 530 words, the text selection design was suitable for a classroom reading assessment mini-
test in 20 minutes. Details of the three reading tasks of the mini-test were as in Table 1. 
3.2. Participants 
The present study collected data from 84 English-major students at tertiary level across four study 
subjects: Reading 1 (n=17), Reading 2 (n=22), Reading 4 (n=21), and British-American Civilization (n=24). 
While the Reading courses had the focus on developing reading skills for students from Year 1 to Year 2, 
the British-American Civilization was a specialized course in historical-political-cultural knowledge about 
the United Kingdom and the United States. According to scores of pre-tests and previous courses, the 
English proficiency level of participants was ranged from pre-intermediate to upper-intermediate. 
It is noted that the pre-test score (from the pilot test) was only applicable for participants in Reading 
1 because there was no previous marking record of reading courses for this group of participants. 
Furthermore, among the four groups, Reading 1 and British-American Civilization were experimental 
group in which they were assigned a recording exercise with a set of comprehension questions regarding 
the content of the reading text. Due to the course arrangement over two semesters, the time span of collected 
data was from November 2015 to June 2016. 
Table 1. Criterion-referenced design of the classroom reading mini-test 
Task 
PET reading test 
structure 
Classroom reading 
mini-test 
(IELTS test format) 
Description 
Task 1 Multiple choice 
questions (part 4) 
Multiple choice 
questions 
This task involves the selection of the correct 
answer among four choices (A, B, C, or D). 
The classroom reading assessment test 
similarly adapts test item format in PET and 
IELTS reading tests. 
Task 2 True-False 
(part 3) 
True-False-Not Given This task involves the main activity of 
validating the accuracy of the given 
statements. PET reading test only requires the 
performance of reading the given statement 
and defining True if the statement agrees with 
the information and False if the statement 
contradicts the information. Adapting IELTS 
test format in the classroom reading 
assessment test, the value Not Given is 
defined when there is no information derived 
from the text on this statement. 
Task 3 Multiple choice 
cloze test (part 5) 
Fill in the blank with 
no more than two 
words 
This task requires test-takers to fill in the gap 
with the appropriate word or group of words. 
The multiple choice cloze test in PET requires 
test takers to choose the right answer among 
four choices (A, B, C, or D) on vocabulary and 
grammar items. In the adaptation of IELTS 
test format, the answer is limited within the 
length of two words and the gap-filled words 
should be appropriate and grammatical. 
3.3. Description of variables 
The raw data was recorded in Excel 2016 and converted into a csv file before it was processed by 
R software (version 3.6.1). Table 2 below presents the description of variables in the csv file: 
Table 2. Description of variables in the study 
Variables Description 
class the study subjects: Reading 1(read1), Reading 2 (read2), Reading 4 (read4), British-
American Civilization (civil) 
class_numeric the study subjects transferred into numeric value: 1=read 1; 2=read 2; 3=read 4; and 
4=civil 
id the last one or two digits of student identification number 
ver test formats (0=split; 1=integrated) 
score the total score of the mini-test (from 0 to 12) 
pre-score the score (converted to the 10-point scale) indicating the English proficiency level of 
participants from PET mock test (for participants in Reading 1) and final grade of 
previous courses of reading skill (the remaining study subjects) 
condition the unavailability (0=control) and availability (1=experimental) of the recording 
exercise before participants took part in the class mini-test in split and integrated 
format. 
3.4. Quantitative analysis 
 With regards to the setting of the variables, the present study firstly analyzed the combined effects 
of relevant factors on test score performance using ANOVA analysis. More specifically, three-way 
ANOVA analyses were applied to analyze (1) the combined effects of test formats, study subject, and 
learning task on test score performance and (2) the combined effects of pre-proficiency level, test formats, 
study subjects, and learning task on test score performance in two models. Second, Welch two-sample t-
test was applied to compare the mean scores of test performance under the division of factors with 
significant effects from ANOVA analyses. 
4. Results 
4.1. Descriptive statistics of test score performance 
 The following descriptive statistics firstly summarizes the number of participants allocated into test 
formats and study subjects, to be followed by the mean scores and its corresponding SD. The mean scores 
between two test formats overall indicate that participants in the integrated format scored higher than those 
in the split format: 
Table 3. Descriptive statistics of test score performance 
Test format x Study subjects 
civil read1 read2 read4 
0(split) 13 9 13 10 
1(integrated) 11 8 9 11 
Mean and SD 
 Test format Subject Mean SD 
1 0 Civil 5.53 1.19 
2 1 Civil 7.63 1.28 
3 0 read1 5.88 1.36 
4 1 read1 7.12 1.35 
5 0 read2 6.15 1.57 
6 1 read2 6.66 1.11 
7 0 read4 5.90 1.79 
8 1 read4 6.09 2.02 
4.2. Analyzing the combined effects test formats, study subject, and learning task on test score 
performance 
A three-way between subjects 2x4x2 ANOVA analysis was conducted to measure the effects of test 
formats (integrated and split formats; var. name: ver), study subjects (civil, read 1, read 2, read 4; var. name: 
class_numeric), and learning task (with and without recording exercise; var. name: condition) on test score 
performance. There was a significant effect of test formats on test score performance at the 0.01 level (F 
(1, 76)=9. 37, p=0.003). There was also an interaction between test formats and learning task on test score 
performance and this interaction is statistically significant at the 0.05 level (F (1, 76)=4.29, p=0.04). 
Table 4. Analysis of Variance for test formats, study subject, learning task, and test score performance 
Response: mydata$test_score 
 Df Sum Sq Mean Sq F value Pr(>F) 
mydata$ver 1 21.108 21.1077 9.37890.0030 ** 
mydata$class_numeric 1 0.047 0.0473 0.0210 0.885074 
mydata$condition 1 2.037 2.0373 0.9052 0.344399 
mydata$ver:mydata$class_numeric 1 2.203 2.2034 0.9790 0.325574 
mydata$ver:mydata$condition 1 9.671 9.6708 4.29710.041568* 
mydata$class_numeric:mydata$condition 1 1.683 1.6831 0.7479 0.389875 
mydata$ver:mydata$class_numeric:mydata$condition 1 0.876 0.8756 0.3891 0.534665 
Residuals 76 171.042 2.2505 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
4.3. Analyzing the combined effects of pre-proficiency level, test formats, study subjects, and learning 
task on test score performance 
The follow-up analysis on the combined effects of four factors on test score performance (var. name: 
test_score) of three groups of participants from Reading 2 (read 2), Reading 4 (read4), and British-American 
Civilization (civil) and was divided into two models. The first model involves the combined effects of pre-
proficiency level (in the form of pre-test scores; var. name: pre_score), learning task (with and without 
recording exercise; var. name: condition), and test formats (integrated and split formats; var. name: ver). 
A three-way ANOVA analysis was conducted to measure the effects of three factors on test score 
performance in the first model. Results of the test indicated no significant effects of pre-proficiency level 
(F(1,59)=0.73, p>0.05) and no combined effects in the interaction of pre-proficiency level with learning 
task (F(1,59)=0.34, p>0.05) as well as with test formats (F(1, 59)=0.02, p>0.05) at the 0.05 level. Apart 
from the significant effect of test formats on test score performance, the interaction between learning task 
and test formats was also significant at the 0.05 level (F(1,59)=4.94, p=0.02). 
Table 5. Analysis of Variance Table of Model 1 
Response: mydata$test_score 
 Df Sum Sq Mean Sq F value Pr(>F) 
mydata$pre_score 1 1.806 1.8060 0.7397 0.39322 
mydata$condition 1 1.003 1.0033 0.4110 0.52396 
mydata$ver 1 13.935 13.9348 5.7078 0.02011 * 
mydata$pre_score:mydata$condition 1 0.854 0.8538 0.3497 0.55654 
mydata$pre_score:mydata$ver 1 0.050 0.0504 0.0206 0.88627 
mydata$condition:mydata$ver 1 12.082 12.0820 4.9488 0.02995 * 
mydata$pre_score:mydata$condition:mydata$ver 1 0.258 0.2582 0.1058 0.74617 
Residuals 59 144.041 2.4414 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
The second model involves the combined effects of pre-proficiency level (in the form of pre-test 
scores; var. name: pre_score), learning task (with and without recording exercise; var_name: condition), 
and study subjects (civil, read 2, read 4; var. name: class_numeric). The second three-way ANOVA 
indicated that in the second model the effect of pre-proficiency level on test score performances was not 
significant (F(1,61)=0.69, p>0.05) at the 0.05 level. Meanwhile, the interaction of pre-proficiency level 
with study subjects was significant at 0.05 level (F(1,61)=4.29, p=0.04). 
Table 6. Analysis of Variance Table of Model 2 
Response: mydata$test_score 
Df Sum Sq Mean Sq F value Pr(>F) 
mydata$pre_score 1 1.806 1.8060 0.6984 0.40658 
mydata$condition 1 1.003 1.0033 0.3880 0.53568 
mydata$class_numeric 1 1.172 1.1717 0.4531 0.50341 
mydata$pre_score:mydata$condition 1 1.204 1.2041 0.4656 0.49759 
mydata$pre_score:mydata$class_numeric 1 11.105 11.1047 4.2943 0.04247 * 
Residuals 61 157.740 2.5859 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
4.4. Analyzing the single effect of learning task and of test formats on task performance 
 Previous findings showed significant effect in the interaction between pre-condition and test formats 
on test score performances. First, the analysis of task performance based on the single effect of learning 
task was conducted by means of comparing the differences in score performance of task between two 
conditions: with recording exercise (experimental; n=41; class_numeric: read 1 and civil) and without 
recording (control; n=43; class_numeric: read 2 and read 4). The task performance was analyzed using 
Welch two-sample t-test in two modes: single and dual task performance. Results of Welch two-sample t-
test in the first analysis indicated no significant effect of learning task on task performance at the 0.05 level 
even though participants in the experimental group scored higher than those in the control group (except 
for the performance of task 1 and in the pair of task 1 3). 
Table 7. The single effect of learning task on task performance 
 Mcontrol M experimental t df p-value 
Task 1 2.13 2.09 0.206 79.113 0.83 
Task 2 1.697 1.878 -0.7675 81. 871 0.445 
Task 3 2.348 2.512 -0.85 81.04 0.397 
Task 1 2 3.837 3.975 -0.436 81. 49 0.663 
Task 1 3 4.48 4.39 0.38 81.7 0.70 
Task 2 3 4.04 4.60 -1.91 81.802 0.059 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 Second, the analysis of task performance based on the single effect of test formats was conducted by 
means of comparing the differences in score performance of task between two formats: split (n=45) and 
integrated (n=39). On similar analysis using Welch two-sample t-test, participants in the integrated format 
scored higher than those in the split format. However, significant effect of test formats on task performance 
was found in the performance of task 1 (t=-2.51, df=77.9, p=0.01), the performance of dual task 1 2 (t=-
2.94, df=76.99, p=0,004), and the performance of dual task 1 3 (t=-2.63, df=75.89, p=0.01) at the 0.05 level. 
Table 8. The single effect of test formats on task performance 
 M 
Split 
M 
integrated 
t df p-value 
Task 1 1.88 2.38 -2.51 77.9 0.01* 
Task 2 1.6 2.0 -1.72 80.77 0.08 
Task 3 2.37 2.48 -0.559 76.597 0. 577 
Task 1 2 3.48 4.38 -2.94 76.99 0.004* 
Task 1 3 4.13 4.79 -2.638 75.89 0.01* 
Task 2 3 4.11 4.56 -1.52 81.072 0. 13 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
5. Discussion and implications 
The purpose of the present study is to investigate the effects of test formats, study subjects, learning 
task, and pre-proficiency level on test score performance in the evaluation of combined effects and single 
effect. Findings of the study firstly revealed that test formats have a significant effect on test score 
performances, considering that participants in the integrated format performed better than those in the split 
format. Despite being different in research scope, while investigating the influence of material design on 
geometry test performance, Tindall-Ford et al. (2015) similarly found that participants in the integrated 
format earned a higher mean score than those in the split format. In a similar concern with their study, it 
could be the employment of self-management strategy by learners as participants while making their efforts 
in reducing the split-attention effect; in addition, participants in the present study may also understand how 
to respond the test items by integrating the questions with the relevant text. 
Furthermore, the present study also showed that the
File đính kèm:
implicating_the_integrated_format_on_reading_test_assessment.pdf