Transcription

NESUG 2012PostersUsing SAS/OR for Automated Test Assembly from IRT-Based Item BanksYung-chen Hsu, GED Testing Service, LLC, Washington, DCTsung-hsun Tsai, Research League, LLC, Matawan, NJABSTRACTIn recent years, advanced development in psychometric theory and computer technology has led to dramaticchanges in test construction practices. In testing organizations, the task of assembling test forms can be considered as an assignment to solve a mathematical programming problem, in which a certain objective functionneeds to be met subject to specific practical constraints, such as test length and content coverage. TheSAS/OR is a powerful modeling environment for solving mathematical optimization problems. Large or midsized testing organizations can greatly benefit from its available solvers to handle real test assembly problemsand needs. We used procedure OPTMODEL to present the programming formulation of a test assembly problem. The merit of the PROC OPTMODEL is that the models in the SAS statements are declared in a formwhich mimics the symbolic formulation of an optimization model. A generic test assembly model with specificstatistical constraints based on item response theory and content related test specifications was shown as an example. The model can be easily modified to include additional practical constraints to obtain an optimal solution.Testing organizations can use this approach to automatically create test forms if relatively well developed itemsbanks are available.INTRODUCTIONA test assembly problem is to select a set of items from a large pool of pre-calibrated items, known as an itembank, based on the test specifications. An item bank is a repository of test items, essentially a database, whichstores all information pertaining to the items such as item format, item characteristics and content domains. Optimization techniques seek optimal solutions under specific constraints of psychometric and test compositionspecifications within a given item bank. The techniques have been a major advancement for testing organizations to achieve automated test assembly. The early automated test assembly method using a mathematicalprogramming approach was not based on the psychometric test theory (Feuerman and Weiss, 1973). Due to theadvanced innovations in psychometric theory and computer technology, current test construction practices usemathematical programming techniques are generally based on item response theory (IRT). IRT is a modern psychometric test theory that describes the relationship between item characteristics and test taker abilities. Thethree-parameter logistic model (3PL) is one of the widely used unidimensional IRT models for dichotomous responses in various large-scale testing programs. The model can be expressed asPi ( , ai ,b i , ci ) ci 1 ci1 exp[ Dai ( bi )]representing the probability of answering a particular dichotomously scored item correctly given the proficiencylevel . The parameters ai , bi , and ci are the characteristics of item i and the common choice of the scaling constant D is 1.7. Generally, the item parameters can be estimated by using PROC NLMIXED (Sheu, Chen, Su, &Wang, 2005). The item information function, which is derived from Fisher information (Load, 1980, Suen 1990,van der Linden & Boekkooi-Timminga, 1989), is defined as[ Pi ' ( )]2I i ( ) Pi ( )Qi ( )'where Q i 1 Pi and Pi ( ) Pi ( ) / . The test information of a test including n items is defined asnI ( ) I i ( ) .i 11

NESUG 2012PostersMATHEMATICAL APPROCHES OF CONSTRUCTING TEST FORMSThenuissen (1985) first presented a binary integer programming approach to construct a test with a target information function. The objective of the model is to minimize the number of items in the test subject to the constraints that the information in the test is above the pre-specified target at a number of ability levels. Severalpractical constraints were considered to incorporate into the modeling approach (Theunissen, 1986; Baker, Cohen, & Barmish, 1988; and de Gruijter, 1990). A different perspective, the so-called Maximin model, that considers the specification of a relative target information function, was formulated by van der Linden & BoekkooiTimminga (1989) in selecting items from an item bank. The model can be interpreted as specifying the relativeshape of the target information function at certain ability points. The automated test assembly problem can betreated as an optimization of matching a target test information function subject to content coverage. Specifyingan absolute target test information function may not be easy in practice if there is no available reference. Theimplementation of Maximin model for automated test generation targets a given shape of the IRT test information function is especially useful for new testing programs. The nature of the test can be more easily specifiedthan assigning exact target information values. Actually, the formulation of Maximin model can be treated as aspecial case of the goal programming model approach (Hsu, 1993), a branch of multi-objective optimization.Most variations (e.g., Boekkooi-Timminga, 1990; Hsu, 1993; van der Linden & Adema, 1998) developed later forsolving practical test assembly issues are generally based on Maximin model for optimal test construction.Let rk be the relative information values at the ability point k and assume that the items in the item bank are represented by decision variable xi , i 1, , N, denoting whether the items are to be included into the test form (xi 1) or not (xi 0). The model is formulated asmaximize ysubject toN I ( ik) xi rk y , k 1, , Ki 1N xi ni 1xi {0,1}, i 1, , Ny 0Ii ( k) denotes the information function of item Ii at the ability point k and n is the test length. As an example, thesecond constraint represents the number of items in the test. The model allows additional practical constraints,such as test composition (e.g., cognitive levels, mutually exclusive items) and administration time, to be takeninto account and specified into the model.AN EXAMPLE TEST ASSEMBLY PROBLEMWe simulated a set of item parameters to create an item bank that has four content domains and each domaincontains 50 items. For simplicity, we set the parameter c to zero. In practice, the c parameter has very small variation because items with large value of c will not be included in the item bank. Generally, parameter c has littleimpact in test assembly.ITEM INFORMATIONMaximin model is used as an illustrative example. We computed the item information at 13 ability points ( (-3,-2.5, -2,-1.5,-1, -0.5, 0, 0.5, 1, 1.5, 2, 2.5, 3)).libname nesugdat 'c:\nesug2012\dat';%let D 1.7;%let D2 %sysevalf(&D*&D);data inf;set nesugdat.itembank;array ri{13} r1-r13;a2 a*a;2

NESUG 2012Postersdo i 1 to 13;p 1/(1 exp(-&D*a*(((i-7)/2.0)-b)));q 1-p;ri{i} &D2*a2*p*q;end;keep r1-r13;run;proc transpose data inf out infc prefix px; run;data infc; set infc (drop name ); run;PROBLEM FORMULATION IN PROC OPTMODELThe task is to compose a classification test of 40 items and the test is divided into four equal sections, with itemssequenced in position 1-10, 11-20, 21-30, and 31-40 for the four domains, respectively. We assumed that theclassification test has multiple cut points, which means the test information curve would have two peaks. The testassembly problem is formulated asmaximize ysubject to200 I ( ik) xi rk y , k 1, ,13i 150 xi 10i 10i 1100 xi 51150 xi 10i 10i 101200 xi 151xi {0,1}, i 1, ,200y 0The following code shows the use of the procedure OPTMODEL for the test assembly tasks: (1) a selective testwith a single cut-off point; (2) a classification test multiple with cut-off points; and (3) a diagnostic test with no cutoff point. The relative information values at different ability points are in Table 1. We specified two cut-off pointsfor the classification test.TABLE 1Relative information values-3 -2.5 Selective0.1 00.11.00.11.0SAS/OR provides a convenient modeling language within PROC OPTMODEL for formulating, solving, andmaintaining optimization models. We start the problem statement with PROC OPTMODEL. Because we aredealing a huge amount of variables, we use set statement to group the numbers for indexing the variables. Thenwe declare the decision variables in the model. The code uses the selective test as an example. Once the problem is solved, the value represents whether an item is selected or not. The objective function is simple. The constraints relate the decision variables with the four domains. The solve statement invokes an appropriate solver tosolve this mixed integer linear programming problem. The results are stored for further processing.3

NESUG 2012Postersproc optmodel;set theta 1.13;set niBank 1.200;num iInf{theta,niBank};read data infc into [j N ] {i in niBank} iInf[j,i] col("px" i) ;num r1{theta} [0.1 0.1 0.1 0.1 0.1 0.7 1.0 0.7 0.1 0.1 0.1 0.1 0.1];varmaxconconconconconx{niBank} BINARY, y;obj y;tInf{j in theta}: sum{i in niBank}iInf[j,i]*x[i] r1[j]*y;ca1: sum{i in 1.50}x[i] 10;ca2: sum{i in 51.100}x[i] 10;ca3: sum{i in 101.150}x[i] 10;ca4: sum{i in 151.200}x[i] 10;solve with milp;create data nesugdat.testset3 from [id] {niBank} sel x;quit;Figure 1 shows the distribution of the values for the log(ai ) and bi of the items in the item bank. Figures 2, 3, and4 are the test information for the three tests, respectively. The test information has one peak for the selective testand two peaks for the classification. The diagnostic test has flat test information.60355 0-1 -2 1 45 40Test information2log(a)50 3530252015 105-3-3-2-101230-3-2-101230123bFIGURE 2. Selective test.60605555505045454040Test informationTest informationFIGURE 1. Scatter plot of log(a) and RE4. Diagnostic test.FIGURE 3. Classification test.4

NESUG 2012PostersCONCLUSIONFor testing organizations, the test assembly task is to solve a constrained combinatorial optimization problem. Ifrelatively well developed items banks have been developed, the problem involves a large amount of variables.PROC OPTMODEL has a succinct way to read and create data sets. It provides a powerful modeling language toformulate and solve the optimization model. This procedure can interface to various optimization solvers to compute solutions to the formulated problems. We showed how to formulate a simple test assembly problem. Themodel can be easily inspected and modified to address a wide variety of test specifications.REFERENCESTheunissen, T. J. J. M. (1986). Some applications of optimization algorithms in test design and adaptive testing.Applied Psychological Measurement, 10 (4), 381-389.De Gruitjer, D. N. M. (1990). Test construction by means of linear programming. Applied psychological measurement, 14 (2), 175-181.Baker, F. B., Cohen, A. S. & Barmish B. R. (1988). Item characteristics of tests constructed by linear programming. Applied psychological measurement 12 (2), 189-199.Boekkooi-Timminga, E. (1990). A cluster-based method for test construction. Applied psychological measurement, 14 (4), 341-353.Feuerman, M., & Weiss, H. (1973). A mathematical programming model for test construction and scoring. Management science, 19 (8), 961-966.Hsu, Y.-C. (1993). The goal programming approach for test construction (Master thesis). The University of Arizona, Tucson, AZ.Lord, F. M. (1980). Applications of item response theory to practical testing problem. Hillsdale, NJ: Erlbaum.Sheu, C.-F., Chen, C.-T., Su, Y.-H., & Wang, W.-C. (2005). Using SAS PROC NLMIXED to fit item responsetheory models. Behavior Research Methods, 37 (2), 202-218.Suen, H. K. (1990). Principles of test theories. Hillsadle, NJ: Erlbaum.Van der Linden, W. J., & Boekkooi-Timminga, E. (1989). A maximin model for test design with practical constraints. Psychometrika, 54 (2), 237-247.Van der Linden, W. J., & Adema, J, J. (1998). Simultaneous assembly of multiple test forms. Journal of Educational Measurement, 35 (3), 185-198.ACKNOWLEDGMENTSSAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SASInstitute Inc. in the USA and other countries. indicates USA registration.CONTACT INFORMATIONYour comments and questions are valued and encouraged. Contact the author at:Yung-chen HsuGED Testing Service, LLC1155 Connecticut Ave., NW 4th FloorWashington, DC 20036Work Phone: 202-471-2214Email: [email protected]: www.gedtestingservice.com5