However, many educational and psychological assessments are multidimensional by design, and a matching variable using total scores that does not reflect the test structure may not. A course in item response theory and modeling with stata, by tenko raykov and george a. Marcoulides, is a comprehensive introduction to the concepts of item response theory irt. Previous efforts extended the framework by using item response theory irt based trait scores, and by employing an iterative process using groupspecific item parameters to account for dif in the trait scores, analogous to. It seems to be a bit limited in terms of difrelated tests. Item response theory irt models can be used to evaluate the relationships between the latent trait of interest and the items intended to measure the trait. As such, software that estimates twoparameter irt models is required for testing. In other words, multiplegroup irt models allow us to evaluate differential item functioning dif. Test information function differential item functioning dif. Helena, you can home brew diflogistic, as i describe in the last post on this thread.
Statas data management features give you complete control. At the conclusion of this course students will be able to import data from spss, sas, stata and excel, install and operate winsteps software, and analyze dichotomous and polytomous data. Differential item functioning dif is when a test item favors or hinders a characteristic exhibited by group members of a testtaking population. Logistic regression modeling as a unitary framework for binary and likerttype ordinal item scores. Assessment of differential item functioning in healthrelated. Item characteristic curves in the reference and focal groups and differential item functioning dif effectsize used in the simulation model for the four kinds of dif. Measure of activity performance of the hand maphand. We present an ordinal logistic regression model for identi.
To enhance comprehension, the book employs real datasets in the examples and illustrates the software outputs in detail. Group 1 has more extreme discrimination coefficients than group 0. Marcoulides, is a comprehensive introduction to the concepts of item response theory irt that includes numerous examples using stata s powerful suite of irt commands. Difd detects dif in test items between groups, conditional on the trait that the test is measuring, using logistic regression. In addition, the authors provide overviews of instrument construction and differential item functioning.
Several methods have been proposed in recent decades for identifying items that function differently between two or more groups of examinees. Stata is a complete, integrated software package that provides all your data. Example page from tutorial click on picture for fullsize view. Differential item functioning analysis with ordinal logistic. Dif detection is a first step in assessing bias in test items. Identifies differential item functioning, creates dummyvirtual items to be used to adjust ability trait estimates, and calculates the ability estimates and standard errors. Mokken scale analysis of mental health and wellbeing. This article provides an applied example using sibtest statistical software to detect dif in u.
Mar 01, 2011 logistic regression provides a flexible framework for detecting various types of differential item functioning dif. Unlike ncdif, cdif is not based on the assumption that all other items in the measure are unbiased, but rather takes into account the covariance of the differences in the expected items scores for the given item. Other application features for differential item function analysis. They will investigate text functioning and test equating with a goal of successfully analyzing data and communicating their findings to decision makers. Differential item functioning columbia university mailman. If dif is found for many items on the test, the final test scores do not represent the same. Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items.
New programs for differential item function dif and. That is, behind the scenes, diflogistic is calculating the sum score to your items as an. Item response theory models relationships between latent traits, such as mathematical ability, and the. Other application features for differential item function analysis, scoring, and the package generating. Stata seems to produce very nice item characteristic curves, item information curves, and test information curves.
Title dif introduction to differential item functioning descriptionremarks and examplesreferencesalso see description differential item functioning dif occurs when items. A handbook on the theory and methods of differential item functioning dif. Differential item functioning analysis with ordinal. Stata is a complete, integrated software package that provides all your data science needsdata manipulation, visualization, statistics, and reproducible reporting. Stata is a complete, integrated statistical software package that provides everything you need for data science. Title dif introduction to differential item functioning descriptionremarks and examplesreferencesalso see description differential item functioning dif occurs when items that are intended to measure a latent trait are unfair, favoring one group of individuals over another. Differential item functioning dif occurs when items that are intended to measure a latent trait are unfair, favoring one group of individuals over another. Multiple ways to detect differential item functioning in sas. Differential item functioning procedures for polytomous. Differential item functioning detection and effect. Starting from a framework for classifying dif detection methods and from a comparative overview of the most traditional methods, an r. Average item scores for subgroups having the same overall score on the test are compared to determine whether the item is measuring in essentially the.
You can read more about irt and more about stata s new irt features and see several worked examples in stata item response theory reference manual. The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning dif in the presence of multilevel data and polytomously scored items. This article compares several procedures in their efficacy for detecting differential item functioning dif. Developed in the norway, the measure of activity performance of the hand maphand assesses 18 activities performed using the hands. Detection of and adjustment for differential item functioning dif. A course in item response theory and modeling with stata is an outstanding text.
Differential item functioning procedures for polytomous items. Stata module for detection of and adjustment for differential item functioning dif, statistical software components s456722, boston. A general framework and an r package for the detection of. Evaluates test items for differential item functioning dif. In brief, differential item functioning dif occurs when groups such as defined by gender, ethnicity, age, or education have different probabilities of endorsing a given item on a multiitem scale after controlling for overall scale scores. The software includes an environment to use interactive visual displays of data along with various data exploratory tools, computation of reliability using cronbachs alpha, stratified alpha, and feldtraju along with jacknife variances, classical item analysis of test items, irtbased fit statistics including item fit plots, yens q1 and q3. They also describe estimation methods and diagnostics, including graphical diagnostic tools, parametric and nonparametric tests, and differential item functioning. Proceedings of sas global forum 2015 conference, dallas, tx, apr 2629. Assessment of differential item functioning in health. That is, behind the scenes, diflogistic is calculating the sum score to your items as an estimate of theta. Gibbons, phd, lance jolley, ms, and gerald van belle, phd introduction. Differential item functioning between ethnic groups in the epidemiological assessment of depression. Irt modelbased test of differential item functioning. The cutpoints for items 14 are the same, but that item is of higher difficulty for group 1.
Sep 01, 2017 this is the webpage for the handbook on differential item functioning. Dif analyses are statistical procedures used to determine to what extent the content of an item affects the item endorsement of subgroups of testtakers. Differential item functioning dif is an important issue of interest in psychometrics and educational measurement. The compensatory differential item functioning cdif, and differential test functioning dtf indexes are also a product of dfit. Item response theory columbia university mailman school of. Measurement invariance and differential item functioning. Differential item functioning dif occurs when items that are intended to measure a trait are unfair, favoring one group of individuals over another. Psychometric software is software that is used for psychometric analysis of data from tests. A unified approach based on r and stata presents special statistical methods for analyzing data collected by questionnaires. Stata module to evaluate test items for differential item. The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning dif in the presence of multilevel data and. Stata module to detect and adjust for differential. This is the webpage for the handbook on differential item functioning.
As part of test score validity, differential item functioning dif is a quantitative characteristic used to evaluate potential item bias. Keywords item response theory, stata, item analysis, scoring. This module should be installed from within stata by typing ssc install difdetect. Stata module to evaluate test items for differential item functioning dif. A course in item response theory and modeling with stata paperback january. Detecting differential item functioning using generalized. Applying logistic regression to detect differential item. Current methods include classical item analysis, differential item functioning dif analysis. With this random seed, the discrimination parameters for items 35 differ materially, but they arent materially different for items 1 and 2. Previous efforts extended the framework by using item.
Irtlrdif is software that computes likelihood ratio tests of dif for the 3pl and graded irt models. X fits an item response model when x are item scores e. Stata module to detect and adjust for differential item functioning dif, statistical software components s4580, boston college department of. Relatively fewer studies examined an itemlevel approach to measurement equivalence, particularly in settings where a large number of groups is included. Investigating test functioning anchoring fixing abilities, difficulties and ratingscale structures subtotals differential item functioning dif analysis dimensionality tutorialfurther 4. We provide a tutorial on differential item functioning dif analysis. Differential item functioning is a statistical characteristic of an item that shows the extent to which the item might be measuring different abilities for members of separate subgroups. This module should be installed from within stata by typing ssc install. A variety of statistical procedures have been developed to assess dif in tests of dichotomous hills, 1989.
Eric ej690033 differential item functioning detection and. General internal medicine, university of washington statistical software components from boston college department of economics. How can i calculate item bias in for a scale in spss. Logistic regression provides a flexible framework for detecting various types of differential item functioning dif. Differential item functioning dif is a statistical characteristic of an item that shows the extent to which the item might be measuring different abilities for members of separate subgroups. Dear mahmoud, there are several available options for estimating item bias or more adequately termed, differential item functioning using spss if the items are dichotomous. Sampling weights with irt and differential item functioning. Measuring differential item and test functioning across. The book takes an applied approach to testing and measurement tasks, mirroring the growing use of statistical methods and software in education, psychology, sociology, and other fields.
Software for the computation of the statistics involved in item response theory likelihoodratio tests for differential item functioning. Differential item functioning analysis with ordinal logistic regression techniques difdetect and difwithpar paul k. Recommendations for conducting differential item functioning. With irt, we can also determine how the instrument as a whole relates to the latent trait. Jul 31, 2018 finally, within the rasch model framework, emphasis is placed upon the invariance of comparisons between groups, such that at the same level of the trait being measured e.
To fit multiplegroup irt models in stata, we simply add the group option to the irt command. Differential item functioning dif with irt grm statalist. Average item scores for subgroups having the same overall score on the test are compared to determine whether the item is measuring in essentially the same way for all subgroups. Dec 10, 2019 helena, you can home brew diflogistic, as i describe in the last post on this thread. Differential item functioning dif in composite health.
Differential item functioning procedures for polytomous items when examinee sample sizes are small by. Conventional differential item functioning dif approaches such as logistic regression lr often assume unidimensionality of a scale and match participants in the reference and focal groups based on total scores. Apr 12, 20 differential item functioning dif is when a test item favors or hinders a characteristic exhibited by group members of a testtaking population. If the factor bringing about such a difference is not part of the construct of focus in the test, then the test would be biased. Stata module for detection of and adjustment for differential item functioning dif, statistical software components s456722, boston college department of economics, revised may 2015. Identifies differential item functioning, creates dummyvirtual items to be used to adjust ability trait estimates in parscale, writes the code and data file needed to process the updated item list in parscale, runs parscale, and merges the ability estimates and standard errors into the original data set. A comparison of item response theorybased methods for. Unlike ncdif, cdif is not based on the assumption that all.
662 651 1056 1372 254 262 1000 1456 1080 25 978 464 727 1189 677 22 596 216 1191 353 619 283 1274 965 1427 424 382 606 137 261 1293 1411 1221 1211 636 1096 169 54 1323 1107 308 797 1076 208 1083 264