EstCRM: An R package for Samejima's Continuous IRT Model.
http://apm.sagepub.com/content/current
EstCRM: An R Package for Samejima’s Continuous IRT Model, Applied Psychological Measurement, March 2012, 36, 149-150
Continuous Response Model (CRM) is an IRT model developed for continuous outcomes. CRM is not commonly used in... more Continuous Response Model (CRM) is an IRT model developed for continuous outcomes. CRM is not commonly used in practice although it is as old as the other well known and popular binary and polytomous IRT models. This may be due to the lack of an accessible software to estimate the model parameters. The R package, EstCRM, was developed to estimate the model parameters for the CRM.
Educational Evaluation in the Sudan
http://www.lulu.com/product/hardcover/assessment-of-the-sudan-school-c
Education in the Sudan today has to witness radical changes in the entire educational system and in the establishment... more Education in the Sudan today has to witness radical changes in the entire educational system and in the establishment of new educational objectives, which are to be carried out to secure the nation's aspiration for an active and effective education system. Many educational conferences are expected to take place to discus educational issues. This part of my book may add some historical perspectives of assessment in the country
Does IRT Provide More Sensitive Measures of Latent Traits in Statistical Tests? An Empirical Examination
by Jeff Stewart
Stewart, J. (In Press) Shiken Research Bulletin, 16 (1)
It has been frequently stated that Item Response Theory produces interval-scale measures where raw scores can only... more It has been frequently stated that Item Response Theory produces interval-scale measures where raw scores can only provide ordinal measures, and that therefore, researchers should choose IRT measures when selecting variables for common statistical tests, because raw scores may not meet their assumptions (Wright, 1992; Harwell & Gattie, 2001). In this study, this claim is empirically examined by conducting Pearson Correlations and ANOVAs on two data sets using raw scores, Rasch Person Measures and 2-Parameter IRT ability estimates, in order to determine if results differed as a consequence. Raw Scores and Rasch Person Measures were very highly correlated, and lead to extremely similar results in all cases. For a well-constructed, reliable test the same was true of 2PL ability estimates. However, in cases where the test has middling to poor reliability, 2PL ability estimates appear to produce a somewhat more sensitive measure of a latent trait than raw scores, which can result in meaningful differences in statistical tests.
Training examiners for a national epidemiological survey of oral mucosal lesions
Ensuring the validity and reliability of data collected in epidemiological surveys is an important consideration. The... more
Ensuring the validity and reliability of data collected in epidemiological surveys is an important consideration. The purpose of the present report is to describe a training and calibration programme for 16 examiners taking part in a national survey of oral mucosal lesions and to present an evaluation of the results. The programme included the distribution of a pictorial manual to participants and a series of lectures followed by three diagnostic sessions, two using slides and the last involving patients. At the final session, the trainees classified 88 per cent of 16 patients correctly in comparison with the definitive diagnoses of the trainer, and their sensitivity in recording oral carcinoma, leukoplakia and lichen planus was at least 0.88. However, correctly classifying submucous fibrosis on the basis of slides alone proved problematic. At the conclusion, the diagnostic accuracy of two examiners for all types of lesion remained appreciably lower than the majority. Training strategies for various types of study are discussed. The method reported is considered to represent a model approach to training and calibrating examiners for this type of survey work.
Identifying Pre-service Elementary School Teachers’ Conceptualization Levels of Rational Numbers
by Halil Eksi
Soner DURMUŞ
Educational Sciences: Theory & Practice
5 (2) • November 2005 • 659-665
The teachers’ subject-domain competencies play a vital role at the source of the difficulties
that students have... more
The teachers’ subject-domain competencies play a vital role at the source of the difficulties
that students have with rational numbers in elementary schools. The aim of
this study is to identify pre-service elementary school teachers’ (i.e., mathematics,
science and elementary school teachers) conceptualizations of rational numbers. The
sample of the study consisted of 277 senior pre-service teachers; 72 from mathematics
teaching department, 60 from science teaching department, and 145 from primary
school teaching department. The Rational Number Conceptualization Test, including
14 questions, was administered to pre-service teachers and semi-structured interviews,
including 10 questions, were conducted afterwards in order to identify their
rational number conceptualizations. Statistically significant results were found on the
conceptualization levels of conceptual rational number problems among pre-service
teachers in favor of mathematics and science pre-service teachers. While there was no
statistically significant difference between mathematics and science pre-service
teachers or between mathematics and elementary school pre-service teachers on the
conceptual levels of operational rational number problems, a statistically significant
difference was found between science and primary school pre-service teachers in
favor of science pre-service teachers. The collected data from the semi-structured
interviews revealed that pre-service teachers have difficulties on the conceptualization
of rational numbers and on learning and applying the models used to teach
rational numbers. In light of the findings of this study and related studies in the literature,
teaching approaches that might enhance pre-service teachers’ conceptualization
levels of rational numbers were discussed and some suggestions were made.
65 views
Seen by:Assessing Student Achievement: Multiple-choice Testing Versus Performance Assessment
by Texas State PA Applied Research Projects
Deming, Jana C., "Assessing Student Achievement: Multiple-choice Testing Versus Performance Assessment" (1992). Applied Research Projects, Texas State University-San Marcos. Paper 226.
http://ecommons.txstate.edu/arp/226
Techniques of Assessing, Types, and Taxonomy of Measures of Thinking Skills
Published in Philippine Journal of Educational Measurement 9: 1. 23-38 , 2005
Sliders for the smart: Type of rating scale on the Web interacts with educational level
Published 2011 in Social Science Computer Review, 29, 221-231. Co-authored with Frederik Funke and Randall Thomas.
Slider scales and radio buttons scales were experimentally compared in horizontal and vertical orientation. Slider... more Slider scales and radio buttons scales were experimentally compared in horizontal and vertical orientation. Slider scales lead to statistically significantly higher break-off rates (odds ratio 1/4 6.9) and substantially higher response times. Problems with slider scales were especially prevalent in participants with less than average education, suggesting the slider scale format is more challenging in terms of previous knowledge needed or cognitive load. An alternative explanation, technologydependent sampling (Buchanan & Reips, 2001), cannot fully account for the current results. The authors clearly advise against the use of Java-based slider scales and advocate low-tech solutions for the design of Web-based data collection. Orientation on screen had no observable effect on data quality or usability of rating scales. Implications of item format for Web-based surveys are discussed.
Test Review: Test of Nonverbal Intelligence-4 (TONI-4)
Ritter, N., Kilinc, E., Navruz, B., Bae, Y. (2011). Test Review: Test of Nonverbal Intelligence-4 (TONI-4). Journal of Psychoeducational Assessment, 29(5), 384-388. doi: 10.1177/0734282911400400
Test Review: Children’s Organizational Skill Scales (COSS).
Kaya, F., Delen, E., and Ritter, N. (2012). Test review: Children’s Organizational Skill Scales (COSS). Journal of Psychoeducational Assessment, 30(2) 205-208. doi: 10.1177/0734282911416320
Test Review: Test of Comprehensive Nonverbal Intelligence-2 (CTONI-2)
Delen, E., Kaya, F., and Ritter, N. (2012). Test review: Test of Comprehensive Nonverbal Intelligence-2 (CTONI-2). Journal of Psychoeducational Assessment, 30(2) 209-213. doi: 10.1177/0734282911415614
Integrating qualitative and quantitative research approaches via the phenomenological method
Fisher, W. P., Jr., & Stenner, A. J. (2011, April). Integrating qualitative and quantitative research approaches via the phenomenological method. International Journal of Multiple Research Approaches, 5(1): 89-103.
Separated and mixed applications of qualitative and quantitative methods are typically encumbered by markedly... more Separated and mixed applications of qualitative and quantitative methods are typically encumbered by markedly different philosophical orientations. Multiple inefficiencies arise when mixed methods work at cross purposes with each other. The phenomenological method, however, has the potential to integrate qualitative and quantitative concerns in ways that orient research toward uniform criteria of substantive meaningfulness and mathematical rigor. Three characteristics of a qualitative-quantitative methodological pluralism are described: structural invariance, substantive interpretability, and the display of anomaly. When combined with networked information technologies, new opportunities emerge for a qualitatively meaningful and quantitatively precise measurement framework in the research and practice of the health sciences.
Reliability Statistics
Fisher, W. P., Jr. (1992). Reliability statistics. Rasch Measurement Transactions, 6(3), 238.
Reliabilities are often reported as though they were invariable characteristics of tests. Of course, they are not.... more Reliabilities are often reported as though they were invariable characteristics of tests. Of course, they are not. They depend not only on the construction of the test, but also on the distribution of the examinee sample tested. Conventionally, only person separation reliability is reported, but item separation statistics are also useful indicators. They tell how well this sample of examinees have spread out the items along the measure of the test, and so defined a meaningful variable.
The standard model in the history of the natural sciences, econometrics, and the social sciences
Fisher, W. P., Jr. (2010). The standard model in the history of the natural sciences, econometrics, and the social sciences. Journal of Physics: Conference Series, 238(1),
Abstract. In the late 18th and early 19th centuries, scientists appropriated Newton’s laws of motion as a model for... more Abstract. In the late 18th and early 19th centuries, scientists appropriated Newton’s laws of motion as a model for the conduct of any other field of investigation that would purport to be a science. This early form of a Standard Model eventually informed the basis of analogies for the mathematical expression of phenomena previously studied qualitatively, such as cohesion, affinity, heat, light, electricity, and magnetism. James Clerk Maxwell is known for his repeated use of a formalized version of this method of analogy in lectures, teaching, and the design of experiments. Economists transferring skills learned in physics made use of the Standard Model, especially after Maxwell demonstrated the value of conceiving it in abstract mathematics instead of as a concrete and literal mechanical analogy. Haavelmo's probability approach in econometrics and R. Fisher's Statistical Methods for Research Workers brought a statistical approach to bear on the Standard Model, quietly reversing the perspective of economics and the social sciences relative to that of physics. Where physicists, and Maxwell in particular, intuited scientific method as imposing stringent demands on the quality and interrelations of data, instruments, and theory in the name of inferential and comparative stability, statistical models and methods disconnected theory from data by removing the instrument as an essential component. New possibilities for reconnecting economics and the social sciences to Maxwell's sense of the method of analogy are found in Rasch's probabilistic models for measurement.
The cash value of reliability
Fisher, W. P., Jr. (2008, Summer). The cash value of reliability. Rasch Measurement Transactions, 22(1), 1160-3.
Mindfulness in measurement: Reconsidering the measurable in mindfulness
Solloway, S., & Fisher, W. P., Jr. (2007). Mindfulness in measurement: Reconsidering the measurable in mindfulness. International Journal of Transpersonal Studies, 26, 58-81.
Can an organic partnership of qualitative and quantitative data confirm the value of mindfulness practice as an... more
Can an organic partnership of qualitative and quantitative data confirm the value of mindfulness practice as an assignment in undergraduate education? Working from qualitative evidence suggesting the existence of potentially measurable mindfulness effects expressed in ruler measures, a previous study calibrated a mathematically invariant scale of mindfulness practice effects with substantively and statistically significant differences in the measures
before and after the assignment. Current efforts replicated these results. The quantitative model is described in measurement terms defined at an introductory level. Detailed figures and appendices are provided, and a program of future research is proposed.
Bringing human, social, and natural capital to life: Practical consequences and opportunities
Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 12(1), 49-66.
Abstract
Capital is defined mathematically as the abstract meaning brought to life in the two phases of the... more
Abstract
Capital is defined mathematically as the abstract meaning brought to life in the two phases of the development of 'transferable representations,' which are the legal, financial, and scientific instruments we take for granted in almost every aspect of our daily routines. The first, conceptual and gestational, and the second, parturitional and maturational, phases in the creation and development of capital are contrasted. Human, social, and natural forms of capital should be brought to life with at least the same amounts of energy and efficiency as have been invested in manufactured and liquid capital, and property. A mathematical law of living capital is stated. Two examples of well-measured human capital are offered. The paper concludes with suggestions for the ways that future research might best capitalize on the mathematical definition of capital.
