Access the full text.
Sign up today, get DeepDyve free for 14 days.
Gustaf Wellhagen, M. Karlsson, M. Kjellsson (2020)
Comparison of Precision and Accuracy of Five Methods to Analyse Total Score DataThe AAPS Journal, 23
P. Martínez-Martín (2010)
Composite rating scalesJournal of the Neurological Sciences, 289
W. Zetusky, J. Jankovic, F. Pirozzolo (1985)
The heterogeneity of Parkinson's diseaseNeurology, 35
D. Conrado, W. Denney, Danny Chen, Kaori Ito (2014)
An updated Alzheimer’s disease progression model: incorporating non-linearity, beta regression, and a third-level random effect in NONMEMJournal of Pharmacokinetics and Pharmacodynamics, 41
(Schindler E, Friberg LE, Karlsson MO. Comparison of item response theory and classical test theory for power/sample size for questionnaire data with various degrees of variability in items’ discrimination parameters. PAGE 24 2015 Abstr 3468 [Internet]. Available from: www.page-meeting.org/?abstract=3468. Accessed 13 Jul 2020.)
Schindler E, Friberg LE, Karlsson MO. Comparison of item response theory and classical test theory for power/sample size for questionnaire data with various degrees of variability in items’ discrimination parameters. PAGE 24 2015 Abstr 3468 [Internet]. Available from: www.page-meeting.org/?abstract=3468. Accessed 13 Jul 2020.Schindler E, Friberg LE, Karlsson MO. Comparison of item response theory and classical test theory for power/sample size for questionnaire data with various degrees of variability in items’ discrimination parameters. PAGE 24 2015 Abstr 3468 [Internet]. Available from: www.page-meeting.org/?abstract=3468. Accessed 13 Jul 2020., Schindler E, Friberg LE, Karlsson MO. Comparison of item response theory and classical test theory for power/sample size for questionnaire data with various degrees of variability in items’ discrimination parameters. PAGE 24 2015 Abstr 3468 [Internet]. Available from: www.page-meeting.org/?abstract=3468. Accessed 13 Jul 2020.
K. Marek, Sohini Chowdhury, A. Siderowf, Shirley Lasch, C. Coffey, C. Caspell-Garcia, T. Simuni, D. Jennings, C. Tanner, J. Trojanowski, L. Shaw, J. Seibyl, N. Schuff, A. Singleton, K. Kieburtz, A. Toga, B. Mollenhauer, D. Galasko, L. Chahine, D. Weintraub, T. Foroud, D. Tosun-Turgut, K. Poston, V. Arnedo, M. Frasier, T. Sherer, S. Bressman, M. Merchant, W. Poewe, C. Kopil, A. Naito, R. Dorsey, C. Casaceli, N. Daegele, J. Albani, L. Uribe, Eric Foster, J. Long, N. Seedorff, K. Crawford, Danielle Smith, P. Casalin, G. Malferrari, C. Halter, L. Heathers, D. Russell, S. Factor, P. Hogarth, A. Amara, R. Hauser, J. Jankovic, M. Stern, Shu‐Ching Hu, Gretchen Todd, R. Saunders‐Pullman, I. Richard, H. Saint‐Hilaire, K. Seppi, H. Shill, Hubert Fernandez, C. Trenkwalder, W. Oertel, D. Berg, Kathrin Brockman, I. Wurster, L. Rosenthal, Yen Tai, N. Pavese, P. Barone, S. Isaacson, A. Espay, D. Rowe, M. Brandabur, J. Tetrud, G. Liang, Á. Iranzo, E. Tolosa, K. Marder, Maria Sanchez, L. Stefanis, M. Marti, J. Martínez, J. Corvol, O. Assly, S. Brillman, Nir Giladi, Debra Smejdir, Julia Pelaggi, F. Kausar, L. Rees, Barbara Sommerfield, Madeline Cresswell, C. Blair, Karen Williams, Grace Zimmerman, S. Guthrie, Ashlee Rawlins, Leigh Donharl, C. Hunter, B. Tran, A. Darin, Heli Venkov, Cathi Thomas, Raymond James, B. Heim, Paul Deritis, F. Sprenger, D. Raymond, D. Willeke, Zoran Obradov, J. Mule, Nancy Monahan, K. Gauss, D. Fontaine, D. Szpak, Arita McCoy, Becky Dunlop, Laura Payne, Susan Ainscough, L. Carvajal, Rebecca Silverstein, Kristy Espay, Madelaine Rañola, E. Rezola, H. Santana, M. Stamelou, Alicia Garrido, S. Carvalho, G. Kristiansen, Krista Specketer, Anat Mirlman, M. Facheris, H. Soares, A. Mintun, J. Cedarbaum, Peggy Taylor, L. Slieker, Brian McBride, Colin Watson, Etienne Montagut, Z. Sheikh, Baris Bingol, R. Forrat, P. Sardi, T. Fischer, D. Reith, J. Egebjerg, L. Larsen, N. Breysse, D. Meulien, B. Saba, V. Kiyasova, C. Min, Thomas McAvoy, R. Umek, P. Iredale, J. Edgerton, Demetrio Santi, C. Czech, F. Boess, J. Sevigny, T. Kremer, I. Grachev, K. Merchant, A. Avbersek, P. Muglia, Alexandra Stewart, R. Prashad, Johannes Taucher (2018)
The Parkinson's progression markers initiative (PPMI) – establishing a PD biomarker cohortAnnals of Clinical and Translational Neurology, 5
(UUPharmacometrics/piraid [Internet]. Uppsala University, Pharmacometrics Research Group; 2020. Available from: https://github.com/UUPharmacometrics/piraid. Accessed 14 Apr 2020.)
UUPharmacometrics/piraid [Internet]. Uppsala University, Pharmacometrics Research Group; 2020. Available from: https://github.com/UUPharmacometrics/piraid. Accessed 14 Apr 2020.UUPharmacometrics/piraid [Internet]. Uppsala University, Pharmacometrics Research Group; 2020. Available from: https://github.com/UUPharmacometrics/piraid. Accessed 14 Apr 2020., UUPharmacometrics/piraid [Internet]. Uppsala University, Pharmacometrics Research Group; 2020. Available from: https://github.com/UUPharmacometrics/piraid. Accessed 14 Apr 2020.
L. Lindbom, Pontus Pihlgren, N. Jonsson (2005)
PsN-Toolkit - A collection of computer intensive statistical methods for non-linear mixed effect modeling using NONMEMComputer methods and programs in biomedicine, 79 3
Samantha Holden, T. Finseth, S. Sillau, B. Berman (2018)
Progression of MDS‐UPDRS Scores Over Five Years in De Novo Parkinson Disease from the Parkinson's Progression Markers Initiative CohortMovement Disorders Clinical Practice, 5
(R Core Team. R: a language and environment for statistical computing [internet]. Vienna, Austria: R Foundation for Statistical Computing; 2019. Available from: https://www.R-project.org/. Accessed 16 Mar 2020.)
R Core Team. R: a language and environment for statistical computing [internet]. Vienna, Austria: R Foundation for Statistical Computing; 2019. Available from: https://www.R-project.org/. Accessed 16 Mar 2020.R Core Team. R: a language and environment for statistical computing [internet]. Vienna, Austria: R Foundation for Statistical Computing; 2019. Available from: https://www.R-project.org/. Accessed 16 Mar 2020., R Core Team. R: a language and environment for statistical computing [internet]. Vienna, Austria: R Foundation for Statistical Computing; 2019. Available from: https://www.R-project.org/. Accessed 16 Mar 2020.
R. Team (2014)
R: A language and environment for statistical computing.MSOR connections, 1
(Baker FB. The basics of item response theory. 2nd Ed. College Park: ERIC Clearinghouse on Assessment and Evaluation, University of Maryland; 2001. http://ericae.net/irt/baker.)
Baker FB. The basics of item response theory. 2nd Ed. College Park: ERIC Clearinghouse on Assessment and Evaluation, University of Maryland; 2001. http://ericae.net/irt/baker.Baker FB. The basics of item response theory. 2nd Ed. College Park: ERIC Clearinghouse on Assessment and Evaluation, University of Maryland; 2001. http://ericae.net/irt/baker., Baker FB. The basics of item response theory. 2nd Ed. College Park: ERIC Clearinghouse on Assessment and Evaluation, University of Maryland; 2001. http://ericae.net/irt/baker.
F. Baker (1985)
The basics of item response theory
T. Vu, J. Nutt, N. Holford (2012)
Progression of motor and nonmotor features of Parkinson's disease and their response to treatment.British journal of clinical pharmacology, 74 2
E. Lesaffre, D. Rizopoulos, R. Tsonaka (2007)
The logistic transform for bounded outcome scores.Biostatistics, 8 1
Comparison of item response theory and classical test theory for power/sample size for
(Wellhagen GJ, Kjellsson MC, Karlsson MO. A bounded integer model for rating and composite scale data. AAPS J. 2019 06;21(4):74.)
Wellhagen GJ, Kjellsson MC, Karlsson MO. A bounded integer model for rating and composite scale data. AAPS J. 2019 06;21(4):74.Wellhagen GJ, Kjellsson MC, Karlsson MO. A bounded integer model for rating and composite scale data. AAPS J. 2019 06;21(4):74., Wellhagen GJ, Kjellsson MC, Karlsson MO. A bounded integer model for rating and composite scale data. AAPS J. 2019 06;21(4):74.
Simon Buatois, S. Retout, N. Frey, S. Ueckert (2017)
Item Response Theory as an Efficient Tool to Describe a Heterogeneous Clinical Rating Scale in De Novo Idiopathic Parkinson’s Disease PatientsPharmaceutical Research, 34
(2001)
College Park: ERIC Clearinghouse on Assessment and Evaluation
E. Germovsek, Anna Hansson, M. Kjellsson, J. Ruixo, Å. Westin, P. Soons, A. Vermeulen, M. Karlsson (2020)
Relating Nicotine Plasma Concentration to Momentary Craving Across Four Nicotine Replacement Therapy FormulationsClinical Pharmacology & Therapeutics, 107
Gustaf Wellhagen, M. Kjellsson, M. Karlsson (2019)
A Bounded Integer Model for Rating and Composite Scale DataThe AAPS Journal, 21
C. Goetz, S. Fahn, P. Martínez-Martín, W. Poewe, C. Sampaio, G. Stebbins, M. Stern, B. Tilley, R. Dodel, B. Dubois, R. Holloway, J. Jankovic, J. Kulisevsky, A. Lang, A. Lees, S. Leurgans, P. LeWitt, D. Nyenhuis, C. Olanow, O. Rascol, A. Schrag, J. Teresi, J. Hilten, Nancy Lapelle (2007)
Movement Disorder Society‐sponsored revision of the Unified Parkinson's Disease Rating Scale (MDS‐UPDRS): Process, format, and clinimetric testing planMovement Disorders, 22
(Wellhagen GJ, Karlsson MO, Kjellsson MC. Comparison of Precision and Accuracy of Five Methods to Analyse Total Score Data. The AAPS Journal 2021;23(1). 10.1208/s12248-020-00546-w.)
Wellhagen GJ, Karlsson MO, Kjellsson MC. Comparison of Precision and Accuracy of Five Methods to Analyse Total Score Data. The AAPS Journal 2021;23(1). 10.1208/s12248-020-00546-w.Wellhagen GJ, Karlsson MO, Kjellsson MC. Comparison of Precision and Accuracy of Five Methods to Analyse Total Score Data. The AAPS Journal 2021;23(1). 10.1208/s12248-020-00546-w., Wellhagen GJ, Karlsson MO, Kjellsson MC. Comparison of Precision and Accuracy of Five Methods to Analyse Total Score Data. The AAPS Journal 2021;23(1). 10.1208/s12248-020-00546-w.
L. Lindbom, J. Ribbing, E. Jonsson (2004)
Perl-speaks-NONMEM (PsN) - a Perl module for NONMEM related programmingComputer methods and programs in biomedicine, 75 2
The AAPS Journal (2021) 23:45 DOI: 10.1208/s12248-021-00555-3 Research Article An Item Response Theory–Informed Strategy to Model Total Score Data from Composite Scales 1 1 1 1,2 Gustaf J. Wellhagen, Sebastian Ueckert, Maria C. Kjellsson, and Mats O. Karlsson Received 21 August 2020; accepted 7 January 2021 Abstract. Composite scale data is widely used in many therapeutic areas and consists of several categorical questions/items that are usually summarized into a total score (TS). Such data is discrete and bounded by nature. The gold standard to analyse composite scale data is item response theory (IRT) models. However, IRT models require item-level data while sometimes only TS is available. This work investigates models for TS. When an IRT model exists, it can be used to derive the information as well as expected mean and variability of TS at any point, which can inform TS-analyses. We propose a new method: IRT-informed functions of expected values and standard deviation in TS-analyses. The most common models for TS-analyses are continuous variable (CV) models, while bounded integer (BI) models offer an alternative that respects scale boundaries and the nature of TS data. We investigate the method in CV and BI models on both simulated and real data. Both CV and BI models were improved in fit by IRT-informed disease progression, which allows modellers to precisely and accurately find the corresponding latent variable parameters, and IRT- informed SD, which allows deviations from homoscedasticity. The methodology provides a formal way to link IRT models and TS models, and to compare the relative information of different model types. Also, joint analyses of item-level data and TS data are made possible. Thus, IRT-informed functions can facilitate total score analysis and allow a quantitative analysis of relative merits of different analysis methods. KEY WORDS: bounded integer model; composite scale data; IRT-informed total score analysis; total score analysis. BACKGROUND large datasets, include many parameters, and take long time to estimate. Most importantly, they cannot be used if data on Composite scales are commonly used in many disease the item level is not available, which is the case we investigate areas, such as CNS disorders and autoimmune diseases (1). here. Often these scales were developed for diagnosis, but in lack When the item-level data is not available, the TS can be of reliable biomarkers they also function as clinical endpoints modelled—which is the focus of this work. For this, several to evaluate disease progression and treatment efficacy. Such approaches are used such as continuous variable (CV), scales consist of several questions/items that are summarized bounded integer (BI) (3), beta regression (4), or coarsened to a total score (TS), often the sum of the item scores. The grid models (5)—here we investigate CV and BI models. resulting TS is discrete and bounded. None of these methods uses the information from item-level Item-level data contains all the information collected, data. However, if there exists an IRT model for the same and therefore adequately designed item response theory composite scale as the TS data, it might be used to inform TS- (IRT) models are the most informative way to analyse analyses. Bounded data has lower variability at the bound- composite scale data (2). These models include item charac- aries, as these impose natural limits to the outcome. Thus, a teristic curves (ICCs) for each item and handle correlation homoscedastic error, as is typical in CV models, is not the between items through one or several latent variables. best description of the variability. Instead, the mean and However, IRT models may be complex to develop, require variability at each latent variable value in an IRT model can be computed through the ICCs. Therefore, an IRT model can yield the expected variability at any predicted TS value. An example of a composite scale is the Movement Disorder Society–sponsored revision of the Unified Parkinson’s Disease Pharmacometrics Research Group, Department of Pharmacy, Upp- sala University, Box 580, 751 23, Uppsala, Sweden. Rating Scale (MDS-UPDRS) (6). There are four parts to MDS- To whom correspondence should be addressed. (e–mail: UPDRS: nonmotor and motor aspects of experiences of daily mats.karlsson@farmaci.uu.se) 1550-7416/21/0000-0001/0 2021 The Author(s). This article is an open access publication 45 Page 2 of 10 The AAPS Journal (2021) 23:45 living, motor examination, and motor complications, totally 68 where Ψ is a latent variable described by the function h(·), ij items. Parkinson’s disease is heterogeneous and different which is typically a function for disease progression and aspects of the disease progress at different rates (7, 8). Since treatment effects on the latent variable scale, and pn as well drug effects are unlikely to affect all items equally, drug as pn are predetermined polynomials with coefficients development within Parkinson’s disease is often focused on a calibrated to theoretical expectations (see “IRT-informed part of the whole MDS-UPDRS scale to gain power; items are functions” below). All other variables maintain their defini- typically reassigned as nonmotor-, motor-, or tremor-related, tion from above. where the total subscore is used as an endpoint. Furthermore, we define the partially IRT-informed CV Indeed, recent work has shown that the disease progres- model for the mean (MI-CV) as sions of nonmotor, motor, and tremor complications can be modelled separately in an IRT framework (9). In this work, Y ¼ pn Ψ þ ε ij ij ij we focus only on the motor subscore which is typically the most sensitivetodrugeffects sinceithas thefastest progression rate (10)—it is also the largest subscale of and for the SD (SDI-CV) as MDS-UPDRS. IRT-informed TS-analysis, presented in this work, is a new method to improve fit and parameter precision as well as Y ¼ f Θ; η ; t ; X þ ε pn Ψ : ij ij i ij ij i 2 description of variability and disease progression in TS- analyses. The idea is that modellers dealing with TS data from a composite scale could gain information from existing IRT models for that composite scale. To this end, IRT- Bounded Integer Models informed link functions between one previously published IRT model and different TS models, and link functions The standard BI model (S-BI) is a discrete data model describing the standard deviation (SD) are evaluated on both for bounded outcomes, where the probability of an individual simulated and real data within Parkinson’s disease. i to have the score k at time t is: ij ! ! Zk−f θ; η ; t ; X Zk−1−f θ; η ; t ; X METHODS ij i ij i i i n n 2 PY ¼ k ¼ ϕ −ϕ η ∼N 0; ω ij i g σ; η ; t ; X g σ; η ; t ; X i ij i i ij i Models where ϕ is the cumulative distribution function for the Eight different models for total score data were evalu- standard normal distribution, Z and Z are the cut ated in this work: four CV models and four BI models k/n (k-1)/n (standard models, fully IRT-informed models for both mean points between categories k and k-1 defined through the and SD, and partially IRT-informed models for either mean probit function for an n-category scale, f(·) is the function for or SD). They operate on the TS scale, Z scale (latent variable the mean, and g(·) the function for the variance on the probit of BI), and/or the Ψ scale (latent variable of IRT). The TS is scale. The probability mass function across all scores adds to bounded, while Ψ,Z ∈ ℝ. A descriptive summary of the 1. For all BI models, the special cases for the first and last models is shown in Table I. categories (k = 1, k=n) apply: Z1−f θ; η ; t ; X Continuous Variable Models ij i PY ¼ 1 ¼ ϕ ij g σ; η ; t ; X ij i Under the standard CV model with homoscedastic error (S-CV), the observation j for subject i at time t follows: ij Zn−1−f θ; η ; t ; X ij i Y ¼ f Θ; η ; t ; X þ ε PY ¼ n ¼ 1−ϕ ij i ij i ij ij g σ; η ; t ; X 2 ij i η ∼N 0; ω ε ∼N 0; σ ij The fully IRT-informed BI model (I-BI) is expressed as: where Θ are fixed effect parameters, η the random effects of the inter-individual (IIV), X the covariates, ε the residual i ij Ψ ¼ h Θ; η ; t ; X 2 ij ij i ! ! unexplained variability (RUV), ω the variance of the IIV, 2 Zk−pn Ψ Zk−1−pn Ψ ij ij 3 3 and σ the variance of the RUV. n n PY ¼ k ¼ ϕ −ϕ ij pn Ψ pn Ψ The fully IRT-informed CV model (I-CV) is expressed ij ij 4 4 as: η ∼N 0; ω Ψ ¼ h Θ; η ; t ; X ij ij i Y ¼ pn Ψ þ ε pn Ψ ij ij ij ij 1 2 where pn and pn are distinct polynomials from pn and pn . 3 4 1 2 η ∼N 0; ω Similar to the IRT-informed CV models, we also define the ε ∼NðÞ 0; 1 ij partial versions for the mean (MI-BI): The AAPS Journal (2021) 23:45 45 Page 3 of 10 Table I. Properties of the Investigated Models Model Description Scale Standard deviation S-CV Standard CV model TS Homoscedastic (estimated θ) SDI-CV Partially (SD) IRT-informed CV model Heteroscedastic (fixed SD(Y|Ψ)) MI-CV Partially (mean) IRT-informed CV model Ψ Homoscedastic (estimated θ) I-CV Fully IRT-informed CV model Heteroscedastic (fixed SD(Y|Ψ)) S-BI Standard BI model Z Homoscedastic (estimated θ) SDI-BI Partially (SD) IRT-informed BI model Heteroscedastic (fixed SD(Y|Ψ)) MI-BI Partially (mean) IRT-informed BI model Ψ Homoscedastic (estimated θ) I-BI Fully IRT-informed BI model Heteroscedastic (fixed SD(Y|Ψ)) BI bounded integer, CV continuous variable, IRT item response theory, Ψ latent variable of IRT, SD(Y| Ψ) standard deviation from IRT model, TS total score, Z latent variable of BI ! ! respectively, and 0.01 and 0.005 for the corresponding BI Zk−pn Ψ Zk−1−pn Ψ ij ij 3 3 n n PY ¼ k ¼ ϕ −ϕ values). ij g σ; η ; t ; X g σ; η ; t ; X ij i ij i i i All calculations were implemented in R and were made available in the package piraid (11) (for full details on how to automatically create polynomials and control streams for a and the SD (SDI-BI): given composite scale, see Supplemental Material 2, where ! ! relevant piraid R code is supplied with some illustrative Zk−f θ; η ; t ; X Zk−1−f θ; η ; t ; X ij i ij i i i n n examples). PY ¼ k ¼ ϕ −ϕ ij pn Ψ pn Ψ ij ij 4 4 For the fully IRT-informed models (I-CV, I-BI), link functions were used between Ψ and mean TS/Z score as well as between Ψ and SD. The partially IRT-informed models illustrate how much improvement either the description of mean or SD brings, respectively, but if the Derivation of IRT-Informed Functions IRT-informed approach is taken, it should of course be natural to use the fully IRT-informed models. For the Based only on the ICCs (no new data is required) from a partially (mean) IRT-informed models (MI-CV, MI-BI), published IRT model (9) for the MDS-UPDRS motor only the link between Ψ and mean TS/Z score was used. subscale, analytic links allowing a direct translation between For the partially (SD) IRT-informed models (SDI-CV, the latent variable of the IRT model and the expected mean and SD were derived on the TS scale for the CV model and SDI-BI), link functions were only implemented between on the Z score scale for the BI model. For the CV model, the mean TS/Z score and SD(Y| Ψ). All these polynomials as expected value and the variance of the total score as a well as examples of h(·) functions are given in Supple- function of the latent variable Ψ were calculated according to: mental Material 3, and NONMEM control stream exam- ples for I-CV and I-BI are given in Supplemental Material 4. The NONMEM control stream for the IRT simulation EYðÞ jΨ ¼ ∑ ∑ s PYðÞ ¼ sjΨ model, including the ICC parameters, is shown in Supple- m¼1 s¼1 mental Material 5. VarðÞ YjΨ ¼ ∑ ∑ðÞ s−EYðÞ jΨ PYðÞ ¼ sjΨ m¼1 s¼1 Information Content where P(Y = s │ Ψ) is the response probability for item m Fisher information, often used in optimal design to and category s given a particular latent variable value (from understand the compare different study designs, can also the published ICCs), M is the number of items in the MDS- be used to appreciate what parts of the data are considered UPDRS motor subscale, and S is the number of categories most informative under a given model. For that purpose, for item m. The analytic link for the BI model is given in the Fisher information for Ψ was calculated for the IRT, Supplemental Material 1. the I-CV, and the I-BI model according to the equation In a second step, the analytically derived link functions presented in Supplemental Material 6. The resulting for the expected value and the standard deviation information, as a function of Ψ, was visualized graphically pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (SDðÞ YjΨ ¼ VarðÞ YjΨ ) were approximated using empiri- and the information content from a S-CV model with cal Chebyshev polynomials across the whole disease range for homoscedastic residual error model was overlaid. The the motor subscale, i.e. Ψ ranged from − 4 to + 8. Starting calculation of information content is conditioned of the from 1, the polynomial degree was increased until the model being an adequate description of the data, which is maximum deviation between analytic link and polynomial the case for the IRT, I-CV, and I-BI model, but not the S- approximation was below a pre-specified tolerance (0.01 and CV model. 0.01 for the expected value and SD of the CV model, 45 Page 4 of 10 The AAPS Journal (2021) 23:45 Applications where θ is the baseline, θ the slope, θ the effect of 1 2 3 medication, X the covariate of medication (1/0), and t time. The same additional complexity models for SD(Y| Ψ)as Applications to Simulated Data described for the simulated datasets were evaluated, but the base model structure was always the same. Improved model The approach of IRT-informed TS-analysis was investi- fit was assessed by objective function value (OFV) or Akaike gated on two simulated datasets. First, one simulation of 1000 individuals at 10 occasions was performed with IRT latent information criterion (AIC), and visual predictive checks variable baseline at 0.654 and no disease progression, (VPC). meaning that subjects had the same value of the latent variable throughout the study. This dataset was analysed with Software S-CV and S-BI models with homoscedastic SD—both with and without an associated log-normally distributed IIV. Nonlinear mixed-effects modelling was performed with Alternatively, SD(Y| Ψ) was used to describe the SD, NONMEM version 7.4 (ICON Development Solutions, meaning no parameter relating to SD was estimated. Addi- Ellicott City, MD), executed through PsN version 4.9 (13, tional complexity on top of SD(Y| Ψ) was also evaluated, 14). The Laplacian estimation method with η-ε interaction which allowed for higher or lower (non-negative) SD but was used for all the CV models, while BI models were retaining the shape of SD(Y| Ψ): multiplication by a param- estimated with stochastic approximation expectation maximi- eter (bound to be positive), lognormal IIV in SD, multiplica- zation (SAEM). To be able to compare the OFV between tion by a parameter with lognormal IIV, and lastly addition of different models, importance sampling with expectation only a parameter (bound to be positive) with lognormal IIV. was added in a second estimation step. Graphics were made Second, another simulation of 1000 individuals at 10 with R version 3.6.2 (15) and the polynomials were calculated occasions was performed with IRT latent variable baseline at in piraid (11). −1 0.654 and a linear disease progression of 0.449 year . The dataset was again analysed with CV and BI models both with RESULTS homoscedastic SD and SD(Y| Ψ), this time in combination with the disease progression model. Both linear disease Derivation of IRT-Informed Functions progression on TS or Z and linear disease progression on Ψ through E(Y| Ψ) were evaluated. High-order polynomials (12–23) were sufficient to ade- quately map Ψ to TS and Z scales. As Ψ increased from low Parameter Precision and Bias to high values, the TS increased with an S-shape, as seen in Fig. 1. The SD of TS showed strong deviation from Models that use the IRT-informed link functions allow homoscedasticity, with decreasing SD towards the extremes estimation of baseline and disease progression parameters on and symmetry around the mid-point. The SD of the total the same scale as the IRT model. It is therefore of interest to score as a function of the mean total score, shown in assess whether such parameters can be accurately estimated. Supplemental Fig. 1, had a similar pattern. For each studied scenario, containing studies of 1000 subjects The Z scores increased linearly in the range of − 2< Ψ < over 10 occasions, 100 datasets were simulated and parameter 5 (see Fig. 2). At the asymptotes, the relation was slightly S- estimated based on the simulated data. In each of these, the shaped. The SD for the Z score also showed symmetry but IRT model was used as simulation model and the I-CV, MI- with the lowest SD at the mid-point and higher SD towards CV, I-BI, and MI-BI models were used for parameter the extremes. The SD of the Z score as a function of the estimation. The precision was evaluated through the width mean Z score, shown in Supplemental Fig. 2, had a similar of the percentiles of the estimated parameter values and the pattern. bias through the percent difference of the mean parameter estimate from the true parameter value. Information Content Application to Real Data Figure 3 illustrates the differences in information content for the latent variable under different models. The informa- The real dataset was the same as in the previously tion is calculated under the assumption that the respective described (3), and came from the Parkinson’s Progression model holds. The curve from the IRT model, hence, Markers Initiative (PPMI) (www.ppmi-info.org)(12) with 428 represents a theoretical upper bound for the information de novo patients with Parkinson’s disease who were followed content as this is the true model in this case. It should be up to 48 months, totaling 2720 observations. Both CV and BI noted that the specific information value for the homosce- models were used to fit the data, where inclusion of E(Y| Ψ) dastic CV model is dependent on the distribution of scores in and/or SD(Y| Ψ) was evaluated, and the base model structure the dataset; one should therefore rather focus on the shape of (in both standard and IRT-informed models) was the same as the curve and not its height. The figure highlights how the previously reported (3, 9), with disease progression described homoscedastic CV model tends to underestimate the infor- through a linear slope and the effect of medication included mation content at the centre of the scale and overestimate it as an offset effect: at the scale boundaries. The two IRT-informed models have similar information content curves and acknowledge the hðÞ ¼ θ þ θ t−θ X 1 2 3 1 decreased information content at the scale boundaries. At The AAPS Journal (2021) 23:45 45 Page 5 of 10 Fig. 1. Mean TS and SD as a function of the latent variable (Ψ) for a CV model the centre of the scale, where a given total score can be adding SD(Y| Ψ) as a description of SD. There was no added achieved through a larger set of response patterns than at the benefit of modifying the function by multiplication or addition boundaries, the IRT model maintains the largest advantage in of extra parameters (results not shown). information content. Towards the boundaries, the difference In Table III, the results of TS-analysis of the simulated between IRT model and IRT-informed total score models dataset with disease progression are shown. The best CV and diminishes. BI models were fully IRT-informed (I-CV, I-BI) and had a 667-point and 263-point improvement in OFV compared to Applications their respective standard models (S-CV, S-BI). The IRT- informed models also had one parameter less than the standard models, since SD was described through SD(Y| Ψ) Application to Simulated Data and, thus, not estimated. The partially (SD) IRT-informed BI model (SDI-BI) had almost the same OFV as I-BI, indicating In Table II, the results of TS-analysis of the simulated that the fit was similar on Z scale and Ψ scale, shown in dataset with no disease progression are shown. For both CV Supplemental Table 1. and BI models, there was a significant decrease in OFV after Fig. 2. Mean Z score and SD as a function of the latent variable (Ψ) for a BI model 45 Page 6 of 10 The AAPS Journal (2021) 23:45 model. The best BI model had slightly lower OFV than the best CV model. As seen in Fig. 4,the fit for both CV and BI models was improved between the base and final model by using IRT- informed functions of mean and SD. The base CV model predicted scores outside the scale range which also resulted in wide confidence intervals for the predictions near the scale boundaries, seen in the 5th and 95th percentiles of the VPC. The final CV model has improved fit and lower uncertainty in the outer percentiles. The base BI model with homoscedastic SD respected the scale boundaries and had quite good fit, but was still improved by IRT-informed functions, with the improve- ments mostly seen for the median and 5th percentile of the observations/predictions. The final I-CVand I-BI had similar fit. DISCUSSION The following section will discuss the general benefits of Fig. 3. Comparison of latent variable information content versus total IRT-informed TS-analyses, which will be followed by the score under the IRT, the I-CV, the I-BI and the (homoscedastic) S- conceptual advantages shown through simulation examples, CV model. The grey-shaded areas illustrate that, under the IRT then the application to real data, and lastly some perspectives model, total information is the sum of information from the individual on limitations and future prospects. items (the 5 most informative items over the whole score range are The IRT-informed functions allow modellers to link IRT labelled) and TS models in a formal way. This improves the TS models, as the expected variability in scores can be better described. Parameter Precision and Bias The functions to describe SD do not require additional parameters, yet improve the fit, as they allow the SD to vary Parameter precision and bias for the case of IRT model with the disease severity and follow the nature of TS data. simulations and re-estimation by CV and BI models are The functions further make it possible to retrieve the presented in Supplemental Fig. 3. The parameters of interest parameters of an IRT model with a TS model, which for example were the baseline and linear slope of the IRT model (i.e. on the could be useful for assessing potential treatment effects. The IRT- Ψ scale) used for simulations. The parameter precision was informed functions also improve the predictive performance of comparable between the models and there was no sign of bias. the models, which is especially helpful for the standard CV analyses that otherwise predict scores outside the boundaries. The information of different model types can also be Application to Real Data assessed via the links, and relative information can be compared. When item-level data is available in some datasets In Table IV, the fit to the real data is shown for CV and but not others, the link functions allow these data to be jointly BI models. The partially IRT-informed models are shown in analysed—which can help in bridging information between Supplemental Table 2. The best CV model was fully IRT- different studies and data bases. informed (I-CV) with both a scaling factor and IIV on SD(Y| Ψ), and had a 364-point OFV improvement over the standard homoscedastic CV model (S-CV) with one addi- Conceptual Advantages tional parameter. The best BI model was the partially (SD) IRT-informed (SDI-BI) with scaling and IIV on SD(Y| Ψ), The mapping between Ψ andTS showeda strong with a 309-point OFV difference compared to the standard BI nonlinear relationship, which is expected since the TS is model (S-BI) and one additional parameter. However, the bounded while the latent variable is not. In contrast, the Z fully IRT-informed BI model (I-BI) with the same compo- score and Ψ mapped rather linearly in the relevant disease nents was comparable in fit and was chosen as the final range, both being variables that can take on any real value. Table II. ΔOFV for Simulated Data with No Disease Progression Model Standard deviation θ ΔOFV OFV No. of estimated parameters AIC S-CV Homoscedastic (estimated θ) 3.5 - 58,064 3 58,070 SDI-CV Heteroscedastic (fixed SD(Y|Ψ)) - − 299 57,765 2 57,769 S-BI Homoscedastic (estimated θ) 0.15 - 58,118 3 58,124 SDI-BI Heteroscedastic (fixed SD(Y|Ψ)) - − 430 57,687 2 57,691 AIC Akaike information criterion, BI bounded integer, CV continuous variable, IRT item response theory, OFV objective function value, ΔOFV difference in OFV relative to standard model, Ψ latent variable of IRT, S-BI standard BI model, S-CV standard CV model, SD(Y|Ψ) standard deviation from IRT model, SDI-BI partially (SD) IRT-informed BI model, SDI-CV partially (SD) IRT-informed CV model The AAPS Journal (2021) 23:45 45 Page 7 of 10 Table III. ΔOFV for Simulated Data with Disease Progression Model Disease progression Standard deviation θ ΔOFV OFV No. of estimated parameters AIC S-CV Linear on TS Homoscedastic (estimated θ) 3.8 - 62,171 6 62,183 I-CV Linear on Ψ (via E(TS|Ψ)) Heteroscedastic (fixed SD(Y|Ψ)) - − 667 61,505 5 61,515 S-BI Linear on Z Homoscedastic (estimated θ) 0.14 - 61,767 6 61,779 I-BI Linear on Ψ (via E(Z|Ψ)) Heteroscedastic (fixed SD(Y|Ψ)) - − 263 61,504 5 61,514 AIC Akaike information criterion, BI bounded integer, CV continuous variable, I-BI fully IRT-informed BI model, I-CV fully IRT-informed CV model, IIV inter-individual variability, IRT item response theory, OFV objective function value, ΔOFV difference in OFV relative to standard model, Ψ latent variable of IRT, S-BI standard BI model, S-CV standard CV model, SD(Y|Ψ) standard deviation from IRT model, TS total score, Z latent variable of BI lower fold difference between the highest and lowest variability With simulated data, modelling a linear disease progression in the BI model as compared to the CV model, which might on Ψ via E(Y| Ψ) greatly improved the fit for the CV model, explain why the addition of SD(Y| Ψ)seemedmoreimportant compared to a linear disease progression on TS. When for CV models than for BI models. The Z scores are not equally subsequently adding SD(Y| Ψ), it further improved the fit. spaced; instead, the distance between cut-points increases as Z However, for the BI model there was only a marginal benefit deviates from 0. This affects the shape of the SD(Y| Ψ)curve of modelling via E(Y| Ψ) after adding SD(Y| Ψ). and explains why the function has the nadir around 0. Following the S-shaped relation between Ψ and TS, it is natural that the CV model is improved using the IRT-information for the mean when there is a linear disease progression on Ψ.This Behaviour with Real Data relation could also produce correlations between the baseline and slope parameters in the standard CV model since a baseline close When applied to real data, a linear disease progression to either boundary would mean a lower slope estimate. In the was superior on the Ψ scale over the TS scale for the best CV analysis of simulated data—where slope and baseline were model (ΔOFV 74), but for the best BI model, there was no uncorrelated—there was no large correlation estimated, likely difference in fit between linear disease progression on Ψ scale since the baseline was in the linear range of the relation. However, or Z scale (ΔOFV +5). This is in line with the results from correlations could be introduced or changed when transforming a simulations, and suggests that Ψ and Z map close to each variable and should be investigated. other and a linear disease progression is in better agreement The unexplained variability between observations was with these scales than the TS scale. clearly not homoscedastic for either of the models. Towards For the simulated datasets, the SD(Y| Ψ) function provided the boundaries, the variability of TS was dramatically decreased the best description of SD and the fit was not improved by any almost symmetrically from the maximum value, which it reached additions or multiplications of the polynomial used. For the real at the midpoint. The fold difference between the maximum and data, however, it was better to allow individual variation of the minimum variability was roughly 4. For the unexplained SD(Y| Ψ) function as the description of some individuals variability of Z scores, the shape was also symmetrical, but with benefitted from a reduced SD while others from an increased a nadir at the midpoint and around 3 times higher at Z≈−2or SD. The reduced SD for some individuals may be due to Z ≈ 2, and decreasing again at the extremes. Thus, there was a Markovian features in the responses—i.e. sequential observed Table IV. ΔOFV for Real Data Model Disease progression Standard deviation θ IIV (%CV) ΔOFV OFV No. of estimated parameters AIC S-CV Linear on TS Homoscedastic (estimated θ) 5.3 - - 18774 10 18,794 I-CV Linear on Ψ Heteroscedastic (fixed SD(Y|Ψ)) - - + 261 19,035 9 19,053 Heteroscedastic (SD(Y|Ψ)∙θ) 1.4 - − 206 18,568 10 18,588 η 2 Heteroscedastic (SD(Y|Ψ)∙θ∙ e ) 1.4 24 − 364 18411 14 18,439 S-BI Linear on Z Homoscedastic (estimated θ) 0.22 - - 18686 10 18,706 I-BI Linear on Ψ Heteroscedastic (fixed SD(Y|Ψ)) - - + 390 19,076 9 19,094 Heteroscedastic (SD(Y|Ψ)∙θ) 1.4 - − 97 18,590 10 18,610 η 4 Heteroscedastic (SD(Y|Ψ)∙θ∙ e ) 1.3 38 − 304 18383 14 18,411 AIC Akaike information criterion, BI bounded integer, CV continuous variable, %CV coefficient of variation in percent, I-BI fully IRT- informed BI model, I-CV fully IRT-informed CV model, IIV inter-individual variability, IRT item response theory, OFV objective function value, ΔOFV difference in OFV relative to standard model, Ψ latent variable of IRT, S-BI standard BI model, S-CV standard CV model, SD(Y| Ψ) standard deviation from IRT model, TS total score, Z latent variable of BI Base CV model Final CV model Base BI model Final BI model 45 Page 8 of 10 The AAPS Journal (2021) 23:45 Fig. 4. Visual predictive checks (VPC) of model fit for base and final CV and BI models for the real data. The circles represent the total score observations, the solid line represents the median, and dashed lines represent the 2.5th and 97.5th percentiles. The shaded areas represent the model predicted 95% confidence interval of the corresponding percentiles scores in an individual have the same value to a higher extent Perspectives than predicted by the model. Such features are well-known in categorical data analyses and have also been described, and We have shown the method applied to a composite scale modelled (16). Most individuals benefitted from a higher where the TS is the sum of item scores. The Extended variability, since the best model had higher SD than SD(Y| Ψ) Disability Status Scale (EDSS) for multiple sclerosis, for example, uses a decision tree to arrive at the TS. To derive for both the CV and BI models. This indicates the presence of the analytical solution of the link functions and information additional sources of variability than those accounted for in the ICCs of the IRT model. One likely explanation is that content would in that case require a different approach. observations of different items are correlated in a way not However, the option of simulating across a wide range of Ψ captured by the IRT model. When simulating items of the IRT from the IRT model to approximate the link functions would models including additional correlations across items, the overall still exist and through these simulation-based link functions SD was indeed different, and typically higher, than the theoret- arrive at the information content. ically expected (results not shown). The highest variability is seen In this work we only focused on one subscale, the MDS- around the mid-point of the scale and very few patients in this UPDRS motor subscale. If a scale with N subscales is considered, study were at, or close to, the scale boundaries—which explains the IRT model will have N latent variables and N different TS the increased mean variability in models with homoscedastic should be characterized, which should be possible with a unexplained variability. Further, describing the time trajectory straightforward extension of the methodology presented here. with a simple model induces some model misspecification, which The degree of the polynomials used to fit the mean and adds to the residual error magnitude. SD as a function of the latent variable is a potential source of Since the TS-analysis has no information about which items error. However, the tolerance of the Chebyshev polynomials contributed to the score, it is natural that more observations are canbeadjustedtoachieve asatisfactory fit, and such needed to gain the same information as an IRT model with item- functionality has been built into the piraid package. When level data. The item-level data is more informative because the adjusting the polynomials through higher tolerance, the OFV ICCs are different for all items and different items have different changed only marginally in this work. information about the underlying latent variable. In contrast to The only disease progression model that was investigated the IRT model, the CV and BI models make no use of the was a linear slope on the Ψ scale, as this was reported in the different items ability to inform on the underlying latent variable. previously published IRT model of these data (8). Of course, in Indeed, the more heterogeneity in information content across reality there may be other functions that better describe disease items, the larger the difference between analyses on item-level progression on the latent variable scale for the real data, which and total score level (17). The standard CV model assumes also had a medication effect identified as an offset effect on Ψ in constant information across all expected values. This represents the IRT model (9 ). In the CV model, the medication effect was an underestimation of the information in part of the scale and also described as an offset effect, however on TS. Thus, the hence underutilizing the information, but also overestimates the interpretation is different at different disease severities, due to information at other values, and may therefore interpret patterns the S-shaped relation between Ψ and TS. As the BI models that only represent noise as signals of model misspecification. mapped linearly in the relevant disease severity range (5th and The AAPS Journal (2021) 23:45 45 Page 9 of 10 95th percentiles of Z were − 2.3 and − 0.22), the interpretation of are included in the article's Creative Commons licence, unless the medication effect on Z is similar to that on Ψ. Again, more indicated otherwise in a credit line to the material. If material advanced functions could have been evaluated. The properties is not included in the article's Creative Commons licence and of TS models under model misspecification were not investi- your intended use is not permitted by statutory regulation or gated here, but could be a future work. While our current exceeds the permitted use, you will need to obtain permission approach with polynomial link functions assumes perfect directly from the copyright holder. To view a copy of this knowledge of the underlying ICCs, a possible future extension licence, visit http://creativecommons.org/licenses/by/4.0/. could use the analytic link functions in combination with an informative prior to allow taking uncertainty into account. There are many possible approaches to model TS data REFERENCES with CV models. A logit transform for the TS is one way to ensure predictions within the boundaries of the scale—however then the boundaries will only be 1. Martinez-Martin P. Composite rating scales. J Neurol Sci. 2010;289(1):7–11. asymptotical. In this work we only investigated untrans- 2. Baker FB. The basics of item response theory. 2nd Ed. College formed TS with additive error as this is a common choice. Park: ERIC Clearinghouse on Assessment and Evaluation, Also for the BI model, a constant SD was used as the aim of University of Maryland; 2001. http://ericae.net/irt/baker. this work was to illustrate the benefits of IRT-informed 3. Wellhagen GJ, Kjellsson MC, Karlsson MO. A bounded integer modelling of TS in a standard setting. Apart from the models model for rating and composite scale data. AAPS J. 2019 06;21(4):74. mentioned above, other models for TS data (not evaluated in 4. Conrado DJ, Denney WS, Chen D, Ito K. An updated Alzheimer’s this work) are for example beta regression and coarsened grid disease progression model: incorporating non-linearity, beta models. regression, and a third-level random effect in NONMEM. J The usefulness of IRT-informed models to better de- Pharmacokinet Pharmacodyn. 2014 Dec;41(6):581–98. 5. Lesaffre E, Rizopoulos D, Tsonaka R. The logistic trans- scribe the unexplained variability of composite scale end- form for bounded outcome scores. Biostat Oxf Engl. points is encouraging. This will facilitate analyses of TS data 2007;8(1):72–85. without the need to develop new IRT models. The impact on 6. Goetz CG, Fahn S, Martinez-Martin P, Poewe W, Sampaio C, precision and accuracy in clinical trials is yet to be quantified, Stebbins GT, et al. Movement Disorder Society-sponsored but is under investigation in a follow-up project (18). The revision of the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS): process, format, and clinimetric testing plan. IRT-informed functions broaden the options available to Mov Disord Off J Mov Disord Soc. 2007;22(1):41–7. modellers dealing with TS data, and could be considered in 7. Zetusky WJ, Jankovic J, Pirozzolo FJ. The heterogeneity of standard analysis plans as yet another possible model to be Parkinson’s disease: clinical and prognostic implications. Neu- determined from the data. rology. 1985;35(4):522–6. 8. Vu TC, Nutt JG, Holford NHG. Progression of motor and nonmotor features of Parkinson’s disease and their response CONCLUSIONS to treatment. Br J Clin Pharmacol. 2012;74(2):267–83. 9. Buatois S, Retout S, Frey N, Ueckert S. Item response theory as an IRT-informed functions provide a formal link between efficient tool to describe a heterogeneous clinical rating scale in de IRT models and TS models and allow longitudinal TS novo idiopathic Parkinson’s disease patients. Pharm Res. 2017;34(10):2109–18. modelling to be improved without adding further parameters 10. Holden SK, Finseth T, Sillau SH, Berman BD. Progression of to the model. This approach allows information of different MDS-UPDRS scores over five years in de novo Parkinson model types, based on item- and TS-level data, to be directly disease from the Parkinson’s progression markers initiative compared and their relative merits better understood. To cohort. Mov Disord Clin Pract. 2017;5(1):47–53. 11. UUPharmacometrics/piraid [Internet]. Uppsala University, facilitate for modellers, IRT-informed functions can be Pharmacometrics Research Group; 2020. Available from: automatically generated through the piraid package. https://github.com/UUPharmacometrics/piraid.Accessed14 Apr 2020. SUPPLEMENTARY INFORMATION 12. Marek K, Chowdhury S, Siderowf A, Lasch S, Coffey CS, Caspell-Garcia C, et al. The Parkinson’s progression markers initiative (PPMI) – establishing a PD biomarker cohort. Ann The online version contains supplementary material Clin Transl Neurol. 2018;5(12):1460–77. available at https://doi.org/10.1208/s12248-021-00555-3. 13. Lindbom L, Ribbing J, Jonsson EN. Perl-speaks-NONMEM (PsN)–a Perl module for NONMEM related programming. FUNDING Comput Methods Prog Biomed. 2004 Aug;75(2):85–94. 14. Lindbom L, Pihlgren P, Jonsson EN, Jonsson N. PsN-Toolkit–a collection of computer intensive statistical methods for non- Open access funding provided by Uppsala University. linear mixed effect modeling using NONMEM. Comput This work was financially supported by the Swedish Research Methods Prog Biomed. 2005 Sep;79(3):241–57. Council Grant 2018-03317. 15. R Core Team.R:alanguage and environment for statistical computing [internet]. Vienna, Austria: R Foundation for Statistical Computing; 2019. Available from: https://www.R-project.org/. Open Access This article is licensed under a Creative Accessed 16 Mar 2020. Commons Attribution 4.0 International License, which per- 16. Germovsek E, Hansson A, Kjellsson MC, Ruixo JJP, Westin Å, Soons PA, et al. Relating nicotine plasma concentration mits use, sharing, adaptation, distribution and reproduction in to momentary craving across four nicotine replacement any medium or format, as long as you give appropriate credit therapy formulations. Clin Pharmacol Ther. 2020;107(1):238– to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were 17. Schindler E, Friberg LE, Karlsson MO. Comparison of item response theory and classical test theory for power/sample size for made. The images or other third party material in this article 45 Page 10 of 10 The AAPS Journal (2021) 23:45 questionnaire data with various degrees of variability in items’ Data. The AAPS Journal 2021;23(1). https://doi.org/10.1208/ discrimination parameters. PAGE 24 2015 Abstr 3468 [Internet]. s12248-020-00546-w. Available from: www.page-meeting.org/?abstract=3468. Accessed 13 Jul 2020. Publisher’s Note Springer Nature remains neutral with regard 18. Wellhagen GJ, Karlsson MO, Kjellsson MC. Comparison of to jurisdictional claims in published maps and institutional Precision and Accuracy of Five Methods to Analyse Total Score affiliations.
"The AAPS Journal" – Springer Journals
Published: Mar 16, 2021
You can share this free article with as many people as you like with the url below! We hope you enjoy this feature!
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.