Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Systematic measurement error in self-reported health: is anchoring vignettes the way out?

Systematic measurement error in self-reported health: is anchoring vignettes the way out? aparajita.dasgupta@ashoka.edu.in Economics, Ashoka University, This paper studies systematic reporting heterogeneity in self-assessed health in India Sonepat, India using World Health Survey (WHS)-SAGE survey that has subjective assessments on own health and hypothetical vignettes as well as objective measures like measured anthropometrics and performance tests on a range of health domains. The study implicitly tests and validates the assumption of response consistency in a developing country setting, thus lending support to the use of vignettes. Additionally, we are able to control for unobservable heterogeneities of reporting behavior at the individual level by employing individual fixed-effects estimation using multiple ratings on a set of vignettes by the same person. The study confirms identical pattern of systematic bias by the socioeconomic subgroups as is indicated by vignette technique. It further highlights that substantial amount of reporting heterogeneity remains unexplained after controlling for the usual socioeconomic control variables. The finding has potentially broader implications for research based on self-reported data in a developing country. JEL Classification: C83, D91, I12, I18, I15, I32, J10 Keywords: Self-assessed health, Vignette approach, Measurement error, Response consistency 1 Introduction Self-assessed health (SAH) is one of the most widely used measures in policy design which is convenient and informative instrument often shown to be correlated with ac- tual health, mortality, morbidity, and health access (Rohrer 2009). An individual is typ- ically asked to indicate whether her health status is excellent, good, fair, or poor. Now, any variation in reported health status can come from the following possible sources: variation related to differences in true (latent) health and/or variation in reporting which is driven by the respondent’s personal characteristics. If health perceptions are systematically correlated with socioeconomic characteristics such as income and ex- posure to health care systems, self-assessed health status can be misleading. One of the ways to examine systematic measurement error in self-reported health is to formalize the problem of heterogeneous reporting behavior and to formulate tests for its occurrence in the context of subjective health information. In order to correct for systematic differences in reporting heterogeneity across population subgroups, a proposed solution is to anchor an individual’s self-assessed response on her rating of a © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 2 of 30 vignette description of a hypothetical situation that is fixed for all respondents (King et al. 2004; Bago d'Uva et al. 2011). The idea is based on the underlying assumption that any variation in rating of a vignette (depicting a fixed level of latent health) would iden- tify systematic reporting bias, which can then be adjusted in the individual’s subjective assessment of her own situation. The validity of this approach however relies on two important assumptions, viz. “vi- gnette equivalence” which requires that all individuals perceive the vignette description as corresponding to a given state of the same underlying construct and “response consistency” which implies that individuals use the same response categories for their subjective assessment as they have used for evaluating the hypothetical scenarios pre- sented to them in vignettes. This assumption will not hold if there are strategic influ- ences on reporting of the individual’s own situation that are absent from evaluation of the vignette (Bago d'Uva et al. 2011). This study is one of the first to test the assump- tion of response consistency in a developing country setting where measurement error in survey data may be more of a problem. The paper first presents a framework to formally test the existence of systematic measurement error across sociodemographic subgroups. We examine systematic reporting heterogeneity using two ways: first using responses from vignette ratings across different health domains and second using a method that combines data on ob- jective and self-reported health indicators. In this process, we implicitly test the validity of the “response consistency” assumption. The paper adds on to the existing body of lit- erature by explicitly checking the validity of this assumption in a developing country setting. Further, using repeated ratings from the same individual over multiple vi- gnettes, we can control for idiosyncratic heterogeneities in the individual fixed-effects estimation. We precisely examine whether the pattern of reporting behavior matches with that obtained from our first exercise and to what extent it can be explained by so- cioeconomic characteristics that are usually accounted in a regression framework. The study finds strong presence of systematic reporting heterogeneity in self-assessed health across subgroups and validates the assumption of response consistency. The result is confirmed in our robustness check where we even control for individual fixed effects. Further, it finds that the reporting heterogeneity in SAH can only be partially explained by observable characteristics of individuals and a large part of it remains unexplained. Thus, the study has important implications for research that solely rely on subjective health data. 2 Theoretical framework Van Doorslaer and Jones (2002) find subgroups of the population systematically use dif- ferent thresholds in classifying their health into a categorical measure. Individuals are likely to use different reference points and interpret the self-assessed health (SAH) question within their own specific context (Lindeboom and van Doorslaer 2004). Sen (1993, 2002) points that comparison of self-reported morbidities in a typical developing country setting may find the children in the poorest households are the healthiest. While various techniques have been proposed for achieving comparable response scales across groups, Murray et al. (2002) indicate anchoring vignettes as “the most promis- ing” of available strategies. Anchoring vignettes reveal how groups may differ in their use of response categories, i.e., where along the health spectrum, individuals locate thresholds between the ordered categories. The idea is to vary the health status Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 3 of 30 exogenously in each of the hypothetical cases, where any difference in rating of these fixed latent health situations would identify systematic difference in reporting behavior by socioeconomic subgroups. Bago d'Uva et al. (2008) using vignette technique rejected reporting homogeneity by different educational groups using a pilot data from Indonesia, India , and China. Although SAH measure is widely used in empirical research, for a given true but un- observed health state, if survey respondents report their health differently depending on certain characteristics like conceptions of general health, utilization of health ser- vices, expectations for own health, financial incentives to report ill health, and compre- hension of the survey questions, measurement error in SAH is no longer random. Bound (1991) highlights if measurement error in a given variable is not “classical,” it can introduce serious biases in estimates leading to simple attenuation to misattribut- ing relationships. Economic circumstances and geographic location may alter health ex- pectations through factors like peer effects, societal norm, and access to medical care. Reporting of health may vary with education through the awareness factor, i.e., concep- tions of illness, understanding of disease and knowledge of the availability, access, and effectiveness of health care. Antman and McKenzie (2007) and Escobal and Laszlo (2008) point a number of reasons why measurement error may be more of a problem in developing country settings for which validation data are not readily available. It be- comes particularly problematic as there can be high degree of heterogeneity among the respondents in this setting in terms of literacy level and health awareness. Noteworthy is the fact that the state of Kerala (with one of the lowest levels of mortality among In- dian states) has consistently reported the highest morbidity rate (approximately three times the all-India average) in three successive rounds of nationally represented survey NSS, whereas in contrast, Bihar, with one of the highest mortality rates, reported the lowest morbidity. Banerjee et al. (2004) mentions that sick individuals in a poor disease endemic area, with limited health access or opportunities for medical treatment, may report being in good health because some type of illnesses may be perceived as “nor- mal” phenomena due to their prolonged, widespread occurrence in the area, where people might be adapted to the sickness that they experience. Schultz and Strauss (2008) mention some illnesses such as blindness, ringworms, or malaria may be per- ceived as normal phenomena due to their prolonged, widespread occurrence in a dis- ease prone area without health access, where individuals may not see themselves as particularly unhealthy. However, one of the key identifying assumptions of this methodology is that of re- sponse consistency, which is the assumption that respondents use the same thresholds while evaluating own health as they use in evaluating the vignettes. Kapteyn et al. (2011) using an Internet-based panel in the USA find a mixed evidence on the response consistency assumption which holds for certain health domains and not in others. Van Soest et al. (2011) develop an integrated framework in which objective measurements are used to validate vignette-based corrections of subjective assessments of drinking be- havior by students in Ireland. Bago d'Uva et al. (2011) point that the assumption of re- sponse consistency is testable given sufficiently comprehensive objective indicators that independently identify response scales. Their study finds mixed results for response consistency in a sample of older English individuals. Although the assumption of re- sponse consistency has been debated in the recent literature in the context of Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 4 of 30 developed countries, there exists scant evidence from the developing country setting. The current study addresses this gap by testing this assumption using a nationally rep- resentative data from India. This paper provides a formal framework to test the existing pattern of reporting behav- ior in SAH and offers a simple methodological technique to check the assumption of re- sponse consistency used in vignette approach in a developing country setting which has important implications for informing survey design using self-assessed responses. 3 Empirical strategy We employ three distinct methods to test the reporting behavior in SAH responses in this study. First, we test the existence of systematic measurement error in SAH across population subgroups by estimating the ordered probit model of the vignette responses following King et al. (2004) to identify the reporting heterogeneity by covariates. Let H be the reported health status for the vignette question; the vector X is a vec- i i tor of observed characteristics (sociodemographic covariates potentially susceptible to systematic reporting bias like age, gender, education, income, and location). The under- lying assumption for this identification relies on the fact that a vignette represents a fixed level of latent health; hence, the difference in rating pattern by covariates can be attributed to the systematic reporting heterogeneity associated with the X ’s. We esti- mate the following equation: H ¼ X β þ u ð1Þ i i Thus, a positive (negative) and significant coefficient (β) would imply over-reporting (under-reporting) of worse health, as degree of worse health/difficulty level in health in- creases from 1 to 5 in the categorical response of the dependent variable. As reporting of health status can potentially be influenced by expectations for own health, tolerance of illness, and health norm in society, we include the following con- trols in the X vector: education categories, gender, age groups, body mass index (BMI categories), expenditure quintiles, religion, ethnic groups, sector (urban/rural), and underdeveloped state dummy—capturing development in the state (which implicitly captures and controls for the access to effective health care and can be a rough meas- ure for tolerance of illness in the society). Banerjee et al. (2004) find that individuals in the upper third income group report the most symptoms over the last 30 days, and at- tribute this to higher awareness of health status. Thus, in order to identify any nonlin- ear effect of income on reporting bias, we use the middle expenditure quintile as the reference category in our estimated equation. In our second empirical approach, we identify systematic reporting heterogeneity using both self-reported and objective health indicators collected in the data. We re- rep gress the self-reported health (H ) on the same set of covariates (X ) controlling for a i i obj battery of “objective” health measures (H ). The underlying idea is any systematic variation in subjective assessments that remains after conditioning on the objective in- dicators can be attributed to systematic biases in reporting behavior. rep obj H ¼ αH þ X b þ V ð2Þ i i i i This specification hinges on the fact that after correcting for “true” health, the report- ing heterogeneity (if any) would be reflected as the coefficients of the covariates in the Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 5 of 30 second equation. Specifically, the assumption is addition of a precise set of objective in- dicators would soak up the variation coming from the difference in true/latent health, leaving out the reporting bias to be identified. We claim that the battery of “objective” health measures is sufficient to ensure that our second approach holds that lets us test whether the response consistency assumption holds in this data. As we have both the objective and subjective counterpart on the particular health domain of mobility, func- tioning, cognition, and memory, we are able to precisely control and condition for the “observable” health counterpart and run this specification. So, one way of implicitly checking the response consistency assumption would be to see if the pattern of reporting heterogeneity by socioeconomic subgroups from Eq. (1) matches with Eq. (2). Precisely, we claim that “response consistency” assumption would hold in this data if we get the same signs of β’s from both the estimations. In our third empirical strategy, we exploit the individual fixed-effects estimation to control for the individual specific unobservable factors in reporting heterogeneity. This is more of a robustness exercise where we even tackle idiosyncratic reporting heterogeneityemploying individual fixedeffects. It maybe possiblethatthere are cer- tain unobservable characteristics at the individual level (for example, say the person is not being serious while evaluating responses) that can add to the reporting heterogen- eity. With responses on multiple vignettes for thesameindividual, we areable tocon- trol for the individual fixed effects in a two-stage regression estimation. In the first stage (Eq. 3), we regress the vignette responses (10 questions per vignette set for each individual) on individual dummies ID to get their corresponding coefficients μ’s which we use in the second stage (Eq. 4) as dependent variable to be explained by the usual covariates. We claim that any variation in the assessment of H (representing fixed level of latent health) between respondents is reflected in μ’s which captures the reporting heterogeneity at the individual level. This method lets us explore the vari- ation in reporting behavior devoid of the noise that can arise due to individual spe- cific unobservable factors. We estimate the following set of equations: H ¼ ID μ þ v ð3Þ i i μ ¼ X β þ u ð4Þ i i Through this exercise, we examine the pattern of reporting heterogeneity that re- mains after we control for such individual specific unobservable characteristics. Further, we examine how much of that reporting heterogeneity is explained by observable fac- tors that usually gets controlled in a typical regression, by checking the R-square of the estimated Eq. (4).The motivation behind this exercise is the fact that vignette adjust- ments can only detect and correct for reporting heterogeneity by observable character- istics of the respondents. If much of the reporting heterogeneity arises due to unobservable characteristics of the respondents, then the scope of anchoring vignettes for greater inter-personal comparability is limited. The next section discusses the data followed by the results. 4 Data and summary statistics The analysis uses the World Health Survey (WHS)-SAGE Wave 1 survey (2007 to 4 5 2009) in India covering six states , namely Maharashtra, Karnataka, West Bengal, Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 6 of 30 Rajasthan, Uttar Pradesh, and Assam. The data collected included self-reported assess- ments of health linked to anchoring vignettes, which are hypothetical stories that de- scribe the health problems of third parties in several health domains. This data has information of both “subjective” and “objective” measures of identical health questions in addition to the responses on vignettes. The states were selected randomly in the sample such that one state was selected from each region (from six regions: north, central, east, north east, west, and south) as well as from each level of development category. The level of development was based on four indicators , namely infant mortality rate, female literacy rate, percentage of safe Table 1 Descriptive statistics Variables Mean Std. Dev. Education categories No formal education 0.45 0.50 Below primary 0.10 0.31 Primary 0.16 0.36 Secondary 0.12 0.33 High school 0.11 0.31 College and above 0.06 0.24 Individual characteristics Male 0.39 0.49 Age groups 18–29.9 0.14 0.34 30–44.9 0.22 0.41 45–60 0.32 0.47 Above 60 0.32 0.47 Marital status Currently married 0.78 0.42 BMI categories (measured) Underweight (BMI < 18.5) 0.35 0.48 Normal (BMI 18.5–24.9) 0.51 0.50 Overweight (BMI 25–29.9) 0.11 0.31 Obese (BMI > 30) 0.03 0.17 Household characteristics Household’s expenditure quintiles Q1 0.21 0.41 Q2 0.16 0.37 Q3 0.22 0.42 Q4 0.22 0.41 Q5 0.17 0.38 Religion (Hindu = 1) 0.84 0.37 Caste (SC/ST = 1) 0.41 0.49 Regional characteristics Urban 0.25 0.43 Underdeveloped dummy (=1 for states: Rajasthan, UP) 0.38 0.49 N = 10873 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 7 of 30 deliveries, and per capita income at the state level. We use the development classifica- tion used in WHS to construct a dummy for underdevelopment (=1 for the two least developed states, viz. Rajasthan and Uttar Pradesh, and =0 for the other four states). The following sets of vignettes in the data included mobility and affect, pain and per- sonal relationships and vision, sleep and energy, and cognition and self-care. The re- spondent was asked to rate how much of a problem or difficulty the person has in the vignette, on an ordered scale response of 1 to 5—the same scale as used to rate SAH. The survey data includes perceptions of well-being and more objective measures of health, including measured performance tests (rapid walk) and cognitive tests (verbal fluency, recall capacity). We construct four categories of individuals by body mass index by using the measured height and weight variable: underweight (BMI < 18.5), normal (BMI 18.5–24.9—reference category in regression), overweight (BMI 25–29.9), and obese (BMI > 30). We include six education categories capturing the highest level of education completed: no formal education (reference category), less than primary education, primary, secondary, high school, and college and above. Age is categorized into four groups: 18 to 29.9 years (reference category), 30 to 44.9 years, 45 to 60 years, and greater than 60. The total number of individuals who have the complete information across mea- sured health is 10,873 individuals. Table 1 presents the summary statistics of the key variables of interest. Figure 1 presents the variation in SAH responses in the sample. We find individual responses on SAH cluster around the middle value. We plot the dis- tribution of measured and reported height across expenditure quintiles and education categories in Figs. 2 and 3 respectively. It reveals that on average, individuals underre- port their true height and this difference becomes smaller for higher expenditure and education categories. Disaggregating by undeveloped category of the states, we find the difference in reported and measured height is most prominent for individuals from the poorest quintiles (Fig. 4). Interestingly, in states like Karnataka and Uttar Pradesh, the self-reported height is always greater than measured height for all expenditure quintiles Fig. 1 Distribution of self-reported health response. Note :SAH is on a 1-5 scale , where 1=very good; 5=very poor Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 8 of 30 Fig. 2 Average self-reported and measured height by expenditure quintiles unlike the other states where it is the opposite. This suggests there may be some cul- tural factors that can be driving the reporting bias. Also, individuals from lesser devel- oped states (correlated with lesser education and lower access to health facilities) have systemically different reporting behavior. We find that the gap between reported weight and measured weight is significant for all expenditure quintiles for the less developed states and not so for that of developed states (Fig. 5). Also, the gap is highest in the poorest quintile. This is actually in line with the finding from Strauss and Thomas (1996) where they observe that the gap between maternal reports and measurements of child height is smaller among higher income and better educated mothers. In the next Fig. 3 Average self-reported and measured height by education categories. Note: Categories include: No formal education (=1), below primary(=2), primary (=3), secondary(=4), high school(=5), college (=6) Post- graduate degree completed(=7) Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 9 of 30 Fig. 4 Average self-reported and measured height by expenditure quintiles and state section, we explore further on this line of enquiry with our regression framework followed by robustness checks. 5 Results Equation (1) is estimated separately for 10 health state vignettes from each health do- main. We separately present the regression results for the domains “mobility and affect,”“pain and personal relationships,”“vision, sleep, and energy,” and “cognition and self-care” in Table 2, Table 3, Table 4, and Table 5 respectively. All specifications for these set of regressions include dummies for education categories, gender, age groups, marital status, body mass index categories, household expenditure quintiles, re- ligion, caste, sector, and level of development in one’s state. We then estimate the dependent variable “how would you rate your health today” on the same set of covariates (Table 6) but include a set of performance tests and interviewer assessments. Further, we control for (i) performance test scores for mobility and cognitive ability and (ii) biomarkers including tests for lung function, blood pressure, pulse rate, and chronic illness diagnosed (arthritis, stroke, angina, diabetes chronic lung disease, asthma, depression, hypertension, cataracts, oral health, injuries, cancer screening) in spe- cification (3) in the same table. Specification (4) adds the respective interviewer assess- ment dummies. The idea is that we are able to precisely control and condition for the “observable” health counterpart and test it by specific health domain of mobility, function- ing, cognition, and memory (results presented in Tables 7, 8, 9,and 10). Males, on average, show a systematic pattern of under-reporting of worse health con- sistent across all the health domains. Interestingly, we find that individuals from both lower as well as higher quintiles have higher probability to report better health Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 10 of 30 Fig. 5 Comparison of self-reported and measured weight by development level and expenditure compared to the middle income group. Individuals from urban are more likely to under-report worse health. The dummy for underdevelopment is negative and statisti- cally significant across specifications. With regards to the age group, individuals over 60 years tend to over-report illness. Interestingly, those who are underweight and obese, controlling for their objective health, tend to over-report worse health. Perhaps, the most interesting result that stands out of this exercise is that of sys- tematic reporting bias by different underdevelopment category of states in India. This is perhaps suggestive of the hypothesis that socially disadvantaged individuals fail to perceive and report the presence of illness because an individual’sassessmentoftheir health is directly contingent on their social experience. It can be attributed to lower expectation for own health/higher tolerance for diseases where an individual may not see herself as being unhealthy conditional on the health norm prevailing in one’s community. We now discuss the findings from the cross-validation exercise estimating Eq. (2) and comment on the validity of “response consistency” assumption across different health domains. Overall, we find that the subjective evaluation of own health problems and that of the vignette person are basically identical which lends support to the re- sponse consistency assumption in the data. We find individuals with education level secondary and above are more likely to under-report illness that is statistically significant at 1% level across all specifications. It is possible that higher educated respondents feel greater confidence regarding their capacity to handle a given level of health impairment, and underrate it. Males, as be- fore, show consistent patterns of underreporting illness as compared to females, statis- tically significant across specifications. Compared to the young age group, individuals over 60 years significantly over report illness, which is consistent with our earlier find- ing from vignette approach. Both the poor and the rich tend to understate illness compared to the middle ex- penditure group. The underdeveloped dummy is consistently negative and statistically Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 11 of 30 Table 2 Vignettes set 1: mobility and affect Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Education categories (ref category: no formal education) Below primary 0.02 −0.03 0.19** 0.12 0.01 −0.06 0.07 0.01 0.01 −0.04 Primary −0.01 −0.07 0.10 0.01 0.04 0.01 0.09 0.03 0.00 0.00 Secondary 0.08 −0.07 0.09 −0.05 0.00 −0.01 −0.09 −0.12* 0.14* 0.03 High school 0.06 −0.09 0.11 −0.01 0.14* −0.00 −0.09 −0.18** 0.09 0.06 College and above −0.07 −0.21** 0.09 −0.07 0.18 0.15 −0.03 −0.10 0.02 0.03 Individual characteristics Male −0.12** 0.12** −0.18*** −0.18*** −0.25*** −0.10* −0.03 −0.10** −0.15*** −0.05 Age groups (ref category: age 18−29.9 years) 30−44.9 −0.02 0.09 0.03 0.01 0.03 0.02 0.12 0.04 0.17** 0.15** 45−60 0.04 0.09 0.05 0.03 −0.03 −0.03 −0.01 −0.09 0.12* 0.20*** Above 60 0.15** 0.21*** 0.13* 0.11 0.12 0.11 0.08 0.03 0.23*** 0.26*** Marital status Currently married 0.06 −0.03 0.01 0.05 0.00 −0.02 −0.02 0.01 −0.06 −0.07 BMI categories (measured) (ref category: normal BMI 18.5−24.9) Underweight (BMI < 18.5) 0.01 0.04 −0.07 −0.06 −0.05 0.02 0.03 0.00 −0.06 −0.03 Overweight (BMI 25−29.9) 0.03 −0.01 −0.06 −0.05 0.13* 0.08 −0.20*** −0.15** −0.02 0.06 Obese (BMI > 30) −0.00 0.23* −0.15 −0.25* 0.11 0.21 0.01 0.03 0.24* 0.10 Household’s expenditure quintiles (ref category: Q3) Q1 −0.08 −0.16** −0.21*** −0.22*** −0.07 −0.18*** −0.10* −0.13** −0.17*** −0.14** Q2 −0.06 −0.06 −0.14** −0.14** 0.05 0.02 −0.03 −0.06 −0.09 −0.07 Q4 −0.04 −0.06 −0.13** −0.12** −0.04 −0.09 −0.04 −0.12* −0.12** −0.17*** Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 12 of 30 Table 2 Vignettes set 1: mobility and affect (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Q5 0.00 0.06 −0.14* −0.10 −0.01 −0.02 0.03 −0.02 −0.06 −0.05 Religion (Hindu = 1) −0.06 −0.10* 0.01 0.08 −0.15*** −0.12** −0.06 −0.05 0.01 −0.07 Caste (SC/ST = 1) 0.12*** 0.05 0.03 0.03 0.10** 0.10** −0.03 0.02 0.22*** 0.16*** Regional characteristics Urban −0.12** −0.12** −0.05 −0.02 −0.04 −0.09* −0.04 −0.01 −0.09* −0.14*** Underdeveloped −0.14*** 0.12*** −0.26*** −0.12** −0.36*** −0.13*** −0.02 0.02 −0.20*** 0.05 Observations 2674 2674 2674 2674 2674 2674 2674 2674 2674 2674 ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 13 of 30 Table 3 Vignettes set 2: pain and personal relationships Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Education categories (ref category: no formal education) Below primary −0.05 −0.09 0.03 0.15** 0.07 0.05 −0.12 −0.05 −0.08 −0.09 Primary −0.17*** −0.12** −0.03 −0.02 0.04 0.03 −0.04 −0.01 −0.10 −0.09 Secondary −0.02 −0.21*** 0.09 0.20*** −0.07 −0.02 −0.02 −0.01 −0.01 0.00 High school −0.01 −0.11 0.03 0.07 0.07 0.03 −0.08 −0.10 −0.01 −0.01 College and above −0.19* −0.25** 0.16 0.23** 0.07 −0.01 −0.13 −0.07 −0.00 0.00 Individual characteristics Male −0.06 −0.03 −0.08 −0.13*** −0.11** −0.14*** 0.15*** 0.13*** 0.09* −0.03 Age groups (ref category: age 18−29.9 years) 30−44.9 0.05 −0.03 0.06 0.11 0.01 0.09 0.02 −0.06 0.01 0.05 45−60 0.02 −0.03 −0.01 0.07 −0.12* −0.01 0.02 −0.08 −0.03 0.07 Above 60 0.10 0.03 −0.05 0.02 −0.12 0.00 −0.03 −0.09 −0.08 0.05 Marital status Currently married −0.05 0.04 −0.11** −0.09* −0.08 −0.04 −0.14*** −0.10* −0.07 −0.02 BMI categories (measured) (ref category: normal BMI 18.5−24.9) Underweight (BMI < 18.5) 0.01 −0.02 0.05 0.04 0.02 0.02 0.04 0.04 0.03 −0.01 Overweight (BMI 25−29.9) −0.02 −0.08 −0.03 −0.02 −0.02 0.04 0.10 0.10 −0.03 −0.01 Obese (BMI > 30) 0.10 −0.14 0.05 0.09 0.04 0.01 −0.19 −0.08 −0.21* −0.18 Household’s expenditure quintiles (ref category: Q3) Q1 −0.09 −0.11* −0.15** −0.05 −0.09 −0.03 −0.11* −0.08 −0.03 −0.07 Q2 0.07 0.05 −0.08 −0.05 −0.12* −0.08 0.05 0.09 0.06 0.12* Q4 0.04 0.09 −0.04 −0.10 −0.08 −0.00 0.04 0.04 0.04 0.02 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 14 of 30 Table 3 Vignettes set 2: pain and personal relationships (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Q5 −0.05 0.00 −0.13* −0.16** 0.07 −0.06 0.06 0.08 0.02 0.05 Religion (Hindu = 1) −0.02 0.13** −0.08 0.00 0.07 0.01 −0.08 −0.06 −0.12** −0.13** Caste (SC/ST = 1) 0.13*** 0.08* −0.04 −0.09** 0.15*** 0.06 −0.12*** −0.11** −0.06 −0.06 Regional characteristics Urban −0.06 −0.07 −0.01 −0.02 −0.08 −0.00 0.01 −0.03 0.01 −0.03 Underdeveloped −0.05 0.03 −0.20*** 0.01 −0.25*** −0.15*** −0.37*** −0.19*** −0.23*** −0.02 Observations 2729 2729 2729 2729 2729 2729 2729 2729 2729 2729 ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 15 of 30 Table 4 Vignettes set 3: vision, sleep, and energy Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Education categories (ref category: no formal education) Below primary 0.12 0.11 0.01 −0.01 0.06 0.09 0.06 0.06 0.03 0.01 Primary 0.01 −0.13** −0.03 0.03 0.01 −0.11* −0.08 −0.08 −0.00 −0.08 Secondary 0.08 0.00 −0.04 −0.09 0.18** 0.09 0.07 0.08 0.01 0.00 High school 0.30*** 0.14* −0.06 −0.16** 0.12 −0.00 −0.11 −0.08 0.03 −0.15* College and above 0.47*** 0.26*** 0.05 −0.14 0.17* 0.03 0.24** 0.15 0.18* −0.03 Individual characteristics Male −0.07 −0.04 −0.05 −0.08 0.01 0.01 −0.18*** −0.21*** −0.13*** −0.09* Age groups (ref category: age 18−29.9 years) 30−44.9 −0.04 0.00 −0.03 −0.11 0.15** 0.07 0.05 0.02 −0.02 −0.07 45−60 0.03 0.05 0.09 −0.00 0.14** 0.12* 0.02 −0.01 0.01 −0.04 Above 60 0.08 0.07 0.11 0.08 0.20*** 0.11 0.00 0.01 0.05 −0.01 Marital status Currently married −0.08 −0.09* 0.00 0.07 0.01 −0.08 −0.02 −0.02 0.06 −0.02 BMI categories (measured) (ref category: normal BMI 18.5−24.9) Underweight (BMI < 18.5) 0.00 −0.02 0.03 0.05 0.03 −0.00 0.09** 0.08* 0.09** 0.10** Overweight (BMI 25−29.9) −0.10 0.02 0.01 0.01 −0.11 −0.06 0.04 0.12* −0.00 −0.02 Obese (BMI > 30) −0.05 0.07 −0.19 0.02 −0.20* −0.08 0.11 0.22* 0.09 0.13 Household’s expenditure quintiles (ref category: Q3) Q1 0.00 −0.05 −0.06 −0.04 0.02 −0.05 0.05 −0.00 0.03 −0.04 Q2 −0.01 −0.06 −0.09 0.00 −0.05 0.00 0.03 0.08 0.07 0.05 Q4 −0.10 −0.12** −0.08 0.06 −0.02 −0.02 0.08 0.02 0.09 0.02 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 16 of 30 Table 4 Vignettes set 3: vision, sleep, and energy (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Q5 −0.08 −0.04 −0.06 0.05 0.05 0.07 0.07 −0.02 0.07 0.10 Religion (Hindu = 1) −0.04 −0.02 0.01 −0.03 −0.02 0.06 −0.02 0.05 −0.04 0.06 Caste (SC/ST = 1) 0.08* 0.11*** −0.04 −0.17*** −0.08* 0.01 −0.02 −0.06 −0.20*** −0.12*** Regional characteristics Urban −0.13** −0.12** −0.01 0.05 −0.08 −0.10* −0.08 −0.02 −0.12** −0.03 Underdeveloped −0.37*** −0.09** −0.25*** −0.21*** −0.10** −0.10** −0.42*** −0.23*** −0.36*** −0.30*** Observations 2771 2771 2771 2771 2771 2771 2771 2771 2771 2771 ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 17 of 30 Table 5 Vignettes set 4: cognition and self-care Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Education categories (ref category: no formal education) Below primary −0.12* −0.09 −0.12* −0.13* 0.06 0.03 −0.02 0.07 0.01 −0.03 Primary −0.11* −0.11* −0.09 −0.05 0.08 0.02 0.02 0.04 −0.12* −0.15** Secondary −0.13* −0.05 −0.04 −0.08 −0.09 −0.04 −0.15** −0.10 −0.01 −0.03 High school −0.04 −0.01 −0.08 −0.07 0.09 0.11 0.18** 0.15** 0.02 0.04 College and above −0.06 −0.08 −0.01 −0.04 0.06 0.10 0.18* 0.16 0.05 −0.03 Individual characteristics Male 0.03 0.01 0.01 −0.09* −0.12** −0.19*** −0.16*** −0.19*** 0.07 0.02 Age groups (ref category: age 18–29.9 years) 30–44.9 0.05 0.03 0.08 0.09 0.06 0.01 −0.15** −0.16** 0.11 0.10 45–60 0.11 0.05 0.10 0.08 0.07 0.07 −0.19*** −0.18** 0.13* 0.04 Above 60 0.24*** 0.19*** 0.18** 0.21*** 0.12* 0.15** −0.14* −0.12 0.11 0.14** Marital status Currently married −0.06 −0.03 −0.01 0.05 0.08 0.09* 0.05 0.09* −0.10* −0.09 BMI categories (measured) (ref category: normal BMI 18.5–24.9) Underweight (BMI < 18.5) −0.00 0.07 0.05 0.04 0.04 0.00 0.01 −0.04 0.10** 0.05 Overweight (BMI 25–29.9) −0.03 −0.05 −0.02 −0.06 −0.07 −0.03 0.06 0.02 0.01 0.04 Obese (BMI > 30) −0.13 −0.07 −0.13 0.01 0.06 −0.11 −0.09 −0.02 0.01 0.09 Household’s expenditure quintiles (ref category: Q3) Q1 0.02 −0.01 −0.09 −0.15** 0.03 −0.06 0.06 0.08 −0.04 −0.03 Q2 0.04 0.05 −0.01 −0.09 −0.01 −0.11* 0.09 0.05 −0.06 −0.04 Q4 0.13** 0.09 −0.04 −0.11* −0.05 −0.05 0.03 0.09 0.07 0.03 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 18 of 30 Table 5 Vignettes set 4: cognition and self-care (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Q5 0.13* 0.10 −0.01 0.00 0.13* 0.09 0.11* 0.15** −0.02 −0.05 Religion (Hindu = 1) −0.07 −0.04 −0.00 −0.06 −0.00 0.05 0.02 0.03 0.03 −0.01 Caste (SC/ST = 1) 0.19*** 0.17*** 0.01 0.02 0.12*** 0.16*** 0.13*** 0.13*** −0.12*** −0.06 Regional characteristics Urban −0.02 −0.06 −0.11** −0.11** 0.04 −0.01 0.07 −0.02 −0.01 0.03 Underdeveloped −0.52*** −0.41*** −0.25*** −0.08* −0.38*** −0.19*** −0.28*** −0.23*** −0.23*** −0.09** Observations 2699 2699 2699 2699 2699 2699 2699 2699 2699 2699 ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 19 of 30 Table 6 Dependent variable: self-reported health Variables (1) (2) (3) (4) Health today Education categories (ref category: Below primary −0.10*** −0.09** 0.07* −0.07* no formal education) Primary −0.10*** −0.08** −0.04 −0.03 Secondary −0.25*** −0.23*** −0.16*** −0.14*** High school −0.38*** −0.36*** −0.25*** −0.23*** College and above −0.63*** −0.60*** −0.46*** −0.44*** Individual characteristics Male −0.12*** −0.13*** −0.09*** −0.08*** Age groups (ref category: 30–44.9 0.51*** 0.53*** 0.50*** 0.47*** age 18–29.9 years) 45–60 0.82*** 0.85*** 0.76*** 0.70*** Above 60 1.18*** 1.19*** 1.04*** 0.91*** Marital status Currently married −0.06** −0.05* −0.04 −0.03 Household’s expenditure quintiles Q1 −0.02 −0.04 −0.02 −0.03 (ref category: Q3) Q2 −0.04 −0.05 −0.05 −0.05 Q4 −0.01 0.00 0.02 0.03 Q5 −0.09** −0.07** −0.09** −0.08** Religion (Hindu = 1) −0.19*** −0.19*** −0.18*** −0.17*** Caste (SC/ST = 1) −0.05** −0.06*** −0.08*** −0.11*** Regional characteristics Urban −0.11*** −0.10*** −0.10*** −0.08*** Underdeveloped −0.27*** −0.27*** −0.26*** −0.32*** BMI categories (measured) (ref Underweight (BMI < 18.5) 0.20*** 0.17*** 0.15*** category: normal BMI 18.5–24.9) Overweight (BMI 25–29.9) 0.02 0.00 −0.01 Obese (BMI > 30) 0.08 0.01 0.00 Rapid Walk −0.32*** −0.08 Cognitive score 1 −0.02 −0.02 Cognitive score 2 −0.06*** −0.06*** Cognitive score 3 0.00 0.00 Performance tests Chronic illness 0.25*** 0.20*** Lung function 0.0 0.00 Blood pressure systolic 0.00 0.00 Blood pressure diastolic 0.00 0.00 Pulse rate 0.00*** 0.00*** Hearing 0.35*** Vision 0.17*** Interviewer assessments Walking 0.36*** Shortness of breath 0.28*** Overall health problem 0.33*** Observations 10,873 10,873 10,873 10,873 ***p < 0.01, **p < 0.05, *p < 0.1 significant across all specifications, implying a underreporting of worse health among the disadvantaged group. This coefficient increases after interviewer assessments of health states are controlled, confirming that it is picking up the reporting bias. We further estimate a vector of self-reported functioning measures in the domain of mobility (Table 7), daily activities (Table 8), and cognitive outcomes (Table 9). While es- timation of self-reported measure for memory would suggest that males fare better, we Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 20 of 30 Table 7 Dependent variables: self-reported functioning measures across various domains of mobility Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) Education categories (ref category: no formal education) Below primary −0.05 −0.07* −0.02 −0.06 −0.02 −0.02 −0.00 −0.02 −0.06 0.01 −0.08** −0.04 Primary −0.06* −0.05 −0.04 −0.09*** −0.03 −0.09*** −0.08** −0.07* −0.10** −0.08** −0.10*** −0.21*** Secondary −0.23*** −0.30*** −0.26*** −0.26*** −0.22*** −0.27*** −0.28*** −0.22*** −0.18*** −0.26*** −0.25*** −0.25*** High school −0.23*** −0.30*** −0.20*** −0.37*** −0.28*** −0.27*** −0.24*** −0.24*** −0.17*** −0.21*** −0.38*** −0.34*** College and above −0.50*** −0.52*** −0.51*** −0.60*** −0.60*** −0.51*** −0.55*** −0.49*** −0.42*** −0.62*** −0.56*** −0.45*** Individual characteristics Male −0.27*** −0.14*** −0.36*** −0.44*** −0.44*** −0.38*** −0.42*** −0.28*** −0.09*** −0.15*** −0.18*** −0.18*** Age groups (ref category: age 18–29.9 years) 30–44.9 0.48*** 0.48*** 0.49*** 0.35*** 0.56*** 0.52*** 0.44*** 0.51*** 0.35*** 0.42*** 0.21*** 0.44*** 45–60 0.83*** 0.84*** 0.86*** 0.70*** 1.00*** 0.98*** 0.80*** 1.03*** 0.76*** 0.77*** 0.44*** 0.84*** Above 60 1.27*** 01.31*** 1.18*** 1.09*** 1.40*** 1.41*** 1.23*** 1.50*** 1.19*** 1.13*** 0.76*** 1.15*** Marital status Currently married −0.03 −0.05* 0.02 0.02 −0.01 0.00 0.02 −0.01 −0.02 −0.09*** −0.14*** 0.02 Household’s expenditure quintiles (ref category: Q3) Q1 0.08** −0.03 −0.01 0.02 0.03 −0.01 0.03 0.04 0.12*** 0.07** −0.05 0.06 Q2 0.06 −0.02 0.06* 0.07* 0.07** 0.05 0.04 0.04 0.13*** 0.06* −0.04 0.08** Q4 0.04 −0.01 −0.03 0.06* −0.03 −0.01 −0.02 −0.01 −0.07 0.01 −0.02 0.04 Q5 0.00 0.03 −0.03 −0.02 0.02 −0.05 −0.03 −0.03 −0.10** −0.08** 0.00 −0.06 Religion (Hindu = 1) −0.12*** −0.13*** −0.11*** −0.14*** −0.12*** −0.16*** −0.14*** −0.11*** −0.08** −0.12*** −0.08** −0.00 Caste (SC/ST = 1) 0.14*** −0.06** 0.07*** −0.00 0.11*** 0.03 0.14*** 0.08*** 0.09*** 0.18*** −0.10*** 0.13*** Regional characteristics Urban −0.11*** −0.17*** −0.09*** −0.03 −0.06** −0.08*** −0.16*** −0.13*** 0.03 −0.05* −0.06** −0.05 Underdeveloped −0.18*** −0.04* −0.22*** −0.33*** −0.22*** −0.06** −0.34*** −0.20*** −0.05 −0.08*** −0.30*** −0.30*** Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 21 of 30 Table 7 Dependent variables: self-reported functioning measures across various domains of mobility (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) BMI categories (measured) (ref category: normal BMI 18.5–24.9) Underweight (BMI < 18.5) 0.08*** 0.12*** 0.06** 0.07*** 0.07*** 0.11*** 0.11*** 0.03 0.06** 0.09*** 0.10*** 0.04 Overweight (BMI 25–29.9) 0.12*** 0.06* 0.12*** 0.10** 0.15*** 0.19*** 0.05 0.17*** 0.07 0.04 0.09** 0.09** Obese (BMI > 30) 0.27*** 0.27*** 0.20*** 0.30*** 0.35*** 0.27*** 0.19*** 0.32*** 0.10 0.14** 0.18*** 0.17** Walk difficulty Timed walk −0.34 0.22 −0.24 −0.03 −0.35 −0.14 0.41 0.24 −0.69** −0.15 0.01 −0.43 Rapid walk −0.36 −0.46* −0.12 −0.59** −0.29 −0.43* −0.88*** −0.78*** 0.17 −0.36 −0.33 0.19 Interviewer assessment −0.73*** −0.34*** −0.49*** −0.37*** −0.50*** −0.54*** −0.36*** −0.46*** −0.29*** −0.39*** −0.37*** −0.52*** Observations 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 The dependent variables in all the specifications take values 1–5 measuring self-reported difficulty level (1 = no difficulty; 5 = extreme difficulty) faced by the respondent in the specific activity describing some form of mobility ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 22 of 30 Table 8 Dependent variables: self-reported functioning measures across various domains of daily activities Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) Education categories (ref category: no formal education) Below primary −0.05 −0.04 −0.06 −0.09** −0.08** −0.12** −0.06 0.01 −0.07 −0.05 −0.09** Primary −0.09*** −0.16*** −0.16*** −0.12*** −0.09** −0.12*** −0.19*** −0.17*** −0.18*** −0.14*** −0.19*** Secondary −0.27*** −0.22*** −0.16** −0.31*** −0.23*** −0.21*** −0.09* −0.22*** −0.23*** −0.31*** −0.40*** High school −0.29*** −0.28*** −0.33*** −0.25*** −0.26*** −0.22*** −0.18*** −0.33*** −0.30*** −0.43*** −0.45*** College and above −0.58*** −0.50*** −0.54*** −0.57*** −0.44*** −0.49*** −0.39*** −0.61*** −0.61*** −0.59*** −0.69*** Individual characteristics Male −0.39*** −0.07* −0.03 −0.29*** −0.20*** −0.18*** −0.16*** −0.25*** −0.19*** −0.30*** −0.16*** Age groups (ref category: age 18–29.9 years) 30–44.9 0.31*** 0.22*** 0.38*** 0.32*** 0.39*** 0.35*** 0.34*** 0.50*** 0.43*** 0.31*** 0.29*** 45–60 0.70*** 0.58*** 0.68*** 0.73*** 0.82*** 0.73*** 0.66*** 0.89*** 0.76*** 0.65*** 0.70*** Above 60 1.15*** 0.95*** 1.00*** 1.06*** 1.23*** 1.11*** 1.04*** 1.24*** 1.14*** 1.09*** 1.16*** Marital status Currently married 0.03 −0.06 −0.07* −0.03 −0.06** −0.02 −0.07** −0.03 −0.01 −0.05* −0.05 Household’s expenditure quintiles (ref category: Q3) Q1 0.03 0.00 0.03 0.01 −0.04 −0.00 −0.01 0.03 0.06 −0.04 −0.09*** Q2 0.02 0.01 −0.04 −0.01 −0.02 0.03 −0.02 0.06 0.00 0.03 −0.02 Q4 0.02 −0.02 −0.02 −0.02 0.06* −0.01 −0.01 −0.00 −0.03 0.00 −0.02 Q5 0.02 −0.03 −0.05 −0.08** 0.02 0.02 −0.05 −0.04 −0.12*** −0.05 −0.06* Religion (Hindu = 1) −0.18*** −0.14*** −0.15*** −0.17*** −0.16*** −0.10*** −0.01 −0.07** −0.01 −0.09*** −0.11*** Caste (SC/ST = 1) 0.06*** 0.02 −0.04 0.06** 0.03 −0.01 0.03 0.07*** 0.07** 0.08*** −0.01 Regional characteristics Urban −0.07*** 0.00 0.00 −0.06** −0.20*** −0.06* 0.03 −0.09*** −0.10*** −0.11*** −0.01 Underdeveloped −0.27*** −0.17*** −0.21*** −0.28*** −0.16*** −0.23*** 0.25*** −0.30*** 0.02 0.15*** 0.19*** Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 23 of 30 Table 8 Dependent variables: self-reported functioning measures across various domains of daily activities (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) BMI categories (measured) (ref category: normal BMI 18.5–24.9) Underweight (BMI < 18.5) 0.11*** 0.15*** 0.10*** 0.08*** 0.13*** 0.11*** 0.10*** 0.07*** 0.07** 0.10*** 0.12*** Overweight (BMI 25–29.9) 0.14*** 0.08 0.05 0.02 0.08** 0.10** −0.03 0.15*** 0.11** 0.10** 0.13*** Obese (BMI > 30) 0.36*** 0.20** 0.22** 0.21*** 0.26*** 0.24*** 0.06 0.33*** 0.27*** 0.23*** 0.28*** Walk difficulty Timed walk 0.08 −0.48 −1.00*** 0.13 0.39 −0.55* −0.31 −−0.22 −0.55* −0.26 −0.22 Rapid walk −0.71*** −0.14 0.35 −0.65** −0.92*** −0.17 −0.04 −0.13 −0.04 −0.46* −0.44* Interviewer assessment −0.55*** −0.51*** −0.63*** −0.50*** −0.39*** −0.56*** −0.36*** −0.64*** −0.59*** −0.43*** −0.55*** Observations 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 The dependent variables in all the specifications take values 1–5 measuring self-reported difficulty level (1 = no difficulty; 5 = extreme difficulty) faced by the respondent in the specific activity describing some form of daily activities ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 24 of 30 Table 9 Dependent variable: self-reported cognitive difficulty Variables (1) (2) Self-reported cognition Memory Concentration Education categories (ref category: no formal education) Below primary −0.12*** −0.11*** Primary −0.09** −0.10*** Secondary −0.30*** −0.33*** High school −0.38*** −0.42*** College and above −0.52*** −0.69*** Individual characteristics Male −0.18*** −0.07** Age groups (ref category: age 18–29.9 years) 30–44.9 0.53*** 0.48*** 45–60 0.88*** 0.81*** Above 60 1.22*** 1.23*** Marital status Currently married −0.08*** −0.08*** Household’s expenditure quintiles (ref category: Q3) Q1 −0.00 −0.02 Q2 −0.01 −0.01 Q4 0.03 −0.00 Q5 −0.01 −0.03 Religion (Hindu = 1) −0.15*** −0.05 Caste (SC/ST = 1) 0.01 −0.01 Regional characteristics Urban −0.23*** −0.13*** Underdeveloped −0.10*** −0.10*** Cognitive tests Cognitive score 1 −0.04*** −0.04*** Cognitive score 2 −0.08*** −0.07*** Words recalled −0.03*** −0.02*** The dependent variables in both the specifications take values 1–5 measuring self-reported difficulty level (1 = no difficulty; 5 = extreme difficulty) faced by the respondent in remembering and concentrating things. Objective measures include (test of words recalled after delay, digital recall test and verbal fluency) ***p < 0.01, **p < 0.05 find contrary result when we estimate objective memory test for words recalled (Table 10). As expected, individuals from underdeveloped states score lower on both cognitive tests. The findings reveal systematic underreporting of worse health among males, higher educated groups, and urban and underdeveloped states reconfirming our earlier findings. Interestingly, we find that coefficient on the underdeveloped dummy for interviewer assessed health problem reveals that individuals from underdeveloped states were more likely to have health problems (results not included). The distribution of μ for different health domains (Figs. 6, 7, 8,and 9) reveals substantial reporting heterogeneity between individuals. We examine how much of this variation in μ can be explained by the observable characteristics of the respon- dents in Table 11. The result matches with what we found earlier. We find that Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 25 of 30 Table 10 Dependent variable: objective memory tests Variables Objective memory tests Education categories (ref category: no formal education) Below primary 0.24*** Primary 0.39*** Secondary 0.57*** High school 0.80*** College and above 0.98*** Individual characteristics Male −0.07*** Age groups (ref category: age 18–29.9 years) 30–44.9 −0.25*** 45–60 −0.56*** Above 60 −0.85*** Marital status Currently married 0.08*** Household’s expenditure quintiles (ref category: Q3) Q1 −0.12*** Q2 −0.05 Q4 0.04 Q5 0.12*** Religion (Hindu = 1) 0.04 Caste (SC/ST = 1) −0.01 Regional characteristics Urban 0.16*** Underdeveloped 0.11*** The dependent variables in all the specifications are objective measures of memory and cognition including (test of words recalled after delay, digital recall test, and verbal fluency) ***p < 0.01 males are more likely to favorably rank their health state and individuals above 60 years were likely to overstate bad health. Interestingly, both the quintiles above and below the middle expenditure group were likely to understate ill health. Indi- viduals from underdeveloped states were found to be consistently underestimating health problems. This has important implications for inter-personal comparability of self-reported data even within a geographical region that may not be homoge- neous in terms of development. Now, in order to see how much of the reporting heterogeneity can be attributed to the observable characteristics, we examine the R-square of the estimated Eq. (4) for different health domains. We find that the R-square for estimations (1)to(4)is just explaining 3% (mobility and affect) to 7% (cognition and self-care) of the vari- ation in the self-reported behavior. This is alarming given the fact that we get to only control and adjust for the observables in the regression, which leaves much of the reporting heterogeneity at the individual level typically unaccounted for. Also, this potentially limits the use of anchoring vignette approach to make SAH Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 26 of 30 Fig. 6 Distribution of individual reporting heterogeneity from vignette set A. Note: health domains in set A includes mobility and affect responses more comparable if much of reporting heterogeneity between individuals is due to unobservable factors that we do not control in a regression. We discuss the implications of our results in the next section. 6 Conclusions One of the key challenges in the analysis and interpretation of health survey data is im- proving the interpersonal comparability of subjective indicators that comes with sys- tematic measurement error—as a consequence of differences in the ways that Fig. 7 Distribution of individual reporting heterogeneity from vignette set B. Note: Health domains in set B includes pain and personal relationships Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 27 of 30 Fig. 8 Distribution of individual reporting heterogeneity from vignette set C. Note: Health domains in set A includes vision, sleep and energy individuals understand and use the available responses for a given question. In this paper, we examine the pattern of reporting differences in SAH from a nationally repre- sentative survey in India and find evidence that measurement error in SAH systematic- ally varies with demographic characteristics, such as the age, gender, and education, and community characteristics such as sector and level of development in the state. This has important implications on several aspects. First, one should be careful in using self-reported health data for inter-personal com- parison of health status. This becomes all the more relevant for policy formulation in the case for a developing country setting like India where objective data on health is scarce Fig. 9 Distribution of individual reporting heterogeneity from vignette set D. Note: Health domains in set A includes cognition and self-care Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 28 of 30 Table 11 Estimations of two-stage regressions using individual fixed effects (1) (2) (3) (4) Variables Vignette set A Vignette set B Vignette set C Vignette set D Education categories Below primary 0.02 −0.03 0.04 −0.11 Primary 0.00 −0.05 −0.05 −0.10* Secondary −0.01 −0.02 0.03 −0.11* High school 0.00 −0.01 −0.01 −0.03 College and above 0.01 −0.02 0.11** −0.05 Individual characteristics Male −0.08*** −0.01 −0.07*** 0.03 Age groups 30–44.9 0.05 0.03 0.00 0.04 45–60 0.02 −0.01 0.04 0.10 Above 60 0.11*** −0.01 0.07* 0.21*** Marital status Currently married −0.01 −0.06** −0.02 −0.05 BMI categories (measured) Underweight (BMI < 18.5) −0.01 0.02 0.04* 0.00 Overweight (BMI 25–29.9) −0.02 −0.01 −0.01 −0.03 Obese (BMI > 30) 0.05 −0.05 0.01 −0.12 Household’s expenditure quintiles Q1 −0.16*** −0.07** −0.01 0.02 Q2 −0.08** 0.01 0.01 0.03 Q4 −0.11*** 0.01 −0.00 0.11* Q5 −0.05 −0.01 0.02 0.11* Religion (Hindu = 1) −0.04 −0.01 −0.00 −0.06 Caste (SC/ST = 1) 0.08*** −0.01 −0.05** 0.17*** Regional characteristics Urban −0.06** −0.02 −0.06** −0.02 Underdeveloped −0.09*** −0.12*** −0.22*** −0.47*** Constant −0.29*** −0.16*** −0.44*** 2.45*** Observations 2673 2728 2770 2698 R-square 0.03 0.02 0.04 0.07 ***p < 0.01, **p < 0.05, *p < 0.1 and one has to literally rely on self-reported health measures for assessing the health situ- ation of the country. This has consequences on evaluation of health policies that are entirely based on self-reported data on morbidity, utilization and expenditure on health care, perceived well-being , and self-rated ranking of health service delivery used in citizen and community report cards. Hence, drawing causal inference of a program based on self-reported health measures needs to be re-examined in the light of this problem. Further, one has to reflect on the problem that this reporting heterogeneity cannot be simply dealt with by controlling for the covariates in a typical regression framework. The findings on systematic reporting behavior by social disadvantage are mixed in our study. While there are non-linearities in systematic reporting bias by education and expenditure quintile of the respondents, we find individuals from underdeveloped Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 29 of 30 states underreport the presence of illness or health-deficits across all specifications. We additionally verify that the assumption of the “response consistency” assumption holds in this data. Further, controlling for individual fixed effects, we purge the idiosyncratic unobservable features of individual reporting behavior and confirm the earlier patterns of systematic bias by gender, age, and development level. Also, we point out that the observable characteristics of the respondents only explain a small portion of this remaining heterogeneity in SAH. Hence, we argue that in the dearth of objective health information, which is often costly to collect in a developing country setting, inclusion of vignette profile in questionnaire provides an arguably low-cost measure of identify- ing the systematic bias in responses, thus improving upon this problem. Endnotes For India, only a pilot data from Andhra Pradesh was analyzed. It has been argued that individuals may use different thresholds for rating vignette questions as opposed to rating self-reported health questions. In order to see whether reporting bias varies by true health, we include the mea- sured body mass index categories (viz. underweight, normal, overweight and obese). Implementation of SAGE Wave 1 was from 2007 to 2010 in six countries over different regions of the world (China, Ghana, India, Mexico, Russian Federation, and South Africa) The sample was stratified by state and locality (urban/rural) resulting in 12 strata and is nationally representative. Of the 28 states, 19 were included in the design which covered 96% of the population. The survey implemented a multistage cluster sampling design resulting in nationally representative cohorts. A composite index of the level of development was computed by giving equal weigh- tage to the four indicators. The states were ranked in this decreasing order of development (Maharashtra > Karnataka > West Bengal > Assam > Rajasthan > Uttar Pradesh) based on the composite index of infant mortality rate, female literacy rate, percentage of safe de- liveries, and per capita income. Around 500 observations do not have scores/not measured on some performance tests, i.e., less than 5% of the sample had missing information on X’s; however, they were not dropped from the analysis. The only exception being in the health domain of pain and discomfort, where male dummy changes sign and is actually positive and significant in 3 estimations (Table 3). The inclusion of the interaction terms of the covariates also does not seem to im- prove the R-square. For example, Gilligan et al. (2009) use self-perceived well-being as an outcome of interest in examining the causal impact of PSNP-food security program in Ethiopia. Acknowledgements I would like to thank the anonymous referee and the editor for the useful remarks. Responsible editor: David Lam Competing interests The IZA Journal of Development and Migration is committed to the IZA Guiding Principles of Research Integrity. The author declares that she has observed these principles. Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 30 of 30 Publisher’sNote Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Received: 4 May 2017 Accepted: 16 January 2018 References Antman F, McKenzie D. Earnings mobility and measurement error: a pseudo-panel approach. Econ Dev Cult Chang. 2007;56(1):125. Bago d’Uva T, Lindeboom M, O’Donnell O, Van Doorslaer E. Slipping Anchor? Testing the Vignettes Approach to Identification and Correction of Reporting Heterogeneity. J Hum Resour. 46.4, 2011;875–906. Print. Bago d'Uva T, Doorslaer EV, Lindeboom M, O'Donnell O. Does reporting heterogeneity bias the measurement of health disparities? Health Econ. 2008;17:351–75. Banerjee A, Deaton A, Duflo E. Health, health care, and economic development: wealth, health, and health services in rural Rajasthan. Am Econ Rev. 2004;94(2):326. Bound J. Self reported versus objective measures of health in retirement models. J Hum Resour. 1991;26(1):107–37. Escobal J, Laszlo S. Measurement error in access to markets. Oxf Bull Econ Stat. 2008;70(2):209–43. Gilligan DO, Hoddinott J, Kumar NR, Taffesse AS. An impact evaluation of Ethiopia’s productive safety nets program. Washington, DC: International Food Policy Research Institute; 2009. Kapteyn A, Smith JP, van Soest A, Vonkova H. Anchoring vignettes and response consistency. Santa Monica: RAND Corporation; 2011. https://www.rand.org/pubs/working_papers/WR840.html King GA, Murray CJL, Salomon JA, Tandon A. Enhancing the validity and cross-cultural comparability of measurement in survey research. American PoliticalScience Review. 2004;98(1):191–207. Lindeboom M, van Doorslaer E. Cut-point shift and index shift in self-reported health. J Health Econ. 2004;23(6):1083–99. Murray, C.J.L., A. Tandon, J. Salomon, C.D. Mathers, and R. Sadana (2002) Cross-population comparability of evidence for health policy, global programme on evidence for health policy discussion paper Geneva: World Health Organization. Rohrer JE. Use of published self-rated health-impact studies in community health needs assessment. Journal of Public Health Management and Practice. 2009;15(4):363–6. Schultz TP, Strauss J. Handbook of development economics (vol. 4). North Holland: Elsevier; 2008. Sen A. Positional objectivity. Philos Public Aff. 1993;22(2):126–45. Sen A. Health: perception versus observation: self reported morbidity has severe limitations and can be extremely misleading. Br Med J. 2002;324(7342):860. Strauss J, Thomas D. Measurement and mismeasurement of social indicators. Am Econ Rev. 1996;86(2):30–4. Van Doorslaer E, Jones AM. Inequalities in self-reported health; validation of a new approach to measurement. J Health Econ. 2002:61–87. Van Soest, Arthur, Delaney, Liam, Harmon, Colm, Kapteyn, Arie, Smith, James. (2011) Validating the use of anchoring vignettes for the correction of response scale differences in subjective questions Journal of the Royal Statistical Society Series. 3 A 174; pp. 575–595. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png IZA Journal of Migration Springer Journals

Systematic measurement error in self-reported health: is anchoring vignettes the way out?

IZA Journal of Migration , Volume 8 (1) – Jun 28, 2018

Loading next page...
 
/lp/springer-journals/systematic-measurement-error-in-self-reported-health-is-anchoring-AjqshCT2Lq
Publisher
Springer Journals
Copyright
Copyright © 2018 by The Author(s).
Subject
Economics; Population Economics; Labor Economics; Migration; Demography
eISSN
2520-1786
DOI
10.1186/s40176-018-0120-z
Publisher site
See Article on Publisher Site

Abstract

aparajita.dasgupta@ashoka.edu.in Economics, Ashoka University, This paper studies systematic reporting heterogeneity in self-assessed health in India Sonepat, India using World Health Survey (WHS)-SAGE survey that has subjective assessments on own health and hypothetical vignettes as well as objective measures like measured anthropometrics and performance tests on a range of health domains. The study implicitly tests and validates the assumption of response consistency in a developing country setting, thus lending support to the use of vignettes. Additionally, we are able to control for unobservable heterogeneities of reporting behavior at the individual level by employing individual fixed-effects estimation using multiple ratings on a set of vignettes by the same person. The study confirms identical pattern of systematic bias by the socioeconomic subgroups as is indicated by vignette technique. It further highlights that substantial amount of reporting heterogeneity remains unexplained after controlling for the usual socioeconomic control variables. The finding has potentially broader implications for research based on self-reported data in a developing country. JEL Classification: C83, D91, I12, I18, I15, I32, J10 Keywords: Self-assessed health, Vignette approach, Measurement error, Response consistency 1 Introduction Self-assessed health (SAH) is one of the most widely used measures in policy design which is convenient and informative instrument often shown to be correlated with ac- tual health, mortality, morbidity, and health access (Rohrer 2009). An individual is typ- ically asked to indicate whether her health status is excellent, good, fair, or poor. Now, any variation in reported health status can come from the following possible sources: variation related to differences in true (latent) health and/or variation in reporting which is driven by the respondent’s personal characteristics. If health perceptions are systematically correlated with socioeconomic characteristics such as income and ex- posure to health care systems, self-assessed health status can be misleading. One of the ways to examine systematic measurement error in self-reported health is to formalize the problem of heterogeneous reporting behavior and to formulate tests for its occurrence in the context of subjective health information. In order to correct for systematic differences in reporting heterogeneity across population subgroups, a proposed solution is to anchor an individual’s self-assessed response on her rating of a © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 2 of 30 vignette description of a hypothetical situation that is fixed for all respondents (King et al. 2004; Bago d'Uva et al. 2011). The idea is based on the underlying assumption that any variation in rating of a vignette (depicting a fixed level of latent health) would iden- tify systematic reporting bias, which can then be adjusted in the individual’s subjective assessment of her own situation. The validity of this approach however relies on two important assumptions, viz. “vi- gnette equivalence” which requires that all individuals perceive the vignette description as corresponding to a given state of the same underlying construct and “response consistency” which implies that individuals use the same response categories for their subjective assessment as they have used for evaluating the hypothetical scenarios pre- sented to them in vignettes. This assumption will not hold if there are strategic influ- ences on reporting of the individual’s own situation that are absent from evaluation of the vignette (Bago d'Uva et al. 2011). This study is one of the first to test the assump- tion of response consistency in a developing country setting where measurement error in survey data may be more of a problem. The paper first presents a framework to formally test the existence of systematic measurement error across sociodemographic subgroups. We examine systematic reporting heterogeneity using two ways: first using responses from vignette ratings across different health domains and second using a method that combines data on ob- jective and self-reported health indicators. In this process, we implicitly test the validity of the “response consistency” assumption. The paper adds on to the existing body of lit- erature by explicitly checking the validity of this assumption in a developing country setting. Further, using repeated ratings from the same individual over multiple vi- gnettes, we can control for idiosyncratic heterogeneities in the individual fixed-effects estimation. We precisely examine whether the pattern of reporting behavior matches with that obtained from our first exercise and to what extent it can be explained by so- cioeconomic characteristics that are usually accounted in a regression framework. The study finds strong presence of systematic reporting heterogeneity in self-assessed health across subgroups and validates the assumption of response consistency. The result is confirmed in our robustness check where we even control for individual fixed effects. Further, it finds that the reporting heterogeneity in SAH can only be partially explained by observable characteristics of individuals and a large part of it remains unexplained. Thus, the study has important implications for research that solely rely on subjective health data. 2 Theoretical framework Van Doorslaer and Jones (2002) find subgroups of the population systematically use dif- ferent thresholds in classifying their health into a categorical measure. Individuals are likely to use different reference points and interpret the self-assessed health (SAH) question within their own specific context (Lindeboom and van Doorslaer 2004). Sen (1993, 2002) points that comparison of self-reported morbidities in a typical developing country setting may find the children in the poorest households are the healthiest. While various techniques have been proposed for achieving comparable response scales across groups, Murray et al. (2002) indicate anchoring vignettes as “the most promis- ing” of available strategies. Anchoring vignettes reveal how groups may differ in their use of response categories, i.e., where along the health spectrum, individuals locate thresholds between the ordered categories. The idea is to vary the health status Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 3 of 30 exogenously in each of the hypothetical cases, where any difference in rating of these fixed latent health situations would identify systematic difference in reporting behavior by socioeconomic subgroups. Bago d'Uva et al. (2008) using vignette technique rejected reporting homogeneity by different educational groups using a pilot data from Indonesia, India , and China. Although SAH measure is widely used in empirical research, for a given true but un- observed health state, if survey respondents report their health differently depending on certain characteristics like conceptions of general health, utilization of health ser- vices, expectations for own health, financial incentives to report ill health, and compre- hension of the survey questions, measurement error in SAH is no longer random. Bound (1991) highlights if measurement error in a given variable is not “classical,” it can introduce serious biases in estimates leading to simple attenuation to misattribut- ing relationships. Economic circumstances and geographic location may alter health ex- pectations through factors like peer effects, societal norm, and access to medical care. Reporting of health may vary with education through the awareness factor, i.e., concep- tions of illness, understanding of disease and knowledge of the availability, access, and effectiveness of health care. Antman and McKenzie (2007) and Escobal and Laszlo (2008) point a number of reasons why measurement error may be more of a problem in developing country settings for which validation data are not readily available. It be- comes particularly problematic as there can be high degree of heterogeneity among the respondents in this setting in terms of literacy level and health awareness. Noteworthy is the fact that the state of Kerala (with one of the lowest levels of mortality among In- dian states) has consistently reported the highest morbidity rate (approximately three times the all-India average) in three successive rounds of nationally represented survey NSS, whereas in contrast, Bihar, with one of the highest mortality rates, reported the lowest morbidity. Banerjee et al. (2004) mentions that sick individuals in a poor disease endemic area, with limited health access or opportunities for medical treatment, may report being in good health because some type of illnesses may be perceived as “nor- mal” phenomena due to their prolonged, widespread occurrence in the area, where people might be adapted to the sickness that they experience. Schultz and Strauss (2008) mention some illnesses such as blindness, ringworms, or malaria may be per- ceived as normal phenomena due to their prolonged, widespread occurrence in a dis- ease prone area without health access, where individuals may not see themselves as particularly unhealthy. However, one of the key identifying assumptions of this methodology is that of re- sponse consistency, which is the assumption that respondents use the same thresholds while evaluating own health as they use in evaluating the vignettes. Kapteyn et al. (2011) using an Internet-based panel in the USA find a mixed evidence on the response consistency assumption which holds for certain health domains and not in others. Van Soest et al. (2011) develop an integrated framework in which objective measurements are used to validate vignette-based corrections of subjective assessments of drinking be- havior by students in Ireland. Bago d'Uva et al. (2011) point that the assumption of re- sponse consistency is testable given sufficiently comprehensive objective indicators that independently identify response scales. Their study finds mixed results for response consistency in a sample of older English individuals. Although the assumption of re- sponse consistency has been debated in the recent literature in the context of Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 4 of 30 developed countries, there exists scant evidence from the developing country setting. The current study addresses this gap by testing this assumption using a nationally rep- resentative data from India. This paper provides a formal framework to test the existing pattern of reporting behav- ior in SAH and offers a simple methodological technique to check the assumption of re- sponse consistency used in vignette approach in a developing country setting which has important implications for informing survey design using self-assessed responses. 3 Empirical strategy We employ three distinct methods to test the reporting behavior in SAH responses in this study. First, we test the existence of systematic measurement error in SAH across population subgroups by estimating the ordered probit model of the vignette responses following King et al. (2004) to identify the reporting heterogeneity by covariates. Let H be the reported health status for the vignette question; the vector X is a vec- i i tor of observed characteristics (sociodemographic covariates potentially susceptible to systematic reporting bias like age, gender, education, income, and location). The under- lying assumption for this identification relies on the fact that a vignette represents a fixed level of latent health; hence, the difference in rating pattern by covariates can be attributed to the systematic reporting heterogeneity associated with the X ’s. We esti- mate the following equation: H ¼ X β þ u ð1Þ i i Thus, a positive (negative) and significant coefficient (β) would imply over-reporting (under-reporting) of worse health, as degree of worse health/difficulty level in health in- creases from 1 to 5 in the categorical response of the dependent variable. As reporting of health status can potentially be influenced by expectations for own health, tolerance of illness, and health norm in society, we include the following con- trols in the X vector: education categories, gender, age groups, body mass index (BMI categories), expenditure quintiles, religion, ethnic groups, sector (urban/rural), and underdeveloped state dummy—capturing development in the state (which implicitly captures and controls for the access to effective health care and can be a rough meas- ure for tolerance of illness in the society). Banerjee et al. (2004) find that individuals in the upper third income group report the most symptoms over the last 30 days, and at- tribute this to higher awareness of health status. Thus, in order to identify any nonlin- ear effect of income on reporting bias, we use the middle expenditure quintile as the reference category in our estimated equation. In our second empirical approach, we identify systematic reporting heterogeneity using both self-reported and objective health indicators collected in the data. We re- rep gress the self-reported health (H ) on the same set of covariates (X ) controlling for a i i obj battery of “objective” health measures (H ). The underlying idea is any systematic variation in subjective assessments that remains after conditioning on the objective in- dicators can be attributed to systematic biases in reporting behavior. rep obj H ¼ αH þ X b þ V ð2Þ i i i i This specification hinges on the fact that after correcting for “true” health, the report- ing heterogeneity (if any) would be reflected as the coefficients of the covariates in the Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 5 of 30 second equation. Specifically, the assumption is addition of a precise set of objective in- dicators would soak up the variation coming from the difference in true/latent health, leaving out the reporting bias to be identified. We claim that the battery of “objective” health measures is sufficient to ensure that our second approach holds that lets us test whether the response consistency assumption holds in this data. As we have both the objective and subjective counterpart on the particular health domain of mobility, func- tioning, cognition, and memory, we are able to precisely control and condition for the “observable” health counterpart and run this specification. So, one way of implicitly checking the response consistency assumption would be to see if the pattern of reporting heterogeneity by socioeconomic subgroups from Eq. (1) matches with Eq. (2). Precisely, we claim that “response consistency” assumption would hold in this data if we get the same signs of β’s from both the estimations. In our third empirical strategy, we exploit the individual fixed-effects estimation to control for the individual specific unobservable factors in reporting heterogeneity. This is more of a robustness exercise where we even tackle idiosyncratic reporting heterogeneityemploying individual fixedeffects. It maybe possiblethatthere are cer- tain unobservable characteristics at the individual level (for example, say the person is not being serious while evaluating responses) that can add to the reporting heterogen- eity. With responses on multiple vignettes for thesameindividual, we areable tocon- trol for the individual fixed effects in a two-stage regression estimation. In the first stage (Eq. 3), we regress the vignette responses (10 questions per vignette set for each individual) on individual dummies ID to get their corresponding coefficients μ’s which we use in the second stage (Eq. 4) as dependent variable to be explained by the usual covariates. We claim that any variation in the assessment of H (representing fixed level of latent health) between respondents is reflected in μ’s which captures the reporting heterogeneity at the individual level. This method lets us explore the vari- ation in reporting behavior devoid of the noise that can arise due to individual spe- cific unobservable factors. We estimate the following set of equations: H ¼ ID μ þ v ð3Þ i i μ ¼ X β þ u ð4Þ i i Through this exercise, we examine the pattern of reporting heterogeneity that re- mains after we control for such individual specific unobservable characteristics. Further, we examine how much of that reporting heterogeneity is explained by observable fac- tors that usually gets controlled in a typical regression, by checking the R-square of the estimated Eq. (4).The motivation behind this exercise is the fact that vignette adjust- ments can only detect and correct for reporting heterogeneity by observable character- istics of the respondents. If much of the reporting heterogeneity arises due to unobservable characteristics of the respondents, then the scope of anchoring vignettes for greater inter-personal comparability is limited. The next section discusses the data followed by the results. 4 Data and summary statistics The analysis uses the World Health Survey (WHS)-SAGE Wave 1 survey (2007 to 4 5 2009) in India covering six states , namely Maharashtra, Karnataka, West Bengal, Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 6 of 30 Rajasthan, Uttar Pradesh, and Assam. The data collected included self-reported assess- ments of health linked to anchoring vignettes, which are hypothetical stories that de- scribe the health problems of third parties in several health domains. This data has information of both “subjective” and “objective” measures of identical health questions in addition to the responses on vignettes. The states were selected randomly in the sample such that one state was selected from each region (from six regions: north, central, east, north east, west, and south) as well as from each level of development category. The level of development was based on four indicators , namely infant mortality rate, female literacy rate, percentage of safe Table 1 Descriptive statistics Variables Mean Std. Dev. Education categories No formal education 0.45 0.50 Below primary 0.10 0.31 Primary 0.16 0.36 Secondary 0.12 0.33 High school 0.11 0.31 College and above 0.06 0.24 Individual characteristics Male 0.39 0.49 Age groups 18–29.9 0.14 0.34 30–44.9 0.22 0.41 45–60 0.32 0.47 Above 60 0.32 0.47 Marital status Currently married 0.78 0.42 BMI categories (measured) Underweight (BMI < 18.5) 0.35 0.48 Normal (BMI 18.5–24.9) 0.51 0.50 Overweight (BMI 25–29.9) 0.11 0.31 Obese (BMI > 30) 0.03 0.17 Household characteristics Household’s expenditure quintiles Q1 0.21 0.41 Q2 0.16 0.37 Q3 0.22 0.42 Q4 0.22 0.41 Q5 0.17 0.38 Religion (Hindu = 1) 0.84 0.37 Caste (SC/ST = 1) 0.41 0.49 Regional characteristics Urban 0.25 0.43 Underdeveloped dummy (=1 for states: Rajasthan, UP) 0.38 0.49 N = 10873 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 7 of 30 deliveries, and per capita income at the state level. We use the development classifica- tion used in WHS to construct a dummy for underdevelopment (=1 for the two least developed states, viz. Rajasthan and Uttar Pradesh, and =0 for the other four states). The following sets of vignettes in the data included mobility and affect, pain and per- sonal relationships and vision, sleep and energy, and cognition and self-care. The re- spondent was asked to rate how much of a problem or difficulty the person has in the vignette, on an ordered scale response of 1 to 5—the same scale as used to rate SAH. The survey data includes perceptions of well-being and more objective measures of health, including measured performance tests (rapid walk) and cognitive tests (verbal fluency, recall capacity). We construct four categories of individuals by body mass index by using the measured height and weight variable: underweight (BMI < 18.5), normal (BMI 18.5–24.9—reference category in regression), overweight (BMI 25–29.9), and obese (BMI > 30). We include six education categories capturing the highest level of education completed: no formal education (reference category), less than primary education, primary, secondary, high school, and college and above. Age is categorized into four groups: 18 to 29.9 years (reference category), 30 to 44.9 years, 45 to 60 years, and greater than 60. The total number of individuals who have the complete information across mea- sured health is 10,873 individuals. Table 1 presents the summary statistics of the key variables of interest. Figure 1 presents the variation in SAH responses in the sample. We find individual responses on SAH cluster around the middle value. We plot the dis- tribution of measured and reported height across expenditure quintiles and education categories in Figs. 2 and 3 respectively. It reveals that on average, individuals underre- port their true height and this difference becomes smaller for higher expenditure and education categories. Disaggregating by undeveloped category of the states, we find the difference in reported and measured height is most prominent for individuals from the poorest quintiles (Fig. 4). Interestingly, in states like Karnataka and Uttar Pradesh, the self-reported height is always greater than measured height for all expenditure quintiles Fig. 1 Distribution of self-reported health response. Note :SAH is on a 1-5 scale , where 1=very good; 5=very poor Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 8 of 30 Fig. 2 Average self-reported and measured height by expenditure quintiles unlike the other states where it is the opposite. This suggests there may be some cul- tural factors that can be driving the reporting bias. Also, individuals from lesser devel- oped states (correlated with lesser education and lower access to health facilities) have systemically different reporting behavior. We find that the gap between reported weight and measured weight is significant for all expenditure quintiles for the less developed states and not so for that of developed states (Fig. 5). Also, the gap is highest in the poorest quintile. This is actually in line with the finding from Strauss and Thomas (1996) where they observe that the gap between maternal reports and measurements of child height is smaller among higher income and better educated mothers. In the next Fig. 3 Average self-reported and measured height by education categories. Note: Categories include: No formal education (=1), below primary(=2), primary (=3), secondary(=4), high school(=5), college (=6) Post- graduate degree completed(=7) Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 9 of 30 Fig. 4 Average self-reported and measured height by expenditure quintiles and state section, we explore further on this line of enquiry with our regression framework followed by robustness checks. 5 Results Equation (1) is estimated separately for 10 health state vignettes from each health do- main. We separately present the regression results for the domains “mobility and affect,”“pain and personal relationships,”“vision, sleep, and energy,” and “cognition and self-care” in Table 2, Table 3, Table 4, and Table 5 respectively. All specifications for these set of regressions include dummies for education categories, gender, age groups, marital status, body mass index categories, household expenditure quintiles, re- ligion, caste, sector, and level of development in one’s state. We then estimate the dependent variable “how would you rate your health today” on the same set of covariates (Table 6) but include a set of performance tests and interviewer assessments. Further, we control for (i) performance test scores for mobility and cognitive ability and (ii) biomarkers including tests for lung function, blood pressure, pulse rate, and chronic illness diagnosed (arthritis, stroke, angina, diabetes chronic lung disease, asthma, depression, hypertension, cataracts, oral health, injuries, cancer screening) in spe- cification (3) in the same table. Specification (4) adds the respective interviewer assess- ment dummies. The idea is that we are able to precisely control and condition for the “observable” health counterpart and test it by specific health domain of mobility, function- ing, cognition, and memory (results presented in Tables 7, 8, 9,and 10). Males, on average, show a systematic pattern of under-reporting of worse health con- sistent across all the health domains. Interestingly, we find that individuals from both lower as well as higher quintiles have higher probability to report better health Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 10 of 30 Fig. 5 Comparison of self-reported and measured weight by development level and expenditure compared to the middle income group. Individuals from urban are more likely to under-report worse health. The dummy for underdevelopment is negative and statisti- cally significant across specifications. With regards to the age group, individuals over 60 years tend to over-report illness. Interestingly, those who are underweight and obese, controlling for their objective health, tend to over-report worse health. Perhaps, the most interesting result that stands out of this exercise is that of sys- tematic reporting bias by different underdevelopment category of states in India. This is perhaps suggestive of the hypothesis that socially disadvantaged individuals fail to perceive and report the presence of illness because an individual’sassessmentoftheir health is directly contingent on their social experience. It can be attributed to lower expectation for own health/higher tolerance for diseases where an individual may not see herself as being unhealthy conditional on the health norm prevailing in one’s community. We now discuss the findings from the cross-validation exercise estimating Eq. (2) and comment on the validity of “response consistency” assumption across different health domains. Overall, we find that the subjective evaluation of own health problems and that of the vignette person are basically identical which lends support to the re- sponse consistency assumption in the data. We find individuals with education level secondary and above are more likely to under-report illness that is statistically significant at 1% level across all specifications. It is possible that higher educated respondents feel greater confidence regarding their capacity to handle a given level of health impairment, and underrate it. Males, as be- fore, show consistent patterns of underreporting illness as compared to females, statis- tically significant across specifications. Compared to the young age group, individuals over 60 years significantly over report illness, which is consistent with our earlier find- ing from vignette approach. Both the poor and the rich tend to understate illness compared to the middle ex- penditure group. The underdeveloped dummy is consistently negative and statistically Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 11 of 30 Table 2 Vignettes set 1: mobility and affect Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Education categories (ref category: no formal education) Below primary 0.02 −0.03 0.19** 0.12 0.01 −0.06 0.07 0.01 0.01 −0.04 Primary −0.01 −0.07 0.10 0.01 0.04 0.01 0.09 0.03 0.00 0.00 Secondary 0.08 −0.07 0.09 −0.05 0.00 −0.01 −0.09 −0.12* 0.14* 0.03 High school 0.06 −0.09 0.11 −0.01 0.14* −0.00 −0.09 −0.18** 0.09 0.06 College and above −0.07 −0.21** 0.09 −0.07 0.18 0.15 −0.03 −0.10 0.02 0.03 Individual characteristics Male −0.12** 0.12** −0.18*** −0.18*** −0.25*** −0.10* −0.03 −0.10** −0.15*** −0.05 Age groups (ref category: age 18−29.9 years) 30−44.9 −0.02 0.09 0.03 0.01 0.03 0.02 0.12 0.04 0.17** 0.15** 45−60 0.04 0.09 0.05 0.03 −0.03 −0.03 −0.01 −0.09 0.12* 0.20*** Above 60 0.15** 0.21*** 0.13* 0.11 0.12 0.11 0.08 0.03 0.23*** 0.26*** Marital status Currently married 0.06 −0.03 0.01 0.05 0.00 −0.02 −0.02 0.01 −0.06 −0.07 BMI categories (measured) (ref category: normal BMI 18.5−24.9) Underweight (BMI < 18.5) 0.01 0.04 −0.07 −0.06 −0.05 0.02 0.03 0.00 −0.06 −0.03 Overweight (BMI 25−29.9) 0.03 −0.01 −0.06 −0.05 0.13* 0.08 −0.20*** −0.15** −0.02 0.06 Obese (BMI > 30) −0.00 0.23* −0.15 −0.25* 0.11 0.21 0.01 0.03 0.24* 0.10 Household’s expenditure quintiles (ref category: Q3) Q1 −0.08 −0.16** −0.21*** −0.22*** −0.07 −0.18*** −0.10* −0.13** −0.17*** −0.14** Q2 −0.06 −0.06 −0.14** −0.14** 0.05 0.02 −0.03 −0.06 −0.09 −0.07 Q4 −0.04 −0.06 −0.13** −0.12** −0.04 −0.09 −0.04 −0.12* −0.12** −0.17*** Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 12 of 30 Table 2 Vignettes set 1: mobility and affect (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Q5 0.00 0.06 −0.14* −0.10 −0.01 −0.02 0.03 −0.02 −0.06 −0.05 Religion (Hindu = 1) −0.06 −0.10* 0.01 0.08 −0.15*** −0.12** −0.06 −0.05 0.01 −0.07 Caste (SC/ST = 1) 0.12*** 0.05 0.03 0.03 0.10** 0.10** −0.03 0.02 0.22*** 0.16*** Regional characteristics Urban −0.12** −0.12** −0.05 −0.02 −0.04 −0.09* −0.04 −0.01 −0.09* −0.14*** Underdeveloped −0.14*** 0.12*** −0.26*** −0.12** −0.36*** −0.13*** −0.02 0.02 −0.20*** 0.05 Observations 2674 2674 2674 2674 2674 2674 2674 2674 2674 2674 ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 13 of 30 Table 3 Vignettes set 2: pain and personal relationships Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Education categories (ref category: no formal education) Below primary −0.05 −0.09 0.03 0.15** 0.07 0.05 −0.12 −0.05 −0.08 −0.09 Primary −0.17*** −0.12** −0.03 −0.02 0.04 0.03 −0.04 −0.01 −0.10 −0.09 Secondary −0.02 −0.21*** 0.09 0.20*** −0.07 −0.02 −0.02 −0.01 −0.01 0.00 High school −0.01 −0.11 0.03 0.07 0.07 0.03 −0.08 −0.10 −0.01 −0.01 College and above −0.19* −0.25** 0.16 0.23** 0.07 −0.01 −0.13 −0.07 −0.00 0.00 Individual characteristics Male −0.06 −0.03 −0.08 −0.13*** −0.11** −0.14*** 0.15*** 0.13*** 0.09* −0.03 Age groups (ref category: age 18−29.9 years) 30−44.9 0.05 −0.03 0.06 0.11 0.01 0.09 0.02 −0.06 0.01 0.05 45−60 0.02 −0.03 −0.01 0.07 −0.12* −0.01 0.02 −0.08 −0.03 0.07 Above 60 0.10 0.03 −0.05 0.02 −0.12 0.00 −0.03 −0.09 −0.08 0.05 Marital status Currently married −0.05 0.04 −0.11** −0.09* −0.08 −0.04 −0.14*** −0.10* −0.07 −0.02 BMI categories (measured) (ref category: normal BMI 18.5−24.9) Underweight (BMI < 18.5) 0.01 −0.02 0.05 0.04 0.02 0.02 0.04 0.04 0.03 −0.01 Overweight (BMI 25−29.9) −0.02 −0.08 −0.03 −0.02 −0.02 0.04 0.10 0.10 −0.03 −0.01 Obese (BMI > 30) 0.10 −0.14 0.05 0.09 0.04 0.01 −0.19 −0.08 −0.21* −0.18 Household’s expenditure quintiles (ref category: Q3) Q1 −0.09 −0.11* −0.15** −0.05 −0.09 −0.03 −0.11* −0.08 −0.03 −0.07 Q2 0.07 0.05 −0.08 −0.05 −0.12* −0.08 0.05 0.09 0.06 0.12* Q4 0.04 0.09 −0.04 −0.10 −0.08 −0.00 0.04 0.04 0.04 0.02 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 14 of 30 Table 3 Vignettes set 2: pain and personal relationships (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Q5 −0.05 0.00 −0.13* −0.16** 0.07 −0.06 0.06 0.08 0.02 0.05 Religion (Hindu = 1) −0.02 0.13** −0.08 0.00 0.07 0.01 −0.08 −0.06 −0.12** −0.13** Caste (SC/ST = 1) 0.13*** 0.08* −0.04 −0.09** 0.15*** 0.06 −0.12*** −0.11** −0.06 −0.06 Regional characteristics Urban −0.06 −0.07 −0.01 −0.02 −0.08 −0.00 0.01 −0.03 0.01 −0.03 Underdeveloped −0.05 0.03 −0.20*** 0.01 −0.25*** −0.15*** −0.37*** −0.19*** −0.23*** −0.02 Observations 2729 2729 2729 2729 2729 2729 2729 2729 2729 2729 ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 15 of 30 Table 4 Vignettes set 3: vision, sleep, and energy Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Education categories (ref category: no formal education) Below primary 0.12 0.11 0.01 −0.01 0.06 0.09 0.06 0.06 0.03 0.01 Primary 0.01 −0.13** −0.03 0.03 0.01 −0.11* −0.08 −0.08 −0.00 −0.08 Secondary 0.08 0.00 −0.04 −0.09 0.18** 0.09 0.07 0.08 0.01 0.00 High school 0.30*** 0.14* −0.06 −0.16** 0.12 −0.00 −0.11 −0.08 0.03 −0.15* College and above 0.47*** 0.26*** 0.05 −0.14 0.17* 0.03 0.24** 0.15 0.18* −0.03 Individual characteristics Male −0.07 −0.04 −0.05 −0.08 0.01 0.01 −0.18*** −0.21*** −0.13*** −0.09* Age groups (ref category: age 18−29.9 years) 30−44.9 −0.04 0.00 −0.03 −0.11 0.15** 0.07 0.05 0.02 −0.02 −0.07 45−60 0.03 0.05 0.09 −0.00 0.14** 0.12* 0.02 −0.01 0.01 −0.04 Above 60 0.08 0.07 0.11 0.08 0.20*** 0.11 0.00 0.01 0.05 −0.01 Marital status Currently married −0.08 −0.09* 0.00 0.07 0.01 −0.08 −0.02 −0.02 0.06 −0.02 BMI categories (measured) (ref category: normal BMI 18.5−24.9) Underweight (BMI < 18.5) 0.00 −0.02 0.03 0.05 0.03 −0.00 0.09** 0.08* 0.09** 0.10** Overweight (BMI 25−29.9) −0.10 0.02 0.01 0.01 −0.11 −0.06 0.04 0.12* −0.00 −0.02 Obese (BMI > 30) −0.05 0.07 −0.19 0.02 −0.20* −0.08 0.11 0.22* 0.09 0.13 Household’s expenditure quintiles (ref category: Q3) Q1 0.00 −0.05 −0.06 −0.04 0.02 −0.05 0.05 −0.00 0.03 −0.04 Q2 −0.01 −0.06 −0.09 0.00 −0.05 0.00 0.03 0.08 0.07 0.05 Q4 −0.10 −0.12** −0.08 0.06 −0.02 −0.02 0.08 0.02 0.09 0.02 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 16 of 30 Table 4 Vignettes set 3: vision, sleep, and energy (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Q5 −0.08 −0.04 −0.06 0.05 0.05 0.07 0.07 −0.02 0.07 0.10 Religion (Hindu = 1) −0.04 −0.02 0.01 −0.03 −0.02 0.06 −0.02 0.05 −0.04 0.06 Caste (SC/ST = 1) 0.08* 0.11*** −0.04 −0.17*** −0.08* 0.01 −0.02 −0.06 −0.20*** −0.12*** Regional characteristics Urban −0.13** −0.12** −0.01 0.05 −0.08 −0.10* −0.08 −0.02 −0.12** −0.03 Underdeveloped −0.37*** −0.09** −0.25*** −0.21*** −0.10** −0.10** −0.42*** −0.23*** −0.36*** −0.30*** Observations 2771 2771 2771 2771 2771 2771 2771 2771 2771 2771 ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 17 of 30 Table 5 Vignettes set 4: cognition and self-care Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Education categories (ref category: no formal education) Below primary −0.12* −0.09 −0.12* −0.13* 0.06 0.03 −0.02 0.07 0.01 −0.03 Primary −0.11* −0.11* −0.09 −0.05 0.08 0.02 0.02 0.04 −0.12* −0.15** Secondary −0.13* −0.05 −0.04 −0.08 −0.09 −0.04 −0.15** −0.10 −0.01 −0.03 High school −0.04 −0.01 −0.08 −0.07 0.09 0.11 0.18** 0.15** 0.02 0.04 College and above −0.06 −0.08 −0.01 −0.04 0.06 0.10 0.18* 0.16 0.05 −0.03 Individual characteristics Male 0.03 0.01 0.01 −0.09* −0.12** −0.19*** −0.16*** −0.19*** 0.07 0.02 Age groups (ref category: age 18–29.9 years) 30–44.9 0.05 0.03 0.08 0.09 0.06 0.01 −0.15** −0.16** 0.11 0.10 45–60 0.11 0.05 0.10 0.08 0.07 0.07 −0.19*** −0.18** 0.13* 0.04 Above 60 0.24*** 0.19*** 0.18** 0.21*** 0.12* 0.15** −0.14* −0.12 0.11 0.14** Marital status Currently married −0.06 −0.03 −0.01 0.05 0.08 0.09* 0.05 0.09* −0.10* −0.09 BMI categories (measured) (ref category: normal BMI 18.5–24.9) Underweight (BMI < 18.5) −0.00 0.07 0.05 0.04 0.04 0.00 0.01 −0.04 0.10** 0.05 Overweight (BMI 25–29.9) −0.03 −0.05 −0.02 −0.06 −0.07 −0.03 0.06 0.02 0.01 0.04 Obese (BMI > 30) −0.13 −0.07 −0.13 0.01 0.06 −0.11 −0.09 −0.02 0.01 0.09 Household’s expenditure quintiles (ref category: Q3) Q1 0.02 −0.01 −0.09 −0.15** 0.03 −0.06 0.06 0.08 −0.04 −0.03 Q2 0.04 0.05 −0.01 −0.09 −0.01 −0.11* 0.09 0.05 −0.06 −0.04 Q4 0.13** 0.09 −0.04 −0.11* −0.05 −0.05 0.03 0.09 0.07 0.03 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 18 of 30 Table 5 Vignettes set 4: cognition and self-care (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Q5 0.13* 0.10 −0.01 0.00 0.13* 0.09 0.11* 0.15** −0.02 −0.05 Religion (Hindu = 1) −0.07 −0.04 −0.00 −0.06 −0.00 0.05 0.02 0.03 0.03 −0.01 Caste (SC/ST = 1) 0.19*** 0.17*** 0.01 0.02 0.12*** 0.16*** 0.13*** 0.13*** −0.12*** −0.06 Regional characteristics Urban −0.02 −0.06 −0.11** −0.11** 0.04 −0.01 0.07 −0.02 −0.01 0.03 Underdeveloped −0.52*** −0.41*** −0.25*** −0.08* −0.38*** −0.19*** −0.28*** −0.23*** −0.23*** −0.09** Observations 2699 2699 2699 2699 2699 2699 2699 2699 2699 2699 ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 19 of 30 Table 6 Dependent variable: self-reported health Variables (1) (2) (3) (4) Health today Education categories (ref category: Below primary −0.10*** −0.09** 0.07* −0.07* no formal education) Primary −0.10*** −0.08** −0.04 −0.03 Secondary −0.25*** −0.23*** −0.16*** −0.14*** High school −0.38*** −0.36*** −0.25*** −0.23*** College and above −0.63*** −0.60*** −0.46*** −0.44*** Individual characteristics Male −0.12*** −0.13*** −0.09*** −0.08*** Age groups (ref category: 30–44.9 0.51*** 0.53*** 0.50*** 0.47*** age 18–29.9 years) 45–60 0.82*** 0.85*** 0.76*** 0.70*** Above 60 1.18*** 1.19*** 1.04*** 0.91*** Marital status Currently married −0.06** −0.05* −0.04 −0.03 Household’s expenditure quintiles Q1 −0.02 −0.04 −0.02 −0.03 (ref category: Q3) Q2 −0.04 −0.05 −0.05 −0.05 Q4 −0.01 0.00 0.02 0.03 Q5 −0.09** −0.07** −0.09** −0.08** Religion (Hindu = 1) −0.19*** −0.19*** −0.18*** −0.17*** Caste (SC/ST = 1) −0.05** −0.06*** −0.08*** −0.11*** Regional characteristics Urban −0.11*** −0.10*** −0.10*** −0.08*** Underdeveloped −0.27*** −0.27*** −0.26*** −0.32*** BMI categories (measured) (ref Underweight (BMI < 18.5) 0.20*** 0.17*** 0.15*** category: normal BMI 18.5–24.9) Overweight (BMI 25–29.9) 0.02 0.00 −0.01 Obese (BMI > 30) 0.08 0.01 0.00 Rapid Walk −0.32*** −0.08 Cognitive score 1 −0.02 −0.02 Cognitive score 2 −0.06*** −0.06*** Cognitive score 3 0.00 0.00 Performance tests Chronic illness 0.25*** 0.20*** Lung function 0.0 0.00 Blood pressure systolic 0.00 0.00 Blood pressure diastolic 0.00 0.00 Pulse rate 0.00*** 0.00*** Hearing 0.35*** Vision 0.17*** Interviewer assessments Walking 0.36*** Shortness of breath 0.28*** Overall health problem 0.33*** Observations 10,873 10,873 10,873 10,873 ***p < 0.01, **p < 0.05, *p < 0.1 significant across all specifications, implying a underreporting of worse health among the disadvantaged group. This coefficient increases after interviewer assessments of health states are controlled, confirming that it is picking up the reporting bias. We further estimate a vector of self-reported functioning measures in the domain of mobility (Table 7), daily activities (Table 8), and cognitive outcomes (Table 9). While es- timation of self-reported measure for memory would suggest that males fare better, we Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 20 of 30 Table 7 Dependent variables: self-reported functioning measures across various domains of mobility Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) Education categories (ref category: no formal education) Below primary −0.05 −0.07* −0.02 −0.06 −0.02 −0.02 −0.00 −0.02 −0.06 0.01 −0.08** −0.04 Primary −0.06* −0.05 −0.04 −0.09*** −0.03 −0.09*** −0.08** −0.07* −0.10** −0.08** −0.10*** −0.21*** Secondary −0.23*** −0.30*** −0.26*** −0.26*** −0.22*** −0.27*** −0.28*** −0.22*** −0.18*** −0.26*** −0.25*** −0.25*** High school −0.23*** −0.30*** −0.20*** −0.37*** −0.28*** −0.27*** −0.24*** −0.24*** −0.17*** −0.21*** −0.38*** −0.34*** College and above −0.50*** −0.52*** −0.51*** −0.60*** −0.60*** −0.51*** −0.55*** −0.49*** −0.42*** −0.62*** −0.56*** −0.45*** Individual characteristics Male −0.27*** −0.14*** −0.36*** −0.44*** −0.44*** −0.38*** −0.42*** −0.28*** −0.09*** −0.15*** −0.18*** −0.18*** Age groups (ref category: age 18–29.9 years) 30–44.9 0.48*** 0.48*** 0.49*** 0.35*** 0.56*** 0.52*** 0.44*** 0.51*** 0.35*** 0.42*** 0.21*** 0.44*** 45–60 0.83*** 0.84*** 0.86*** 0.70*** 1.00*** 0.98*** 0.80*** 1.03*** 0.76*** 0.77*** 0.44*** 0.84*** Above 60 1.27*** 01.31*** 1.18*** 1.09*** 1.40*** 1.41*** 1.23*** 1.50*** 1.19*** 1.13*** 0.76*** 1.15*** Marital status Currently married −0.03 −0.05* 0.02 0.02 −0.01 0.00 0.02 −0.01 −0.02 −0.09*** −0.14*** 0.02 Household’s expenditure quintiles (ref category: Q3) Q1 0.08** −0.03 −0.01 0.02 0.03 −0.01 0.03 0.04 0.12*** 0.07** −0.05 0.06 Q2 0.06 −0.02 0.06* 0.07* 0.07** 0.05 0.04 0.04 0.13*** 0.06* −0.04 0.08** Q4 0.04 −0.01 −0.03 0.06* −0.03 −0.01 −0.02 −0.01 −0.07 0.01 −0.02 0.04 Q5 0.00 0.03 −0.03 −0.02 0.02 −0.05 −0.03 −0.03 −0.10** −0.08** 0.00 −0.06 Religion (Hindu = 1) −0.12*** −0.13*** −0.11*** −0.14*** −0.12*** −0.16*** −0.14*** −0.11*** −0.08** −0.12*** −0.08** −0.00 Caste (SC/ST = 1) 0.14*** −0.06** 0.07*** −0.00 0.11*** 0.03 0.14*** 0.08*** 0.09*** 0.18*** −0.10*** 0.13*** Regional characteristics Urban −0.11*** −0.17*** −0.09*** −0.03 −0.06** −0.08*** −0.16*** −0.13*** 0.03 −0.05* −0.06** −0.05 Underdeveloped −0.18*** −0.04* −0.22*** −0.33*** −0.22*** −0.06** −0.34*** −0.20*** −0.05 −0.08*** −0.30*** −0.30*** Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 21 of 30 Table 7 Dependent variables: self-reported functioning measures across various domains of mobility (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) BMI categories (measured) (ref category: normal BMI 18.5–24.9) Underweight (BMI < 18.5) 0.08*** 0.12*** 0.06** 0.07*** 0.07*** 0.11*** 0.11*** 0.03 0.06** 0.09*** 0.10*** 0.04 Overweight (BMI 25–29.9) 0.12*** 0.06* 0.12*** 0.10** 0.15*** 0.19*** 0.05 0.17*** 0.07 0.04 0.09** 0.09** Obese (BMI > 30) 0.27*** 0.27*** 0.20*** 0.30*** 0.35*** 0.27*** 0.19*** 0.32*** 0.10 0.14** 0.18*** 0.17** Walk difficulty Timed walk −0.34 0.22 −0.24 −0.03 −0.35 −0.14 0.41 0.24 −0.69** −0.15 0.01 −0.43 Rapid walk −0.36 −0.46* −0.12 −0.59** −0.29 −0.43* −0.88*** −0.78*** 0.17 −0.36 −0.33 0.19 Interviewer assessment −0.73*** −0.34*** −0.49*** −0.37*** −0.50*** −0.54*** −0.36*** −0.46*** −0.29*** −0.39*** −0.37*** −0.52*** Observations 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 The dependent variables in all the specifications take values 1–5 measuring self-reported difficulty level (1 = no difficulty; 5 = extreme difficulty) faced by the respondent in the specific activity describing some form of mobility ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 22 of 30 Table 8 Dependent variables: self-reported functioning measures across various domains of daily activities Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) Education categories (ref category: no formal education) Below primary −0.05 −0.04 −0.06 −0.09** −0.08** −0.12** −0.06 0.01 −0.07 −0.05 −0.09** Primary −0.09*** −0.16*** −0.16*** −0.12*** −0.09** −0.12*** −0.19*** −0.17*** −0.18*** −0.14*** −0.19*** Secondary −0.27*** −0.22*** −0.16** −0.31*** −0.23*** −0.21*** −0.09* −0.22*** −0.23*** −0.31*** −0.40*** High school −0.29*** −0.28*** −0.33*** −0.25*** −0.26*** −0.22*** −0.18*** −0.33*** −0.30*** −0.43*** −0.45*** College and above −0.58*** −0.50*** −0.54*** −0.57*** −0.44*** −0.49*** −0.39*** −0.61*** −0.61*** −0.59*** −0.69*** Individual characteristics Male −0.39*** −0.07* −0.03 −0.29*** −0.20*** −0.18*** −0.16*** −0.25*** −0.19*** −0.30*** −0.16*** Age groups (ref category: age 18–29.9 years) 30–44.9 0.31*** 0.22*** 0.38*** 0.32*** 0.39*** 0.35*** 0.34*** 0.50*** 0.43*** 0.31*** 0.29*** 45–60 0.70*** 0.58*** 0.68*** 0.73*** 0.82*** 0.73*** 0.66*** 0.89*** 0.76*** 0.65*** 0.70*** Above 60 1.15*** 0.95*** 1.00*** 1.06*** 1.23*** 1.11*** 1.04*** 1.24*** 1.14*** 1.09*** 1.16*** Marital status Currently married 0.03 −0.06 −0.07* −0.03 −0.06** −0.02 −0.07** −0.03 −0.01 −0.05* −0.05 Household’s expenditure quintiles (ref category: Q3) Q1 0.03 0.00 0.03 0.01 −0.04 −0.00 −0.01 0.03 0.06 −0.04 −0.09*** Q2 0.02 0.01 −0.04 −0.01 −0.02 0.03 −0.02 0.06 0.00 0.03 −0.02 Q4 0.02 −0.02 −0.02 −0.02 0.06* −0.01 −0.01 −0.00 −0.03 0.00 −0.02 Q5 0.02 −0.03 −0.05 −0.08** 0.02 0.02 −0.05 −0.04 −0.12*** −0.05 −0.06* Religion (Hindu = 1) −0.18*** −0.14*** −0.15*** −0.17*** −0.16*** −0.10*** −0.01 −0.07** −0.01 −0.09*** −0.11*** Caste (SC/ST = 1) 0.06*** 0.02 −0.04 0.06** 0.03 −0.01 0.03 0.07*** 0.07** 0.08*** −0.01 Regional characteristics Urban −0.07*** 0.00 0.00 −0.06** −0.20*** −0.06* 0.03 −0.09*** −0.10*** −0.11*** −0.01 Underdeveloped −0.27*** −0.17*** −0.21*** −0.28*** −0.16*** −0.23*** 0.25*** −0.30*** 0.02 0.15*** 0.19*** Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 23 of 30 Table 8 Dependent variables: self-reported functioning measures across various domains of daily activities (Continued) Variables (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) BMI categories (measured) (ref category: normal BMI 18.5–24.9) Underweight (BMI < 18.5) 0.11*** 0.15*** 0.10*** 0.08*** 0.13*** 0.11*** 0.10*** 0.07*** 0.07** 0.10*** 0.12*** Overweight (BMI 25–29.9) 0.14*** 0.08 0.05 0.02 0.08** 0.10** −0.03 0.15*** 0.11** 0.10** 0.13*** Obese (BMI > 30) 0.36*** 0.20** 0.22** 0.21*** 0.26*** 0.24*** 0.06 0.33*** 0.27*** 0.23*** 0.28*** Walk difficulty Timed walk 0.08 −0.48 −1.00*** 0.13 0.39 −0.55* −0.31 −−0.22 −0.55* −0.26 −0.22 Rapid walk −0.71*** −0.14 0.35 −0.65** −0.92*** −0.17 −0.04 −0.13 −0.04 −0.46* −0.44* Interviewer assessment −0.55*** −0.51*** −0.63*** −0.50*** −0.39*** −0.56*** −0.36*** −0.64*** −0.59*** −0.43*** −0.55*** Observations 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 10,873 The dependent variables in all the specifications take values 1–5 measuring self-reported difficulty level (1 = no difficulty; 5 = extreme difficulty) faced by the respondent in the specific activity describing some form of daily activities ***p < 0.01, **p < 0.05, *p < 0.1 Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 24 of 30 Table 9 Dependent variable: self-reported cognitive difficulty Variables (1) (2) Self-reported cognition Memory Concentration Education categories (ref category: no formal education) Below primary −0.12*** −0.11*** Primary −0.09** −0.10*** Secondary −0.30*** −0.33*** High school −0.38*** −0.42*** College and above −0.52*** −0.69*** Individual characteristics Male −0.18*** −0.07** Age groups (ref category: age 18–29.9 years) 30–44.9 0.53*** 0.48*** 45–60 0.88*** 0.81*** Above 60 1.22*** 1.23*** Marital status Currently married −0.08*** −0.08*** Household’s expenditure quintiles (ref category: Q3) Q1 −0.00 −0.02 Q2 −0.01 −0.01 Q4 0.03 −0.00 Q5 −0.01 −0.03 Religion (Hindu = 1) −0.15*** −0.05 Caste (SC/ST = 1) 0.01 −0.01 Regional characteristics Urban −0.23*** −0.13*** Underdeveloped −0.10*** −0.10*** Cognitive tests Cognitive score 1 −0.04*** −0.04*** Cognitive score 2 −0.08*** −0.07*** Words recalled −0.03*** −0.02*** The dependent variables in both the specifications take values 1–5 measuring self-reported difficulty level (1 = no difficulty; 5 = extreme difficulty) faced by the respondent in remembering and concentrating things. Objective measures include (test of words recalled after delay, digital recall test and verbal fluency) ***p < 0.01, **p < 0.05 find contrary result when we estimate objective memory test for words recalled (Table 10). As expected, individuals from underdeveloped states score lower on both cognitive tests. The findings reveal systematic underreporting of worse health among males, higher educated groups, and urban and underdeveloped states reconfirming our earlier findings. Interestingly, we find that coefficient on the underdeveloped dummy for interviewer assessed health problem reveals that individuals from underdeveloped states were more likely to have health problems (results not included). The distribution of μ for different health domains (Figs. 6, 7, 8,and 9) reveals substantial reporting heterogeneity between individuals. We examine how much of this variation in μ can be explained by the observable characteristics of the respon- dents in Table 11. The result matches with what we found earlier. We find that Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 25 of 30 Table 10 Dependent variable: objective memory tests Variables Objective memory tests Education categories (ref category: no formal education) Below primary 0.24*** Primary 0.39*** Secondary 0.57*** High school 0.80*** College and above 0.98*** Individual characteristics Male −0.07*** Age groups (ref category: age 18–29.9 years) 30–44.9 −0.25*** 45–60 −0.56*** Above 60 −0.85*** Marital status Currently married 0.08*** Household’s expenditure quintiles (ref category: Q3) Q1 −0.12*** Q2 −0.05 Q4 0.04 Q5 0.12*** Religion (Hindu = 1) 0.04 Caste (SC/ST = 1) −0.01 Regional characteristics Urban 0.16*** Underdeveloped 0.11*** The dependent variables in all the specifications are objective measures of memory and cognition including (test of words recalled after delay, digital recall test, and verbal fluency) ***p < 0.01 males are more likely to favorably rank their health state and individuals above 60 years were likely to overstate bad health. Interestingly, both the quintiles above and below the middle expenditure group were likely to understate ill health. Indi- viduals from underdeveloped states were found to be consistently underestimating health problems. This has important implications for inter-personal comparability of self-reported data even within a geographical region that may not be homoge- neous in terms of development. Now, in order to see how much of the reporting heterogeneity can be attributed to the observable characteristics, we examine the R-square of the estimated Eq. (4) for different health domains. We find that the R-square for estimations (1)to(4)is just explaining 3% (mobility and affect) to 7% (cognition and self-care) of the vari- ation in the self-reported behavior. This is alarming given the fact that we get to only control and adjust for the observables in the regression, which leaves much of the reporting heterogeneity at the individual level typically unaccounted for. Also, this potentially limits the use of anchoring vignette approach to make SAH Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 26 of 30 Fig. 6 Distribution of individual reporting heterogeneity from vignette set A. Note: health domains in set A includes mobility and affect responses more comparable if much of reporting heterogeneity between individuals is due to unobservable factors that we do not control in a regression. We discuss the implications of our results in the next section. 6 Conclusions One of the key challenges in the analysis and interpretation of health survey data is im- proving the interpersonal comparability of subjective indicators that comes with sys- tematic measurement error—as a consequence of differences in the ways that Fig. 7 Distribution of individual reporting heterogeneity from vignette set B. Note: Health domains in set B includes pain and personal relationships Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 27 of 30 Fig. 8 Distribution of individual reporting heterogeneity from vignette set C. Note: Health domains in set A includes vision, sleep and energy individuals understand and use the available responses for a given question. In this paper, we examine the pattern of reporting differences in SAH from a nationally repre- sentative survey in India and find evidence that measurement error in SAH systematic- ally varies with demographic characteristics, such as the age, gender, and education, and community characteristics such as sector and level of development in the state. This has important implications on several aspects. First, one should be careful in using self-reported health data for inter-personal com- parison of health status. This becomes all the more relevant for policy formulation in the case for a developing country setting like India where objective data on health is scarce Fig. 9 Distribution of individual reporting heterogeneity from vignette set D. Note: Health domains in set A includes cognition and self-care Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 28 of 30 Table 11 Estimations of two-stage regressions using individual fixed effects (1) (2) (3) (4) Variables Vignette set A Vignette set B Vignette set C Vignette set D Education categories Below primary 0.02 −0.03 0.04 −0.11 Primary 0.00 −0.05 −0.05 −0.10* Secondary −0.01 −0.02 0.03 −0.11* High school 0.00 −0.01 −0.01 −0.03 College and above 0.01 −0.02 0.11** −0.05 Individual characteristics Male −0.08*** −0.01 −0.07*** 0.03 Age groups 30–44.9 0.05 0.03 0.00 0.04 45–60 0.02 −0.01 0.04 0.10 Above 60 0.11*** −0.01 0.07* 0.21*** Marital status Currently married −0.01 −0.06** −0.02 −0.05 BMI categories (measured) Underweight (BMI < 18.5) −0.01 0.02 0.04* 0.00 Overweight (BMI 25–29.9) −0.02 −0.01 −0.01 −0.03 Obese (BMI > 30) 0.05 −0.05 0.01 −0.12 Household’s expenditure quintiles Q1 −0.16*** −0.07** −0.01 0.02 Q2 −0.08** 0.01 0.01 0.03 Q4 −0.11*** 0.01 −0.00 0.11* Q5 −0.05 −0.01 0.02 0.11* Religion (Hindu = 1) −0.04 −0.01 −0.00 −0.06 Caste (SC/ST = 1) 0.08*** −0.01 −0.05** 0.17*** Regional characteristics Urban −0.06** −0.02 −0.06** −0.02 Underdeveloped −0.09*** −0.12*** −0.22*** −0.47*** Constant −0.29*** −0.16*** −0.44*** 2.45*** Observations 2673 2728 2770 2698 R-square 0.03 0.02 0.04 0.07 ***p < 0.01, **p < 0.05, *p < 0.1 and one has to literally rely on self-reported health measures for assessing the health situ- ation of the country. This has consequences on evaluation of health policies that are entirely based on self-reported data on morbidity, utilization and expenditure on health care, perceived well-being , and self-rated ranking of health service delivery used in citizen and community report cards. Hence, drawing causal inference of a program based on self-reported health measures needs to be re-examined in the light of this problem. Further, one has to reflect on the problem that this reporting heterogeneity cannot be simply dealt with by controlling for the covariates in a typical regression framework. The findings on systematic reporting behavior by social disadvantage are mixed in our study. While there are non-linearities in systematic reporting bias by education and expenditure quintile of the respondents, we find individuals from underdeveloped Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 29 of 30 states underreport the presence of illness or health-deficits across all specifications. We additionally verify that the assumption of the “response consistency” assumption holds in this data. Further, controlling for individual fixed effects, we purge the idiosyncratic unobservable features of individual reporting behavior and confirm the earlier patterns of systematic bias by gender, age, and development level. Also, we point out that the observable characteristics of the respondents only explain a small portion of this remaining heterogeneity in SAH. Hence, we argue that in the dearth of objective health information, which is often costly to collect in a developing country setting, inclusion of vignette profile in questionnaire provides an arguably low-cost measure of identify- ing the systematic bias in responses, thus improving upon this problem. Endnotes For India, only a pilot data from Andhra Pradesh was analyzed. It has been argued that individuals may use different thresholds for rating vignette questions as opposed to rating self-reported health questions. In order to see whether reporting bias varies by true health, we include the mea- sured body mass index categories (viz. underweight, normal, overweight and obese). Implementation of SAGE Wave 1 was from 2007 to 2010 in six countries over different regions of the world (China, Ghana, India, Mexico, Russian Federation, and South Africa) The sample was stratified by state and locality (urban/rural) resulting in 12 strata and is nationally representative. Of the 28 states, 19 were included in the design which covered 96% of the population. The survey implemented a multistage cluster sampling design resulting in nationally representative cohorts. A composite index of the level of development was computed by giving equal weigh- tage to the four indicators. The states were ranked in this decreasing order of development (Maharashtra > Karnataka > West Bengal > Assam > Rajasthan > Uttar Pradesh) based on the composite index of infant mortality rate, female literacy rate, percentage of safe de- liveries, and per capita income. Around 500 observations do not have scores/not measured on some performance tests, i.e., less than 5% of the sample had missing information on X’s; however, they were not dropped from the analysis. The only exception being in the health domain of pain and discomfort, where male dummy changes sign and is actually positive and significant in 3 estimations (Table 3). The inclusion of the interaction terms of the covariates also does not seem to im- prove the R-square. For example, Gilligan et al. (2009) use self-perceived well-being as an outcome of interest in examining the causal impact of PSNP-food security program in Ethiopia. Acknowledgements I would like to thank the anonymous referee and the editor for the useful remarks. Responsible editor: David Lam Competing interests The IZA Journal of Development and Migration is committed to the IZA Guiding Principles of Research Integrity. The author declares that she has observed these principles. Dasgupta IZA Journal of Development and Migration (2018) 8:12 Page 30 of 30 Publisher’sNote Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Received: 4 May 2017 Accepted: 16 January 2018 References Antman F, McKenzie D. Earnings mobility and measurement error: a pseudo-panel approach. Econ Dev Cult Chang. 2007;56(1):125. Bago d’Uva T, Lindeboom M, O’Donnell O, Van Doorslaer E. Slipping Anchor? Testing the Vignettes Approach to Identification and Correction of Reporting Heterogeneity. J Hum Resour. 46.4, 2011;875–906. Print. Bago d'Uva T, Doorslaer EV, Lindeboom M, O'Donnell O. Does reporting heterogeneity bias the measurement of health disparities? Health Econ. 2008;17:351–75. Banerjee A, Deaton A, Duflo E. Health, health care, and economic development: wealth, health, and health services in rural Rajasthan. Am Econ Rev. 2004;94(2):326. Bound J. Self reported versus objective measures of health in retirement models. J Hum Resour. 1991;26(1):107–37. Escobal J, Laszlo S. Measurement error in access to markets. Oxf Bull Econ Stat. 2008;70(2):209–43. Gilligan DO, Hoddinott J, Kumar NR, Taffesse AS. An impact evaluation of Ethiopia’s productive safety nets program. Washington, DC: International Food Policy Research Institute; 2009. Kapteyn A, Smith JP, van Soest A, Vonkova H. Anchoring vignettes and response consistency. Santa Monica: RAND Corporation; 2011. https://www.rand.org/pubs/working_papers/WR840.html King GA, Murray CJL, Salomon JA, Tandon A. Enhancing the validity and cross-cultural comparability of measurement in survey research. American PoliticalScience Review. 2004;98(1):191–207. Lindeboom M, van Doorslaer E. Cut-point shift and index shift in self-reported health. J Health Econ. 2004;23(6):1083–99. Murray, C.J.L., A. Tandon, J. Salomon, C.D. Mathers, and R. Sadana (2002) Cross-population comparability of evidence for health policy, global programme on evidence for health policy discussion paper Geneva: World Health Organization. Rohrer JE. Use of published self-rated health-impact studies in community health needs assessment. Journal of Public Health Management and Practice. 2009;15(4):363–6. Schultz TP, Strauss J. Handbook of development economics (vol. 4). North Holland: Elsevier; 2008. Sen A. Positional objectivity. Philos Public Aff. 1993;22(2):126–45. Sen A. Health: perception versus observation: self reported morbidity has severe limitations and can be extremely misleading. Br Med J. 2002;324(7342):860. Strauss J, Thomas D. Measurement and mismeasurement of social indicators. Am Econ Rev. 1996;86(2):30–4. Van Doorslaer E, Jones AM. Inequalities in self-reported health; validation of a new approach to measurement. J Health Econ. 2002:61–87. Van Soest, Arthur, Delaney, Liam, Harmon, Colm, Kapteyn, Arie, Smith, James. (2011) Validating the use of anchoring vignettes for the correction of response scale differences in subjective questions Journal of the Royal Statistical Society Series. 3 A 174; pp. 575–595.

Journal

IZA Journal of MigrationSpringer Journals

Published: Jun 28, 2018

References