Mortality Projections for Small Populations: An Application to the Maltese Elderly
Mortality Projections for Small Populations: An Application to the Maltese Elderly
Menzietti, Massimiliano;Morabito, Maria Francesca;Stranges, Manuela
2019-03-29 00:00:00
risks Article Mortality Projections for Small Populations: An Application to the Maltese Elderly ,† † † Massimiliano Menzietti * , Maria Francesca Morabito and Manuela Stranges Department of Economics, Statistics and Finance “Giovanni Anania”, University of Calabria, Ponte Pietro Bucci, 87036 Arcavacata di Rende (CS), Italy; mariafr.morabito@gmail.com (M.F.M.); manuela.stranges@unical.it (M.S.) * Correspondence: massimiliano.menzietti@unical.it; Tel.: +39-(0)984-492250 † These authors contributed equally to this work. Received: 31 December 2018; Accepted: 15 March 2019; Published: 29 March 2019 Abstract: In small populations, mortality rates are characterized by a great volatility, the datasets are often available for a few years and suffer from missing data. Therefore, standard mortality models may produce high uncertain and biologically improbable projections. In this paper, we deal with the mortality projections of the Maltese population, a small country with less than 500,000 inhabitants, whose data on exposures and observed deaths suffers from all the typical problems of small populations. We concentrate our analysis on older adult mortality. Starting from some recent suggestions in the literature, we assume that the mortality of a small population can be modeled starting from the mortality of a bigger one (the reference population) adding a spread. The first part of the paper is dedicated to the choice of the reference population, then we test alternative mortality models. Finally, we verify the capacity of the proposed approach to reduce the volatility of the mortality projections. The results obtained show that the model is able to significantly reduce the uncertainty of projected mortality rates and to ensure their coherent and biologically reasonable evolution. Keywords: stochastic mortality model; small population; elderly mortality; Maltese mortality 1. Introduction Improving the accuracy of mortality projections is of great importance in actuarial science: in pricing, reserving and risk management of annuity business as well as in pensions and health care plans. In such activities, actuaries are often faced with the problem of modeling the mortality of small populations: e.g., small countries, a specific region of a country, an annuity portfolio and the participants of a pension or a health care plan. In the small populations, mortality rates are more affected by random fluctuations than in larger ones. Furthermore, the empirical records from small populations might only be available for relatively short periods and some data could be missing. These features lead to difficulties in identifying the underlying trends. Due to these reasons, a model usually adopted for large populations could be inadequate for the smaller ones. Among others, Booth et al. (2006) have reported the limitations in the use of the Lee–Carter model on small populations. Moreover, as a result of the fact that data may only be available for short periods, mortality projections are rather uncertain. Forecasts are very sensitive to the fitting period and simple extrapolation of historic trends could produce “implausible projections and unrealistic age-profiles” (Jarner and Kryger (2011)). Malta, with its small population, is undoubtedly an exemplary case. In accordance with the above, the development of its mortality rates shows great variability and irregular patterns and its data have more missing records than other countries; therefore, the use of standard mortality models may not be applicable and could lead to unreliable results. Risks 2019, 7, 35; doi:10.3390/risks7020035 www.mdpi.com/journal/risks Risks 2019, 7, 35 2 of 25 Besides small populations, the focus of this work is on the older adult mortality. This represents an issue faced by many researchers in the field (e.g., Bongaarts (2005)). The Lee–Carter methodology tends to systematically under-predict the gains in old age mortality, due to the fact that, recently, old age improvements rates have gradually risen over time. Regarding this issue, coherently with the Lee–Carter hypothesis of invariant improvement rates over time, one solution may be to adopt data related to shorter periods (see, among others, Booth et al. (2006)). As noted by Jarner and Kryger (2011), this approach allows us to overcome the problem of prediction, but, by restricting the observed period, its application seems to be limited. For these reasons, we have focused our study only on this segment of the population, considering that infant and child mortality have quite a different nature from adult mortality. Furthermore, older adults are also the age-group of greatest interest in actuarial applications, such as pricing, reserving and risk management for life annuity portfolios or pension funds. The literature on the subject of mortality modeling is very extensive but is mostly concerned with regular data. However, the issue of mortality in small populations has also been the subject of study and application of various methodologies. One way to deal with these problems could be the replication of the original data, which may lead to loss of specific information, or data smoothing using graduation methods (see Benjamin and Pollard (1993) and Bravo and Malta (2010)). An example of the modeling of the mortality of small populations is the Saint model (Jarner and Kryger (2011)). It is based on observations of the Danish population, in which mortality rates vary considerably over time and for different ages, violating the Lee–Carter assumptions. The difficulty in obtaining plausible predictions with the classical models applied to small populations is related to the low number of exposures, to the high variability and to the high sensitivity with respect to the period under consideration. It has been widely demonstrated that populations socially and economically similar could be modeled jointly, and this is very useful in the case of small populations, as one can overcome the disadvantages of limited data. This is why, when working on small demographic data, the most common idea is to borrow information from a larger and similar population. The use of a large population to improve the forecasts of the small one could be conducted: “by mixing appropriately the mortality data obtained from other populations” (e.g., Ahcan et al. (2014)); by creating a two-population mortality model, which considers jointly the mortality of a small population related to a larger one. Regarding multi-population mortality models, the most important contributions on this topic are: Li and Lee (2005), who applied the Lee–Carter model, with the introduction of common factors, for a group of given population, in order to predict single mortality evolution; Cairns et al. (2011a), who introduced a Bayesian framework to jointly model two populations, referring to one of them as sub-population of the other one; Dowd et al. (2011), who proposed the gravity model for two populations in order to obtain coherent mortality forecasts; Jarner and Kryger (2011), who proposed a model for the Danish mortality (the Spread Adjusted InterNational Trend (SAINT) model) combining the mortality deterministic evolution of a basket of population with the stochastic evolution of the spread; D Amato et al. (2014), who extended the Lee–Carter model in order to take into account the existence of dependence in mortality data across multiple populations; Villegas and Haberman (2014), who applied a relative modeling approach where the death rates of a subpopulation are modeled in relation to the death rates of a reference population; they considered different multiple population extensions of the Lee–Carter model and applied their approach in order to study and forecast socioeconomic mortality differentials across deprivation subgroups in England; Risks 2019, 7, 35 3 of 25 Kleinov (2015), who developed a common age affect variants of the Lee–Carter model with p age and period factors for modeling the mortality of multiple populations; Li et al. (2015), who generalize a single-population mortality model in different possible ways, in order to fit two or more populations and to measure the basis risk in longevity hedges; Wan and Bertschi (2015), who proposed a two part model to fit Swiss historical data and make coherent forecasts, taking information from a larger population; Antonio et al. (2017), who developed a Li and Lee multi-population model to project Dutch and Belgian mortality evolution and measure the actuarial implication of their model; Chen et al. (2017), who proposed “the use of parametric bootstrap methods to investigate the finite sample distribution of the maximum likelihood estimator for the parameter vector of a stochastic mortality model”; Hunt and Blake (2017), who modeled the mortality rates of a pension scheme through an Age Period Cohort (APC) model that has the same form of the reference population model but is characterized by scaling factors that multiply period and cohort parameters and reduce or increase the dependence between the two models; Villegas et al. (2017), who developed a comprehensive comparative study of mortality models for two populations proposed in the literature and applied them to the case of a population of a pension scheme in order to measure the basis risk involved in longevity hedges; Wang et al. (2018), who proposed an approach based on a combination of data aggregation and mortality graduation applied to the empirical data from Taiwan and Taipei City. We follow the approaches of Jarner and Kryger (2011) and Wan and Bertschi (2015). In both works, a two-step routine has been used, consisting of a first phase in which the mortality of a reference population is modeled; it is followed by the estimation of the parameters of the mortality spread between the two populations. The Jarner and Kryger model is based on the hypothesis that in the small population mortality evolves around a smooth surface (the trend, whose parameters are estimated in a frailty model). The deviation from this surface is the spread and it is modeled by regressors referring to the age-profiles and the evolution over time of its components. In particular, they propose three age-profile regressors to capture level, slope and curvature of the deviation between the reference surface and the specific small population mortality. Wan and Bertschi use a Plat model and a Lee–Carter (with m time factors) to model the reference part and the spread, respectively. In both of these works, the reference population is a basket of populations worldwide that include the population of interest; unlike their proposal, in our paper, the choice of the reference population does not fall in a basket of populations. Similarly to Ahcan et al. (2014), we believe that the procedure is more effective by adopting as a reference population a population close (geographically, historically or socio-economically) to the small population to be modeled (in our case the Maltese one). We use, as reference, one country population and the first step of the analysis consists of determining the population to be taken as a reference on a set of possible candidates. Once the reference population has been chosen, we construct a two-part mortality model in order to describe the evolution of the small population: the first component is the trend, the second component is the spread. Unlike the two works mentioned above, in this paper, the choice of the models to be used for the two populations takes place through a two-stage selection procedure: first, we identify the mortality models that most suitably represent the mortality of the reference population, then we choose the model for the spread through an analysis of all the possible mix of considered models, as well as the results of the previous stage. The rest of this paper is organized as follows: in Section 2, we analyze the possible reference populations and choose the best one; in Section 3, we choose the mortality models for the reference population and the Maltese one; Section 4 contains the forecasts obtained from the application of the models; in Section 5, some final conclusions are deduced. Risks 2019, 7, 35 4 of 25 2. The Model Structure and the Choice of the Reference Population 2.1. The Model Structure The goal of this study is to overcome the problems related to the small size of the mortality data for a small population. In particular, the purpose is to take information about the trend from the mortality dynamics of another population assumed as the reference one, which has the following characteristics: being greater than the one for which we want to predict mortality; having a background trend similar to that of the smaller population’s mortality. We denote by d , and by E , respectively, the number of deaths and the exposure in the x,t,i x,t,i population i, at age x in the year t. Data refer to k ages, x 2 [x , x ], and n calendar years, t 2 [t , t ]. 1 k 1 The central death rate is given by: m = d /E . (1) x,t,i x,t,i x,t,i We assume that the number of deaths follow a Poisson distribution: d Poisson(E m ). (2) x,t,i x,t,i x,t,i The spread between the mortality of the small population, s, and the reference one, r, is modeled as follows: m = m ˆ S , (3) x,t,s x,t,r x,t where m ˆ are the fitted values of death rates for the reference population and S the spread between x,t,r x,t the death rates of the two populations. Under these assumptions, the first step is to choose the reference population. The methodologies for this choice are neither unique nor simple and have to be relativized according to the research and characteristics of the available data. There should be similarities in the shape of mortality rates of small population and that of the reference one. Moreover, there should be no differences in trend between the selected reference population and the small one. In the literature, several useful indices were presented in order to measure similarities and differences of mortality between different populations. Keyfitz and Caswell (2005), for example, have shown different approaches to the study of similarities between phenomena in different populations and over time. The use of direct and indirect standardization is very common when the populations in comparison are different in terms of the structure and scale of dataset records. We have considered four different countries as candidates for the reference population: France, Italy, Spain and United Kingdom. These countries have similar patterns of demographic transitions. Malta started its demographic transition later, but completed it within a relatively short period of one generation. By the end of the 1970s, the birth and mortality rates fell to the same levels as other European countries that had completed their demographic transitions many years before. Among these four countries, the United Kingdom is historically and culturally close to Malta, which was a British colony from 1814 (Congress of Vienna) until the independence declaration (21 September 1964). France, Italy and Spain were also evaluated according to a geographical criterion, considering that Italy is the closest to Malta and the other two border the Mediterranean Sea. Other Mediterranean countries were excluded from the analysis due to the socio-political divergences that inevitably differentiate them from Malta as regards mortality levels (for example North-African countries and Turkey) or due to lack of data (for example, Greece, for which data about the number of deaths are available on the Human Mortality Database until 2013). 2.2. Mortality Data In order to select the reference population and build the model, for each of the considered populations, we have used the historical mortality dataset composed of the number of deaths and Risks 2019, 7, 35 5 of 25 central exposures by age and year, from 1999 up to 2016 and for the ages 60–89. For a short overview of Malta’s demographic situation, we report that the number of Maltese exposures of all ages in 2016 was equal to 437,479, while the number of exposures for the age range 60–89 was equal to 58,959 females and 51,438 males. Considering that mortality data of Malta are not available in the Human Mortality Database (while Eurostat provides data starting from year 2006), we can not use the same data source for all the countries involved in the analysis. A part of mortality data of Malta (data about the period from 2001 to 2014) is published online by the National Statistics Office of Malta, but the whole dataset has been obtained thanks to the Population and Migration Unit of the National Statistics Office (NSO) of Malta. The deaths in this dataset represent the current values registered in the country, analysed by the Health Information and forwarded to the NSO. Data concerning the mid-year population aged x in the year t are obtained as the average of the populations at the end of each year. These data are about the total population in Malta (in this section, we denote “total population” that includes both Maltese and foreign residents). The population on 31 December 1998 and 1999 by age was provided only for the Maltese population, M (x, t) (we consider only “Maltese born population”, excluding permanent pop foreign residents). However, total population of all ages, T (t), on 31 December 1998 and 1999 were pop provided. In order to use consistent data for the entire time period, the estimates of total population on 31 December 1998 and 1999 by age, T (x, t), are obtained based on the available data, by applying pop the following formula: M (x, t) pop T (x, t) = T (t), (4) pop pop M (t) pop where M (t) = M (x, t), t = 31.12.1998, 31.12.1999 and x = 0, ..., w, with w denoting the pop pop extreme age. This formula has been applied both for female and male populations. For other countries, data are available on the Human Mortality Database (HMD) (HMD (2018)), except for data about Italy in the years 2015–2016 , which are obtained from the ISTAT (National Institute of Statistics) (ISTAT (2015), ISTAT (2016)). As documented in the “Data sources” section of the HMD website, the data about Italy available in the Human Mortality Database are provided by ISTAT, therefore the two data sources are aligned. Specifically, for 2015 and 2016, the values of mid-year population by age are obtained from those of the population at the beginning of the years (published on the ISTAT website). As regards the number of deaths in 2015 and 2016, deaths by age are not available on the ISTAT website, so they have been obtained through a two-step procedure: firstly, the mortality rates derived from the death probabilities ; then, from these values and those of mid-year population, number of deaths are easily calculated. In order to check the reliability of this procedure, we have taken data from ISTAT in a randomly chosen previous year and estimate in the same way the number of deaths by age and year. Comparing the values obtained to those available on the HMD, we find out that almost all the percentage variations are negligible. 2.3. The Choice of the Reference Population In this section, we report different indices, calculated by comparing the mortality of Malta with the mortality of the four countries candidates to be the reference population; the indices are calculated for both genders. It might be worth to plotting the evolution of the death rates referred to five-year age ranges (Figures 1 and 2 for females and males, respectively). Although the data on the Eurostat site are available for France, United Kingdom, Italy and Spain for the whole period 1999–2016, we considered it more convenient to use the data from HMD, immediately available in R thanks to the functions of the “StMoMo” package. As pointed out in the ISTAT publication by ISTAT (2001), the perequation on rates that leads to probabilities is quite complex; 2q considering the low prominence of these records in this work, the standard formula m = is applied. 2 q x Risks 2019, 7, 35 6 of 25 As expected, the age(-range)-specific death rates of Malta over time are much less stable than other countries. This is attributable to the scale of the available data, which generates more random fluctuations than is the case of bigger populations. In this case, because of rate variability, we cannot gather evidence on which trend is closer to the Maltese one. The comparison between the countries is affected by the different population sizes and mostly by the different age structures. In order to have a more realistic comparison, a common method is to use standardization procedures. We use a direct standardization in which the standard age structure is provided by a standard population. Considering that the purpose here is to compare the mortality of different countries to that of Malta, we decided to assume Malta as a standard population. The Standardized Death Rates over time are plotted below, defined as: E m x,t,s x,t,i x=60 SDR = , (5) t,i x,t,s x=60 where i = France, Italy, Malta, Spain and UK, s = standard population (Malta) and t =1999,...,2016. This SDR allows us to eliminate the differences attributable to the different dimensions and structures of the populations. Once the effect of the structure is removed, the values are comparable and give a measure of closeness or distance of mortality in the countries considered. From Figure 3, we can observe that the general trend and the slight decrease of values over time of the Maltese mortality seem to be closer to the dynamics of UK mortality than those of other countries. ITA ITA UK UK MT MT ESP ESP FR FR 2000 2005 2010 2015 2000 2005 2010 2015 Year Year (a) (b) ITA ITA UK UK MT MT ESP ESP FR FR 2000 2005 2010 2015 2000 2005 2010 2015 Year Year (c) (d) ITA ITA UK UK MT MT ESP ESP FR FR 2000 2005 2010 2015 2000 2005 2010 2015 Year Year (e) (f) Figure 1. Age(-range)-specific female death rates: (a) at age 60–64, (b) at age 65–69, (c) at age 70–74, (d) at age 75–79, (e) at age 80–84, (f) at age 85–89. Rates Rates Rates 0.040 0.065 0.085 0.012 0.018 0.024 0.004 0.007 0.010 Rates Rates Rates 0.085 0.120 0.150 0.020 0.030 0.040 0.006 0.010 0.014 Risks 2019, 7, 35 7 of 25 ITA ITA UK UK MT MT ESP ESP FR FR 2000 2005 2010 2015 2000 2005 2010 2015 Year Year (a) (b) ITA ITA UK UK MT MT ESP ESP FR FR 2000 2005 2010 2015 2000 2005 2010 2015 Year Year (c) (d) ITA ITA UK UK MT MT ESP ESP FR FR 2000 2005 2010 2015 2000 2005 2010 2015 Year Year (e) (f) Figure 2. Age(-range)-specific male death rates: (a) at age 60–64, (b) at age 65–69, (c) at age 70–74, (d) at age 75–79, (e) at age 80–84, (f) at age 85–89. ITA ITA UK UK MT MT ESP ESP FR FR 2000 2005 2010 2015 2000 2005 2010 2015 Year Year (a) (b) Figure 3. (a) female yearly standardized death rates, (b) male yearly standardized death rates. Additionally, we can calculate a Relative Measure of Mortality (RM), by dividing the SDRs of France, Italy, Spain and UK by the crude death rates of Malta, obtaining the following formula: E m x,t,s x,t,i x=60 RM = , (6) t,i E m x,t,s x,t,s x=60 Rates Rates Rates Rates 0.015 0.030 0.045 0.065 0.090 0.115 0.020 0.030 0.040 0.008 0.012 0.016 Rates Rates Rates Rates 0.015 0.030 0.045 0.120 0.160 0.205 0.035 0.050 0.065 0.012 0.018 0.024 Risks 2019, 7, 35 8 of 25 where i = France, Italy, Spain and UK, s = standard population (Malta) and t = 1999, ..., 2016. Over time, the more the RM index for a country has a value near to 1, the closer its mortality is to Malta’s mortality. From Figure 4, we can observe that the values of RM of France, Italy and Spain are systematically under 1 (and under those of UK), showing that their mortality is lower than that of Malta in each year, except for male mortality in 2010 and 2016. The results obtained with this additional measure suggest that United Kingdom may be suitable as a reference to model the mortality of Malta. In order to have quantitative support on the selection of the reference population, further analysis follows in the next section. ITA ITA UK UK ESP ESP FR FR 2000 2005 2010 2015 2000 2005 2010 2015 Year Year (a) (b) Figure 4. (a) RM for female population, (b) RM for male population. 2.4. An Alternative Approach for the Choice of the Reference Population In defining the reference population, an alternative approach could be to mix data from different countries as in Ahcan et al. (2014). In particular, they proposed to mitigate the fluctuations in the mortality profile of a small population (the Slovenian one) preserving its own features, replacing the mortality rates of the small population with the weighted averages of neighboring countries mortality data. This approach is based on the minimization of the sum of the (squared) differences between the observed age-specific mortality rates of the small population with the weighted averages of different countries age-specific mortality rates. We have applied this method to the observed Maltese mortality rates and the weighted averages of the mortality rates of the four populations considered as possible reference: France, Italy, Spain and United Kingdom. The optimization problem is defined as follows: 2016 89 min (m m ) i = 1, ..., 4, x,t,s å å x,t,AVE t=1999 x=60 s. t. (7) w 0, w = 1, å i i=1 where m = w m (8) x,t,AVE å i x,t,i i=1 and we refer to France for the population 1, Italy for the population 2, Spain for population 3 and, finally, to United Kingdom for the population 4 while population s is, as usual, the Maltese one. From this optimization procedure, we find that the weight of the British component in the optimum basket mortality data is over 0.99 while the weights for France, Italy and Spain all together sum for the remaining part. RM 0.6 0.8 1.0 1.2 RM 0.6 0.8 1.0 1.2 Risks 2019, 7, 35 9 of 25 This result confirms that the United Kingdom is the best candidate as reference population and that it would not be useful to consider a basket of countries considering that the weight of the remaining countries would be substantially insignificant. For all these reasons, in the following, we have decided to use UK mortality data as reference in the estimation of Maltese mortality. 3. The Mortality Models Choice Once the reference population used to “borrow” information on the general trend of the mortality profile of the small population has been chosen, the next step concerns the choice of mortality models for the reference population and the models for the spread. In order to do this, we followed a two-step procedure: first, we identified the model that best fits the observed mortality data of the reference population; then, we chose the spread model by analyzing the possible combinations between the models under consideration. 3.1. The Reference Population Mortality Model In order to choose the model for the reference population, we fit UK mortality data, separately for each sex, LC (Lee and Carter (1992)), APC (Currie (2006)), RH (Renshaw and Haberman (2006)) and Plat models (Plat (2009)) respectively defined as (Table 1): Table 1. Mortality models for the reference population. Model Formula (r) (1,r) (1,r) LC log(m ) = a + b k x,t,r x x (r) (1,r) (r) APC log(m ) = a + k + g x,t,r x t t x (r) (1,r) (1,r) (0,r) (r) RH log(m ) = a + b k + b g x,t,r x x x t t x (r) (1,r) (2,r) (r) Plat log(m ) = a + k + (x ¯