Accuracy of Five Multiple Imputation Methods in Estimating Prevalence of Type 2 Diabetes based on STEPS Surveys
- https://doi.org/10.2991/jegh.k.191207.001How to use a DOI?
- Multiple imputation, nonresponse, STEPS surveys
Background: This study was aimed to evaluate five Multiple Imputation (MI) methods in the context of STEP-wise Approach to Surveillance (STEPS) surveys.
Methods: We selected a complete subsample of STEPS survey data set and devised an experimental design consisted of 45 states (3 × 3 × 5), which differed by rate of simulated missing data, variable transformation, and MI method. In each state, the process of simulation of missing data and then MI were repeated 50 times. Evaluation was based on Relative Bias (RB) as well as five other measurements that were averaged over 50 repetitions.
Results: In estimation of mean, Predictive Mean Matching (PMM) and Multiple Imputation by Chained Equation (MICE) could compensate for the nonresponse bias. Ln and Box–Cox (BC) transformation should be applied when the nonresponse rate reaches 40% and 60%, respectively. In estimation of proportion, PMM, MICE, bootstrap expectation maximization algorithm (BEM), and linear regression accompanied by BC transformation could correct for the nonresponse bias. Our findings show that even with 60% of nonresponse rate some of the MI methods could satisfactorily result in estimates with negligible RB.
Conclusion: Decision on MI method and variable transformation should be taken with caution. It is not possible to regard one method as totally the worst or the best and each method could outperform the others if it is applied in its right situation. Even in a certain situation, one method could be the best in terms of validity but the other method could be the best in terms of precision.
- © 2020 Atlantis Press International B.V.
- Open Access
- This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).
In many countries, noncommunicable disease risk factors survey, also known as STEP-wise Approach to Surveillance (STEPS), is regarded as the main source of information on prevalence of type 2 diabetes. General structure of STEPS as well as the questionnaire and analysis process has been devised by World Health Organization .
However, some participants do not give their blood sample. As a result, biochemical variables including variable Fasting Blood Glucose (FBG) will contain some amount of missing data (on average about 19% in Iran’s STEPS surveys), which can damage the estimates both in terms of precision and validity. The degree to which missing data damage the results and the decision on how to deal with it depends mostly on the mechanisms through which missing data arise.
There are three mechanisms under which missing data are produced: first, Missing Completely at Random (MCAR), which produces missing data independent from values in other variables as well as the missing value itself. This kind of missing data reduce the sample size but do not bias the estimates. Second, Missing at Random (MAR) in which the probability of being missing is independent from its value after adjusting for other variables. Third, Missing not at Random (MNAR) in which being missing depends on its value after adjusting for other variables. It has been argued that under MAR or MNAR, there is a need for action because, besides reduction in sample size, the estimates will be biased .
There are a wide range of methods to deal with missing data among which Multiple Imputation (MI) has gotten much attention in all fields of research [2,3]. Its popularity is for being highly efficient and the ability to produce unbiased results under MAR as well as MCAR assumptions .
Multiple imputation, proposed by Rubin in 1987 and developed by others including Schafer , substitutes each missing value with more than one (M) value and produces M sets of data that are complete but different regarding imputed values. Then, the target analysis is performed on each data set and finally the corresponding estimates are combined according to the Rubin’s combination rule  as follows [Equation (1)–(4)]:
During years, MI has been developed through several methods to fit different settings because its performance depends on the structure of data set, scale and distribution of variables, correlation among involved variables, rate and mechanism of missing data .
So, in order to find which method is best suited to estimate prevalence of type 2 diabetes based on STEPS data, we need to evaluate the accuracy of most advanced MI methods in the context of STEPS survey. This evaluation can be further justified through following reasons:
First, most of the studies that evaluated and compared the accuracy of MI methods have not included a wide range of methods as they are now available in statistical software. Second, no study has yet been made on the basis of STEPS studies, possibly due to the challenge that MI will not result in proper imputation in data sets with complex structure [2,7]. Third, it is not clear what percent of nonresponses correspond to a certain amount of bias in STEPS results, so even if we accept that one method has a kind of superiority over others, it is desirable to know how much of nonresponse bias can be compensated by that method.
Regarding above explanations, this study was carried out to evaluate the accuracy of five MI methods in estimating prevalence of type 2 diabetes based on STEPS surveys. These MI methods include Expectation–Maximization with Bootstrapping Algorithm (BEM), Multivariate Normal Regression (MVN), Linear Regression (LR), Multiple Imputation by Chained Equation (MICE), and Predictive Mean Matching (PMM).
2. MATERIALS AND METHODS
2.1. Data Source
We used the data set of STEPS survey conducted by Iran’s Ministry of Health in 2007 in which a sample of 30,000 of Iranian people aged 15–64 years had been recruited. Then, a subsample of data set was drawn in a way that the structure of survey was preserved. For this purpose, seven out of 30 provinces that had the least amount of missing data in variable FBG were selected. The selected subsample consisted of 5099 participants and 21 variables on demographic, behavioral, anthropometrical, biochemical, and sampling characteristics. The full explanation of the STEPS survey and its results are beyond the scope of this article and are available elsewhere [8–10].
2.2. Experimental Design
Experimental design of this study consisted of 45 states (3 × 3 × 5), which differed by rate of simulated missing data (20%, 40%, and 60%), variable transformation [no transformation, natural logarithm (Ln), and Box–Cox (BC)], and MI method (BEM, MICE, MVN, LR, and PMM).
2.3. MI Procedures
In this study, five MI methods were considered as follows:
Bootstrap expectation–maximization algorithm that uses expectation maximization algorithm introduced by Dempster  to estimate the posterior distribution of incomplete data and then a bootstrap method to take draws from this posterior distribution, which is considered fast and robust. Since this procedure lets us to simultaneously impute more than one variable without distinction between dependent and independent variables, all biochemical variables were included in the imputation process.
Multiple imputation by chained equation, also known as switching regression or sequential regression multivariate imputation , is completely formed on conditional specification models. The assumption of multivariate normality for data can be ignored [13–15]. Like BEM, biochemical variables were included in the imputation process. Since each variable is imputed using its own imputation model, MICE can handle different variable types (continuous, binary, unordered categorical, and ordered categorical) .
Multivariate normal regression with which missing values are imputed by taking values from posterior distribution estimated by Markov Chain Monte Carlo algorithm [4,5]. It is based on the multivariate normality, but generally, it is robust to this assumption and still can provide valid estimates. With MVN approach, it is possible to consider several response variables but all predictors should be completely observed. For MVN and two next approached, we applied MI system implemented in Stata version 11.2 (StataCorp. 2009. Stata Statistical Software: Release 11, College Station, TX: StataCorp LP).
Univariate Linear Regression (UVR) that univariates a posterior predictive distribution derived from a normal linear regression model that is univariate. The imputation model includes an imputing variable as the response and some complete predictors.
Predictive mean matching first introduced by Rubin . It reduces the bias in a data set by means of imputation by taking real values sampled from the data. It is a semi-parametric version of the UVR with the same technique, but instead of displacing missing values with predictions of the model, PMM takes draws randomly from observed values whose predictive means are in a specified distance of that of missing values that has been set in advance.
To eliminate skewness in FBS to make it consistent with normality assumption, we used Ln and BC transformations.
Simulation was restricted to generating missing data in variable FBG. In fact, for each state of the experimental design, the five following steps repeated 50 times:
Step 1: Generating MAR data in variable FBG in a way that probability of becoming missing depended to sex and then age: for this purpose, variable P was created that took 1 for men, 2 for women aged 25–36, and 3 for others. Then, variable R was created that took values between 0 and 1 drawn randomly from a uniform distribution. Next variable FBG was sorted in ascending order of variables P and R and its first 1000, 2000, and 3000 observations were converted to missing for 19.6%, 39.2%, and 58.8% missing data rates that were rounded to 20%, 40%, and 60%, respectively.
Step 2: Calculation of FBG-related measurements including mean of FBG and proportion with FBG ≥ 126 mg/dl for prevalence of type 2 diabetes. This step reflects how estimations were affected by missing data before MI. In estimation process, survey setting and weights, as defined by Iran’ STEPS survey team, were taken into account. There is enough material available for more detailed information on sampling, weighting, and survey setting of the Iran’s STEPS survey .
Step 3: Using MI method: the five considered MI methods were MICE, BEM, MVN, LR, and PMM. The independent variables used in all the methods were demographic, behavioral, anthropometrical, biochemical, and survey sampling variables . The survey sampling variables were included because the missing pattern might differ between strata or clusters. Finally, the number of imputations was set to 20 [20,21]. Because detailed information on the imputation process for all MI methods is widely available in the literature [5,6,13,22,23], full explanation on MI methods was excluded from the scope of this work.
Step 4: Recalculation of FBG-related measurements is the same as in Step 2, which is intended to represent how MI was able to reduce the effect of missing data on the estimations.
Step 5: Calculation of Fraction of Missing Information (FMI), Relative Efficiency (RE), and Relative Variance Increased (RVI). MI system of Stata was used to calculate these measurements based on the following Equations (5)–(8) [2,24,25]:
As Steps 2, 4, and 5 were completed, the calculated measurements were recorded in a separate data set. In the end, this data set consisted of 2250 rows (50 simulations for each of the 45 states) and 10 columns representing measurements of Steps 2, 4, 5, and state.
2.5. Statistical Software
Multiple imputation by chained equation was performed by Stata 11.2 through ice command . MVN, LR, and PMM were performed through MI system of Stata version 11.2. BEM was performed by R-based AMELIA II package version 3.4.2 . However, for all MI methods, Steps 2, 4, and 5 were done by Stata to make the calculations consistent across all states.
2.6. Statistical Analysis
Based on FBG-related measurements derived in Steps 2 and 4, Relative Bias (RB), Average Length (AL) of 95% confidence interval, and Coverage Rate (CR) were computed for before and after MI. Then, these measurements along with RE, FMI, and RVI were averaged out over 50 simulations of each state.
Relative bias is defined as true value minus estimated value divided by the true value. In this context, true value was obtained from complete data set while estimated value was obtained from data set with missing data before and after imputation.
Average length is the difference between upper and lower limits of an estimate’s confidence interval and indicates the precision of the estimate. AL is preferred over standard error since it takes into account the implication that the estimate is based on t-student distribution .
Coverage rate is the proportion of estimates that their confidence intervals contain the true value.
Tables 1–6 show RB, CR, and AL for before and after imputation, as well as RE, FMI, and RVI for MI methods, transformation, and rate of missing data. In Tables 1–3, the target measurement is the mean of FBG, and in Tables 3–6, the target measurement is proportion with FBG ≥ 126 mg/dl. For each target measurement, tables are separated by rate of missing data.
MICE, multiple imputation by chained equation; MVN, multivariate normal regression; LR, linear regression; PMM, predictive mean matching; RB, relative bias; CR, coverage rate; AL, average length of 95% confidence interval; RE, relative efficiency; FMI, fraction of missing information; RVI, relative variance increased.
Accuracy of multiple imputation methods in estimation of mean of fasting blood glucose when nonresponse rate is 20%
Accuracy of multiple imputation methods in estimation of mean of fasting blood glucose when nonresponse rate is 40%
Accuracy of multiple imputation methods in estimation of mean of fasting blood glucose when nonresponse rate is 60%
Accuracy of multiple imputation methods in estimation of proportion of observations with fasting blood glucose ≥126 mg/dl when nonresponse rate is 20%
Accuracy of multiple imputation methods in estimation of proportion of observations with fasting blood glucose ≥126 mg/dl when nonresponse rate is 40%
Accuracy of multiple imputation methods in estimation of proportion of observations with fasting blood glucose ≥126 mg/dl when nonresponse rate is 60%
Table 1 shows that in estimation of mean of FBG when nonresponse rate is 20%, the PMM will result in the least RB (−0.01) while LR with Ln transformation will have the best performance regarding AL (2.283), RE, FMI, and RVI. All methods except for MICE with LN transformation had perfect CR.
Table 2 demonstrates that in estimation of mean of FBG and when nonresponse rate reaches 40%, among methods with perfect CR, MICE will have the least RB (0.00) whereas BEM will have the narrowest AL (2.49). However, regarding RE, FMI, and RVI, the best performance belongs to PMM supplemented by BC transformation.
Table 3 displays that in estimation of mean of FBG and when nonresponse rate becomes 60%, PMM with BC transformation could result in the least RB (0.00) while BEM will have the best performance regarding AL (2.717), RE, FMI, and RVI. Both methods are among those with perfect CR.
Table 4 reveals that in estimation of proportion with FBG ≥ 126 mg/dl when nonresponse rate is 20%, PMM with BC transformation has the least RB (−0.40) and in addition to some of the other methods has the perfect CR, AL, and RE. However, those methods have some minor superiority regarding FMI and RVI but at the cost of much greater RB.
Table 5 illustrates that in estimation of proportion of observations with FBG ≥ 126 mg/dl when nonresponse rate soars to 40%, MICE with BC transformation yields the least RB (0.72) and perfect CR, although there are methods that, in addition to perfect CR, have the better performance regarding AL, RE, and FMI among which PMM with Ln transformation has the least RVI, too.
Table 6 indicates that in estimation of proportion of observations with FBG ≥ 126 mg/dl when nonresponse rate rises to 60%, MVN with BC transformation has the least RB (−0.82) but at the cost of losing much of the CR and of very high RVI. However, some of the other methods have perfect CR as well as highest RE and lowest FMI among which LR with BC transformation has shorter AL and lower RVI.
In this study, five MI methods were evaluated quantitatively in the context of STEPS survey. In addition, the impact of variable transformation as well as the rate of missing data on the performance of the MI methods were examined. As an example, with 60% of nonresponse rate, the RB in proportion with FBG ≥ 126 mg/dl was −54.85% (highest amount in this study), CR was 0, and the length of CI was 0.03; however, MICE with BC transformation reduced the RB to 1.78%, increased the CR to 1, and reduced the length of CI to 0.02.
In fact, we found that it is not possible to regard one method as totally the worst or the best. Each method could outperform others if it is applied in its right situation. Even it is possible that in a certain situation, one method is the best in terms of validity but the other method is the best in terms of precision. So, findings of this study can serve as a guide for choosing the best strategy in case of dealing with nonresponses in surveys like STEPS. However, if the performance of MI methods, as shown here, is dissatisfying, it is still possible to improve it. For this purpose, one way is to enhance the survey questionnaire with variable correlated with the target variable or to gather some information about unit nonparticipants [27,28].
Some features of the present study could be regarded as strength points. For example, for the first time, performance of a wide range of the most advanced MI methods, available in statistical software, was evaluated in the context of STEPS survey. Besides, simulation technique was used to generate missing data in a real data set, so the true value of missing data was known for evaluation of MI methods.
On the other hand, some limitations of this study should be considered. First, only one quantitative variable was used for generating missing data, imputation, and all estimations. Second, only three rates of missing data were regarded, so the results remain silence for rates outside or between the ranges. Third, the number of simulations was arbitrarily set to 50 that might not be sufficient in some cases. Although the data set used in this study was not new, it does not an impose a serious challenge as this study was not aimed to provide neither descriptive nor analytical measures but to clarify how to deal with nonresponses in STEPS surveys, which are almost the same regarding the features that affect the performance of MI methods.
Consistent with similar studies, MI methods were evaluated based on RB, CR, AL, RE, FMI, and RVI [6,30]. RB is a measure of invalidity and should be considered large if is more than 2% or 5% in either direction [6,31]. CR is also a measure of validity but at the same time is a function of AL, which is a measure of precision. RE, FMI, and RVI are a function of the number of imputation and the ratio of between to within imputation variance. Better imputation results in lower values of RVI and FMI while higher values of RE [2,24,25].
We observed that RVI differed slightly over five MI methods of each experimental state. However, in the state with 60% missing data, MVN had considerably higher RVI (Tables 3 and 6). This can be explained by the process we employed to generate missing data, which increased variance of FBG . It seems that when rate of missing data reaches 60%, the increase in the variance cannot be tolerated by MVN.
Our findings show that REs were almost similar and ranged from 0.96 to 0.995 that was seen between states, which were different in other measurements. For example, PMM in Table 3 and BEM in Table 5 have the same RE as 0.96 but different RB as 0.68 and −52.2 and CR as 1.00 and 0, respectively. So, MI methods with the same RE should not necessarily be regarded completely equivalent.
It is said that FMI would be substantially less than the rate of missing data when the imputed data are highly predictive for the missing data  but we observe the contrary in some occasions. For example, in Table 4, RB is 26.31 while FMI is 0.14, almost 30% less than the rate of missing data. On the other hand, in Table 3, while RB is −0.68, FMI is 0.784, 30% more than the rate of missing data.
Our results showed that performance of MI methods was better when the target measurement was mean rather than proportion. For example, under MAR, 20% missing values could bias the proportion with FBG ≥ 126 mg/dl, while 60% missing values was needed to bias the mean of FBG.
In situations where missing values do not bias the estimates, decision on use of MI should be taken with caution as some MI methods can yield lower CR while some other ones can result in narrower confidence interval.
So, our findings are consistent with the literature indicating that performance of MI methods depends on the target analysis and rate of missing data and can be improved by selection of proper variable transformation . Besides, as some studies have also shown in other settings, there are some MI methods with almost similar performance that can considerably correct for nonresponse bias in STEPS surveys [4,18,21,22].
CONFLICTS OF INTEREST
The authors declare they have no conflicts of interest.
HH and JH contributed to the conceptualization and writing of the first draft of the manuscript. EB and RA contributed to literature research. HH and SH did data analysis, writing and revision of the manuscript. JH supervised investigation and methodology. HH supervised the research project. HH and SH reviewed and edited the final draft.
This study was financially supported by
Cite this article
TY - JOUR AU - Hamid Heidarian Miri AU - Jafar Hassanzadeh AU - Saeedeh Hajebi Khaniki AU - Rahim Akrami AU - Ehsan Baradaran Sirjani PY - 2020 DA - 2020/01 TI - Accuracy of Five Multiple Imputation Methods in Estimating Prevalence of Type 2 Diabetes based on STEPS Surveys JO - Journal of Epidemiology and Global Health SP - 36 EP - 41 VL - 10 IS - 1 SN - 2210-6014 UR - https://doi.org/10.2991/jegh.k.191207.001 DO - https://doi.org/10.2991/jegh.k.191207.001 ID - Miri2020 ER -