workshop

Introduction

Diseases can spread in any country or region on the globe. Where there are humans inhabitant, diseases can spread from human to human. The majority of the disease spread is contributed by vast human connectivity and travel patterns. Thus, a good way to try and prevent epidemics to occur gather more information about how a disease is spread. This can be done through measuring and mapping human networks and travel patterns. This method is seen in spatial epidemiology, a field that is concerned with diseases and its geographic variations. An important idea in spatial epidemiology is that each region come with its own contextualized geography and set of conditions that affect how a disease is born, incubated, and spread. A quality that is consistently found across studies is that there is high heterogeniety between countries, and even between regions of a country. This makes the characteristics of each disease highly contextualized to the area being studied.

Overall, the healthcare aspect of spatial epidemiology relates to Amartya Sen’s definition of human development. Amartya Sen explains that human development should be viewed as freedom, that regions should develop so that its people can expand their freedoms, starting with the ability to access basic freedoms such as good health. Furthermore, in order for a country, like Nigeria, to develop, that country should focus on improving healthcare access or providing health interventions to combat any situtation that does not allow good health for all. The ability to provide healthcare facilities, for example, is being improved through data science methodologies, as explained later on in this literature review.

This literature review examines Nigeria and its struggle with malaria: a highly infectious disease where parasites, such as p. falciparum, are transmitted by mosquitoes. The main cause for transmission of this disease is human transport through the high, and ever growing, connectivity between regions. This disease is currently prevalent in sub-Saharan Africa, where specific characteristics of Africa, such as ecological and environmental covariates, encourage the incubation of mosquitoes and parasites. Data science, such as the use of mobile phone data to map human movement, or satellite imaging to determine water sources, can help gather information needed to design and implement intervention methods for the elimination of this disease.

The purpose of this research is to improve upon the use of sub-national household surveys to generate a population age-structure density model by specifying, estimating, and validating a population density model as compared to using census data to estimate a population density model.

Human Development Process

Malaria prevalence is found in countries all over Sub-Saharan Africa. Many of these countries have implemented intervention methods to reduce the spread of malaria, specifically to children under the age of 5 because they are the most vulnerable. Since 2000, there has been an increase in effective malaria control and a decline in global mortality rates due to malaria. For example, the Swaziland National Malaria Control Program reports a deline in malaria cases of 2.9 to 0.07 per 1000 people since 1999. Some intervention methods include, but not limited to, indoor residual spraying (IRS), insecticide-treated bed nets (ITNs), long-lasting insecticidal nets (LLINs), and vaccinations.

The success of these interventions is contributed by highly contextualized population information, as well as geographical data, and other important information. Before interventions or policies can be planned, governments and organizations, such as the Global Malaria Programme (GMP) of the World Health Organization (WHO), need a reliable set of data to base their strategies on. For the best, and most effective way, to reach a sustainable development goal, countries need to invest in high quality data so that these organizations can stregnthen their scientific basis for decision making. This can be achieved through a number of ways such as through nationally conducted censuses, malaria indicator surveys (MIS) nationally-representative surveys, and demographic health surveys (DHS). Depending on how detailed and invested these surveys are, they can obtain information about population demographics, household size, health, and other factors. The precision of surveys ultimately depends on its sample size, budget, and sample clusters and how often it is carried out. Certain types of surveys are more beneficial for malaria interventions. Through comparing these types of surveys, it seems that nationally-representative household surveys are better bases for estimating health metrics for the targeting of interventions.

Even after determining that nationally-representative household surveys are better than its alternatives, it still holds some doubts. Examining the effectiveness and precision of such surveys is very important because the data from these surveys often serve as the basis for data science analysis, and furthermore, healthcare interventions. Surveys are typically time-consuming, use a lot of resources and cost a lot of money, so examining them can lead to a survey-design which maximizes effectiveness. The potential harm outlined in this study is the errors that could be made by having incorrect sample sizes in the process of estimation of disease prevalence, specifically with examples of malaria in sub-Saharan Africa. This topic is significant because there is an increasing demand for reliable, high quality data to support decision-making and progress towards sustainable development goals, especially in low-middle income countries. Surveys are a good source of this data because they provide timely information with high resource availability. Nationally-representative surveys are the main source of data for establishing a scientific evidence base, monitoring disease, and evalutation of overall healthcare, but sometimes the surveys do not focus enough on the importance of defining indicator-relevant sample sizes to have high precision. All in all, there is a gap in the process of gathering data that should be further researched.

Usually after data has been gathered, whether it be a sufficient amount or not, or whether it be accurate or not, this data is used for estimations before these results are used in a map or model of higher resolution. The issue that is oftentimes faced is that surveys do not capture all areas, rather, it only takes into account a sample size that is then used to estimate an entire area of a region. An example of these statistical estimation methods is the random forest (RF) estimation method. Another example is using Bayesian statistical methods, Guassian models, integrated nested Laplace approximations (INLA), or Markov chain Monte Carlo algorithms. Overall, in the context of malaria context, these statistical methods are used to estimate areas where malaria prevalence is concentrated, population density, population age-structure maps, covariates and their respective rankings, among other things.

After creating maps and models, the next step in this human development is to use this new information to strategize and design interventions. A specific example of how models are used was illustrated in the intervention program introduced in the Global Technical Strategy (GTS) by the Global Malaria Programme. In “Spatio-temporal analysis of malaria vector density from baseline through intervention in a high tranmission setting” by Alegana et al., they explain how interventions can be designed to maximize its impact of malaria vectors by specifically assessing the effect of indoor residual spraying on malaria vector densities in area where the use of long-lasting insecticidal nets (LLINs) were high. This paper is of significance because understanding how to maximize the impact of interventions will save resources, time, and money. Interventions, such as the government-mass campaigns and government-initiated indoor residual spraying (IRS) campaigns, require a reliable scientific evidence basis, usually gathered by surveys and data-science methodological analyses. For example, in the study, they were able to establish seasonal peaks in malaria venctor densities, and these pieces of data provide guidance for targeting IRS before peak transmission.

In this particular case study, all households who were enrolled in the malaria study in 2011 were provided with LLINs; and in neighboring communities, the proportion of those households with at least one LLINs increased in the next few years. Unfortunately, this one act did not significantly decrease mosquito density (only did it have an effect when combined with IRS), but it does support Amartya Sen’s idea that government and the authoritative institutions of regions should provide aid to those who need it. In other words, intervention and aid need to be provided so populations in those communities can have access to freedom, and that is the basic freedom to survive. Moreover, the dimension of human development being addressed here is cross-national healthcare, specifically with malaria vector species. The sustainable development goals focused in this article are to ensure healthy lives and promote well-being for all ages, and to strengthen the means of implementation and revitalize global partnership for sustainable development.

In this study, vector control is the main focus, and is central to the Global Technical Strategy (GTS) by the Global Malaria Programme. The methods include gathering data from multi-country analysis based on nationally-representative households surveys from DHS and MIS in sub-Saharan Africa. The heterogenious distribution of mosquitoes was studied through a longitudinal study in Nagongera, Uganda. Linear regression and the Bayesian information criterion (BIC) checked correlations between covariates which affects mosquito counts. These calculations were then mixed with spatial-temporal effected used to predict continuous maps of vector densities. Thus, the human development process that the authors are investigating is spatial epidemiology with the scientific question: which combinations of malaria interventions produce the maximal impact on malaria vector densities, and when should they be implemented? This study concluded that IRS using bendiocarb spray proved to have complemented high LLIN coverage, but it has to be associated with continuous monitoring of insecticide resistance.

Intervention programs are the reason that the prevalence of malaria infections have decreased in sub-Saharan Africa. Sub-saharan Africa has the most effificient malaria vector species, and these species prefer indoor biting. The Global Technical Strategy (GTS) focuses on vector control as their number one strategy to try to eliminate the disease.

Geospatial Data Science Methods

Bayesian hierarchical generalized mixed model with spatio-temporal effects used to predict continuous maps of vector densities [3]

This method is carried out by first starting surveys in areas of high stable malaria transmission intensity. In this study, they focused on Nagongera, a sub-county region in Eastern Uganda. This area is unique because it has an altitude of 1095 meters above sea level, is dominated by subsistence farming, has 2 rainy seasons averaging 1000-15000 mm rainfall annually, and an average temperature of 23 degrees celsius. This region’s specific characteristics contributes to two distinct malaria transmission peaks. The first transmission peak is during the longer wet season of March to June, and the second peak is during the second wet season of November to December.

Entomology surveillance was conducted in Nagongera at house-hold level. First a sampling frame of all households was established and 100 of these households were randomly selected to be part of the study. In each of the selected households, moquitoes were gathered once a month using mini CDC light traps and the number of female mosquitoes were recorded.

Along with collecting entomology survey data, climatic and environmental covariates were then ranked. Covariates such as climatic (rainfall and temperature), ecological (EVI), toporaphy (elevation), proxy measure of urbanicity (night-time lights), and household density was measured and noted. Furthermore, linear regression model, R, was used to check correlation between covariates that may result in multicollinearity. Lastly in this step, a Bayesian information criterion (BIC), was calculated for each covariate and covariates were selected with this count. The model with the lowest BIC selected after covariates were regressed against mosquito counts was used for further spatio-temporal analysis of mosquito vector density. This model was applied in the IRS program (a part of the national campaign that started in 2006 in south western Uganda), and expanded to further 10 districts.

Afterwards, a Bayesian inference was performed using integrated nested Laplace approximations (INLA), and model goodness-of-fits were assessed to validate the models. Root mean sqaure error (RMSE) was also utilised.

Data and spatially matched covariates used in Bayesian hierarchical spatio-temporal models to map population age-structure densities for health development applications [6]

A 2006 study in Nigeria used Bayesian hierarchical spatio-temporal models to map popluation age-structure densities for malaria interventions. Data of populations of children under 5 were obtained from 3 nationally-represented household surveys: 2008 DHS, 2010 MIS, and 2010 LSMS-ISA panel. In these surveys, clusters were formed, and then a random sample of these households were selected within each cluster. The proportion of children under 5 for each household was then calculated for all households in a cluster.

After estimating the proportion of children under 5 for each cluster, covariates were selected that wree found to have a high correlation to the proprotion of the population under 5 years. These include land-use, urbanizations, vegetation indices, climatic conditions, socioeconomic factors, and distance to roads. Covariates were selected via a non-spatial generalized linear regression model (glm) and then these covariates were used in the Bayesian approach. Furthermore, the glm model with the lowest BIC was selected after covariates were regressed against proportion of under 5 years old. This method produced continuous maps of the estimated proportion of the population implemented through SPDE with INLA, and has a high resolution of 1x1 km grid squares.

Model validation methods were also conducted to ensure that maps are sufficient. The deviance information criterion (DIC) was calculated, along with mean absolute data (MAE) and RMSE to asses bias and accuracy of models. In the end, the application of these models was focused on two intervention methods: distributions of ITNs for malaria prevention, and coverage of basic vaccination for childhood diseases.

Discussion

This literature review discusses the gathering of data, computing the data into maps, and then applying those maps to real-life strategies to improve health human development in the context of the malaria epidemic in Nigeria. This research has exposed some of the better methods in which to use during the human development process: nationally-represented household surveys has been proven to be more successful than past censuses, and using covariate correlations for estimations have shown to provide accurate predictions of densities for mapping.

Although these methods have been selected to be higher-grade, limitations still occur within these methods. For example, the sample sizes of nationally-representative surveys can cause variability and inaccuracy in estimating malaria prevalence, as seen in the Kenya MIS and Rwanda DHS [2]. As for the second geospatial data science method discussed in this literature review regarding covariate relationships in predicting age-structure densities, environmental factor relationships are not well understood, and thus not generalizable for different malaria settings yet. This is a significant issue often faced in the field of data science and human development due to high heterogeneity between regions.

References

  1. Strano, E., Viana, M.P., Sorichetta, A. et al. Mapping road network communities for guiding disease surveillance and control strategies. Sci Rep 8, 4744 (2018). https://doi.org/10.1038/s41598-018-22969-4
  2. Alegana, V.A., Wright, J., Bosco, C. et al. Malaria prevalence metrics in low- and middle-income countries: an assessment of precision in nationally-representative surveys. Malar J 16, 475 (2017). https://doi.org/10.1186/s12936-017-2127-y
  3. Alegana, V.A., Kigozi, S.P., Nankabirwa, J. et al. Spatio-temporal analysis of malaria vector density from baseline through intervention in a high transmission setting. Parasites Vectors 9, 637 (2016). https://doi.org/10.1186/s13071-016-1917-3
  4. A. Wesolowski, et al. (October 12, 2012). Quantifying the Impact of Human Mobility on Malaria. Retrieved https://science.sciencemag.org/content/338/6104/267
  5. Jr, R. C. R., Menach, A. L., Kunene, S., Ntshalintshali, N., Hsiang, M. S., Perkins, A., … Bryan Greenhouse. (2015, December 29). Mapping residual transmission for malaria elimination. Retrieved February 23, 2020, from https://elifesciences.org/articles/09520
  6. V. A. Alegana, P. M. Atkinson, C. Pezzulo, A. Sorichetta, et al. (2015) Fine resolution mapping of population age-structures for health and development applications. Retrieved from https://royalsocietypublishing.org/doi/pdf/10.1098/rsif.2015.0073
  7. L. S. Tusting, C. Bottomley, H. Gibson, I. Kleinschmidt, A. J. Tatem, S. W. Lindsay, and P. W. Gething. (February 21, 2017). Housing Improvements and Malaria Risk in Sub-Saharan Africa: A Multi-Country Analysis of Survey Data. Retrieved from https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1002234
  8. A. J. Tatem, P. Jia, D. Ordanovich, M. Falkner, Z. Huand, and R. Howes. (October 21, 2016). The geography of imported malaria to non-endemic countries: a meta-analysis of nationally reported statistics. Retrieved from https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(16)30326-7/fulltext
  9. Lai, S., Wardrop, N., Huang, Z. et al. Plasmodium falciparum malaria importation from Africa to China and its mortality: an analysis of driving factors. Sci Rep 6, 39524 (2016). https://doi.org/10.1038/srep39524
  10. Tatem, A.J., Qiu, Y., Smith, D.L. et al. The use of mobile phone data for the estimation of the travel patterns and imported Plasmodium falciparum rates among Zanzibar residents. Malar J 8, 287 (2009). https://doi.org/10.1186/1475-2875-8-287