Thiruchelvam, Dass, Zaki, Yahya, and Asirvadam: Correlation analysis of air pollutant index levels and dengue cases across five different zones in Selangor, Malaysia

Correlation analysis of air pollutant index levels and dengue cases across five different zones in Selangor, Malaysia


This study investigated the potential relationship between dengue cases and air quality – as measured by the Air Pollution Index (API) for five zones in the state of Selangor, Malaysia. Dengue case patterns can be learned using prediction models based on feedback (lagged terms). However, the question whether air quality affects dengue cases is still not thoroughly investigated based on such feedback models. This work developed dengue prediction models using the autoregressive integrated moving average (ARIMA) and ARIMA with an exogeneous variable (ARIMAX) time series methodologies with API as the exogeneous variable. The Box Jenkins approach based on maximum likelihood was used for analysis as it gives effective model estimates and prediction. Three stages of model comparison were carried out for each zone: first with ARIMA models without API, then ARIMAX models with API data from the API station for that zone and finally, ARIMAX models with API data from the zone and spatially neighbouring zones. Bayesian Information Criterion (BIC) gives goodness-of-fit versus parsimony comparisons between all elicited models. Our study found that ARIMA models, with the lowest BIC value, outperformed the rest in all five zones. The BIC values for the zone of Kuala Selangor were –800.66, – 796.22, and –790.5229, respectively, for ARIMA only, ARIMAX with single API component and ARIMAX with API components from its zone and spatially neighbouring zones. Therefore, we concluded that API levels, either temporally for each zone or spatio- temporally based on neighbouring zones, do not have a significant effect on dengue cases.


Dengue fever, which is transmitted by Aedes mosquitoes, have been increasing in Malaysia lately, specifically in its most urbanized state, Selangor (Hassan et al., 2012). Uncontrolled population growth, climate change and unplanned urbanization are some of the main factors linked to an increase in dengue cases (Hassan et al., 2012). Control against the vector population is still the best way to curb this disease as vaccination is still under study with mixed outcomes. A dengue vaccine named Dengvaxia has been developed by Sanofi Pasteur Ltd (Sanofi Pasteur Press Release, 2016), but this vaccine has caused many young children who underwent vaccination to become more sick after a three-year time period (Warner, 2016). As a result of this, dengue prediction models are still important and complementary tools for controlling the vector population at present.

Most of the dengue prediction models use meteorological parameters such as mean temperature, relative humidity and rainfall as predictors (Brunkard et al., 2008; Olson et al., 2009; Cheong et al., 2012; Cheong et al., 2013; Dom et al., 2013; Naish et al., 2014). Inclusion of air quality as a potential predictor is still a topic that is not well researched in the literature; there are only a few reported studies, especially from Singapore, which have addressed this issue (Smith et al., 2013). The name of measurement that refers to air quality also varies among countries. For example, it is known as the Air Pollution Index (API) in Malaysia, as Air Quality Index (AQI) in Thailand and as Pollutant Standard Index (PSI) in Singapore (ASEAN Specialised Meteorological Centre, 2017). A previous study (Massad et al., 2010) based in Singapore hypothesized that a severe haze condition (with high PSI levels) causes a drop in dengue since severe haze will produce smoke that would kill the adult mosquitoes. The authors’ discussion was based on observations of drops in dengue cases during the Southeast Asia extreme haze event in 2006. On the other hand, Smith et al. (2010) carried out a haze-dengue analysis, also in Singapore, and concluded oppositely: High PSI value did not necessarily reduce dengue cases. They observed that during the bad haze conditions in years of 1997 and 1998, dengue cases was also at its highest levels. Therefore, they concluded that severe haze conditions did not necessarily bring about a drop in dengue cases. These two opposite findings can be further explained by another study by Oliver et al. (2014) which have discussed that only when the API level is enough high, its negative effect on the mosquito population (these vectors could be destroyed) can be felt and dengue occurences could be reduced (Oliver et al., 2014). Therefore, the relationship between air quality levels and dengue cases remains uncertain.

In view of this, our study aimed to explore API as a potential predictor of dengue cases for zones involved in our study in Malaysia. A challenge in this study is that previous dengue cases provided a positive feedback for current dengue cases due to the nature of transmission of this disease. Thus, the effect of API had to be assessed over and above this feedback effect. To address whether API levels correlate with dengue cases, a number of autoregressive integrated moving average (ARIMA) and autoregressive integrated moving average with exogenous variable (ARIMAX) models were built using the Box-Jenkins (BJ) variable transformation approach. Subsequently, the ARIMA and ARIMAX models were optimized with respect to the extent of their feedback. The optimized ARIMA and ARIMAX models were then compared using the Bayesian Information Criteria (BIC) as the criteria for model selection. Our approach of deciding the presence or absence of correlation of API with dengue cases was based on three stages of model-based analysis: model identification, validation and selection. Our models were able to find long-term correlations between API levels and dengue cases, if any, after accounting for other explanatory attributes proxied through lagged dengue cases.

Materials and Methods

Study area

This study involved the following five zones in Selangor: i) Petaling Jaya; ii) Pelabuhan Klang; iii) Shah Alam; iv) Kuala Selangor; v) Banting. The state of Selangor was selected as it recorded the highest number of dengue cases in Malaysia (Hassan et al., 2012). Selangor is also well known for facing deterioration of air quality every year, where it recorded a number of 59, 48, and 37 unhealthy days (API levels were in the range of 101-200) during the years of 2009-2012. This poor air quality is believed to occur due to a number of factors such as commercial and industrial developments, emissions from motor vehicles and the trans boundary haze (Rahmah et al., 2015).

Data on weekly dengue cases were obtained from the Ministry of Health Malaysia, and it represents the total number of dengue cases that occurred in the respective zones. On the other hand, data on API measurements were obtained from the Department of Environment, Malaysia. The API measurements for Malaysia comprise measurements of five main air pollutants, i.e. carbon monoxide (CO), ozone (O3), nitrogen dioxide (NO2), sulphur dioxide (SO2) and particulate matter with a diameter of less than 10 micron (PM10) (Department of Environment Malaysia, 2017). All dengue cases and API data for the five zones of study were reported on a weekly basis, starting from the year 2009 till 2015, except for Banting where the dengue cases and API data were obtained from year 2010 till 2015. The surrounding environment for air quality monitoring stations varied: the stations in Petaling Jaya, Shah Alam and Pelabuhan Klang are located in the middle with main roads which connect industrial and high density housing areas. The Kuala Selangor station is located in a comparatively less populated area, whereas the Banting station is near the roadside of a highway, thus located in a area that is less populated and having trees and oil palm plantations (Rahmah et al., 2015). These different environments contribute to the variation in levels of deterioration of air quality measurements (Azmi et al., 2010). Figure 1 shows the spatial locations of the API monitoring stations as well as the boundaries of the zones considered. The areas corresponding to these zones are as follows: Petaling Jaya (PJ) = 97.2 km2, Pelabuhan Klang (PK) = 261 km2, Shah Alam (SA) = 293 km2, Kuala Selangor = 217 km2 and Banting = 261 km2. Four of the areas are approximately similar but PJ is half the size of the rest.

The locations of API stations for the 5 zones, Petaling Jaya, Pelabuhan Kelang, Shah Alam, Kuala Selangor and Banting, are marked with black dots and their zonal boundaries indicated by the shaded grey patches.

Presence of feedback

It is proposed that dengue cases can be explained through various time-lagged models with lags of up to ten weeks. The extent of feedback can be understood based on the study of the lifecycle of the A. aegypti mosquito. The mosquito breeding period from mosquito eggs to adulthood (Life cycle of Aedes Agypti, 2017) lasts beween 1.5 to 3 weeks, which is followed by a biting period where the adult mosquitoes, if successful in biting an infected human, will acquire the dengue virus. The infected mosquito then undergoes an extrinsic incubation period (EIP) which takes from 8 to 12 days before becoming infectious (Chen et al., 2012). This is followed by a mosquito transmission period where the infected mosquitoes will bite susceptible (uninfected) humans (Dengue Transmission, 2014). The next phase is the human tramsmission phase where the dengue virus is transferred from an infected person to an uninfected person (Cheong et al. 2013). The virus will undergo an Intrinsic Incubation Period (IIP) inside the newly infected human which lasts for 4-10 days after which the infected person will exhibit symptoms of dengue (World Health Organization, 2009). The number of infected humans at previous time points, therefore, serves as a determinant of current levels of dengue cases based on the lifecycle of Aedes mosquitoes as described above. As the number of infected humans at any time point is difficult to determine, a reliable substitute is the actual number of dengue cases observed at these previous time points. Variability in the biting period, the mosquito transmission period, human transmission period, the EIP and the IIP cause us to consider all previous dengue cases up to lag-10 weeks as potential predictors for the current number of dengue cases, which is the maximum extent of feedback that we consider in our ARIMA and ARIMAX models. Figure 2 shows the timeframe of the dengue infection process. For dengue cases, ARIMA models are considered as models that can incorporate feedback information via lagged terms. In this study, API is the exogenous variable considered in our ARIMAX models. Both models are used to represent stationary time series data (Alain et al., 2007; Cressie and Wikle, 2011). However, when a time series is non-stationary, as is the case for the dengue case trajectories, ARIMA and ARIMAX models cannot be applied directly. The standard practice is to have the original time series differenced up to the first or second order to yield stationarity after which the models can be fit to the observed data (Hyndman and Athanasopoulos, 2013).


In our case, first order differencing was found to be sufficient to yield stationarity; higher order differencing was found to be unnecessary. The general mathematical equation of an ARIMA model for a differenced time series Y* is:

Eq. 1
where ϕ = (ϕ1, ϕ2,…. ϕp) and θ = (θ1, θ2,…, θq) are unknown coefficients with each ϕi ∈ ℜ and θj ∈ ℜ, and ε’s random errors independent and identically distributed with zero mean and constant but unknown variance, σ2. In Eq. 1, q is the extent of lag (i.e. the extent of dengue case feedback) and the extent of lag of the errors (i.e. the extent of correlation in Yt*). The ARIMA model in Eq. 1 with lag orders p and q is denoted ARIMA (p, d, q) where d is the order of differencing to yield stationarity. The values of p and q were obtained based on inspecting the autocorrelation function (ACF) and partial autocorrelation function (PACF) plots of Yt*. The method of choosing p and q based on inspection is a wellknown technique in the literature (see Nau, 2014) and hence, we omit further details here. To avoid complicated models, the ϕ - and θ - coefficients in Eq. 1 corresponding to insignificant lags in the ACF and PACF plots were set to zero. The final model was denoted by ARIMA(p, d, q)* where * indicates the reduced model with omission of insignificant coefficients.

The remaining significant but unknown coefficients in the ARIMA(p, d, q)* model were estimated using the maximum likelihood (ML) approach (Wang and Jain, 2003; Levine and Justice, 2013). The ML approach was used as it reports good model fit and prediction accuracy. Subsequently, residual diagnostics based on the fitted model was performed. The fitted model is deemed appropriate if the ACF and PACF plots of the residuals do not show any significant autocorrelations (i.e. significant peaks at lagged time points). If not, the above process mus be repeated for different choices of p and q until desired results are obtained. We provide an illustration of the techniques for one of the zones under study in the Results section. The details on the same methodology are omitted in the analysis corresponding to the other four zones.

Exogenous variables can be incorporated into the ARIMA(p, d, q)* model forming an ARIMAX(p, d, q, m)* model with m lags. The corresponding mathematical equation of an ARIMAX(p, d, q, m)* model is:

Eq. 2
where gh-13-1-613-e000.jpg is the linear combination of lagged API observations up to order m. The ϕ -and θ - coeffiients which were insignificant in the original ARIMA(p, d, q)* model were set to zero in the ARIMAX (p, d, q, m)* model in Eq. 2 as well. The remaining significant ϕ - and θ -coefficients as well as the m+1 β - coefficients in the ARIMAX(p, d, q, m)* model in Eq. 2 were re-estimated using ML. We noted that the API time series also needs to be stationary to avoid spurious regression (Hyndman and Athanasopoulos, 2013) which was achieved by performing first order differencing as mentioned earlier.

We selected an upper bound for m, namely m0, and fit m0 ARIMAX (p, d, q, m)* models for m=0,1, 2,…,m0. Among the ARIMA(p, d, q)* and the m0 ARIMAX(p, d, q, m)* models, the best one was selected using (BIC):

Eq. 3
where L is the model likehood, k is the number of model parameters and n the number of observations in the time series data. Model selection based on BIC accounts for both model fit and model parsimony (Bhatnagar et al., 2012; Sitepu et al., 2013; Sriyasatien et al., 2016). In Eq. 3, k = k0 + 1 where k0 is the number of significant ϕ - and θ - coefficients in the ARIMA(p, d, q)* model; the extra +1 corresponds to the variance parameter σ2. For the ARIMAX(p, d, q, m)* model, the value of k = k0 + m + 2 since there are m + 1 β- coefficients and one variance parameter σ2. The best model is the one with the lowest BIC value in Eq. 3. We also considered spatiotemporal analysis where temporal API measurements from spatially neighbouring zones were incorporated into Eq. 2 in addition to the API measurements from that zone. In this case, the value of gh-13-1-613-e00a.jpg in Eq. 3 where M is the total number of zones considered with mj denoting the lag order for zone j = 1, 2,…M.


Description of data

A total of 354 weekly data were used (covering the period between January 2009 up to middle of October 2015) involving observations of dengue cases and API levels; for Banting, the number of weeks covered was 290 from March 2010 to October 2015. Table 1 gives the range and average values of the data for each zone. The average API readings ranged between 51 and 100 which is classified as moderate, that is, moderate pollution that does not pose any bad effect on health (Department of Environment Malaysia, 2017). This moderate API range should be acceptable for most of the public except for a small group of people who are sensitive to ozone or particulate pollution (Department of Environment Malaysia, 2017).

Prediction models

Prior to model building and selection, we normalized both the response and explanatory variables (i.e. dengue cases and API levels) into the same range (0-1): Letting H represent a generic notation for a variable, normalization entails

Eq. 4
where HN is the normalized variable, and Hmin and Hmax, the minimum and maximum values of H, respectively, in the data set.

Three models were compared in this study, i.e. the ARIMA models, ARIMAX models with single-API station and the ARIMAX with paired-API stations from spatially neighboring zones. To build the ARIMA model, the BJ approach was used: first, differencing was carried out to ensure that the data were stationary; next, the significant lag orders of p and q were selected by inspecting the ACF and PACF plots; finally, the ARIMA model was estimated to obtain the parameter values using the ML approach. An example of ARIMA model building is illustrated for the Kuala Selangor zone in Figures 3-5. Figure 3 shows the weekly dengue case trajectory for Kuala Selangor during the years 2009-2015 which is clearly seen to be non-stationary. First order differencing (d=1) was carried out and significant lag orders identified using the ACF and PACF plots. The ACF plot in Figure 4 shows a significant peak at q=2, hence, the ARIMA (0, 1, 2)* model was selected. We found that significant peaks in both the ACF and PACF plots can be eliminated by choosing exactly one of p or q appropriately, that is, by setting the other to zero, which also reduces model complexity. Figure 5 shows the ACF and PACF plots of the residuals from the fitted model. No significant peaks existed outside the 95% confidence lines indicating that the residuals are indeed white noise. Hence, the ARIMA model (0, 1, 2)* was selected as the model for dengue cases for Kuala Selangor.

To investigate if API had any effect on dengue cases, six ARIMAX (0,1,2,m) *models were formed for different lag orders m=0,1, 2,…,m0 = 5. Table 2 gives the BIC values for ARIMA and best ARIMAX models (out of the six, based on BIC) for Kuala Selangor. A similar methodology was also applied to the rest of the study zones (Table 2).

Next, a spatio-temporal analysis was carried out to find if inclusion of API parameter from spatially neighbouring stations (the closest one) could enhance the model fit. Table 3 shows the results obtained for the ARIMAX model for Kuala Selongor when API data from its closest neighbour, Shah Alam, was additionally incorporated into the ARIMAX model (Figure 1). Results from Table 3 show that best ARIMAX model is from the pairing of API parameters at lag orders of zero and zero. A comparison of the BIC values in Tables 3 and 4 shows that the it reaches its lowest value for the ARIMA model without any API component. We concluded that the dengue trajectory for Kuala Selangor was very localized and intrinsic to that zone only. Once the localized, best model based on feedback had been identified, it was realized that API observations for that zone as well as API data from spatially neigbouring zones do not contribute to better models. Indeed, similar results were found for all other zones: Comparing the best ARIMA and ARIMAX models from Table 2 with the best ARIMAX models from Table 4 (with API data from spatially neighbouring zones), we found that the ARIMA models had best BIC values in all the five zones of study.


In Malaysia, poor air quality mostly occurrs during the haze episodes, known as the trans-boundary haze (see BBC News Online, 2017) due to smoke from open burning and wildfires in Indonesia. Haze and other local particulate matters were believed to kill the mosquito vectors, causing lower dengue transmission and decreased number of dengue incidences as a whole (Massad et al., 2010). Indeed, when their modelled prediction of a dengue outbreak did not materialize, Massad et al. (2010) suggested that haze could have prevented it. This suggestion was based on Singapore data on dengue and the pollutant standards index (PSI) for the year of 2006 and 2007. Smoke is also known to act as an insect repellent and therefore used as such (Tabuti, 2007).

Studies which investigate the relationship between haze (or air quality index) and the number of dengue cases in Malaysia are at best scarce. Rafdzah et al. (2015) conducted an extensive review of existing literature for such studies and came up with a list that does not contain any studies from Malaysia. This motivated us to conduct this study in the state of Selangor in Malaysia. Based on our analysis, we found no relationship between API level and dengue cases in all five study zones in Selangor. Prediction models were better elicited without API as a predictor. Based on data from the region of Kuala Selangor, we found that the optimum ARIMA model was better than the optimal ARIMAX model based on BIC values. The reason was that ARIMAX(p, d, q, m)* models are more complex compared to ARIMA(p, d, q)* and will be penalized by the BIC criteria (i.e. result in higher BIC values) if the inclusion of API as the exogeneous variable is redundant, that is, if API has no effect on dengue cases. A spatio-temporal analysis also yielded similar conclusions. Incorporation of API measurements from Shah Alam, the station closest to Kuala Selangor, in addition to API measurements from Kuala Selangor station did not lower the BIC values, which means that their incorporation into the model was redundant. Similar results were observed for the four other study areas.

Dengue case observations were similar in the closely situated neighbouring zones Petaling Jaya and Shah Alam as can be seen in Table 1. The average and range of dengue cases were 267.92 [0, 1585.00] for Petaling Jaya and 213.27 [0, 1782.00] for Shah Alam, respectively. These similar average and range values differed significantly from the other zones in Table 1. In terms of API levels, there were similarities too, i.e. in the average and range of API levels. As seen in Table 1, the average and range of API levels were 58.61 [34.46, 201.29] for Petaling Jaya and 61.53 [34.33, 248.29] for Shah Alam, respectively. These average and range values again differed significantly compared to the other zones. Due to this similarity between Petaling Jaya and Shah Alam, the best models (which were the ARIMA models although with differing ARIMA components) had BIC values closer to each other compared to the other zones (Table 2), meaning that these models provided similar model fit and parsimonious assessment to be used for prediction purposes.

Our study findings on dengue cases and API levels are consistent with findings from a previous study (Smith et al., 2010) based on 2001-2008 data from Singapore and air quality measured according to the PSI. They concluded that there was no significant association between PSI and dengue. However, their data were limited to a small range of dengue cases (minimum of 45 and maximum of 265) and PSI (minimum of 34.2 and maximum of 41.9) (Rafdzah et al., 2015). We considered a higher range of dengue cases as well as API values in this paper and still did not find any correlation among them.

Interestingly, some positive impact of haze on dengue has been suggested (Ooi et al., 2010; Smith et al., 2013) based on the 1997-1998 dengue outbreak in Singapore, which faced severe haze at that time. This impact was explained in Smith et al. (2013) where they suggested the high haze readings have caused both human and mosquitoes to stay indoors bringing higher possibilities for mosquito bites and dengue transmission. However, Massad et al. (2010) suggested that the observed outbreak during this period could be explained by the high annual average temperature which occurred together with the haze conditions. Annual average temperatures were at the highest during the latest 15 years at the time, and this could have caused an increase in the number of infectious Aedes mosquitoes since the EIP period would have been shortened. This, in turn, would have caused more possibilities for infectious bites and dengue transmission. Thus, in our model, it was imperative to include such explanations in the model development so that the effect of API levels could be assessed after such inclusions; note that our model used time-lagged dengue cases as a proxy for high temperature. Thus, residual correlation of API levels and dengue cases, if any, would only be reflected in our models if truly present and not due to other confounding factors such as high temperature. Indeed, we found that after the inclusion of time-lagged dengue cases, API levels did not provide any additional explanation to the increase or decrease in the number of dengue cases.


This is one of the first studies to investigate the relationship between API levels and the number of dengue cases in Malaysia using feedback modelling techniques such as ARIMA and ARIMAX. Our study found that API level did not correlate with dengue cases in all five study zones in the state of Selangor, Malaysia, which were selected due to their high levels of dengue cases. The inclusion of API for all study zones did not enhance our prediction models as assessed by the BIC values. The BIC values did not improve, either by adding API data from spatially neighbouring zones. We conclude that the dengue cases trajectory for each zone was very localized and intrinsic to that zone only. Once the localized best model was identified; other variables (the API observations for the focal zone as well as API data from spatially neigbouring zones) did not contribute to yield better prediction models. The finding that the ARIMA models with the lowest BIC values outperformed the rest of the models in all five zones should be of future interest.


Authors would like to thank the Ministry of Health and Department of Environment, Malaysia for providing the necessary data to conduct this work.



FZ Alain, NL Elena, MS Graham, 2007. Time series analysis. In: AF Zurr, EN Ieno, GM Smith, eds. Analyzing ecological data. New York, NY: Springer; pp 264-288.


S Anno, K Imaoka, T Tadono, T Igarashi, S Sivaganesh, S Kannathasan, V Kumaran, SN Surendran, 2015. Space-time clustering characteristics of dengue based on ecological, socioeconomic and demographic factors in northern Sri Lanka. Geospat Health 10:215-8.


SZ Azmi, MT Latif, AS Ismail, L Juneng, AA Jemain, 2010. Trend and status of air quality at three different monitoring stations in the Klang Valley, Malaysia. Air Qual Atmosphere Health 3:53-12.


ASEAN Specialised Meteorological Centre (ASMC). Air Qual 2017; Available from: Accessed: October 2017.


BBC News Online. What causes South East Asia’s haze? Available from: Accessed: May 2017.


S Bhatnagar, V Lai, S Gupta, 2012. Forecasting incidence of dengue in Rajasthan, using time series analyses. Indian J Pub Health 56:281-5.


JM Brunkard, E Cifuentes, SJ Rothenberg, 2008. Assessing the roles of temperature, precipitation, and ENSO in dengue reemergence on the Texas-Mexico border region. Salud Publica Mexico 50:227-9.


MJ Chen, CY Lin, YT Wu, 2012. Effects of extreme precipitation to the distribution of infectious diseases in Taiwan. PLoS One 7:1994-16.


YL Cheong, L Burkart, PJ Leitao, T Lakes, 2013. Assessing weather effects on dengue disease in Malaysia. Int J Environ Res Pub Health 10:6319-6.


N Cressie, CK Wikle, 2011. Statistics for spatio-temporal data. New Jersey, USA: Wiley Series in Probability and Statistics.


Department of Environment Malaysia. Air Pollution Index of Malaysia. 2017; Available from: Accessed: January 2017.


Dengue Transmission, 2014. Available from: Accessed: January 2017.


NC Dom, AA Hassan, ZA Latif, R Ismail, 2013. Generating temporal model using climate variables for the prediction of dengue cases in Subang Jaya, Malaysia. Asian Pacific J Trop Dis 3:352-61.


Warner Evelyn, 2016. Too good to be true: Sanofi Pasteur’s dengue vaccine raises serious concerns. Available from: Accessed: May 2017.


H Hassan, S Shohaimi, NR Hashim, 2012. Risk mapping of dengue in Selangor and Kuala Lumpur, Malaysia. Geospat Health 7:21-5.


RJ Hyndman, G Athanasopoulos, 2013. Forecasting: principles and practice. OTexts: Melbourne, Australia. Available from: Accessed: March 2017.


H Levine, JB Justice, 2013. Handbook of local government fiscal health. Burlington, MA, USA: Jones & Bartlett Learning.


Life cycle of Aedes Agypti. Available from: Accessed: January 2017.


E Massad, F Coutinho, S Ma, M Burattini, 2010. A hypothesis for the 2007 dengue outbreak in Singapore. Epidemiol Infect 138:951-7.


S Naish, P Dale, JS Mackenzie, J McBride, K Mengersen, S Tong, 2014. Climate change and dengue: a critical and systematic review of quantitative modelling approaches. BMC Infect Dis 14:167.


R Nau, 2014. Notes on nonseasonal ARIMA models. Available from:—Robert_Nau.pdf. Accessed: January 2017.


EE Ooi, A Wilder-Smith, LC Ng, DJ Gubler, 2010. The 2007 dengue outbreak in Singapore. Epidemiol Infect 138:958-61.


L Oliver, MN Burattini, FA Coutinho, GE Coelho, CJ Struchiner, E Massad, 2014. A negative correlation between dengue and bushfires in Brazil. J Environ Health 76:66-7.


J Olson, K Berggren, M Olofsson, M Viklander, 2009. Applying climate model precipitation scenarios for urban hydrological assessment: A case study in Kalmar City, Sweden. Atmosph Res 92:364-75.


Z Rafdzah, A Nasrin, R Kavita, Y Abqariah, 2015. The relationship between air quality and dengue cases: a systematic review. Available from: Accessed: January 2017.


S Rahmah, S Norkhadijah, MF Ramli, MT Latif, 2015. The assessment of ambient air pollution trend in Klang Valley, Malaysia. World Environment 5:1-11.


Sanofi Pasteur Press Release, 2016. Available from: Publishes-New-Analyses-Confirming-that-Sanofi-Pasteur-sVaccine-Candidate-Safely-Protects-Pre-Adolescents-to-Adults-Against-Dengue.aspx. Accessed: August 2016.


MS Sarfraz, NK Tripathi, FS Faruque, UI Bajwa, A Kitamoto, M Souris, 2014. Mapping urban and peri-urban breeding habitats of Aedes mosquitoes using a fuzzy analytical hierarchical process based on climatic and physical parameters. Geospat Health 8:685-97.


M Sitepu, J Kaewkungwal, N Luplerdlop, 2013. Temporal patterns and a disease forecasting model of dengue hemorrhagic fever in Jakarta based on 10 years of surveillance data. Southeast Asian J Trop Med Pub Health 44:206-17.


P Sriyasatien, A Phumee, P Ongruk, 2016. Analysis of significant factors for dengue fever incidence prediction. BMC Bioinformatics 17:166-9.


JRS Tabuti, 2007. Herbal medicines used in the treatment of malaria in Budiope country, Uganda. J Ethnopharmacol 116:33-42.


U.S. Environmental Protection Agency. A guide to air quality and your health, 2014. Available from: Accessed: July 2017.


GCS Wang, CL Jain. Regression analysis: modelling & forecasting. Flushing, New York: Graceway Publishing Company Inc.; 2003.


A Wilder-Smith, A Earnest, S Tan, E Ooi, D Gubler, 2010. Lack of association of dengue activity with haze. Epidemiol Infect 138:962-6.


A Wilder-Smith, A Earnest, S Ma, LC Ng, 2013. Haze and dengue: the unanswered questions. Ann Acad Med Singapore 42:687-8.


World Health Organization and Special Programme for Research and Training in Tropical Diseases Dengue: Guideline for diagnosis, treatment, prevention and control. France: World Health Organization, 2009. Available from: Accessed: January 2017.

Figure 1.

Map outlining the boundaries of the state of Selangor in West Malaysia.

Figure 2.

Life cycle of the Aedes Aegypti mosquito and dengue transmission to humans: a motivation for lagged time series models for dengue cases.

Figure 3.

Plot of dengue cases for Kuala Selangor region.

Figure 4.

Plot of autocorrelation function and partial autocorrelation function after differencing.

Figure 5.

Plot of autocorrelation function and partial autocorrelation function for residuals of the autoregressive integrated moving average (0,1,2) model.

Table 1.

Summary measures of air pollutant index (API) levels and dengue cases.

Area API levels Dengue cases
Average Range Average Range
Petaling Jaya 58.61 [34.46, 201.29] 267.92 [0, 1585.00]
Pelabuhan Klang 70.24 [37.00, 450.13] 57.48 [0, 522.00]
Shah Alam 61.53 [34.33, 248.29] 213.27 [0, 1782.00]
Kuala Selangor 55.31 [23.58, 214.75] 4.71 [0, 63.00]
Banting 63.69 [40.00, 264.54] 8.63 [0, 89.00]
Table 2.

Best model for each study zone selected using Bayesian information criterion (BIC).

Study zone Best ARIMA model BIC value Best ARIMAX model BIC value
Petaling Jaya ARIMA (7,1,9) -964.5606 ARIMAX (7,1,9,0) -958.7241
Shah Alam ARIMA (2,1,1) -1.0075e+03 ARIMAX (2,1,1,0) -1.0030e+03
Pelabuhan Klang ARIMA (3,1,1) -858.3920 ARIMAX (3,1,1,0) -853.7578
Kuala Selangor ARIMA (0,1,2) -800.6577 ARIMAX (0,1,2,0) -796.2189
Banting ARIMA (0,1,3) -519.0362 ARIMAX (0,1,3,0) -516.1652

[i] ARIMA, Autoregressive integrated moving average, ARIMAX, ARIMA with an exogeneous variable.

Table 3.

Best Bayesian information criterion values for Kuala Selangor autoregressive integrated moving average with an exogeneous variable models for dengue cases with air pollutant index measurements from this zone and its closest spatial neighbouring zone, ShahAlam.

Pairing station Shah Alam Kuala Selangor BIC values for models at
Lag 0 Lag 1 Lag 2
Lag 0 –790.5229 –786.1049 –778.9426
Lag 1 –785.9545 –780.2422 –773.0930
Lag 2 –779.3796 –773.7084 –767.9281

[i] BIC, Bayesian information criterion.

Table 4.

Best Bayesian information criterion (BIC) values for each zone with respect to their nearest spatial neighbours.

Study Zone Closest spatial neighbour(s) Best BIC value(s)
Petaling Jaya Shah Alam –952.92
Shah Alam Pelabuhan Klang, Petaling Jaya –998.25, –997.33
Pelabuhan Klang Shah Alam, Kuala Selangor –848.68, –848.01
Kuala Selangor Shah Alam –790.52
Banting Pelabuhan Klang, Petaling Jaya, Shah Alam –511.61, –511.04, –511.82
Abstract views:


Article Metrics

Metrics Loading ...

Metrics powered by PLOS ALM

Copyright (c) 2018 Loshini Thiruchelvam, Sarat C. Dass, Rafdzah Zaki, Abqariyah Yahya, Vijanth S. Asirvadam

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© PAGEPress 2008-2018     -     PAGEPress is a registered trademark property of PAGEPress srl, Italy.     -     VAT: IT02125780185     •     Privacy