The obese percentage of the world population is increasing (OECD, 2013; Han et al., 2016). Obesity can contribute to the development of several diseases, including cardiovascular disease, cancer, and type-2 diabetes. Low energy expenditure and high food intake lead to obesity (Kumanyika et al., 2002). As energy expenditure and food intake are potentially influenced by socioeconomic status (SES), many studies have investigated the relationships between the degree of obesity measured in terms of body mass index (BMI) and SES, such as age, gender, region of residence and education (Addo et al., 2009; Beydoun and Wang, 2009; Stommel and Schoenborn, 2010). Research has lately been expanded to investigating the relation between obesity and other factors, such as alcohol consumption (Kang et al., 2013), smoking (Pouliou and Elliott, 2010), leisure activity (Ross et al., 2007), and stress (Mak et al., 2015).
Ordinary least squares (OLS) regression analysis has been used to identify these relationships (Hojgaard et al., 2008). However, after recent studies identified that obesity has spatial dependence, spatial analyses were conducted to further understand its spatial dependence (Drewnowski et al., 2009; Duncan et al., 2012). Such studies used the global Moran’s I statistic (Chen and Wen, 2010; Drewnowski et al., 2014), the spatial error model (SEM) (Drewnowski et al., 2009; Duncan et al., 2012), the spatial lag model (SLM) (Slack et al., 2014), and the local indicator of spatial autocorrelation (LISA) (Myers et al., 2015). From these empirical studies, spatial regression analyses have taken into account the spatial dependence to investigate the relationship between obesity and several factors, such as SES (age, gender, and employment), educational level, food environment, and smoking. However, these studies did not focused on the relationship with variables concerning local health policy which could also play a role.
Many governments and local councils have developed and enforced their own regional public health care plans (Bullough et al., 2015). In Korea, regional public health and medical care plans are outlined in the Regional Public Health Act (Kang and Sohn, 2016). One of these care plans is Management of public sports facilities. The term public sports facility is defined as a physical environment located in an area. These facilities are established to promote healthy lifestyles of local people via effective and safe physical activities.
In this study, we investigated in a local area whether public sports facilities established in lieu of regional public health plan have indeed contributed to the reduction of the obesity of adult residents. Various sociodemographic variables were used as control variables. Our spatial analysis consisted of three steps: calculating the global Moran’s I statistic, creating the LISA and comparing different types of spatial regressions using the spatial lag model; the spatial Durbin model; the spatial error model; the spatial Durbin error model; and the general spatial model. We utilized the data obtained from the Fifth Korea National Health and Nutrition Examination Surveys (KNHANES V-3) conducted in 2012.
Policies for encouragement of participation in sports have been implemented in many countries (Downward et al., 2009; Sotiriadou, 2009; Downward and Rasciute, 2010). The importance of sports infrastructures has been emphasized for the encouragement of participation in various sports (Wicker et al., 2013). To verify the effect of such infrastructures, many researchers have analyzed the relationship between sport infrastructure and sports participation by considering various variables.
Sallis et al. (2000) reviewed a broad range of literature investigating what make people involved in physical activities. As a result, the authors concluded that high accessibility of facilities or programmes influences the physical activity of children and adolescents positively. However, also conflicting opinions on the effect of participation in sports have been heard. Niclasen et al. (2012) showed that proximity to sports facility is positively associated with a high level of vigorous physical activity, while it is negatively associated with moderate to vigorous physical activity in Greenlandic adolescents. In Dutch adolescents, Prin et al. (2010) concluded that access to sports facilities is not a sufficient condition but a precondition to promote physical activity. Furthermore, although Holman et al. (1996) identified accessibility of a sports facility as an important factor in encouraging physical activity, Stahl et al. (2001) demonstrated that supportive physical environment is not associated with physical activity, and Van Lenthe (2005) stated that while residents with lower income participate more in walking and cycling in general, they participate less in sporting activities.
Individual determinants such as age, income and education level as well as proximity to sports infrastructures have also been considered as affecting participation in sport (Berger et al., 2008; Downward and Rasciute, 2010; Ruseski et al., 2011). Recently, Wicker et al. (2013) analyzed the relationship between detailed characteristics of sports infrastructures as well as the individual determinants and sporting through multi-level analyses. They considered the size of sports areas, the number of swimming pools, the number of track and field arenas and the detailed characteristics of sports infrastructures. O’Reilly et al. (2015) discussed the characteristics of the infrastructure in more detail complementing the work by Wicker et al. (2013) with added variables, such as the year it was built or renovated, the types of food services available as well as the facility size and the number of pools and rinks.
In addition, many studies aimed to identify whether the participation in sports influences on the reduction of obesity. Hojgaard et al. (2008) analyzed the relationship between individual variables, including participation in sport activities and waist circumference. Since then, some researchers recognized the spatial dependence in variables representing the extent of obesity and adopted spatial regression for its study. Chen and Wen (2010) confirmed the relationship between individual variables including physical inactivity and BMI through in this way, and Slack et al. (2014) analyzed such relationships in terms of recreational, economic and health context. Many studies like these has been carried out to identify such relationship. In our study, we analyzed whether public sports facilities are positively associated with the reduction of obesity in the public policy context. Intrinsic influence of public sports facilities was analyzed after controlling for some individual determinants and considering spatial dependence of regional obesity.
Materials and Methods
We used the survey data obtained from the KNHANES V-3 2012 provided by the Korea Centers for Disease Control and Prevention (Korea CDC). For the survey, the Korea CDC extracted 192 primary sampling units named enumeration districts from 3,479 administrative districts across the country. These data were collected based on the resident registration population in 2009 and a survey of apartment prices in 2008. Next, Korea CDC sampled 3,840 secondary sampling units, i.e. households, from the sampled enumeration districts and conducted a survey of all household members. Although the final sampling unit was the household, personal addresses are not available to the public. Therefore, in this study, data analysis was conducted at the enumeration district level. We used only the KNHANES V-3 data because the Korea CDC extracts enumeration districts differently every year. Among the data provided by the Korea CDC, we eliminated research subjects under the age of 19 in order to restrict our subjects of interest to adults. The final sample for this study was 5,436 adults consisting of 2,248 males (41.35%) and 3,188 females (58.65%).
All information used was self-reported by the KNHANES V-3 respondents. We used BMI, calculated as the weight in kilograms divided by the square of the height in meters, as the dependent variable. Independent variables included in the models are public sports facility variables from the Korean Statistical Information Service (KOSIS). Public sports facility variables were represented by the number of public sports facilities (as this supplies information about the absolute effect of public sports facilities) and the number of public sports facilities per 10,000 people (which provides information about the relative effect considering the population). As sociodemographic characteristic variables we used years of education, the average monthly household income and number of members in the household as these factors are usually considered as explanatory variables for BMI (Wen et al., 2010). As control variables, we set i) self-rated health status variables; ii) physical activity. For health related variables, we used average hours of sleep, waist circumference, drinking frequency and smoking status. These values were generated from answers to the KNHANES V-3 questions about health status. Drinking frequency and smoking status were used as categorical variables. The activity variables were based on the intensity, duration and frequency of physical activity and were represented by vigorous physical activity days, walking days, muscle-strengthening activity days and flexibility activity days.
We analyzed the spatial association of public sports facilities with BMI at the enumeration district level by re-arranging data from the individual level to the enumeration district level. For each enumeration district unit, 28.5 participants were included on average. The number of survey participants at enumeration district level is displayed in Figure 1. The summary statistics of the continuous and categorical variables at the individual level (n=5,436 adults) and the enumeration district level (n=191 units) are shown in Tables 1-4.
We conducted a three-step statistical analysis process starting with calculation of the Global Moran’s I statistic followed by a LISA analysis and a comparison analysis of spatial regression models. In our analysis, the global Moran’s I statistic was essential. If significant, spatial dependence in the dependent variable would be implied signalling that we could proceed to the spatial regression models. After that the LISA analysis was conducted for exploratory spatial data analysis, we determined the best model to account for the relationship between BMI and the independent variables, i.e. the OLS model, the SLM model, the spatial Durbin model (SDM), the SEM model, the spatial Durbin error model (SDEM) and the general spatial model (GSM) (Anselin, 2013).
The global Moran’s I statistic is used as a measure of the overall spatial autocorrelation by testing the null hypothesis that no spatial correlation exists in the distribution of the dependent variable. If the hypothesis is rejected, this statistic supports either clustering (homogeneity) or dispersion (heterogeneity). A global Moran’s I statistic near +1 indicates heterogeneity, while that near -1 indicates homogeneity (Anselin and Bera, 1998). In this analysis, we constructed a row-standardized spatial weights matrix based on radial distance (threshold distance=80km). We selected the radial distance with the most significant global Moran’s I statistic (i.e., that with the lowest P-value) by increasing the radial distance at 5-km intervals.
After conducting the global Moran’s I test examining the overall spatial autocorrelation, the LISA analysis was used to investigate local spatial autocorrelation (Anselin, 2004). The LISA analysis intrinsically measures the statistical correlation for the value of one area with values of nearby areas. A LISA value close to zero implies little or no statistical correlation among the neighbourhoods, while a value near +1 indicates a perfect positive spatial autocorrelation (clustered together by high or low values). A LISA value near –1, on the other hand, means a perfect negative spatial autocorrelation (checkerboard pattern) (Moran, 1950). For each enumeration district, the relationship between BMI and mean BMI of a given neighbourhood was calculated. Using the relationship with each neighbourhood, this relation falls into four categories: i) high-high (HH) clusters categorized by high BMI values associated with high BMI neighbours; ii) low-low (LL) clusters indicating low BMI values associated with low BMI neighbours; iii) highlow (HL) outliers implying high BMI values with low BMI neighbours; and iv) LH outliers, which exhibit low BMI values with high BMI neighbours. To test LISA significance, we used a Monte Carlo permutation approach. This method assumes that the spatial data are likely to be similarly observed for any location. The spatial data values are randomly shuffled across all places, and the LISA value is recalculated for each Monte Carlo permutation. The significance of the LISA analysis was determined by conducting a reference distribution over 999 random permutations. The final step included a comparison of the six regression models mentioned above and expressed as follows:
where y is an n×1 vector of dependent variable (BMI in our analysis) and X and β have conformable dimensions for k exogenous covariates including a constant. ε is a vector of error terms; ρ a coefficient on the spatial lag of the dependent variable; W a n×n spatial weight matrix; θ the regression parameter reflecting the influence of the spatially lagged explanatory variables on variation in the dependent variable y; and λ a coefficient on the spatial lag of the error term.
In order to reflect the spatial autocorrelation, various spatial regression models were used in this study (Han and Sohn, 2017). The spatial dependence was incorporated using a spatially lagged dependent variable (Wy) in the SLM; using spatially lagged dependent variable (Wy) and independent variable (WX) in the SDM; using a spatial lag of error term (Wu) in SEM; and using spatial lags of error term (Wu) and independent variable (WX) in the SDEM (Anselin, 2013). The GSM is a combination of the SLM and the SEM (Anselin, 2013). In order to choose the best model, we utilized the Akaike information criterion (AIC) (Akaike, 1974; 1998) and the log-likelihood (Huelsenbeck and Crandall, 1997) of the regressions OLS, SLM, SDM, SEM, SDEM and GSM.
We used two open source softwares: GeoDa (https://spatial. uchicago.edu/software) and R (https://www.r-project.org/). GeoDa was used to figure out the spatial distribution of variables and to create LISA significance and cluster maps. R was utilized to conduct the Moran’s I test, the LISA analysis and the spatial regression analysis.
Moran’s I statistic values for the dependent variable and residual were 0.022 (P-value: 0.083) and 0.098 (P-value<0.000), respectively telling us that there existed a positive spatial autocorrelation in BMI and residual.
Performing the LISA analysis, we found that 41 enumeration district units (21.47%) were spatially autocorrelated at a significance level of 0.05. In addition, 16 HH clusters (8.38%), 7 HL outliers (3.66%), 2 LL clusters (1.05%), 16 LH outliers (8.38%) and 150 non-significant areas (78.53%) were identified. The HH clusters, which indicate clustering of similarly high BMI values, were usually located in Midwest Korea. The BMI values at the enumeration district level, the LISA significance map for BMI and the LISA cluster map for BMI are displayed in Figures 2-4. When we compared the performance of the six models (OLS, SLM, SDM, SEM, SDEM and GSM), the SEM model was selected as the best model. This result indicated that a regression analysis of BMI in Korea should consider the spatial dependence in the error term. Table 5 shows the comparison of the six regression models and Table 6 shows the results of OLS and SEM.
In the OLS model, the number of Members in household was found to be statistically significant at the 10% level; however, this variable was not significant in the SEM model. For the variables Flexibility activity days, Waist circumference and Smoking status, non-smoking contrasted with the reference level; for Drinking frequency, more than 4 times per week contrasted with the reference level; Age, Years of education and Muscle-strengthening activity days were statistically significant at the 10% level in the OLS model as well as in the SEM model. The increases in Flexibility activity days and Waist circumference, Non-smoking and Drinking more than 4 times per week tended to be associated with an increase in BMI, while the increases in Age, Years of education and Muscle-strengthening activity days tended to be related to a decrease in BMI.
Obesity remains a global public health concern. Both central and local governments focus on obesity prevention by enforcing their own regional public health care plans. The management of public sports facilities is one of the local council plans in Korea. For councils, it is important to identify the effects of the number of public sports facilities on regional obesity. In this study, we investigated the spatial characteristic of BMI and the spatial effects of the number of public sports facilities on the average BMI in Korean regions.
First, we identified that Korea exhibits spatial dependence in BMI using the global Moran’s I statistic. In addition, the residual of OLS model for the BMI showed significant positive spatial autocorrelation. Next, the spatial clusters and spatial outliers were then determined using a LISA analysis. This showed that Midwest Korea had high BMI in general, while the Seoul metropolitan area included several HH clusters and LH outliers.
Finally, we conducted various spatial regression models, and identified that the number of public sports facilities is not significantly related to BMI. This result could be caused by the low utilization ratio of public sports facilities and the unbalanced spatial distribution in number and the kinds of public sports facilities available. Therefore, councils try to i) increase the quality of public sports facilities; ii) investigate the preferred types of public sports facilities and push for their establishment; and iii) incorporate the accessibility and needs of local residents in the selection and optimal location of new public sport facilities.
We set the unit of analysis as enumeration district and not as participant residence in this study because personal information was protected. Although our results are significant, data considering participant’s residences would provide more accurate and meaningful results. Further studies analyzing more detailed and abundant data may provide results that are more accurate.
The decreases in age, years of education, and muscle-strengthening activity days, the increases in waist circumference and flexibility activity days, non-smoking status, and drinking more than 4 times per week were found to be significantly associated with the increase in BMI according to the SEM model. Based on these results, we can suggest doing muscle-strengthening activity more than doing flexibility activity to achieve a healthy BMI. In addition, shorter waistlines and reduced drinking would be recommended to people in general as a change in this direction would lead to a healthier lifestyle. Young age, low education level and non-smoking status need to be further investigated in terms of the relation with BMI.