Public health researchers have started to pay attention to built environment (BE) with regard to preventive approaches against the global pandemic of obesity and related health problems resulting from physical inactivity (Lee et al., 2012; Sallis et al., 2016). Previous studies of relationship between BE and public health have, however, predominantly been conducted in relatively lowdensity city contexts (Rydin et al., 2012; Barton and Grant, 2013). The is now evidence of uncertainty due to spatial scales and conceptualizing measures (Clark and Scott, 2014), an issue which becomes amplified in high-density cities such as Hong Kong, Tokyo, Mumbai and Shanghai (Low et al., 2016). These megacities are widely known for their geographic and morphological BE heterogeneity, but the uncertainty in relation to constructing and interpreting BE measures needs further examination.
According to Longley et al. (2015), uncertainty arises since most representations of the world are incomplete, erroneous, out of date, subject to generalization or aberrant. The absence of complete BE data of high-density cities often hinder inferring the relationship between BE and health outcomes. For example, a crossnational study reported difficulties in measuring accessibility of public transport because of incomplete bus records in Bogota and Hong Kong (Adams et al., 2014). With efforts focused on the development of street audit instruments to collect objective BE data (Sun et al., 2017), current indicators collected by audit instruments are unlikely to reveal variability of locations (Cerin et al., 2013). Only single scales have been used in studies of Hong Kong, for example research at level of the block (an area bounded by roads and their intersections) for body constitution and sedentary behaviour (Low et al., 2016) or a residence-based buffer and a 15- minutes walking radius for walking and physical activity (Cerin et al., 2013; Lu et al., 2016).
Geographic information systems (GIS) are widely used to measure BE, but conceptualizing such indicators varies considerably across disciplines. For example, street design can be measured and presented as different indicators, i.e. block length, block size, intersection density, street density, connected intersection ratio or link node ratio (where node signifies the point where roads meet or cross) (Berrigan et al., 2010; Cerin et al., 2011; Sallis et al., 2016). However, there is a lack of clear and precise operational definitions of BE measures (Handy, 2005; Forsyth et al., 2006) together with limited information issuing from uncertainty analysis of such indicators in high-density cities. Kwan (2012) notes that important ambiguities arise in measuring geographical context due to the spatial and temporal uncertainty of where, when and how long individuals experience environmental influences. Spatially, the uncertainty relates to the discussion of geographic scales in measuring spatial phenomena, as the modifiable areal unit problem (MAUP) of arbitrarily defined boundaries (Clark and Scott, 2014; Longley et al., 2015). Lack of attention in consistency, validity and reliability of spatial scale of the measures could result in mischaracterization of environment exposure of subjects (Handy, 2005; Brownson et al., 2009).
Hong Kong is a typical high-density city, where little research has been conducted on uncertainty analysis of BE indicators and health associations. We aimed to investigate the reliability of a wide spectrum of measures in this city charactering high-density BE in urban health studies across size and morphology of spatial measurement units thereby deriving urban health indicators. The study is expected to provide insights to analyzing uncertainty in BE measures in public health studies with rigorous stratified location sampling from a territorial-wide cohort.
Materials and Methods
The 5-D (density, diversity, design, destination accessibility, and distance to transit) internationally recognized framework (Ewing and Cervero, 2010; Ewing et al., 2015) was used as design conception for constructing BE indicators based on a comprehensive dataset collected from governmental and private sectors. The 5-D approach includes density, measured as the variable of interest per unit of area; diversity, pertaining to the number of land use classes in that area; design, measured as street patterns and related characteristics; destination accessibility, indicating ease of access to trip attractions; and distance to transit, measured as the number of stations or stops per unit area.
Our measures concerned a representative sampling of residential addresses from the Hong Kong FAMILY Cohort which consists of a composite sample from several sources including a population representative random-core sample (Leung et al., 2015). The cohort covers almost all Hong Kong neighbourhoods (99.8%) enabling detailed spatial epidemiological studies linking BE to health and well-being at the individual, household and neighbourhood levels. Cohort studies take a holistic view of health, investigating its different dimensions including socio-demographics, anthropometrics, lifestyle and behavioural factors, measures of social capital and biomaterials, looking for effective public health and preventive approaches to improving physical, mental and social well-being (Leung et al., 2015).
Population and study area
Hong Kong is one of the most densely populated places in the world. According to the latest available census released from the Census and Statistics Department, Government of Hong Kong Special Administrative Region (SAR) (https://www. censtatd.gov.hk/hkstat/sub/so20.jsp), the inhabitation at the end of 20017 was over seven million people in an area covering 1068 km2. The ubiquitous building design in Hong Kong is a podium serving as a base platform with 2 to 4 floors, with several high-rise residential towers built above the podium (Shelton et al., 2011). More than 75%t of the land comprises of non-BE areas, while most BE exists between the waterfront and the mountains. The dwelling density of Hong Kong urban area is over 1,250 units per hectare. The shortage of flat land together with high land values have prompted buildings to develop vertically (Shelton et al., 2011). Six spatial buffering and zonal techniques and scales were used for each BE measure in order to understand their performance in local contexts.
Table 1 shows a summary of data types and sources collected from Hong Kong governmental and private sectors and stored in shapefile format, including building outlines and heights, street network, destinations from points of interest (N=329,644) and census tracts. The building density as floor area ratio, i.e. the ratio of a building’s total floor area to the size of the piece of land upon which it is built, measured using the street-block census tract, is shown in Figure 1. Most land parcels in northern Hong Kong Island and Kowloon have a building density >5, which means that if the land of a block census tract is completely built, the number of floors is greater than 5 with the fifth quantile of building density being between 5 and 108 floors (Figure 1).
The residential addresses of FAMILY Cohort participants as of 2011 were geocoded. The detailed study design of the cohort has been described by Leung et al. (2015). The cohort comprised of 20,279 households and 46,001 participants. In this study, we used 5732 geocoded home addresses which excludes repeated addresses (different apartments) within the same building.
Design of built environment measures
Land use patterns refer to the spatial distribution of areas devoted to different purposes. Using the 5-D model, we measured three main dimensions of the high-density built environment: land use patterns, transport and urban design (Handy et al., 2002). The former was estimated, first by calculating the building density (i.e. total floor areas of podium and residential towers divided by the catchment area) and then destination accessibility, which were classified into eleven categories as displayed in Table 2. The destination accessibility measures were constructed using density and quantity where the density measure was derived from the quantity of accessible destinations divided by the catchment area.
We also constructed a measure of the mixture of destination and land-use pattern. The formula used to calculate this mixture is a variation of the entropy formula used by Frank and his colleagues (Frank et al., 2005).
where DM is the mixture index, di the percentage of destination category i within the study participant’s catchment area and n the number of destination categories.
We used street connectivity to measure urban design and transport, constructing several measures, including average and median lengths of a block (influenced by the design of the street network, density of the street intersections (a measure of network connectivity) and the link-node ratio (indicator of connectivity, which is equal to the number of links divided by the number of nodes within in a study area.
The shorter the block length, the more connected the community (Leslie et al., 2007), and the higher the intersection density or the link-node ratio, the more connected the network. The transport infrastructure accessibility is measured as the number of density and quantity of accessible transit stops, including public bus stops, metro entrances and ferries.
Each BE measure was calculated at six spatial scale signified by: street-block group (SG), primary adjacency community (PAC), two circular buffers (CBs) and two network-based service area buffers (SAs) defined as shown in Figure 2 and Table 3. The SG and PAC are zonal buffers consistent with urban design features and statistical census tracts. CBs and SAs are less related to morphological phenomena and more to behaviour using the crow-fly and network distances. In a literature review of studies using behaviour buffers to describe spatial contexts, 65% used CBs while the rest used network-distances (Leal and Chaix, 2011).
Constructing the built environment measures
Density (1 indicator), diversity (1 indicator), design (4 indicators), destinations accessibility (22 indicators) and distance to transit (2 indicators) were calculated in the six spatial scales producing a total of 180 BE indicators, extracted for study participants of the FAMILY Cohort using GIS (ArcGIS 10.31, ESRI, Redlands, CA, USA) and Python scripting with 64-bit desktop background geoprocessing (https://blogs.esri.com/esri/arcgis/ 2012/11/12/python-scripting-with-64-bit-processing/). Using GIS, we also mapped a destination mixture indicator and a density/ quantity indictor of retail (as a destination) across spatial scales as examples to visualize the uncertainty in the design, choice and use of BE measures.
Descriptive statistics using Stata (MP 14, StataCorp LP, TX, USA) were applied to the BE measures and Spearman’s rank correlation calculated to assess how well the relationships between two variables were persevered across spatial scales. We used various categories to evaluate rank preservation (Strominger et al., 2016) as follows: i) correlation ≥0.7 indicates a well-preserved index; ii) correlation ≥0.5 to 0.7 indicates a moderately preserved index; iii) correlation ≥0.3 to 0.5 indicates a weakly preserved index; iv) correlation <0.3 indicates that the index is not preserved across scales.
The results from the descriptive statistics are shown in Tables 4-6. Judged by the density measures the building density varied between the spatial scales. SA_400m showed the highest building density (4.0). It was different from the CB_400m (2.67) but similar to the SG level (3.77). CB_800m indicated the lowest building density (1.85) which was only half of the counterpart using the SA buffer. Judged by the diversity measures, the destination mixture was the greatest at the SG level (0.48), followed by SA_400m (0.42), PAC (0.36), SA_800m (0.36) and CB_400m (0.34). CB_800m (0.28) had the smallest diversity in mixture of the destinations/ services.
Judged by the design measures, the median block lengths were not consistent with the average block lengths. For example, the median length was 80.2 m compared to a mean of 118.0 m at the SG level. SA_800m had the smallest median length (51.0 m) among all spatial scales, while SG had the longest block length both with respect to the median and average measures. SA buffers had greater density of street intersections than the CB counterparts (e.g., 0.00145 for SA_400m versus 0.00038 for CB_400m). Intriguingly, SG having the lowest connectivity measured using median bock length, that is having the longest block length (80.20), is reversed to have the highest connectivity measured using link-node ratio (0.85).
Judged by the destination accessibility measures, SG had the smallest quantity of destinations in most of the destination categories, while CB_800m held the largest quantity of destinations and showed striking differences between scales in categories of small business (5.62 versus 209.00), retails (21.70 versus 579.00), companies (10.60 versus 1210.00), and restaurants (19.10 versus 451.00). The quantities of destinations in SA_400m was less than half the quantity of destinations in CB_400m counterparts (e.g., educational (16.80 versus 43.00), entertainments (28.70 versus 70.80), and restaurants (68.10 versus 162.00)). The differences in destination quantities were offset or even reversed compared to the destination density measures. For example, retail density between SG and CB_800m was reversed compared to the absolute measure: quantity was 21.70 versus 579.00, while the density was 0.00053 per m2 versus 0.00029 per m2. Generally, SG had the largest density of destinations of all the categories (except for the company category).
Judged by the distance to transit measures, SG and SA_400m had the largest density of transit (0.00006), while CB_800m had the smallest (0.00003). The quantity measures preserve the density ranking across spatial scales, with CB_800m having the largest amount of transit, 68.4 stops being within accessible distance on average compared to 4.5 at the SG level.
Mapping built environment measurement uncertainty
We used a set of maps to show the diversity of destination mix contra density and quantity of retail as examples of variability in measures across spatial scales when measuring the high-density BE of Hong Kong (Figure 3). Destination mixture was evenly distributed across the territory using SG. There was no clear difference, on this measure, between new towns in the New Territories and the city centre on Hong Kong Island. On the contrary, the distribution of service mix was much more uneven using SA_400m. The areas with relative low service mix were in the northern part of Hong Kong Island (from Sheung Wan to Causeway Bay) and the linear area from Tsim Sha Tsui to Mong Kok (around Nathan Road) in Kowloon. These were both among the most densely populated areas in Hong Kong.
Quantity measures of the retail service were quite different comparing SG and SA_400m. Most of the places which had less than 20 retail stores in the block measure had more than 100 in the 400m network buffer measure. The densest places measured in the retails quantity were in similar areas that had the relative lower destination mix in the northern part of Hong Kong island and around Nathan Road. The variations in retail density between the two spatial scales were much less than with the quantity measure, except for a few places such as the eastern part of Hong Kong island and the linear area from Hung Ham to To Kwa Wan in Kowloon. Similar degrees and patterns of uncertainty over spatial scale were found in most of the other destination categories referred in Tables 4-6.
Statistical analyses of variability/uncertainty
Spearman’s rank correlations for BE measures across scales are shown in Table 7. Judged by the density measures, block-level building density correlated weakly with PAC, CB_400m, and SA_800m (ρ>0.30 and ρ<=0.50). It did not correlate with CB_800m (ρ=0.27). PAC had a weak correlation with SA_400m, a moderate correlation with SA_800m (ρ=0.55) and a strong correlation with CB_400m and CB_800m (ρ>0.70). CB_400m had a strong correlation with CB_800m, SA_400m, and SB800 (ρ>0.70). CB_800m had a weak correlation with SA_400m and a moderate correlation with SA_800m (ρ=0.61). A strong correlation was found with SA_400m and SA_800m (ρ=0.83).
Judged by the diversity measures, the mix of services had a weak correlation with SG and SA_400m, SG and SA_800m, PAC and CB_800m, CB_800m and SA_400m (ρ=0.32~0.48). There was no correlation between SG and CB_800m (ρ=0.17). A wellpreserved correlation was shown between SA_400m and CB_800m, CB_400m and CB_800m, and CB_400m and SA_800m (ρ=0.72~0.80).
Judged by the design measures, median block length was less consistent across spatial scales than average block length except when comparing SG and PAC (ρ=0.45 versus 0.40), and SA_400m and SA_800m (ρ=0.86 versus 0.84). The average block length exhibits weak to moderate correlations among the six spatial scales (ρ=0.34~0.67), with the exception of a non-correlated relationship between SG and CB_800m (ρ=0.23) and a high correlation between SA_400m and SA_800m (ρ=0.84). Link-node ratio showed extreme variability across scales. Most rankings were not preserved (ρ=-0.09~0.25), except weak correlations between CB_400m and SA_800m (ρ=0.32) and SA_400m and SA_800m (ρ=0.48) and a moderate correlation between SG and PAC (ρ=0.57).
Judged by the destination accessibility measures, rank correlations of service destination density measures were preserved better than service destination quantity measures in general. For example, density measures of accessible entertainments shown moderate to high correlation between scales (ρ=0.43~0.85), while quantity measures in SG were not correlated with CB_400m (ρ=0.11), CB_800m (ρ=-0.03), SA_400m (0.18), or SA_800m (0.11).
Judged by the distance to transit measures, we observed weak correlations for density measures of accessible transit comparing SG and the other scales (ρ=0.25~0.37). However, the SG measures of accessible transit counts were not preserved with the other scales (ρ= (-0.14) ~ (-0.02)) except with PAC (ρ=0.72). Similarly, PAC measures of quantity were not correlated with CB_400m, CB_800m, SA_400m, and SA_800m (ρ=-0.06~0.07).
This paper deals with our understanding of uncertainty, a long-standing but less studied issue in urban health, with a particular focus in high-density and heterogonous urban situations. We examined the uncertainty in measuring high-density BE, as typified by Hong Kong, with a view to understand the variability of BE measures across different design methods and spatial units. Our results reveal how variability affects the sensitivity of BE measures in health studies.
We constructed 30 BE indicators of land use patterns, transport and urban design around 5,732 geocoded residential addresses of Hong Kong FAMILY Cohort members. Each indicator was calculated at six spatial scales, commonly applied in health-related behavioural studies, adopting the 5-D BE framework to increase comparability with international studies. These urban morphology metrics were constructed based on a complete database; specifically, we measured destination/service diversity from a complete suite of points of interest in detailed categories. It is different from pervious diversity measures using mixture indices constructed at the parcel level (e.g. residential, commercial, or industrial levels) with abstract land use categories (Frank et al., 2005). The abstract land use at the parcel level is unable to capture the variations in highdensity environments with intense mixed-use development, which may lead to invalid findings regarding the linkages of BE-health outcomes (Lu et al., 2016). We posed a detailed taxonomy of the classification of services for destination accessibility measures, which can serve as a protocol for future studies.
We found a high variability in construction methods of BE indicators. For example, significant differences in the urban design dimension appear between block length (median and average) and street connectivity (density of intersections and link-node ratios), and between quantity measures of destination accessibility and their counterparts in density measures. Median block length and link-node ratio showed less preserved correlations in ranking than other measures of the design dimension, the median being more sensitive than the average for measuring variability of the block size. Measurement rankings can even be reversed in different design methods of indicators. For example, the lowest connectivity measured by the median bock length at the SG level was shown to have the highest connectivity in link-node ratio on the contrary, which may lead to extreme inconsistence when applying them separately in regression models. Quantity measures of service destination accessibility were found to be more sensitive than the service density measures. These uncertainties are somewhat in line with the findings in low-density cities (Mitra and Buliung, 2012; Clark and Scott, 2014; Strominger et al., 2016), but the uncertainty level is more pronounced in the high-density environment.
We found considerable uncertainties in measuring a high-density BE across spatial scales. Most of measures using the smallest census tract of Hong Kong, the SG scale, were quite inconsistent with other scales, indicating minimal correlation. The census tract scale may therefore be inappropriate for capturing variability of urban morphology in high-density building areas, although it has the advantage of correlating BE measures with census variables. This calls attention to health research using census tracts in similar high-density contexts. Likewise, we found none or weak correlations when comparing indicator rankings at the CB_800m levels with other scales. The poorly preserved correlations between CB and others may partially originate from the constraint layout of the urban area in Hong Kong, where developed land is squeezed between mountains and the sea. CB used in this morphological context is likely to include a large proportion of uninhabited areas of mountains or water body. Similarly, larger zonal PAC measures perform poorly because they tend to involve inaccessible spaces due to ownership or institutional, configurational, slope and other reasons. Models of BE-health associations using these scales could, therefore, fail to infer valid linkages between BE and health impact. On the contrary, a network buffer, which is less contaminated by inaccessible spaces, becomes a more appropriate unit for analysis.
The uncertainty demands reflection with regard to design methods and spatial scales when constructing BE indicators for health studies. Qualitative interviews of subjects would be required to gain a better understanding of activity space, thus informing selection of BE metrics to better reflect a subject’s geographic context. Taking the quantity and density of accessible destinations as an example, destination counts are better at reflecting a location’s vitality than destination density. However, subjects may perceive no difference in utility between neighbourhood environments with 10~20 restaurants and another with 20~30. It therefore matters that the two types of measures have significantly different rankings when measured at different scales and with different shapes. Density measures will be biased as an indicator of place viability, due to intrinsic lack of correlation with the more important destination count morphology.
Researchers need to use a scale that is appropriate to the behaviours investigated. When constructing a BE measure, subjects may be asked, for example, to actively draw what they consider their behavioural space to be, or passively reveal the ideal spatial scales for some particular choice set (e.g., active trajectories by GPS) (Tribby et al., 2017). An extension of our study would be to repeat the systematic comparison of BE measures across scales within the context of a BE-health associational study. In this way, the impact of using certain measures rather than others could be assessed in terms of correlation coefficients, not just in terms of the descriptive performance of the measures compared against each other.
Limitation and strengths
This study has several limitations, e.g., we did not consider the weights of different destinations/services in measuring the diversity of land use. Weighting would be needed when linking these indicators to health outcomes. For example, retail destinations may be more important for walking studies than small businesses, which need to be differentiated from companies. We used a street network rather than a pedestrian network for the urban design measures and there may be behavioural differences between those two networks (Sun et al., 2015). The study was conducted in a high-density urban environment, so generalization of the finding should be done with caution. However, the methodology can be reliably adapted to other high-density cities and elsewhere, with additional considerations. Finally, we only discussed static BE exposure measures based on residential addresses, while dynamic measures of exposure may be required to connect ‘activity spaces’ with exposures in different environments. Understanding the spatial and temporal variations of exposure is critical when dynamic exposures are required (Kwan, 2013; Burgoine et al., 2014; Tenailleau et al., 2015).
To the best of our knowledge, this is the first study to systematically profile variability of BE measures across scale and shape of measurement buffers in a high-density city. This is a notable strength as this approach is fundamental for research and practice for promoting healthy high-density cities. We can further explore the linkage of BE and health behaviours and disentangle the impact of BE on health through rigorous and solid measures, which represents evidence of the need for effective BE intervention in high- density cities for urban planners, policy makers and public health practitioners. We are working on building an open data web platform to support scholars who are interested in studies of BE-health associations in Hong Kong. The methodologies, including data collection protocols, measures, and modelling provide a benchmark for high density BE-health studies in Hong Kong and other high-density cities around the world.
We measured and classified attributes of the high-density BE based on a comprehensive urban dataset which is essential for testing health-related research hypotheses. Our findings suggest complete data, appropriate design methods and suitable spatial scales of measures are crucial in high density BE related health studies. Some indictors were found to be more robust than others. When high-density urban space is linearly constrained by uninhabited areas, as in Hong Kong, network buffers likely to retain more meaningful activity space than circular buffers while circular buffers and block-based geography, with the latter’s attraction of linked census data, may be more acceptable surrogates in homogenous lower density cities. This is the first study to systematically examine BE in a high-density city, which can be framework of measuring BE for healthy studies in other similar contexts.