Infectious diseases, such as enteric and respiratory diseases, are frequently transmitted through the environment. This process is often conceptualized with the F-diagram, a model of the transmission of pathogens through one of six environmental reservoirs: fluids, fingers, fields, foods, flies, and fomites (Figure 1). The F-diagram describes the movement and fate of pathogens as a process originating from an infected person and moving through the environment to a susceptible person (Kawata, 1978). The process may be described as linear (i.e., open defecation contaminates drinking water, Figure 1, arrows A→D) but is frequently more complicated. For example, hand contamination from changing an infant’s diaper is transferred to stored drinking water during collection and subsequently ingested (Figure 1, arrows B→C→D).
Implicit in the F-diagram, via the arrows connecting the environmental reservoirs, are human-environment interactions, which describe the impacts of a person’s movement through and contacts with the environment on pathogen transmission. The interactions relevant to disease transmission include both those of the infected person and the susceptible person. Humanenvironment interactions that influence disease transmission occur across all scales, from large (i.e., global migration patterns) (Sorichetta et al., 2016) to small (i.e., a person’s individual contacts with contaminated surfaces) (Julian and Pickering, 2015). Accounting for these interactions helps in the understanding and control of environmentally-transmissible diseases.
Human-environment interactions can be quantified using diverse methodologies that each provide value in understanding infectious disease transmission. Amongst the most common methodologies are: i) mining census data; ii) surveys; iii) diaries; iv) structured observations; v) electronic tracking (i.e., radiofrequency identification, GPS, accelerometers); vi) videography. Videography methods to capture interactions are playing an increasingly important role in assessing the contribution of environmental pathways of pathogen transmission.
As relatively low-cost and simple methods, surveys, diaries, and structured observations have been broadly applied to understand human behavior. Surveys are retrospective questionnaires, and are increasingly taking advantage of mobile technologies (i.e., mHealth) to disseminate information, survey health, and track disease (World Health Organization, 2011). Diaries encourage self-observation and recording of targeted behaviors (i.e., locations, activities, diet, and/or social networks) (Read et al., 2008). Structured observations shift data collection from the study participants to trained observers who monitor people’s behaviors in real-time. Structured observations have been used to, for example, quantify environmental exposures (Teunis et al., 2016) and water, sanitation, and hygiene behaviors (Ram et al., 2010). Despite their widespread use, these tools often introduce bias (i.e., social desirability, recall, reporting, and/or observer bias), require a priori knowledge of which behaviors to target, and provide only limited data resolution.
Automated tracking technologies (i.e., radiofrequency identification devices (RFID), Global Positioning System (GPS) tracking, accelerometers, and pressure sensors) have been employed to study infectious disease transmission. For example, RFID and GPS have been used to monitor and quantify social contacts and accelerometers have been used to monitor hygiene intervention compliance (Ram et al., 2010). Sensors for recording human-environment interactions can collect large amounts of data over extended time-scales with little investment. However, human behavior monitoring with tracking technologies remains constrained by technological limitations. GPS, for example, provides data on location, accelerometers track local movements, and pressure sensors track changes in flow or volume – useful for monitoring water usage. Extending data to human-environment interactions requires inferences about the interpretability of the limited data.
Videography offers the opportunity to record and quantify human-environment interactions in high resolution through the collection, annotation, and analysis of digitized data. Unlike surveys, diaries, or structured observations, videos provide a permanent objective record. The videos can be reviewed to ensure accuracy, quantify inter-observer variation of targeted activities, and/or observe multiple concurrent phenomena independently. The use of multiple data collectors annotating the same video may help reduce observer bias. When used appropriately, hidden capture techniques may reduce social desirability and/or reactivity. Videos can also capture data that would be difficult or impossible to observe in real-time, such as hand configuration during handobject contacts (AuYeung et al., 2008).
Since videographic methods may capture personal or private information, culturally sensitive data protection and management plans should be developed and institutional review board approval should be sought.
Videos may be collected via third-person perspective, firstperson perspective (wearable), or hidden cameras. Depending on the collection method, videographers may need to stay available on-site during recording. Hidden and third-person perspective cameras may involve stationary cameras tracking a single location (i.e., observing hand hygiene stations) or may be carried by a data collector. First-person perspective cameras may be worn either on the head or chest to capture, for example, hand contact events, mouthing events, and/or social contacts (Julian and Pickering, 2015). Limitations to recording length (e.g., storage, battery life) need to be considered in protocols.
An important challenge in the use of videography is the need to annotate, or translate, the video into a digital record of events that can be analyzed downstream. Annotation by hand follows methods employed by structured observations, where protocols are developed for the observation and recording of specific, targeted events (i.e., monitoring timing and frequency of handwashing (Ram et al., 2010)). Alternatively, several software packages have been employed to facilitate video annotation.
The recently updated Virtual Timing Device for the Personal Computer [VTDPC] (University of Arizona, Tucson, AZ, USA) was developed to quantify human-environment interactions from previously obtained videos (Zartarian et al., 1997; Ferguson et al., 2006). In brief, a video is annotated by the software user using a palette of customizable categories of actions or objects. For example, an action category may include child’s hand in mouth, object in mouth or touching; an object category may include plastic, food, or soil; and a location category may include indoor, outdoor or more specific locations. The integrated video plays when objects from each category are selected and can be paused by unselecting one or more objects. VTDPC records the items selected in each category as well as the duration of the selection. The resultant time-series of contact events is often referred to as microlevel activity time series [MLATS] (Zartarian et al., 1997; Ferguson et al., 2006). The output is a sequential list of selections from within each category, the duration of the selection, and associated metadata (i.e., date, time, data collector’s ID).
LiveTrack (Stanford University, Stanford, CA) was recently developed to complement VTDPC for annotation of events observed in real-time (i.e., structured observations). In brief, LiveTrak is a tablet-based software package that provides similar functionality to VTDPC through the a priori creation of a palette containing categories of objects. Like VTDPC, items are selected from each category and the output is a list of selections from within each category with the duration of each selection and associated metadata (i.e., date, time, data collector’s ID).
Annotation of videos is expensive in terms of labor, in part because of the typical requirement for repeat viewing. For example, 4-5 hours of annotation may be required for one hour of video. The development of computer-vision algorithms for automatic videographic annotation is thus a promising and rapidly advancing field that could alleviate these issues and improve privacy protection. However, automated annotation of natural movements of objects, like hands in uncontrolled environments, remains challenging.
Videographic analysis, similar to methods reliant on trained enumerators, are subject to inter- and intra-observer variation (Ferguson et al., 2006). Protocols that include training and inter- or intra- observer comparison are needed to reduce observer error and misclassification.
Analysis of the annotation output can be relatively simplistic, typically involving the use of summary statistics of frequency and duration of targeted events (Zartarian et al., 1997; Clack et al., 2017). More complex analyses utilize the micro-level activity time series to model sequential events for use in temporal exposure, dose, and risk assessment (Canales and Leckie, 2007; Julian and Pickering, 2015).
In the accompanying video we motivate and highlight the use of videography for the field of infectious disease transmission through demonstrations of VTDPC and LiveTrak. We demonstrate the capture of hand contact events using the software, which are otherwise difficult to observe through traditional methods. The example focuses on human-environment interactions at small, local scales. However, videography is readily extensible to other scenarios, including: i) human and/or animal migration patterns, social interactions and/or geographic distributions; ii) behaviors and practices around water, sanitation, hygiene, and/or energy; and/or iii) performance of targeted activities for efficiency and/or safety evaluations.
To conduct accurate health risk assessments and better control infectious disease transmission a greater understanding of humanenvironment interactions is needed. Here, we describe using videography and specialized software to better quantify human-environment interactions. Two specialized programs include VTDPC, which collects sequential microlevel activity time series of contact events and interactions, and LiveTrak, which is optimized to facilitate annotation of events in real-time. Opportunities to annotate behaviors at high resolution using these tools are promising. Further breakthroughs in computer-vision algorithms for annotation will be useful to fully unleash the power of videography in quantifying human movements for diverse applications. Windows Movie Maker and Microsoft PowerPoint were utilized in creating media for this work.