Thursday, October 19, 2017

Generating Realistic Populations of Megacities and their Social Networks

In my last blog I wrote how the study of disasters needs to account for three interacting complex adaptive systems, the physical environment, the social environment and the individual cognitive environment. Agent-based experimental simulations of populations responding to disasters need  their synthetic populations to represent not just the individual characteristics of people, but also the social connections that influence their behavior. Tomorrow I will present a paper on how to generate such a population at the Computational Social Science Society of the Americas in Santa Fe, New Mexico.

The paper, co-authored with my fellow researchers at George Mason University, discusses how a mixed method of iterative proportional fitting and network generation to build a synthesized subset population of the New York megacity and region. Our approach demonstrates that a robust population and social network relevant to specific human behavior can be synthesized for agent-based models.

Generating Megacity Populations and their Social Networks

The New York Megacity and region includes very dense areas and rural ones.

We are creating a 1:1 population based on the US 2010 Census, and we use Iterative Proportional Fitting (IPF) for the population synthesis. This is a method in which the attributes of an individual unit are taken from a data set with fixed marginal totals. For each agent in the model a random sample is taken from a probability distribution of the relevant attributes existing in the population data. This process is repeated until all the attributes are assigned to the agent population.

In the last stage of the process we create social networks based on family, work and school connections. If a network tie in larger workplace and school populations, we created a small-world network from the potential pool of network connections. The result are thousands of network connections that extend individual and family social ties beyond their households. This shows a family with two working parents and their two children.

My colleague, Talha Oz, has shared the code on Jupyter's nbviewer:

Friday, October 13, 2017

Organizing Theories for Disaster Study in Computational Social Science

What are frameworks for understanding disasters? And, how can a Complex Adaptive Systems framework provide better understanding?

Check out my presentation at GMU's Computational Social Science (CSS) Friday Seminar:
Organizing Theories for Disaster Study in Computational Social Science

Or view the slides:
Disasters in a Complex Adaptive Systems Framework

Friday, September 22, 2017

Synthetic Populations for ABMs

Agent-based models are being used for computer experiments in epidemiology, transportation, migration, climate change, and urban studies. Researchers use the models to experiment on simulated human behavior with a synthesized population in a controlled environment. What population synthesis methods are currently being used in ABMs, and how have these synthetic populations been used?

Population synthesis is the process of creating agent representations of the model population based on available data. Sample-based methods are more traditional, but new methods also create synthetic populations sample-free.

Sample-based methods either involve synthetic reconstruction or combinatorial optimization (reweighting) based on existing datasets on population characteristics such as census data.

In synthetic reconstruction the joint-distribution of relevant population attributes are used to create a fitted population and generate individual units on that population. The most common method is Iterative Proportional Fitting (IPF). A procedure in which the attributes of an individual unit are taken from a contingency table with fixed marginal totals. For each agent in the model a random sample is taken from a probability distribution of the relevant attributes existing in the population. This process is repeated until all the attributes are assigned to the agent population. The result is a reconstruction of the original population.

In combinatorial optimization a sample population is generated and then repeatedly modified until it meets a threshold of required constraints. First a set of randomly selected households are taken from an existing population dataset. A random household from this sample and one from the large dataset is assessed for fit. If there is a better fit, the households are switched. The assessment and potential switch is repeated, and the sample population gradually improves its fit to a set of population constraints. The result is a sample population

A sample-free method in synthetic reconstruction involves generating individual units and placing them into households or other groupings until the entire population is used. The method draws the individual's attributes at the most disaggregated level from joint distributions. After all the individuals are generated, the population is compared to the joint distributions and inconsistencies are handled by shifting attribute values. The last step is to gather the individuals into households or groupings. Look for this technique to be applied to migratory groups, areas undergoing rapid change and other underrepresented, marginalized populations.

Populations and Generator Tools:
Population generative tools are available now with existing synthetic populations or for use on new population data.

Uses Iterative Proportional Updating (IPU) that, unlike IPF, controls for both the agent and agent grouping at the same time. Used for creating realistic human populations for prediction of anatomical, physiological and phase 1 metabolic variation in the population in response to exposure or dosage.

PopGen for SimTRAVEL
Adds person-level attributes in addition to census data distributions in population synthesis. These populations are designed for application to urban planning and analysis of transportation, routes, activities, vehicles, emissions and land-use. Arizona State University (ASU) is integrating it into UrbanSim.

Synthetic Populations and Ecosystems of the World (SPEW)
Provides a synthetic population and ecosystem from available data of over 80 countries in American Community Survey (ACS), International Public Use Microdata Samples (IPUMS) and other population samples using simple random sampling. Carnegie Mellon University (CMU) plans to add moment matching and iterative proportional fitting in future versions. Moment matching is a statistical technique used to estimate population parameters by deriving equations that describe the population characteristic's expected mean.

Virginia Bioinformatics Institute Synthetic Data 
Synthetic populations of Portland, Oregon, Montgomery, Virginia, West Africa, and Washington, D.C. that have been applied to studies of infectious disease, incarceration rates, and emergency management.

RTI U.S. Synthetic Household Population
Provides a representation of households and persons in U.S. populations from publicly available data sources. These data are placed on a map and represent distribution variations within census blocks. Used for representations of the demographic characteristics of a population including age, gender, race, income, and educational attainment. The map and underlying data is available online for free to track infectious disease, study transportation networks or optimize supply chains.

The data for SPEW, RTI US Synthetic Household and other populations from the Models of Infectious Disease Study (MIDAS) can also be found here:

Further References:
Arentze, Theo, Harry Timmermans, and Frank Hofman. 2007. “Creating Synthetic Household Populations: Problems and Approach.” Transportation Research Record: Journal of the Transportation Research Board 2014 (December): 85–91.

Barthelemy, Johan, and Philippe L. Toint. 2013. “Synthetic Population Generation Without a Sample.” Transportation Science 47 (2): 266–79. doi:10.1287/trsc.1120.0408.

Beckman, Richard J., Keith A. Baggerly, and Michael D. McKay. 1996. “Creating Synthetic Baseline Populations.” Transportation Research Part A: Policy and Practice 30 (6): 415–29.

Deming, W. Edwards, and Frederick F. Stephan. 1940. “On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals Are Known.” The Annals of Mathematical Statistics 11 (4): 427–444.

Huang, Zengyi, and Paul Williamson. 2001. “A Comparison of Synthetic Reconstruction and Combinatorial Optimisation Approaches to the Creation of Small-Area Microdata.” Department of Geography, University of Liverpool.

McNally, Kevin, Richard Cotton, Alex Hogg, and George Loizou. 2014. “PopGen: A Virtual Human Population Generator.” Toxicology 315 (January): 70–85.

Müller, Kirill, and Kay W. Axhausen. 2010. “Population Synthesis for Microsimulation: State of the Art.” In . Monte Verità, Ascona, Switzerland: ETH Zürich, Institut für Verkehrsplanung, Transporttechnik, Strassen-und Eisenbahnbau (IVT).

Williamson, Peter, Michael Birkin, and Phillip H. Rees. 1998. “The Estimation of Population Microdata by Using Data from Small Area Statistics and Samples of Anonymised Records.” Environment and Planning A 30 (5): 785–816.

Wise, Sarah. 2014. “Using Social Media Content to Inform Agent-Based Models for Humanitarian Crisis Response.” George Mason University.

Tuesday, September 12, 2017

Hurricane Response & Crisis Mapping

Crowdsourcing sharable maps for humanitarian relief efforts are becoming the new normal.

Three examples:
Harvey Relief
Irma Response
US Wildfires

These maps include layers of geospatial data as KML files that can be downloaded for analysis. The Harvey Relief map includes crowd-sourced locational data for individuals to find aid.
The google crisis map layers include links to the original source KML data.

These maps can be used to inform humanitarian efforts around the globe.

Thursday, August 24, 2017

Solutions to bias in the modern scientific process

There is wide recognition of inherent bias problems in the modern scientific process. I attended a good talk by Brian Nosek today discussing the challenges, barriers and solutions. Among the scientific process challenges are flexibility in analysis of data, selective reporting, ignoring nulls and lack of replication. Some of these problems can be traced to basic human behavior and psychology including perceived norms, motivated reasoning, minimal accountability, and people are just plain busy.

Nosek proposes that the solution to these problems is to show the work and share it early in the research cycle. Signals and incentives to make this behavior visible is necessary.

Towards that end here are a few efforts to address these problems:

Open Science Framework -- tools for sharing work across the research cycle

Registered reports -- peer review after research design, before collection and analysis

Tracking switched outcomes in trials:

For computation and data science -- code sharing:

Friday, August 18, 2017

Agent-Based Models and Social Network Analysis

Agent-based models (ABMs) are computational models consisting of heterogeneous agents programmed with decision-making heuristics and learning abilities. Interactions between these agents and their simulated environment result in adaptation and emergent behavior. The integration of ABMs and social network analysis provides the opportunity to experiment with the effects of social influence in individual decision-making and emergent social processes, but few ABMs include social networks.

A review of the literature does reveal a growing body of social network experimentation in ABMs:

Table: Social Networks in Agent Based Models

Monday, February 27, 2017

Foundations: The Halifax Disaster

Disaster at the intersection of Complex Adaptive Systems - 
The Halifax Disaster in Three Parts

The study of disasters provides a context to understand how individuals, families and social systems operate under extreme stress, how to respond in disrupted social systems, and what can be done to aid those harmed by the disasters. Within these events individuals and groups under extreme stress are observed to demonstrate the best and worst of humanity, and study of behaviors in these events provide insights into individual and community coping mechanisms. In disaster human behavior is a response to events occurring at the intersection of three sets of interacting systems, the socio-ecological system, the system of collective, social behavior, and the individual actor’s cognitive. Each of these systems changes and adapts in response to the others as part of a complex adaptive system of systems. The conception of disaster as a disruption of intersecting complex adaptive systems is too abstract to improve understanding. Instead real-world case studies must be used to provide context, enable process tracing and identify causation. Samuel Prince’s classic dissertation study of a catastrophe documents the foundational features of disaster and provides evidence for the three interacting systems with behaviors from the environment, groups and individuals. 

Source: Maritime Museum, Nova Scotia Canada

Both as a study and a story the event is intrinsically compelling
 with an improbable series of disruptions provided by war, earthquake, fire, flood, famine and storm (Prince 1920, p.28). In December 1917, after a failure in ship signaling, French munitioner transporting explosives collided with an empty Belgian relief ship vessel in the ocean terminal of Halifax, Canada. The resulting explosion brought World War I to the shores of Canada, shaking the ground, showering soot, oil, water and shrapnel, producing a tidal wave of sea that flooded the City of Halifax, triggering gas explosions and a fire that wiped out the north end of the city just before a series of winter storms brought blizzards, wind, rain, flood and freezing temperatures to the area. The event impacted the entire social and environmental system of approximately 50,000 people and forced every individual and social organization to respond in a massive engagement of social change. Globally, aid funding poured in from China, New Zealand, Great Britain and the United States (Ruffman and Findley, 2007), and locally the everyday activities of individuals, families, businesses and government were suspended as citizens in desperate need struggled to deal with the losses; 2,000 dead, 6,000 injured, 10,000 homeless, and ~$35 million in property destruction (Prince 1920, p. 26).

Source: Nova Scotia Museum, Canada

Over a period of days and months the individuals, families, social organizations and the larger socio-ecological system were overwhelmed by necessary, non-routine activities required to survive and recover. Individuals struggled through varying physical and emotional states to make decisions and act in adaptation to the disaster. Prince found that the people of Halifax, Halogians, experienced varying degrees of shock, helplessness, hallucination, delusion, fear, grief, sorrow, kindliness, sympathy, heroism, instinct, self-help, mutual aid, blame, scapegoating, and primitive instincts. At the time of the impact people fled out of a sense of preservation or fought to prevent the explosion and fires. After the impact people were engaged in searches for loved ones, rescue and aid or recovery from injury. Individuals who were physically and mentally able and with relevant social roles, community leaders or “big men” of civic and philanthropic work, public utility workers, medical professionals and social specialists put tremendous energy into the response efforts.
Source: Halifax Regional Municipality

Within hours of the explosion existing social organizations like the military, play actors and firemen adapted their activities to become first responders. Refugees informally grouped together for a sense of security and safety, and other groups emerged in imitation to provide improvised shelter and clothing and food depots. Formal social organizations resumed activities more slowly with the first of these being the public utilities providing telegraph service, gas and lighting, and rail transportation. In the three days following the explosion newspaper, postal service and banks opened up, and ‘social specialists’ began to converge on the city to provide medical and rehabilitative care. Regular city council meetings recommenced two weeks, and the first businesses opened back up four-five weeks after the explosion. Special laws were enacted to ensure the safety of Halogians during the time of disaster recovery. Results from a quickly appointed investigative task-force attempted to find the cause of the explosions and fix responsibility. Responsibility for the explosion was not asserted, but the principles of restitution and indemnity in disaster were formalized (Prince 1920, p.95). The focus of energy and funds to the rebuilding Halifax led to an acceleration of city growth. It took only three months to clear the explosion debris from devastated areas by using 950 men and 270 horses working 10 hours a day (Prince 1920, p.78). Such recovery was only possible through the collective efforts of not only its own citizens, but also the contributions from more than 200 cities around the world (Prince 1920, p.78).

"McCall Apartments" from rebuilding.
The environment and the human-built system set the context in Prince’s study. At the time of the disaster Halifax was a developing military town and geographically situated in the Northeast of Canada with a narrow ocean strait used as a through-way for ocean-going commerce. Thus situated, the city was vulnerable to accidents caused by increases in ocean traffic and the transportation of lethal cargo. After the disaster new laws were enacted to improve safety at sea and the inspection, control and handling of explosives in harbors (Prince, p.109; Dynes and Quarantelli, 1993). The city and its ocean piers were also rebuilt to better accommodate future commerce. City planning and improvement efforts widened streets and rezoned land for commercial, public use and residential housing. The disaster became the impetus for an acceleration of the development of the city from a sleepy military town to a regional maritime commercial hub in a larger network of commercial ports. Prince concluded that the disaster had created a period of fluidity in which social change could be achieved, and he was able to document these changes with new social organizations, coordinating institutions for medical and social care, new social legislation, and improvements in the built-environment of Halifax.

As disaster studies expanded, the Halifax disaster and Prince’s study has become a template for the disaster experience to which academics later mapped empirical data and identified human behavior in times of stress (Fritz and Mathewson, 1957; Tyhurst, 1957; Wallace, 1956; Carr, 1932). Prince lays out the groundwork findings of individual, collective and environmental system behaviors that show how the destruction of disasters create social vacuums in which new behaviors and social organizations emerge. His work can also be analyzed as an intersection of individual cognitive systems that trigger emotional and rational behaviors, groups who emerge to collectively respond to the environmental threats, and the environment that provided a disruption triggered by human error. Although it is impossible to trace the infinite number of behavioral causations, the aggregation of these behaviors results in a set of system interactions that can be analyzed to provide greater understanding of the processes at work in disaster.


Carr, Lowell Juilliard. 1932. “Disaster and the Sequence-Pattern Concept of Social Change.” American Journal of Sociology, 207–218.

Dynes, Russell R., and Enrico Louis Quarantelli. 1993. “The Place of the 1917 Explosion in Halifax Harbor in the History of Disaster Research: The Work of Samuel H. Prince.”

Fritz, Charles E., and J.H. Mathewson. 1957. “Convergence Behavior in Disasters; a Problem in Social Control.” Publication 476. Washington, D.C.: Committee on Disaster Studies National Academy of Sciences - National Research Council. Library of Congress. 

Prince, Samuel Henry. 1920. “Catastrophe and Social Change: Based upon a Sociological Study of the Halifax Disaster.” New York: Columbia University. 

Ruffman, Alan, and Wendy Findley. 2007. “The Halifax Explosion.”

Tyhurst, J. S. 1957. “Psychological and Social Aspects of Civilian Disaster.” Canadian Medical Association Journal 76 (5): 385.

Wallace, Anthony F. C. 1956. “Human Behavior in Extreme Situations: A Survey of the Literature and Suggestions for Further Research.” 390. Disaster Study. National Academy of Sciences -- National Research Council. 56-60013.