1. Using Open 911 Data
One of the most significant things the Reimagine 911 team learned from using open data is how different one city is to the next. Our findings can help center your expectations whether you are hoping to explore a specific city or compare multiple cities.
“911” Data Doesn’t Come (Directly) from 911
The canonical source for 911 data is typically the computer-aided dispatch (CAD) or records management system (RMS) system used by the 911 center. However, this is rarely what is shared in public portals; at least, not in its original form.
This is because most open data portals are not funded and operated by the 911 center themselves. More often they are operated by police departments and city data providers (see the 911 Open Data Providers section below for more). As a result, the data providers often have additional information at their disposal, and a broader interest that goes beyond 911 data.
As a result, two or more different agencies are often involved in preparing, filtering, and publishing open data. For researchers interested in exploring 911 data, there are three key considerations that need to be reconciled.
Non-Exclusive Datasets
The datasets we reviewed frequently included columns or rows that were not related to 911 calls. Some examples were:
Responder data about the incident: arrival times, disposition codes, etc.
Contextual location data: council districts, police beat, etc.
Supplemental information: parcel number, estimated damage cost, urban/rural flags, etc.
Inter-agency records: serving warrants, administrative requests, etc.
Nearly all datasets also included data from events further down the emergency response chain, or call-for-service data from other agencies. This can provide a richer, more contextual view into calls. However it also requires a higher level of familiarity if a researcher wishes to limit their questions to public 911 calls.
If data that exclusively represents 911 calls is important to you, some additional data cleaning may be necessary. Importantly, it can be difficult to isolate the public 911 call records in many cases. Some datasets include a column that indicates whether the call originated from 911, but this is somewhat uncommon in the datasets we reviewed.
Non-Comprehensive Datasets
Researchers should generally assume that for any open 911 dataset, some call records will have been excluded from the public offering. The Open Data Review includes an “exclusions” column that captures the many open data policies that require filtering the public set of calls.
Some reasons the dataset may be incomplete include:
Calls not dispatched: general information, calls resolved without dispatch, etc.
Inter-agency calls: police initiated incidents, calls transferred between departments, etc.
Sensitive call types: sexual assault, suicide, domestic violence, homicide, etc.
Researchers may need to review the datasets they want to use carefully to understand which records are present and which are not.
Inconsistent Anonymization
Anonymization was highly inconsistent across the cities the Reimagine 911 team reviewed. The most commonly anonymized data was location information. The lat/long coordinates were sometimes truncated, and it was common to limit addresses to street blocks or intersections.
However some providers chose not to anonymize or filter some surprising information. Some datasets provide detailed location information, including the floor of the building that calls originated from. Others included the names of authorities that responded to a scene.
While all this data is public, it is up to the researcher to decide what is responsible to review and share—within the limits of the provider’s license (see our Data Licensing section for more information).
Exploring Singular Cities Versus Comparing Multiple Cities
Your decision to work with open data may be very different depending on whether you are interested in exploring one city or comparing multiple cities. The Reimagine 911 project used open data to explore local 911 call types across multiple cities and we have a good understanding of the different considerations for each case:
Geographic flexibility improves data availability: Since the type and format of data varies so much across different cities, good data will generally be more available for researchers who are not committed to studying specific cities. The ability to analyze one or more specific cities is subject to the data made available by those cities.
Comparisons across many cities can be limited: As the number of cities being compared grows, the likelihood of richer data drops. For example, census block data, responder type, and disposition codes can be very informative, but we found these in less than 25% of the cities we reviewed.
The 911 call-type and call times are consistently available: Regardless of whether you are researching a single city or many cities, you can expect open data to provide the initial 911 call type and call type for each dataset. Most other data points are inconsistently available across cities.
The Onus of Understanding
The 911 data shared openly online is highly variable. Code books and specific descriptions of the data are not often provided. Terminology is also variable and used inconsistently between different ECCs.
This puts the onus of understanding 911 data on the researcher. The localized nature of each dataset means they need to be reviewed carefully and individually.
When to Use Open 911 Data
Despite the relatively low quality of open 911 data, it can still prove useful in many circumstances. Consider using open data in the following situations:
For your own education: If you are not making any public statements, but are seeking only to validate your own hypotheses, open 911 data may be valuable. Be sure you have taken the time to understand your data well.
Identifying the existence of a condition: Although the non-comprehensive nature of open 911 data makes it difficult to confirm the absence of a condition, open data may help you make a strong case for the presence of a condition.
Sharing informal observations: It may be appropriate to use open data for situations that are agreeable to being backed up with soft data.
Pursuing harder data: Using open data (with the appropriate disclaimers and attributions) may be an effective first step towards getting access to similar, cleaner datasets.
When to Avoid Using Open 911 Data
From issues we found with open 911 data, we know that using this information may not be appropriate for all scenarios. We recommend not using open 911 data if the following conditions apply:
Data must be exclusively 911 data: Nearly all of the open data we reviewed combined 911 information with information from dispatchers, responders, and municipal metadata. If a review of the datasets you want to use do not clearly distinguish between 911 data and other records, you should assume that the dataset is not exclusively 911 data.
Data must be comprehensive: You may not be able to use open data if it would be problematic that not all 911 calls are available. If your research requires unfiltered data you will most likely need to coordinate with each individual Emergency Communications Center (ECC).
Direct data access can be arranged: If you have both time and pathways to access 911 data directly from the CAD system, that may be preferable despite the delay in arrangements. This may be suitable for academic researchers, researchers working with influential partners, or through the use of municipal or FOIA information requests.
Findings need a rigorous data foundation: If the data quality considerations discussed in this section would be unacceptable to the audience you are sharing your findings with, then it may not be valuable to use open 911 data.
Last updated