Thread: STTR BAA: A machine learning application to seismology

Started: 2018-12-01 07:52:59
Last activity: 2018-12-01 07:52:59
2018-12-01 07:52:59
You may be interested in the upcoming AFRL STTR (Small Business Technology Transfer) opportunity STTR 19.A, "Machine Learning Methods to Catalog Sources from Diverse, Widely Distributed Sensors". The specific seismology topic relevant to nuclear explosion monitoring is appended below.

STTR programs are designed to encourage collaboration among small businesses and non-profit research institutions. The Broad Agency Announcement is at, the main Air Force STTR/SIBR website is at, and the link to detailed instructions is at

The solicitation will be open from January 8, 2019 through February 6, 2019. Phase I Contract Awards should be made by 6 Aug 2019. Staff in AFRL's Nuclear Explosion Monitoring R&D Program can respond to questions concerning the topic until January 7, 2019.

G. Eli Baker, 505 846-6070, glenn.baker.3<at>
Raymond Willemann, 505 853-4333, raymond.willemann<at>
Rick Schult, 505 846-6101, frederick.schult<at>
Kenny Ryan, 505 854-4758, kenneth.ryan<at>

AF19A-T012 TITLE: Machine Learning Methods to Catalog Sources from Diverse, Widely Distributed Sensors

OBJECTIVE: Develop algorithms to automate the processing of incoming data streams from a diverse and dynamically changing set of sensors to detect, locate, and classify seismic events and flag suspicious events (i.e. possible nuclear tests) for human analysts at extremely low miss rates and the lowest possible false alarm rates.

DESCRIPTION: The automatic system the Air Force uses to monitor the globe for test nuclear explosions processes incoming data streams from hundreds of seismic stations and arrays, including those of the United Nations' Comprehensive Test Ban Treaty Organization's (CTBTO) International Monitoring System (IMS), among others. In near real time the system creates catalogs of seismic sources, and identifies and flags suspicious events for human analysts. The system can be overwhelmed when large earthquakes are followed by thousands or tens of thousands of aftershocks. Similarly, current systems can neither process the orders of magnitude more signal detections nor dynamically incorporate the hundreds to thousands of new sensors that will be required to push detection and classification thresholds down to meet mission requirements. New algorithmic approaches are needed to meet this challenge. The most promising approaches are machine learning (ML) methods. The seismic data volumes involved are appropriate for ML methods and the availability of HPC resources makes their processing feasible.

The first challenge is to apply ML methods to rapidly and accurately automate the recognition of similar signals in very large data sets (e.g. Yoon, et al., 2015). Most observed seismic signals are repeated (repeating earthquakes, aftershocks, and mining explosions) and their recognition will speed processing by simplifying the association problem. That is, associating all signals across the network with the set of hypothesized events that are most likely to have generated the signals is NP hard (e.g. Arora et al, 2017; Benz et al., 2017). Any signals identified as similar to signals associated with a previously identified event can immediately be associated with a similar repeat event. In addition, the system will need to autonomously identify sets of common nuisance signals generated near stations (e.g. ice quakes at high latitudes, sonic booms near military airfields) and use those to cull signals that are not of monitoring interest.

Upon detection of new signals that are not similar to previous signals, the system must then distinguish the signal type (e.g. identify the seismic phase), determine the event location most likely to have generated that signal and other signals recorded across the network (determining which signals are associated with each other, and with a single hypothesized event, is the NP hard problem), and discriminate the source type. The method must also robustly adapt to changes in network configuration and components. Instruments will range from permanent high fidelity 3-component seismometer arrays dedicated to and designed for nuclear explosion monitoring to individual sensors operating for other purposes (e.g. seismic hazard monitoring) that can be opportunistically added "on-the-fly" in regions of interest.

PHASE I: Deliver a final report that 1) evaluates the performance of existing machine learning algorithms applied to components of the network processing problem, including signal detection, identification of repeated similar events, signal classification, event building (i.e. determination of the likeliest set of seismic events that could be the source of the detected signals), and event location and classification, and 2) lays out a plan and rationale for further algorithm refinement, and incorporation of the algorithms into a system that will efficiently process incoming data streams from a dynamic network to accurately identify signals of interest (i.e. possible nuclear tests).

PHASE II: Develop an end-to-end system that incorporates refined versions of algorithms tested in phase 1. Demonstrate substantially improved performance with respect to signal detection and classification, and event formation, location, and classification, of the new system over that of existing state-of-the-art systems (e.g. those of the CTBTO's or the US Geological Survey's National Earthquake Information Center [NEIC]. The catalogs of both and the data used are available). The system should be validated with real data streams (e.g. from the IMS or the NEIC networks). Performance metrics should include miss rates and false alarm rates relative to catalogs that have been vetted by human analysts. The system's ability to adapt on the fly to new data sources should be validated by the addition and deletion of data streams from stations not typically used for monitoring, such as from regional hazard monitoring networks.

PHASE III DUAL USE APPLICATIONS: In coordination with scientists and engineers from AFRL and AFRL's operational customer, the Air Force Technical applications Center (AFTAC), transition the system to AFTAC for further evaluation and testing with AFTAC's data. Delivery must include thorough documentation, users' manual, case examples, support, and training to ensure effective transition. The system will also have commercial application in regional and national networks used to monitor seismic hazards, volcano monitoring, and induced seismicity (e.g. from mining, geothermal, and fracking).

1. Yoon, C. E., O. O'Reilly, K. Bergen, and G. Beroza, Earthquake detection through computationally efficient similarity search, Science Advances, 04 Dec 2015, Vol. 1, no. 11, DOI: 10.1126/sciadv.1501057
2. N. S. Arora, S. Russell, and E. Sudderth, NET-VISA: Network Processing Vertically Integrated Seismic Analysis, Bulletin of the Seismological Society of America, Vol. 103, No. 2a, doi: 10.1785/0120120107
3. Benz, H., C. E. Johnson, J. M. Patton, N. D. McMahon, P. S. Earle, GLASS 2.0: An Operational, Multimodal, Bayesian Earthquake Data Association Engine, American Geophysical Union, Fall Meeting 2015, abstract id. S21B-2687

KEYWORDS: Nuclear explosion monitoring, machine learning, similarity search

03:46:27 v.f0c1234e