SAGE: Data Services Products: Noise Toolkit PDF-PSD

Summary

An open-source script bundle that contains a series of Python scripts to:

request waveforms and response data for given station/channels using the ObsPy FDSN client
OR read user’s waveform data files (in SAC, MSEED, CSS, etc. format) and only request response information
compute and populate an hourly file-based PSD database
extract PSDs and PDFs from the PSD database

Quicklinks

Data Product pages:

Citation & DOIs:

Citations

Description

Information on noise levels at seismic stations, obtained from Power Spectral Densities (PSD) of waveform data, has been used routinely to characterize the current and past performance of the stations and for detecting potential operational and/or instrumental station problems. The station PSDs and PDFs are computed and distributed via DMC’s MUSTANG data quality metrics web service. MUSTANG computes over 40 separate metrics and can return measurements to you in XML, text, CSV, and JSON formats using a REST-ful web service interface.

As the role of PSDs in studying background noise is expanding beyond the traditional station quality control objectives, a more customized computation of PSDs is desired. Noise Toolkit PDF/PSD bundle attempts to address this need by providing an open-source bundle of 3 Python scripts that are highly configurable (GitHub Repository). Although this package takes advantage of modern Web Services to retrieve necessary waveform data, but it also allows users to process waveform data that they have locally available. This package provides PSD file collections similar to popular PQLX package (McNamara and Boaz, 2005) and therefore it is compatible with existing user programs (see the PDF/PSD Package Wiki page). The scripts included in this bundle are:

ntk_computePSD.py – a Python script to request waveforms and response data for given station(s) using the ObsPy FDSN client
OR to read user’s waveform data files (in SAC, MSEED, CSS, etc. format), compute PSDs and populate a file-based PSD database
ntk_extractPsdHour.py – a Python script to extract PSDs for a given channel and bounding parameters from the PSD database. The output is similar to PQLX’s exPSDhour script
ntk_binPsdDay.py – a Python script to bin PSD’s to daily files for a given channel and bounding parameters.

PSD Computation

PSD computation is done by running the ntk_computePSD.py script. The script reads its run-time parameters from a file provided by user in addition to its command line arguments. A typical command for running this script and creating a PSD plot as a function of frequency is as follows:

ntk_computePSD.py param=computePSD net=NM sta=SLM loc=DASH
start=2009-01-01T01:00:00 end=2009-01-01T02:00:00 type=frequency 
mode=plot

where param=computePSD specifies the file containing run-time parameters and type=frequency specifies that sampling should be done in the frequency domain. The mode=plot instructs the script to display the computed PSDs.

PSD computation flow chart is shown in Figure 1 (click on the image for a larger view). At each stage the computation is controlled by the parameter file. Waveform data are either requested directly through the FDSN web services or are read directly from user data files. For the latter, a call is made to the response services to obtain the instrument response information.

Figure 2 shows a set of PSDs computed by ntk_computePSD.py script. The PSDs are for the broadband channels of station NM.SLM during the second hour of 2009-01-01 to match a similar plot presented by Koper and Hawley (2010).

Measures of power as a function of frequency — **Figure 2.** Computed power for **NM.SLM** on 2009-01-01
(left) Koper and Hawley, 2010 and (right) **ntk_computePSD.py**.

In figure 2, the 3 panels on the left (in black) from top to bottom represent power for the BHZ, BHE and BHN channels as presented by Koper and Hawley (2010). A similar 3 panels on the right (in blue) show PSDs computed by ntk_computePSD.py for the same time window (click on the image for a larger view). These PSDs are computed using an hour window that is divided into 15 subwindows (each 409.6 s long) with 50% overlap.

Smoothing

Once PSDs are computed, a smoothing algorithm is applied to reduce the number of samples and overall PSD storage requirements (McNamara and Boaz, 2005). The width of the smoothing window is user selectable and should be selected to preserve a desired level of PSD details. Figure 3 is modified version of figure 4 from the sacpsd document that includes the compatible Noise Toolkit smoothed results. In this figure the yellow and blue-green curves are from the original figure and represent the results from running sacpsd and the McNamara and Buland (2004) codes respectively for NM.SLM 2009-11-01 11:00 to 12:00 UTC. The red, blue and green curves are produced by the Noise toolkit for the BHZ channel PSD of figure 2 using smoothing parameters indicated:

MB 2004 – McNamara and Buland (2004) code from sacpsd document
sacpsd 5pt – sacpsd using 5-point averaging
NTK 4-8 – NTK using 1/4 octave of data around each extracted frequency (at 1/8 octave intervals) being the most detailed, however with some gaps at lower frequencies

NTK 2-8 – NTK using 1/2 octave of data around each extracted frequency (at 1/8 octave intervals) showing reasonable level of details, with the most gaps eliminated
NTK 1-8 – NTK using 1 octave of data around each extracted frequency (at 1/8 octave intervals) similar to MB 2004

PSD smoothing — **Figure 3.** smoothing of the computed PSD of the BHZ channel of
NM.SLM (Figure 2), using different smoothing techniques and
smoothing window width. See text for more details.

Storing PSDs

The smoothed PSDs are stored in individual files under the data directory defined by user in the parameter file. Using the default parameters, the daily PSD file storage requirement for 48 one hour windows with 50% overlap per day is about 1.5 MB per channel. The path to the PSD storage area has the following format:

DATA/psdDb/network.station.location/channel/yyyy/jjj/

psdDb - is the top directory for the PSDs
network - is the network code
station - is the station code
location - is the location ID 
channel - is the channel code
yyyy - is the year of the data file 
jjj - is the Julian day of the data file

Each PSD file is clearly named with its start time and window length:

network.station.location.channel.yyyy-mm-ddThh:mi:ss.ssssss.length.type.txt
mm - 2 digit month
dd - 2 digit day of the month
hh:mm:sss.ssssss - UTC time 
length - length of the window in seconds
type - x-axis type (frequency or period)

NOTE: type indicates whether sampling done in frequency or period domain. Rather than converting period to frequency, ntk_computePSD.py selects samples directly in the requested domain, therefore resulting in equally spaced samples in the log space of the selected domain.

Extracting PSDs

ntk_extractPsdHour.py script extracts PSDs for a given channel and bounding parameters from the PSD database (similar to PQLX’s exPSDhour script). The file containing the extracted PSDs is stored under:

DATA/PSD/network.station.location/channel/

PSD - is the top directory for the PSDs
network - is the network code
station - is the station code
location - is the location ID 
channel - is the channel code

with a file name that includes the search window start and end times:

network.station.location.channel.start.end.txt
network - is the network code
station - is the station code
location - is the location ID 
channel - is the channel code
start - is the start of the PSD selection window (as yyyy-mm-ddThh:mi:ss)
end - is the end of the PSD selection window (as yyyy-mm-ddThh:mi:ss)

where

mm - 2 digit month
dd - 2 digit day of the month
hh:mm:sss - UTC time

Bin PSDs

ntk_binPsdDay.py bins PSD’s to daily files for a given channel and bounding parameters. The bins are stored as:

DATA/PDF/network.station.location/channel/Yyyyy/Djjj.bin

where

PDF - is the root directory of the binned PSD data files
network - is the network code
station - is the station code
location - is the location ID 
yyyy - is the year of the data file 
jjj - is the Julian day of the data file

Djjj.bin File Format

Djjj.bin is a cumulative “bin” file containing the hour long PSD measurements for an entire day, where jjj is the julian day for any given file.

 FREQUENCY POWER #HITS 
 FREQUENCY POWER #HITS 
 FREQUENCY POWER #HITS 
 FREQUENCY POWER #HITS 
 FREQUENCY POWER #HITS 
 .
 .
 .
 .

where:

FREQUENCY is the frequency in Hertz
POWER is the power bin in (dB)
#HITS is the number of times

Citations and DOIs

To cite the IRIS DMC Data Products effort:

Hutko, A. R., M. Bahavar, C. Trabant, R. T. Weekly, M. Van Fossen, T. Ahern (2017), Data Products at the IRIS‐DMC: Growth and Usage, Seismological Research Letters, 88, no. 3, https://doi.org/10.1785/0220160190.

To cite the IRIS DMC The IRIS DMC Noise Toolkit or reference use of its bundles:

IRIS DMC (2014), Data Services Products: The IRIS DMC Noise Toolkit, https://doi.org/10.17611/DP/NTK.1.

To cite the source or reference the use of the Noise Toolkit PDF/PSD bundle:

IRIS DMC (2014), Data Services Products: Noise Toolkit PDF-PSD Noise Toolkit PDF/PSD bundle, https://doi.org/10.17611/DP/NTK.2.

References

Koper K.D. and V.L. Hawley, “Frequency dependent polarization analysis of ambient seismic noise recorded at a broadband seismometer in the Central United States”, Earthquake Science, 23, 439-447, 2010.

McNamara D.E. and R.I. Boaz, Seismic Noise Analysis System, Power Spectral Density Probability Density Function: Stand-Alone Software Package, United States Geological Survey Open File Report, NO. 2005-1438, 30p., 2005.

McNamara, D.E. and R.P. Buland, “Ambient noise levels in the continental United States”, Bull. Seism. Soc. Am. 94, 1517-1527, 2004.

Credits

IRIS DMC Products Team
Keith Koper (University of Utah)
Rob Anthony and Rick Aster (New Mexico Tech and Colorado State University)

Contact

Data Help

ABOUT SAGE

COMMUNITY

EVENTS

PUBLICATIONS

SAGE ORGANIZATION

Data Services Products: Noise Toolkit PDF-PSD Noise Toolkit PDF/PSD bundle