Detecting Technical Anomalies in Water-Quality Data From River Networks

Abstract

One of the most important requirements for a well-defined environmental analysis technique is the degree of confidence in the sensor data. Anomalies in water-quality data from in situ sensors caused by technical faults can impair data quality and have a direct impact on the inference drawn from subsequent data analysis. In this work, our focus is on anomalies in water-quality data from in situ sensors caused by technical issues. We define a technical anomaly as an observation that has an unexpectedly low probability density. In this talk, we???ll first go over the various types of technical anomalies that can be found in water-quality sensor data collected at various geographic locations within a river network. Second, we will discuss why different types of conditioning information, such as contemporaneous downstream observations, lagged downstream observations, and upstream observations at the time the conditional correlation is maximized, are important for improving technical anomaly detection in river networks. Third, we will introduce a novel framework to detect anomalies in water quality data based on the conditional cross-correlation between neighboring sensors in close proximity with a connected flow. An approach based on extreme value theory is used to calculate a data-driven anomalous threshold for potential anomalies in water quality data. This approach successfully identified both high-priority and low-priority anomalies involving drifts and abrupt changes, including sudden spikes, sudden isolated drops, and level shifts, while maintaining very low false detection rates. The proposed framework was evaluated using data obtained from in situ sensors in rivers in Pringle Creek, one of the NEON (National Ecological Observatory Network) aquatic sites located in Wise County, Texas. The key functionalities of the proposed anomaly detection framework implemented in open-source R package, conduits are also demonstrated during the talk.

Date
May 10, 2022 12:00 AM — 12:00 AM
Location
Grand Rapids, Michigan
Virtual Conference
Avatar
Priyanga Dilini Talagala
PhD in Statistics

My research interests include Computational Statistics, Anomaly Detection, Time Series Analysis and Machine Learning.