+ - 0:00:00
Notes for current slide
Notes for next slide

Detection of Anomalous Series Within a Large Collection of Streaming Time Series Data

Priyanga Dilini Talagala
with
Rob J Hyndman
Kate Smith-Miles
Sevvandi Kandanaarachchi
Mario A. Muñoz

Monash University
1 / 24

Motivation: Fence-mounted perimeter intrusion detection systems

  • Data obtained using fiber optic cables attached to a fence
2 / 24

Motivation: Fence-mounted perimeter intrusion detection systems

  • Data obtained using fiber optic cables attached to a fence
  • Intrusion events cause measurable changes in intensity, phase, wavelength or transit time of light in the fiber.
2 / 24

Motivation: Fence-mounted perimeter intrusion detection systems

  • Data obtained using fiber optic cables attached to a fence
  • Intrusion events cause measurable changes in intensity, phase, wavelength or transit time of light in the fiber.
  • Aim: Find anomalous time series (the location of the intrusion event)
2 / 24

Motivation: Network intrusion detection systems

  • Yahoo data breach in late 2014 --- world's largest ever cyber attack
3 / 24

Motivation: Network intrusion detection systems

  • Yahoo data breach in late 2014 --- world's largest ever cyber attack
  • Intrusion attacks cause measurable changes in times of logins, command executed during a single user session, number of password failures
3 / 24

Motivation: Network intrusion detection systems

  • Yahoo data breach in late 2014 --- world's largest ever cyber attack
  • Intrusion attacks cause measurable changes in times of logins, command executed during a single user session, number of password failures
  • Aim: find anomalous time series (locate intrusion attacks)
3 / 24

Motivation

  • All these applications generate millions or even billions of individual time series simultaneously
4 / 24

Motivation

  • All these applications generate millions or even billions of individual time series simultaneously
  • Research question: Finding anomalous time series within a large collection of time series
4 / 24

Motivation

  • All these applications generate millions or even billions of individual time series simultaneously
  • Research question: Finding anomalous time series within a large collection of time series
  • Approaches to solving the problem of anomaly detection for temporal data :
4 / 24

Motivation

  • All these applications generate millions or even billions of individual time series simultaneously
  • Research question: Finding anomalous time series within a large collection of time series
  • Approaches to solving the problem of anomaly detection for temporal data :

    Batch scenario:
    whole set of data is available, focus - complete events

4 / 24

Motivation

  • All these applications generate millions or even billions of individual time series simultaneously
  • Research question: Finding anomalous time series within a large collection of time series
  • Approaches to solving the problem of anomaly detection for temporal data :

    Batch scenario:
    whole set of data is available, focus - complete events

    Data stream scenario: continuous, unbounded, flow at high speed, high volume

4 / 24

Automatic anomaly detection algorithm for streaming data is required:

  • to give real-time support
5 / 24

Automatic anomaly detection algorithm for streaming data is required:

  • to give real-time support
  • to provide early detection of anomalies
5 / 24

Automatic anomaly detection algorithm for streaming data is required:

  • to give real-time support
  • to provide early detection of anomalies
  • to learn and adapt to the changing environment automatically (concept drift)
5 / 24

Automatic anomaly detection algorithm for streaming data is required:

  • to give real-time support
  • to provide early detection of anomalies
  • to learn and adapt to the changing environment automatically (concept drift)
  • to deal with large amounts of data efficiently
5 / 24

What is an anomaly ?

6 / 24

Image credit: Wikimedia Commons

What is an anomaly ?

  • By definition, anomalies are rare in comparison to a system's typical behaviour.
  • We define an anomaly as an observation that is very unlikely given the forecast distribution.
7 / 24

Algorithm of the proposed framework

Aim

  • To detect anomalous time series within a large collection of time series in a streaming data context
8 / 24

Algorithm of the proposed framework

Aim

  • To detect anomalous time series within a large collection of time series in a streaming data context

Main Assumptions

8 / 24

Algorithm of the proposed framework

Aim

  • To detect anomalous time series within a large collection of time series in a streaming data context

Main Assumptions

  • Anomaly is a rare event which has a very low chance of occurrence with respect to the typical behavior of the system
8 / 24

Algorithm of the proposed framework

Aim

  • To detect anomalous time series within a large collection of time series in a streaming data context

Main Assumptions

  • Anomaly is a rare event which has a very low chance of occurrence with respect to the typical behavior of the system
  • A representative data set of the system's typical behavior is available to define the model for the typical behavior of the system.
8 / 24

Algorithm of the proposed framework

Aim

  • To detect anomalous time series within a large collection of time series in a streaming data context

Main Assumptions

  • Anomaly is a rare event which has a very low chance of occurrence with respect to the typical behavior of the system
  • A representative data set of the system's typical behavior is available to define the model for the typical behavior of the system.

Proposed Algorithm

8 / 24

Algorithm of the proposed framework

Aim

  • To detect anomalous time series within a large collection of time series in a streaming data context

Main Assumptions

  • Anomaly is a rare event which has a very low chance of occurrence with respect to the typical behavior of the system
  • A representative data set of the system's typical behavior is available to define the model for the typical behavior of the system.

Proposed Algorithm

  • Off-line Phase: Building a model of a system's typical behaviour; (similar to (Clifton, Hugueny & Tarassenko, 2011))
8 / 24

Algorithm of the proposed framework

Aim

  • To detect anomalous time series within a large collection of time series in a streaming data context

Main Assumptions

  • Anomaly is a rare event which has a very low chance of occurrence with respect to the typical behavior of the system
  • A representative data set of the system's typical behavior is available to define the model for the typical behavior of the system.

Proposed Algorithm

  • Off-line Phase: Building a model of a system's typical behaviour; (similar to (Clifton, Hugueny & Tarassenko, 2011))
  • On-line Phases: Testing newly arrived data using the boundary
8 / 24

Feature Based Representation of Time series

  • Mean
  • Variance
  • Changing variance in remainder
  • Level shift using rolling window
  • Variance change
  • Strength of linearity
  • Strength of curvature
  • Strength of spikiness
  • Burstiness of time series (Fano Factor)
  • Minimum
  • Maximum
  • The ratio between interquartile mean and the arithmetic mean
  • Moment
  • Ratio of means of data that is below and upper the global mean
9 / 24

Feature Based Representation of Time series

10 / 24

Dimension Reduction for Time Series

  • First two PCs explain 85% of variation
11 / 24

Classical Extreme Value Theory

Figure: Extreme value distributions corresponding to m = 1; 10; 100; 1000, each describing where the maximum of m samples drawn from N(0; 1) will lie.

Figure: Extreme value distributions corresponding to m = 1; 10; 100; 1000, each describing where the maximum of m samples drawn from N(0; 1) will lie.

12 / 24

Theorem 1: Fisher-Tippett theorem (Limit laws for maxima)

(Embrechts et al. (2013), p. 121)

Let X=X1,X2,...,Xm be a sequence of independent and identically distributed random variables and Xmax=max(X). If there exist centering constant dm() and normalizing constant cm(>0), and some non-degenerate distribution function H+ such that

then H+ belongs to one of the following three distribution functions:

13 / 24

Extreme Value Distribution of the Probability Density Values (Clifton et al., 2011)

  • Estimate the probability density function of the 2D PC space --> Kernel density estimation
  • Draw a large number N of extremes from the estimated probability density function
Figure: Distribution of 1000 extremes generated from bivariate kernel density function with m=500

Figure: Distribution of 1000 extremes generated from bivariate kernel density function with m=500

14 / 24

Extreme Value Distribution of Probability Density Values (Clifton et al., 2011)

  • Define a Ψ-transform space, using the Ψ-transformation defined by

  • Ψ-transform maps the density values back into space into which a Gumbel distribution can be fitted.
Figure: Distribution of transformed values

Figure: Distribution of transformed values

15 / 24

16 / 24

Image credit: Wikimedia Commons

17 / 24

How it works?

18 / 24

How it works?

19 / 24
oddstream::find_odd_streams(train_data, test_stream)

20 / 24

What Next?

  • Explore more on feature extraction and feature selection methods to create a better feature space suitable for streaming data context.
21 / 24

What Next?

  • Explore more on feature extraction and feature selection methods to create a better feature space suitable for streaming data context.
  • Use other dimension reduction techniques such as multidimensional scaling analysis, random projection to see the effect on the performance of the proposed framework.
21 / 24

What Next?

  • Explore more on feature extraction and feature selection methods to create a better feature space suitable for streaming data context.
  • Use other dimension reduction techniques such as multidimensional scaling analysis, random projection to see the effect on the performance of the proposed framework.
  • Do more experiments on density estimation methods to get a better tail estimation.
21 / 24

What Next?

  • Explore more on feature extraction and feature selection methods to create a better feature space suitable for streaming data context.
  • Use other dimension reduction techniques such as multidimensional scaling analysis, random projection to see the effect on the performance of the proposed framework.
  • Do more experiments on density estimation methods to get a better tail estimation.
  • Extend the algorithm to work with Multidimensional Multivariate Data Streams.
21 / 24

References

Images were taken:

Main references

  • Clifton, D. A., Hugueny, S., & Tarassenko, L. (2011). Novelty detection with multivariate extreme value statistics. Journal of signal processing systems, 65 (3), (pp. 371-389).
  • Fulcher, B. D. (2012). Highly comparative time-series analysis. PhD thesis, University of Oxford.
  • Hyndman, R. J., Wang, E., & Laptev, N. (2015). Large-scale unusual time series detection. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW), (pp. 1616-1619). IEEE.
22 / 24

Acknowledgement

Statistical Society of Australia, Victorian Branch

  • for offering financial support to attend the Young Statisticians Conference (YSC) 2017 in Coolangatta, QLD.
23 / 24

Thank You

R package available at: github.com/pridiltal/oddstream

Email: dilini.talagala@monash.edu

Slides available at: http://pritalagala.netlify.com/talk/yscvicbranch2017-talk/

24 / 24

Motivation: Fence-mounted perimeter intrusion detection systems

  • Data obtained using fiber optic cables attached to a fence
2 / 24
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow