Skip to content

SCH

Seminar

The ultrametric topology perspective on analysis of massive, very high dimensional data stores

Murtagh, F (Royal Holloway)
Wednesday 09 January 2008, 09:00-10:00

Seminar Room 1, Newton Institute

Abstract

An ultrametric topology formalizes the notion of hierarchical structure. An ultrametric embedding, referred to here as ultrametricity, is implied by a hierarchical embedding. Such hierarchical structure can be global in the data set, or local. By quantifying extent or degree of ultrametricity in a data set, we show that ultrametricity becomes pervasive as dimensionality and/or spatial sparsity increases. This leads us to assert that very high dimensional data are of simple structure. We exemplify this finding through a range of simulated data cases. We discuss also application to very high frequency time series segmentation and modeling. Other applications will be described, in particular in the area of textual data mining.

References

[1] F. Murtagh, On ultrametricity, data coding, and computation, Journal of Classification, 21, 167-184, 2004.

[2] F. Murtagh, G. Downs and P. Contreras, "Hierarchical clustering of massive, high dimensional data sets by exploiting ultrametric embedding", SIAM Journal on Scientific Computing, in press, 2007.

[3] F. Murtagh, The remarkable simplicity of very high dimensional data: application of model-based clustering, submitted, 2007.

[4] F. Murtagh, Symmetry in data mining and analysis: a unifying view based on hierarchy, submitted, 2007.

Related Links

Presentation

[pdf ]

Audio

MP3MP3

Video

The video for this talk should appear here if JavaScript is enabled.
If it doesn't, something may have gone wrong with our embedded player.
We'll get it fixed as soon as possible.

Back to top ∧