Given the recent explosion of interest in streaming data and online algorithms, clustering of time series

subsequences, extracted via a sliding window, has received much attention. In this work we make a

surprising claim. Clustering of time series subsequences is meaningless. More concretely, clusters extracted

from these time series are forced to obey a certain constraint that is pathologically unlikely to be satisfied by

any dataset, and because of this, the clusters extracted by any clustering algorithm are essentially random.

While this constraint can be intuitively demonstrated with a simple illustration and is simple to prove, it has

never appeared in the literature. We can justify calling our claim surprising, since it invalidates the

contribution of dozens of previously published papers. We will justify our claim with a theorem, illustrative

examples, and a comprehensive set of experiments on reimplementations of previous work. Although the

primary contribution of our work is to draw attention to the fact that an apparent solution to an important

problem is incorrect and should no longer be used, we also introduce a novel method which, based on the

concept of time series motifs, is able to meaningfully cluster subsequences on some time series datasets

Source: http://www.cs.ucr.edu/~eamonn/meaningless.pdf