Actas de congresos
On the effect of endpoints on dynamic time warping
Fecha
2016-08Registro en:
SIGKDD Workshop on Mining and Learning from Time Series, II, 2016, San Francisco.
Autor
Silva, Diego Furtado
Batista, Gustavo Enrique de Almeida Prado Alves
Keogh, Eamonn
Institución
Resumen
While there exist a plethora of classification algorithms for most data types, there is an increasing acceptance that the unique properties of time series mean that the combination of nearest neighbor classifiers and Dynamic Time Warping (DTW) is very competitive across a host of domains, from medicine to astronomy to environmental sensors. While there has been significant progress in improving the efficiency and effectiveness of DTW in recent years, in this work we demonstrate that an underappreciated issue can significantly degrade the accuracy of DTW in real-world deployments. This issue has probably escaped the attention of the very active time series research community because of its reliance on static highly contrived benchmark datasets, rather than real world dynamic datasets where the problem tends to manifest itself. In essence, the issue is that DTW’s eponymous invariance to warping is only true for the main “body” of the two time series being compared. However, for the “head” and “tail” of the time series, the DTW algorithm affords no warping invariance. The effect of this is that tiny differences at the beginning or end of the time series (which may be either consequential or simply the result of poor “cropping”) will tend to contribute disproportionally to the estimated similarity, producing incorrect classifications. In this work, we show that this effect is real, and reduces the performance of the algorithm. We further show that we can fix the issue with a subtle redesign of the DTW algorithm, and that we can learn an appropriate setting for the extra parameter we introduced. We further demonstrate that our generalization is amiable to all the optimizations that make DTW tractable for large datasets.