|
|
@ -150,6 +150,9 @@ Finally, we conclude our paper in \Cref{sec:summary}. |
|
|
|
|
|
|
|
Clustering of jobs based on their names |
|
|
|
|
|
|
|
Multivariate time series |
|
|
|
Levenshtein distance also known as Edit Distance (ED). |
|
|
|
|
|
|
|
Vampir clustering of timelines of a single job. |
|
|
|
|
|
|
|
\section{Methodology} |
|
|
@ -328,10 +331,11 @@ The runtime is normalized for 100k jobs, i.e., for BIN\_all it takes about 41\,s |
|
|
|
Generally, the bin algorithms are fastest, while the hex algorithms take often 4-5x as long. |
|
|
|
Hex\_phases is slow for Job-S and Job-M while it is fast for Job-L, the reason is that just one phase is extracted for Job-L. |
|
|
|
The Levenshtein based algorithms take longer for longer jobs -- proportional to the job length as it applies a sliding window. |
|
|
|
The KS algorithm is faster than the others by 10x but it operates on the statistics of the time series. |
|
|
|
|
|
|
|
Note that the current algorithms are sequential and executed on just one core. |
|
|
|
For computing the similarity to one (or a small set of reference jobs), they could easily be parallelized. |
|
|
|
We believe this will then allow a near-online analysis of a job. |
|
|
|
\jk{To analyze KS jobs} |
|
|
|
|
|
|
|
\begin{figure} |
|
|
|
\centering |
|
|
|