performance section
Before Width: | Height: | Size: 69 KiB After Width: | Height: | Size: 65 KiB |
Before Width: | Height: | Size: 90 KiB After Width: | Height: | Size: 165 KiB |
Before Width: | Height: | Size: 75 KiB After Width: | Height: | Size: 71 KiB |
Before Width: | Height: | Size: 109 KiB After Width: | Height: | Size: 203 KiB |
Before Width: | Height: | Size: 73 KiB After Width: | Height: | Size: 71 KiB |
Before Width: | Height: | Size: 89 KiB After Width: | Height: | Size: 142 KiB |
|
@ -150,6 +150,9 @@ Finally, we conclude our paper in \Cref{sec:summary}.
|
|||
|
||||
Clustering of jobs based on their names
|
||||
|
||||
Multivariate time series
|
||||
Levenshtein distance also known as Edit Distance (ED).
|
||||
|
||||
Vampir clustering of timelines of a single job.
|
||||
|
||||
\section{Methodology}
|
||||
|
@ -328,10 +331,11 @@ The runtime is normalized for 100k jobs, i.e., for BIN\_all it takes about 41\,s
|
|||
Generally, the bin algorithms are fastest, while the hex algorithms take often 4-5x as long.
|
||||
Hex\_phases is slow for Job-S and Job-M while it is fast for Job-L, the reason is that just one phase is extracted for Job-L.
|
||||
The Levenshtein based algorithms take longer for longer jobs -- proportional to the job length as it applies a sliding window.
|
||||
The KS algorithm is faster than the others by 10x but it operates on the statistics of the time series.
|
||||
|
||||
Note that the current algorithms are sequential and executed on just one core.
|
||||
For computing the similarity to one (or a small set of reference jobs), they could easily be parallelized.
|
||||
We believe this will then allow a near-online analysis of a job.
|
||||
\jk{To analyze KS jobs}
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
|
|
|
@ -11,6 +11,9 @@ prefix = args[2]
|
|||
|
||||
# Plot the performance numbers of the analysis
|
||||
data = read.csv(file)
|
||||
levels(data$alg_name)[levels(data$alg_name) == "bin_aggzeros"] = "bin_aggz"
|
||||
levels(data$alg_name)[levels(data$alg_name) == "hex_native"] = "hex_nat"
|
||||
levels(data$alg_name)[levels(data$alg_name) == "hex_phases"] = "hex_phas"
|
||||
|
||||
e = data %>% filter(jobs_done >= (jobs_total - 9998))
|
||||
e$time_per_100k = e$elapsed / (e$jobs_done / 100000)
|
||||
|
|