Nai
Before Width: | Height: | Size: 65 KiB After Width: | Height: | Size: 64 KiB |
Before Width: | Height: | Size: 165 KiB After Width: | Height: | Size: 162 KiB |
Before Width: | Height: | Size: 71 KiB After Width: | Height: | Size: 70 KiB |
Before Width: | Height: | Size: 203 KiB After Width: | Height: | Size: 200 KiB |
Before Width: | Height: | Size: 71 KiB After Width: | Height: | Size: 71 KiB |
Before Width: | Height: | Size: 142 KiB After Width: | Height: | Size: 143 KiB |
|
@ -85,7 +85,6 @@ DKRZ --
|
||||||
|
|
||||||
\begin{abstract}
|
\begin{abstract}
|
||||||
|
|
||||||
\todo{Rename algorithm according to JHPS paper, evtl. describe each algorithm with one sentence?}
|
|
||||||
One goal of support staff at a data center is to identify inefficient jobs and to improve their efficiency.
|
One goal of support staff at a data center is to identify inefficient jobs and to improve their efficiency.
|
||||||
Therefore, a data center deploys monitoring systems that capture the behavior of the executed jobs.
|
Therefore, a data center deploys monitoring systems that capture the behavior of the executed jobs.
|
||||||
While it is easy to utilize statistics to rank jobs based on the utilization of computing, storage, and network, it is tricky to find patterns in 100.000 jobs, i.e., is there a class of jobs that aren't performing well.
|
While it is easy to utilize statistics to rank jobs based on the utilization of computing, storage, and network, it is tricky to find patterns in 100.000 jobs, i.e., is there a class of jobs that aren't performing well.
|
||||||
|
@ -200,12 +199,14 @@ After data is reduced across nodes, we quantize the timelines either using binar
|
||||||
By pre-filtering jobs with no I/O activity -- their sum across all dimensions and time series is equal to zero, we are reducing the dataset from about 1 million jobs to about 580k jobs.
|
By pre-filtering jobs with no I/O activity -- their sum across all dimensions and time series is equal to zero, we are reducing the dataset from about 1 million jobs to about 580k jobs.
|
||||||
|
|
||||||
\subsection{Algorithms for Computing Similarity}
|
\subsection{Algorithms for Computing Similarity}
|
||||||
We reuse the algorithms developed in \cite{Eugen20HPS}: B-all, B-aggzeros, Q-native, Q-lev, and Q-quant.
|
We reuse the algorithms developed in \cite{Eugen20HPS}: B-all, B-aggz(eros), Q-native, Q-lev, and Q-phases.
|
||||||
They differ in the way data similarity is defined; either the binary or hexadecimal coding is used, the distance measure is mostly the Euclidean distance or the Levenshtein-distance.
|
They differ in the way data similarity is defined; either the binary or hexadecimal coding is used, the distance measure is mostly the Euclidean distance or the Levenshtein-distance.
|
||||||
For jobs with different lengths, we apply a sliding-windows approach which finds the location for the shorter job in the long job with the highest similarity.
|
For jobs with different lengths, we apply a sliding-windows approach which finds the location for the shorter job in the long job with the highest similarity.
|
||||||
The Q-quant algorithm extracts I/O phases and computes the similarity between the most similar I/O phases of both jobs.
|
\todo{evtl. describe each algorithm with one sentence?}
|
||||||
|
The Q-phases algorithm extracts I/O phases and computes the similarity between the most similar I/O phases of both jobs.
|
||||||
In this paper, we add a new similarity definition based on Kolmogorov-Smirnov-Test that compares the probability distribution of the observed values which we describe in the following.
|
In this paper, we add a new similarity definition based on Kolmogorov-Smirnov-Test that compares the probability distribution of the observed values which we describe in the following.
|
||||||
|
|
||||||
|
|
||||||
\paragraph{Kolmogorov-Smirnov (kv) algorithm}
|
\paragraph{Kolmogorov-Smirnov (kv) algorithm}
|
||||||
% Summary
|
% Summary
|
||||||
For the analysis, we perform two preparation steps.
|
For the analysis, we perform two preparation steps.
|
||||||
|
@ -391,7 +392,7 @@ We believe this will then allow an online analysis.
|
||||||
In the quantitative analysis, we explore the different algorithms how the similarity of our pool of jobs behaves to our reference jobs.
|
In the quantitative analysis, we explore the different algorithms how the similarity of our pool of jobs behaves to our reference jobs.
|
||||||
The cumulative distribution of similarity to a reference job is shown in \Cref{fig:ecdf}.
|
The cumulative distribution of similarity to a reference job is shown in \Cref{fig:ecdf}.
|
||||||
For example, in \Cref{fig:ecdf-job-S}, we see that about 70\% have a similarity of less than 10\% to Job-S for Q-native.
|
For example, in \Cref{fig:ecdf-job-S}, we see that about 70\% have a similarity of less than 10\% to Job-S for Q-native.
|
||||||
B-aggzeros shows some steep increases, e.g., more than 75\% of jobs have the same low similarity below 2\%.
|
B-aggz shows some steep increases, e.g., more than 75\% of jobs have the same low similarity below 2\%.
|
||||||
The different algorithms lead to different curves for our reference jobs, e.g., for Job-S, Q-phases bundles more jobs with low similarity compared to the other jobs; in Job-L, it is the slowest.
|
The different algorithms lead to different curves for our reference jobs, e.g., for Job-S, Q-phases bundles more jobs with low similarity compared to the other jobs; in Job-L, it is the slowest.
|
||||||
% This indicates that the algorithms
|
% This indicates that the algorithms
|
||||||
|
|
||||||
|
@ -455,7 +456,7 @@ Practically, the support team would start with Rank\,1 (most similar job, presum
|
||||||
\caption{Job-L} \label{fig:hist-job-L}
|
\caption{Job-L} \label{fig:hist-job-L}
|
||||||
\end{subfigure}
|
\end{subfigure}
|
||||||
\centering
|
\centering
|
||||||
\caption{Histogram for the number of jobs (bin width: 2.5\%, numbers are the actual job counts). B-aggzeros is nearly identical to B-all.}
|
\caption{Histogram for the number of jobs (bin width: 2.5\%, numbers are the actual job counts). B-aggz is nearly identical to B-all.}
|
||||||
\label{fig:hist}
|
\label{fig:hist}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
|
@ -563,7 +564,7 @@ For Job-L, the job itself isn't included in the chosen Top\,100 (see \Cref{fig:h
|
||||||
|
|
||||||
\subsubsection{Algorithmic differences}
|
\subsubsection{Algorithmic differences}
|
||||||
To verify that the different algorithms behave differently, the intersection for the Top\,100 is computed for all combinations of algorithms and visualized in \Cref{fig:heatmap-job}.
|
To verify that the different algorithms behave differently, the intersection for the Top\,100 is computed for all combinations of algorithms and visualized in \Cref{fig:heatmap-job}.
|
||||||
Bin\_all and B-aggzeros overlap with at least 99 ranks for all three jobs.
|
Bin\_all and B-aggz overlap with at least 99 ranks for all three jobs.
|
||||||
While there is some reordering, both algorithms lead to a comparable set.
|
While there is some reordering, both algorithms lead to a comparable set.
|
||||||
All algorithms have a significant overlap for Job-S.
|
All algorithms have a significant overlap for Job-S.
|
||||||
For Job\-M, however, they lead to a different ranking, and Top\,100, particularly KS determines a different set.
|
For Job\-M, however, they lead to a different ranking, and Top\,100, particularly KS determines a different set.
|
||||||
|
@ -622,13 +623,13 @@ For Job-S, we found that all algorithms work well and, therefore, omit further t
|
||||||
\begin{table}[bt]
|
\begin{table}[bt]
|
||||||
\centering
|
\centering
|
||||||
\begin{tabular}{r|r|r|r|r|r}
|
\begin{tabular}{r|r|r|r|r|r}
|
||||||
B-aggzeros & B-all & Q-lev & Q-native & Q-phases & KS\\ \hline
|
B-aggz & B-all & Q-lev & Q-native & Q-phases & KS\\ \hline
|
||||||
38 & 38 & 33 & 26 & 33 & 0
|
38 & 38 & 33 & 26 & 33 & 0
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
|
|
||||||
%\begin{tabular}{r|r}
|
%\begin{tabular}{r|r}
|
||||||
% Algorithm & Jobs \\ \hline
|
% Algorithm & Jobs \\ \hline
|
||||||
% B-aggzeros & 38 \\
|
% B-aggz & 38 \\
|
||||||
% B-all & 38 \\
|
% B-all & 38 \\
|
||||||
% Q-lev & 33 \\
|
% Q-lev & 33 \\
|
||||||
% Q-native & 26 \\
|
% Q-native & 26 \\
|
||||||
|
@ -653,7 +654,7 @@ For Job-S, we found that all algorithms work well and, therefore, omit further t
|
||||||
\caption{Non-control job: Rank\,4, SIM=81\%}
|
\caption{Non-control job: Rank\,4, SIM=81\%}
|
||||||
\end{subfigure}
|
\end{subfigure}
|
||||||
|
|
||||||
\caption{Job-S: jobs with different job names when using B-aggzeros}
|
\caption{Job-S: jobs with different job names when using B-aggz}
|
||||||
\label{fig:job-S-bin-agg}
|
\label{fig:job-S-bin-agg}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
|
|
|
@ -11,9 +11,9 @@ prefix = args[2]
|
||||||
|
|
||||||
# Plot the performance numbers of the analysis
|
# Plot the performance numbers of the analysis
|
||||||
data = read.csv(file)
|
data = read.csv(file)
|
||||||
levels(data$alg_name)[levels(data$alg_name) == "bin_aggzeros"] = "bin_aggz"
|
levels(data$alg_name)[levels(data$alg_name) == "B-aggzeros"] = "B-aggz"
|
||||||
levels(data$alg_name)[levels(data$alg_name) == "hex_native"] = "hex_nat"
|
levels(data$alg_name)[levels(data$alg_name) == "Q-native"] = "Q-nat"
|
||||||
levels(data$alg_name)[levels(data$alg_name) == "hex_phases"] = "hex_phas"
|
levels(data$alg_name)[levels(data$alg_name) == "Q-phases"] = "Q-phas"
|
||||||
|
|
||||||
e = data %>% filter(jobs_done >= (jobs_total - 9998))
|
e = data %>% filter(jobs_done >= (jobs_total - 9998))
|
||||||
e$time_per_100k = e$elapsed / (e$jobs_done / 100000)
|
e$time_per_100k = e$elapsed / (e$jobs_done / 100000)
|
||||||
|
|
|
@ -11,7 +11,7 @@ library(stringi)
|
||||||
library(stringr)
|
library(stringr)
|
||||||
|
|
||||||
# Turn to TRUE to print indivdiual job images
|
# Turn to TRUE to print indivdiual job images
|
||||||
plotjobs = TRUE
|
plotjobs = FALSE
|
||||||
|
|
||||||
# Color scheme
|
# Color scheme
|
||||||
plotcolors <- c("#CC0000", "#FFA500", "#FFFF00", "#008000", "#9999ff", "#000099")
|
plotcolors <- c("#CC0000", "#FFA500", "#FFFF00", "#008000", "#9999ff", "#000099")
|
||||||
|
|