Comments
This commit is contained in:
parent
64a582cb25
commit
f8465c25d6
|
@ -2,7 +2,7 @@
|
||||||
\documentclass[]{llncs}
|
\documentclass[]{llncs}
|
||||||
|
|
||||||
\usepackage{todonotes}
|
\usepackage{todonotes}
|
||||||
\newcommand{\eb}[1]{\todo[inline]{(EB): #1}}
|
\newcommand{\eb}[1]{\todo[inline, color=green]{EB: #1}}
|
||||||
\newcommand{\jk}[1]{\todo[inline]{JK: #1}}
|
\newcommand{\jk}[1]{\todo[inline]{JK: #1}}
|
||||||
|
|
||||||
\usepackage{silence}
|
\usepackage{silence}
|
||||||
|
@ -11,6 +11,13 @@
|
||||||
\WarningFilter{caption}{Unsupported}
|
\WarningFilter{caption}{Unsupported}
|
||||||
\WarningFilter{caption}{Unknown document}
|
\WarningFilter{caption}{Unknown document}
|
||||||
|
|
||||||
|
\usepackage{changes}
|
||||||
|
\definechangesauthor[name=Betke, color=blue]{eb}
|
||||||
|
\newcommand{\ebrep}[2]{\replaced[id=eb]{#1}{#2}}
|
||||||
|
\newcommand{\ebadd}[1]{\added[id=eb]{#1}}
|
||||||
|
\newcommand{\ebdel}[1]{\deleted[id=eb]{#1}}
|
||||||
|
\newcommand{\ebcom}[1]{\comment[id=eb]{#1}}
|
||||||
|
|
||||||
\let\spvec\vec
|
\let\spvec\vec
|
||||||
\let\vec\accentvec
|
\let\vec\accentvec
|
||||||
\usepackage{amsmath}
|
\usepackage{amsmath}
|
||||||
|
@ -27,7 +34,6 @@
|
||||||
\usepackage[listings,skins,breakable,raster,most]{tcolorbox}
|
\usepackage[listings,skins,breakable,raster,most]{tcolorbox}
|
||||||
\usepackage{caption}
|
\usepackage{caption}
|
||||||
|
|
||||||
|
|
||||||
\lstset{
|
\lstset{
|
||||||
numberbychapter=false,
|
numberbychapter=false,
|
||||||
belowskip=-10pt,
|
belowskip=-10pt,
|
||||||
|
@ -66,6 +72,7 @@
|
||||||
\title{Using Machine Learning to Identify Similar Jobs Based on their IO Behavior}
|
\title{Using Machine Learning to Identify Similar Jobs Based on their IO Behavior}
|
||||||
\author{Julian Kunkel\inst{2} \and Eugen Betke\inst{1}}
|
\author{Julian Kunkel\inst{2} \and Eugen Betke\inst{1}}
|
||||||
|
|
||||||
|
|
||||||
\institute{
|
\institute{
|
||||||
University of Reading--%
|
University of Reading--%
|
||||||
\email{j.m.kunkel@reading.ac.uk}%
|
\email{j.m.kunkel@reading.ac.uk}%
|
||||||
|
@ -75,6 +82,9 @@ DKRZ --
|
||||||
}
|
}
|
||||||
\begin{document}
|
\begin{document}
|
||||||
\maketitle
|
\maketitle
|
||||||
|
\eb{Der Titel ist nicht zugtreffend. Hier ist kein Clustering im Spiel, also auch keine Machine Learning. Es wird ja nichts gelernt. Es ist eher so:
|
||||||
|
``A Workflow for Identification of Similar Job by Means of Timeseries-based I/O Similarity Functions''
|
||||||
|
}
|
||||||
|
|
||||||
\begin{abstract}
|
\begin{abstract}
|
||||||
|
|
||||||
|
@ -308,12 +318,19 @@ Practically, the support team would start with Rank\,1 (most similar job, presum
|
||||||
|
|
||||||
When analyzing the overall population of jobs executed on a system, we expect that some workloads are executed several times (with different inputs but with the same configuration) or are executed with slightly different configurations (e.g., node counts, timesteps).
|
When analyzing the overall population of jobs executed on a system, we expect that some workloads are executed several times (with different inputs but with the same configuration) or are executed with slightly different configurations (e.g., node counts, timesteps).
|
||||||
Thus, potentially our similarity analysis of the job population may just identify the re-execution of the same workload.
|
Thus, potentially our similarity analysis of the job population may just identify the re-execution of the same workload.
|
||||||
|
\ebadd{%
|
||||||
|
In the most cases, the support staff would identify the re-execution of jobs simply by job names.
|
||||||
|
The job names are often user defined generic strings and can contain confidential data.
|
||||||
|
It is quite difficult to anonymize them and keep the meaning unchanged.
|
||||||
|
Therefore, they are not available for this analysis.
|
||||||
|
}%
|
||||||
|
|
||||||
To understand if the analysis is inclusive and identifies different applications, we use two approaches with our Top\,100 jobs:
|
To understand if the analysis is inclusive and identifies different applications, we use two approaches with our Top\,100 jobs:
|
||||||
We explore the distribution of users (and groups), runtime, and node count across jobs.
|
We explore the distribution of users (and groups), runtime, and node count across jobs.
|
||||||
The algorithms should include different users, node counts, and across runtime.
|
The algorithms should include different users, node counts, and across runtime.
|
||||||
To confirm hypotheses presented, we analyzed the job metadata comparing job names which validates our quantitative results discussed in the following.
|
To confirm hypotheses presented, we analyzed the job metadata comparing job names which validates our quantitative results discussed in the following.
|
||||||
|
|
||||||
|
|
||||||
\paragraph{User distribution.}
|
\paragraph{User distribution.}
|
||||||
To understand how the Top\,100 are distributed across users, the data is grouped by userid and counted.
|
To understand how the Top\,100 are distributed across users, the data is grouped by userid and counted.
|
||||||
\Cref{fig:userids} shows the stacked user information, where the lowest stack is the user with the most jobs and the top most user in the stack has the smallest number of jobs.
|
\Cref{fig:userids} shows the stacked user information, where the lowest stack is the user with the most jobs and the top most user in the stack has the smallest number of jobs.
|
||||||
|
@ -333,8 +350,9 @@ As post-processing jobs use typically one node and the number of postprocessing
|
||||||
The boxplots have different shapes which is an indication, that the different algorithms identify a different set of jobs -- we will analyze this later further.
|
The boxplots have different shapes which is an indication, that the different algorithms identify a different set of jobs -- we will analyze this later further.
|
||||||
|
|
||||||
\paragraph{Runtime distribution.}
|
\paragraph{Runtime distribution.}
|
||||||
The runtime of the Top\,100 jobs is shown using boxplots in \Cref{fig:runtime-job}.
|
The \added{job} runtime of the Top\,100 jobs is shown using boxplots in \Cref{fig:runtime-job}.
|
||||||
While all algorithms can compute the similarity between jobs of different length, the bin algorithms and hex\_native penalize jobs of different length leading to a narrow profile.
|
While all algorithms can compute the similarity between jobs of different length, the bin algorithms and hex\_native penalize jobs of different length leading to a narrow profile.
|
||||||
|
\eb{``Narrow profiles'' sieht man irgendwie nicht in den Bildern. (Oder ich hab's nicht verstanden. Verstande habe ich, dass ``Narrow profile`` eine Jobmenge ist mit einer ahnlichen Laufzeit.)}
|
||||||
For Job-M and Job-L, hex\_phases is able to identify much shorter or longer jobs.
|
For Job-M and Job-L, hex\_phases is able to identify much shorter or longer jobs.
|
||||||
For Job-L, the job itself isn't included in the chosen Top\,100 (see \Cref{fig:hist-job-L}, 393 jobs have a similarity of 100\%) which is the reason why the job runtime isn't shown in the figure itself.
|
For Job-L, the job itself isn't included in the chosen Top\,100 (see \Cref{fig:hist-job-L}, 393 jobs have a similarity of 100\%) which is the reason why the job runtime isn't shown in the figure itself.
|
||||||
|
|
||||||
|
@ -380,6 +398,7 @@ For Job-L, the job itself isn't included in the chosen Top\,100 (see \Cref{fig:h
|
||||||
\caption{Distribution of node counts (for Job-S nodes=1 in all cases)}
|
\caption{Distribution of node counts (for Job-S nodes=1 in all cases)}
|
||||||
\label{fig:nodes-job}
|
\label{fig:nodes-job}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
\eb{In \Cref{fig:nodes-job} koennte man noch evtl. die Anzahl der Knoten der untersuchten Jobs einblenden.}
|
||||||
|
|
||||||
\begin{figure}
|
\begin{figure}
|
||||||
\begin{subfigure}{0.31\textwidth}
|
\begin{subfigure}{0.31\textwidth}
|
||||||
|
@ -401,6 +420,7 @@ For Job-L, the job itself isn't included in the chosen Top\,100 (see \Cref{fig:h
|
||||||
\caption{Distribution of runtime for all 100 top ranked jobs}
|
\caption{Distribution of runtime for all 100 top ranked jobs}
|
||||||
\label{fig:runtime-job}
|
\label{fig:runtime-job}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
\eb{In \Cref{fig:runtime-job} koennte man evtl. noch die Laufzeit der untersuchten Jobs einblenden.}
|
||||||
|
|
||||||
\subsubsection{Algorithmic differences}
|
\subsubsection{Algorithmic differences}
|
||||||
To verify that the different algorithms behave differently, the intersection for the Top\,100 is computed for all combination of algorithms and visualized in \Cref{fig:heatmap-job}.
|
To verify that the different algorithms behave differently, the intersection for the Top\,100 is computed for all combination of algorithms and visualized in \Cref{fig:heatmap-job}.
|
||||||
|
@ -409,6 +429,8 @@ While there is some reordering, both algorithms lead to a comparable order.
|
||||||
The hex\_lev and hex\_native algorithms are also exhibiting some overlap particularly for Job-S and Job-L.
|
The hex\_lev and hex\_native algorithms are also exhibiting some overlap particularly for Job-S and Job-L.
|
||||||
For Job\-M, however, they lead to a different ranking and Top\,100.
|
For Job\-M, however, they lead to a different ranking and Top\,100.
|
||||||
From the analysis, we conclude that one representative from binary quantization is sufficient while the other algorithms identify mostly disjoint behavioral aspects and, therefore, should be considered together.
|
From the analysis, we conclude that one representative from binary quantization is sufficient while the other algorithms identify mostly disjoint behavioral aspects and, therefore, should be considered together.
|
||||||
|
\eb{Ist das eine generelle Aussage: ``one representative from binary quantization is sufficient``? Wenn ja, dann ist sie sehr wage. Koennte Zufall sein.}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\begin{figure}
|
\begin{figure}
|
||||||
|
@ -432,6 +454,7 @@ From the analysis, we conclude that one representative from binary quantization
|
||||||
\caption{Intersection of the 100 top ranked jobs for different algorithms}
|
\caption{Intersection of the 100 top ranked jobs for different algorithms}
|
||||||
\label{fig:heatmap-job}
|
\label{fig:heatmap-job}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
\eb{In \Cref{fig:heatmap-job} muss die Farbpalette gefixt werden. Auf blau sieht man gar nichts.}
|
||||||
|
|
||||||
\section{Assessing Timelines for Similar Jobs}
|
\section{Assessing Timelines for Similar Jobs}
|
||||||
|
|
||||||
|
@ -514,6 +537,7 @@ Bin aggzeros works quite well here too. The jobs are a bit more diverse.
|
||||||
\subsection{Job-M}
|
\subsection{Job-M}
|
||||||
|
|
||||||
Bin aggzero liefert Mist zurück.
|
Bin aggzero liefert Mist zurück.
|
||||||
|
\eb{Wegen Bug?}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue