This commit is contained in:
Eugen Betke 2020-08-26 19:08:50 +02:00
parent 64a582cb25
commit f8465c25d6
1 changed files with 27 additions and 3 deletions

View File

@ -2,7 +2,7 @@
\documentclass[]{llncs}
\usepackage{todonotes}
\newcommand{\eb}[1]{\todo[inline]{(EB): #1}}
\newcommand{\eb}[1]{\todo[inline, color=green]{EB: #1}}
\newcommand{\jk}[1]{\todo[inline]{JK: #1}}
\usepackage{silence}
@ -11,6 +11,13 @@
\WarningFilter{caption}{Unsupported}
\WarningFilter{caption}{Unknown document}
\usepackage{changes}
\definechangesauthor[name=Betke, color=blue]{eb}
\newcommand{\ebrep}[2]{\replaced[id=eb]{#1}{#2}}
\newcommand{\ebadd}[1]{\added[id=eb]{#1}}
\newcommand{\ebdel}[1]{\deleted[id=eb]{#1}}
\newcommand{\ebcom}[1]{\comment[id=eb]{#1}}
\let\spvec\vec
\let\vec\accentvec
\usepackage{amsmath}
@ -27,7 +34,6 @@
\usepackage[listings,skins,breakable,raster,most]{tcolorbox}
\usepackage{caption}
\lstset{
numberbychapter=false,
belowskip=-10pt,
@ -66,6 +72,7 @@
\title{Using Machine Learning to Identify Similar Jobs Based on their IO Behavior}
\author{Julian Kunkel\inst{2} \and Eugen Betke\inst{1}}
\institute{
University of Reading--%
\email{j.m.kunkel@reading.ac.uk}%
@ -75,6 +82,9 @@ DKRZ --
}
\begin{document}
\maketitle
\eb{Der Titel ist nicht zugtreffend. Hier ist kein Clustering im Spiel, also auch keine Machine Learning. Es wird ja nichts gelernt. Es ist eher so:
``A Workflow for Identification of Similar Job by Means of Timeseries-based I/O Similarity Functions''
}
\begin{abstract}
@ -308,12 +318,19 @@ Practically, the support team would start with Rank\,1 (most similar job, presum
When analyzing the overall population of jobs executed on a system, we expect that some workloads are executed several times (with different inputs but with the same configuration) or are executed with slightly different configurations (e.g., node counts, timesteps).
Thus, potentially our similarity analysis of the job population may just identify the re-execution of the same workload.
\ebadd{%
In the most cases, the support staff would identify the re-execution of jobs simply by job names.
The job names are often user defined generic strings and can contain confidential data.
It is quite difficult to anonymize them and keep the meaning unchanged.
Therefore, they are not available for this analysis.
}%
To understand if the analysis is inclusive and identifies different applications, we use two approaches with our Top\,100 jobs:
We explore the distribution of users (and groups), runtime, and node count across jobs.
The algorithms should include different users, node counts, and across runtime.
To confirm hypotheses presented, we analyzed the job metadata comparing job names which validates our quantitative results discussed in the following.
\paragraph{User distribution.}
To understand how the Top\,100 are distributed across users, the data is grouped by userid and counted.
\Cref{fig:userids} shows the stacked user information, where the lowest stack is the user with the most jobs and the top most user in the stack has the smallest number of jobs.
@ -333,8 +350,9 @@ As post-processing jobs use typically one node and the number of postprocessing
The boxplots have different shapes which is an indication, that the different algorithms identify a different set of jobs -- we will analyze this later further.
\paragraph{Runtime distribution.}
The runtime of the Top\,100 jobs is shown using boxplots in \Cref{fig:runtime-job}.
The \added{job} runtime of the Top\,100 jobs is shown using boxplots in \Cref{fig:runtime-job}.
While all algorithms can compute the similarity between jobs of different length, the bin algorithms and hex\_native penalize jobs of different length leading to a narrow profile.
\eb{``Narrow profiles'' sieht man irgendwie nicht in den Bildern. (Oder ich hab's nicht verstanden. Verstande habe ich, dass ``Narrow profile`` eine Jobmenge ist mit einer ahnlichen Laufzeit.)}
For Job-M and Job-L, hex\_phases is able to identify much shorter or longer jobs.
For Job-L, the job itself isn't included in the chosen Top\,100 (see \Cref{fig:hist-job-L}, 393 jobs have a similarity of 100\%) which is the reason why the job runtime isn't shown in the figure itself.
@ -380,6 +398,7 @@ For Job-L, the job itself isn't included in the chosen Top\,100 (see \Cref{fig:h
\caption{Distribution of node counts (for Job-S nodes=1 in all cases)}
\label{fig:nodes-job}
\end{figure}
\eb{In \Cref{fig:nodes-job} koennte man noch evtl. die Anzahl der Knoten der untersuchten Jobs einblenden.}
\begin{figure}
\begin{subfigure}{0.31\textwidth}
@ -401,6 +420,7 @@ For Job-L, the job itself isn't included in the chosen Top\,100 (see \Cref{fig:h
\caption{Distribution of runtime for all 100 top ranked jobs}
\label{fig:runtime-job}
\end{figure}
\eb{In \Cref{fig:runtime-job} koennte man evtl. noch die Laufzeit der untersuchten Jobs einblenden.}
\subsubsection{Algorithmic differences}
To verify that the different algorithms behave differently, the intersection for the Top\,100 is computed for all combination of algorithms and visualized in \Cref{fig:heatmap-job}.
@ -409,6 +429,8 @@ While there is some reordering, both algorithms lead to a comparable order.
The hex\_lev and hex\_native algorithms are also exhibiting some overlap particularly for Job-S and Job-L.
For Job\-M, however, they lead to a different ranking and Top\,100.
From the analysis, we conclude that one representative from binary quantization is sufficient while the other algorithms identify mostly disjoint behavioral aspects and, therefore, should be considered together.
\eb{Ist das eine generelle Aussage: ``one representative from binary quantization is sufficient``? Wenn ja, dann ist sie sehr wage. Koennte Zufall sein.}
\begin{figure}
@ -432,6 +454,7 @@ From the analysis, we conclude that one representative from binary quantization
\caption{Intersection of the 100 top ranked jobs for different algorithms}
\label{fig:heatmap-job}
\end{figure}
\eb{In \Cref{fig:heatmap-job} muss die Farbpalette gefixt werden. Auf blau sieht man gar nichts.}
\section{Assessing Timelines for Similar Jobs}
@ -514,6 +537,7 @@ Bin aggzeros works quite well here too. The jobs are a bit more diverse.
\subsection{Job-M}
Bin aggzero liefert Mist zurück.
\eb{Wegen Bug?}