mistral-io-datasets/paper/main.tex

\let\accentvec\vec
\documentclass[]{llncs}

\usepackage{todonotes}
\newcommand{\eb}[1]{\todo[inline]{(EB): #1}}
\newcommand{\jk}[1]{\todo[inline]{JK: #1}}

\usepackage{silence}
\WarningFilter{biblatex}{Using}
\WarningFilter{latex}{Float too large}
\WarningFilter{caption}{Unsupported}
\WarningFilter{caption}{Unknown document}

\let\spvec\vec
\let\vec\accentvec
\usepackage{amsmath}
\let\vec\spvec

\usepackage{array}
\usepackage{xcolor}
\usepackage{color}
\usepackage{colortbl}
\usepackage{subcaption}
\usepackage{hyperref}
\usepackage{listings}
\usepackage{lstautogobble}
\usepackage[listings,skins,breakable,raster,most]{tcolorbox}
\usepackage{caption}


\lstset{
	numberbychapter=false,
	belowskip=-10pt,
	aboveskip=-10pt,
}

\lstdefinestyle{lstcodebox} {
	basicstyle=\scriptsize\ttfamily,
	autogobble=true,
	tabsize=2,
	captionpos=b,
	float,
}

\usepackage{graphicx}
\graphicspath{
	{./pictures/},
  {../fig/},
  {../}
}

\usepackage[backend=bibtex, style=numeric]{biblatex}
\addbibresource{bibliography.bib}


\usepackage{enumitem}
\setitemize{noitemsep,topsep=0pt,parsep=0pt,partopsep=0pt}

\definecolor{darkgreen}{rgb}{0,0.5,0}
\definecolor{darkyellow}{rgb}{0.7,0.7,0}


\usepackage{cleveref}
\crefname{codecount}{Code}{Codes}

\title{Using Machine Learning to Identify Similar Jobs Based on their IO Behavior}
\author{Julian Kunkel\inst{2} \and Eugen Betke\inst{1}}

\institute{
University of Reading--%
\email{j.m.kunkel@reading.ac.uk}%
\and
DKRZ --
\email{betke@dkrz.de}%
}
\begin{document}
\maketitle

\begin{abstract}

Support staff.
Problem, a particular job found that isn't performing well.
Now how can we find similar jobs?

Problem with definition of similarity.

In this paper, a methodology and algorithms to identify similar jobs based on profiles and time series are  illustrated.
Similar to a study.

Research questions: is this effective to find similar jobs?

The contribution of this paper...
\end{abstract}

\section{Introduction}

%This paper is structured as follows.
%We start with the related work in \Cref{sec:relwork}.
%Then, in TODO we introduce the DKRZ monitoring systems and explain how I/O metrics are captured by the collectors.
%In \Cref{sec:methodology} we describe the data reduction and the machine learning approaches and do an experiment in \Cref{sec:data,sec:evaluation}.
%Finally, we finalize our paper with a summary in \Cref{sec:summary}.

\section{Related Work}
\label{sec:relwork}

\section{Methodology}
\label{sec:methodology}

Given: the reference job ID.
Create from 4D time series data (number of nodes, per file systems, 9 metrics, time) a feature set.

Adapt the algorithms:
\begin{itemize}
	\item iterate for all jobs
		\begin{itemize}
			\item compute distance to reference job
		\end{itemize}
	\item sort the jobs based on the distance to ref job
	\item create cumulative job distribution based on distance for visualization, allow users to output jobs with a given distance
\end{itemize}

A user might be interested to explore say closest 10 or 50 jobs.

Algorithms:
Profile algorithm: job-profiles (job-duration, job-metrics, combine both)
$\rightarrow$ just compute geom-mean distance between profile

Check time series algorithms:

\begin{itemize}
	\item bin
	\item hex\_native
  \item hex\_lev
	\item hex\_quant
\end{itemize}

\section{Evaluation}
\label{sec:evaluation}

For each reference job and algorithm, we created a CSV files with the computed similarity for all other jobs.
Next, we analyzed the performance of the algorithm.
Then the quantitative behavior and the correlation between chosen similarity and number of found jobs, and, finally, the quality of the 100 most similar jobs.

\subsection{Reference Jobs}

In the following, we assume a job is given and we aim to identify similar jobs.
We chose several reference jobs with different compute and IO characteristics:
\begin{itemize}
	\item Job-S: performs post-processing on a single node. This is a typical process in climate science where data products are reformatted and annotated with metadata to a standard representation (so called CMORization). The post-processing is IO intensive.
  \item Job-M: a typical MPI parallel 8-hour compute job on 128 nodes which writes time series data after some spin up.   %CHE.ws12
	\item Job-L: a 66-hour 20-node job.
  The initialization data is read at the beginning.
  Then only a single master node writes constantly a small volume of data; in fact, the generated data is too small to be categorized as IO relevant.
\end{itemize}

The segmented timeline of the jobs are visualized in \Cref{fig:refJobs}.
This coding is also used for the HEX class of algorithms (BIN algorithms merge all timelines together as described in \jk{TODO}.
The figures show the values of active metrics ($\neq 0$) only; if few are active then they are shown in one timeline, otherwise they are rendered individually to provide a better overview.
For example, we can see in \Cref{fig:job-S}, that several metrics increase in Segment\,6.

\begin{figure}
\begin{subfigure}{0.8\textwidth}
\centering
\includegraphics[width=\textwidth]{job-timeseries4296426}
\caption{Job-S} \label{fig:job-S}
\end{subfigure}
\centering


\begin{subfigure}{0.8\textwidth}
\centering
\includegraphics[width=\textwidth]{job-timeseries5024292}
\caption{Job-M} \label{fig:job-M}
\end{subfigure}
\centering


\caption{Reference jobs: segmented timelines of mean IO activity}
\label{fig:refJobs}
\end{figure}


\begin{figure}\ContinuedFloat

\begin{subfigure}{0.8\textwidth}
\centering
\includegraphics[width=\textwidth]{job-timeseries7488914-30}
\caption{Job-L (first 30 segments of 400; remaining segments are similar)}
\label{fig:job-L}
\end{subfigure}
\centering
\caption{Reference jobs: segmented timelines of mean IO activity}
\end{figure}


\subsection{Performance}

\jk{Describe System at DKRZ from old paper}

To measure the performance for computing the similarity to the reference jobs, the algorithms are executed 10 times on a compute node at DKRZ.
A boxplot for the runtimes is shown in \Cref{fig:performance}.
The runtime is normalized for 100k seconds, i.e., for bin\_all it takes about 41\,s to process 100k jobs out of the 500k total jobs that this algorithm will process.
Generally, the bin algorithms are fastest, while the hex algorithms take often 4-5x as long.
Hex\_phases is slow for Job-S and Job-M while it is fast for Job-L, the reason is that just one phase is extracted for Job-L.
The Levensthein based algorithms take longer for longer jobs -- proportional to the job length as it applies a sliding window.
Note that the current algorithms are sequential and executed on just one core.
For computing the similarity to one (or a small set of reference jobs), they could easily be parallelized.
We believe this will then allow a near-online analysis of a job.

\begin{figure}
\centering
  \begin{subfigure}{0.31\textwidth}
  \centering
  \includegraphics[width=\textwidth]{progress_4296426-out-boxplot}
  \caption{Job-S (runtime=15,551\,s, segments=25)} \label{fig:perf-job-S}
  \end{subfigure}
  \begin{subfigure}{0.31\textwidth}
  \centering
  \includegraphics[width=\textwidth]{progress_5024292-out-boxplot}
  \caption{Job-M (runtime=28,828\,s, segments=48)} \label{fig:perf-job-M}
  \end{subfigure}
  \begin{subfigure}{0.31\textwidth}
  \centering
  \includegraphics[width=\textwidth]{progress_7488914-out-boxplot}
  \caption{Job-L} \label{fig:perf-job-L}
  \end{subfigure}

  \caption{Runtime of the algorithms to compute the similarity to reference jobs}
  \label{fig:performance}
\end{figure}


\subsection{Quantitative Analysis}

In the quantitative analysis, we explore for the different algorithms how the similarity of our pool of jobs behaves to our three reference jobs (Job-S, Job-M, and Job-L).
The cumulative distribution of similarity to the reference jobs is shown in \Cref{fig:ecdf}.
For example, in \Cref{fig:ecdf-job-S}, we see that about 70\% have a similarity of less than 10\% to Job-S for HEX\_native.
BIN\_aggzeros shows some steep increases, e.g., more than 75\% of jobs have the same low similarity below 2\%.
The different algorithms lead to different curves for our reference jobs, e.g., for Job-S, HEX\_phases bundles more jobs with low similarity compared to the other jobs; in Job-L, it is the slowest.
% This indicates that the algorithms

The support team in a data center may have time to investigate the most similar jobs.
Time for the analysis is typically bound, for instance, the team may analyze the 100 most similar ranked jobs (the Top\,100).
In \Cref{fig:hist}, the histograms with the actual number of jobs for a given similarity are shown.
As we focus on a feasible number of jobs, the diagram should be read from right (100\% similarity) to left; and for a bin we show at most 100 jobs (total number is still given).
It turns out that both BIN algorithms produce nearly identical histograms and we omit one of them.
In the figures, we can see again a different behavior of the algorithms depending on the reference job.
Especially for Job-S, we can see clusters with jobs of higher similarity (e.g., at hex\_lev at SIM=75\%) while for Job-M, the growth in the relevant section is more steady.
For Job-L, we find barely similar jobs, except when using the HEX\_phases algorithm.

Practically, the support team would start with Rank\,1 (most similar job, presumably, the reference job itself) and walk down until the jobs look different, or until a cluster is analyzed.

\begin{figure}

\begin{subfigure}{0.8\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/ecdf}
\caption{Job-S} \label{fig:ecdf-job-S}
\end{subfigure}
\centering

\begin{subfigure}{0.8\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/ecdf}
\caption{Job-M} \label{fig:ecdf-job-M}
\end{subfigure}
\centering

\begin{subfigure}{0.8\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/ecdf}
\caption{Job-L} \label{fig:ecdf-job-L}
\end{subfigure}
\centering
\caption{Quantitative job similarity -- empirical cumulative density function}
\label{fig:ecdf}
\end{figure}


\begin{figure}
\centering

\begin{subfigure}{0.75\textwidth}
\centering
\includegraphics[width=\textwidth,trim={0 0 0 2.2cm},clip]{job_similarities_4296426-out/hist-sim}
\caption{Job-S} \label{fig:hist-job-S}
\end{subfigure}

\begin{subfigure}{0.75\textwidth}
\centering
\includegraphics[width=\textwidth,trim={0 0 0 2.2cm},clip]{job_similarities_5024292-out/hist-sim}
\caption{Job-M} \label{fig:hist-job-M}
\end{subfigure}

\begin{subfigure}{0.75\textwidth}
\centering
\includegraphics[width=\textwidth,trim={0 0 0 2.2cm},clip]{job_similarities_7488914-out/hist-sim}
\caption{Job-L} \label{fig:hist-job-L}
\end{subfigure}
\centering
\caption{Histogram for the number of jobs (bin width: 2.5\%, numbers are the actual job counts). BIN\_aggzeros is nearly identical to BIN\_all.}
\label{fig:hist}
\end{figure}

\subsubsection{Inclusivity and Specificity}


User count and group id is the same, meaning that a user is likely from the same group and the number of groups is identical to the number of users (unique), for Job-L user id and group count differ a bit, for Job-M a bit more.
Up to about 2x users than groups.

To understand how the Top\,100 are distributed across users, the data is grouped by userid and counted.
\Cref{fig:userids} shows the stacked user information, where the lowest stack is the user with the most jobs and the top most user in the stack has the smallest number of jobs.
For Job-S, we can see that about 70-80\% of jobs stem from one user, for the hex\_lev and hex\_native algorithms, the other jobs stem from a second user while bin includes jobs from additional users (5 in total).
For Job-M, jobs from more users are included (13); about 25\% of jobs stem from the same user, here, hex\_lev and hex\_native is including more users (30 and 33, respectively) than the other three algorithms.
For Job-L, the two hex algorithms include with (12 and 13) a bit more diverse user community than the bin algorithms (9) but hex\_phases covers 35 users.

\begin{figure}
\begin{subfigure}{0.31\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/user-ids}
\caption{Job-S} \label{fig:users-job-S}
\end{subfigure}
\begin{subfigure}{0.31\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/user-ids}
\caption{Job-M} \label{fig:users-job-M}
\end{subfigure}
\begin{subfigure}{0.31\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/user-ids}
\caption{Job-L} \label{fig:users-job-L}
\end{subfigure}


\caption{User information for all 100 top ranked jobs}
\label{fig:userids}
\end{figure}

\begin{figure}
%\begin{subfigure}{0.31\textwidth}
%\centering
%\includegraphics[width=\textwidth]{job_similarities_4296426-out/jobs-nodes}
%\caption{Job-S} \label{fig:nodes-job-S}
%\end{subfigure}
\begin{subfigure}{0.48\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/jobs-nodes}
\caption{Job-M (ref. job runs on 128 nodes)} \label{fig:nodes-job-M}
\end{subfigure}
\begin{subfigure}{0.48\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/jobs-nodes}
\caption{Job-L (reference job runs on 20 nodes)} \label{fig:nodes-job-L}
\end{subfigure}
\centering
\caption{Distribution of node counts (for Job-S nodes=1 in all cases)}
\label{fig:nodes-job}
\end{figure}

\begin{figure}
\begin{subfigure}{0.31\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/jobs-elapsed}
\caption{Job-S ($job=10^{4.19}$)} \label{fig:runtime-job-S}
\end{subfigure}
\begin{subfigure}{0.31\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/jobs-elapsed}
\caption{Job-M ($job=10^{4.46}$)} \label{fig:runtime-job-M}
\end{subfigure}
\begin{subfigure}{0.31\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/jobs-elapsed}
\caption{Job-L ($job=10^{5.3}$)} \label{fig:runtime-job-L}
\end{subfigure}
\centering
\caption{Distribution of runtime for all 100 top ranked jobs}
\label{fig:runtime-job}
\end{figure}

To see how different the algorithms behave, the intersection of two algorithms is computed for the 100 jobs with the highest similarity and visualized in \Cref{fig:heatmap-job}.
As expected, we can observe that bin\_all and bin\_aggzeros is very similar for all three jobs.
While there is some reordering, both algorithms lead to a comparable order.
The hex\_lev and hex\_native algorithms are also exhibiting some overlap particularly for Job-S and Job-L.
For Job\-M, however, they lead to a different ranking and Top\,100.
From the analysis, we conclude that one representative from binary quantization is sufficient while the other algorithms identify mostly disjoint behavioral aspects and, therefore, should be considered together.

One consideration is to identify jobs that meet a rank threshold for all different algorithms.
\jk{TODO}

\begin{figure}
\begin{subfigure}{0.31\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/intersection-heatmap}
\caption{Job-S} \label{fig:heatmap-job-S}
\end{subfigure}
\begin{subfigure}{0.31\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/intersection-heatmap}
\caption{Job-M} \label{fig:heatmap-job-M} %,trim={2.5cm 0 0 0},clip
\end{subfigure}
\begin{subfigure}{0.31\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/intersection-heatmap}
\caption{Job-L} \label{fig:heatmap-job-L}
\end{subfigure}

\centering
\caption{Intersection of the 100 top ranked jobs for different algorithms}
\label{fig:heatmap-job}
\end{figure}

\section{Assessing Timelines for Similar Jobs}

\subsection{Job-S}

\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/hex_lev-0.9615--1timeseries4296288}
\caption{Rank 2, SIM=0.9615}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/hex_lev-0.9012--15timeseries4296277}
\caption{Rank 15, SIM=0.9017}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/hex_lev-0.7901--99timeseries4297842}
\caption{Rank\,100, SIM=0.790}
\end{subfigure}

\caption{Job-S with Hex-Lev, selection of similar jobs}
\label{fig:job-S-hex-lev}
\end{figure}

\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/hex_native-0.9808--1timeseries4296288}
\caption{Rank 2, SIM=}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/hex_native-0.9375--15timeseries4564296}
\caption{Rank 15, SIM=}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/hex_native-0.8915--99timeseries4296785}
\caption{Rank\,100, SIM=}
\end{subfigure}

\caption{Job-S with Hex-Native, selection of similar jobs}
\label{fig:job-S-hex-native}
\end{figure}

% \ContinuedFloat

Hex phases very similar to hex native.
Komischer JOB zu inspizieren: \verb|job_similarities_4296426-out/hex_phases-0.7429--93timeseries4237860|


Bin aggzeros works quite well here too. The jobs are a bit more diverse.


\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/bin_aggzeros-0.8462--1timeseries4296280}
\caption{Rank 2, SIM=}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/bin_aggzeros-0.7778--14timeseries4555405}
\caption{Rank 15, SIM=}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_4296426-out/bin_aggzeros-0.6923--99timeseries4687419}
\caption{Rank\,100, SIM=}
\end{subfigure}

\caption{Job-S with bin\_aggzero, selection of similar jobs}
\label{fig:job-S-bin-aggzeros}
\end{figure}


\subsection{Job-M}

Bin aggzero liefert Mist zurück.


\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/bin_aggzeros-0.7755--1timeseries8010306}
\caption{Rank 2, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/bin_aggzeros-0.7347--14timeseries4498983}
\caption{$SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/bin_aggzeros-0.5102--99timeseries5120077}
\caption{$SIM=$ }
\end{subfigure}

\caption{Job-M with Bin-Aggzero, selection of similar jobs}
\label{fig:job-M-bin-aggzero}
\end{figure}


\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_lev-0.9546--1timeseries7826634}
\caption{Rank 2, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_lev-0.9365--2timeseries5240733}
\caption{Rank 3, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_lev-0.7392--15timeseries7651420}
\caption{$SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_lev-0.7007--99timeseries8201967}
\caption{$SIM=$ }
\end{subfigure}

\caption{Job-M with hex\_lev, selection of similar jobs}
\label{fig:job-M-hex-lev}
\end{figure}


\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_native-0.9878--1timeseries5240733}
\caption{Rank 2, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_native-0.9651--2timeseries7826634}
\caption{Rank 3, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_native-0.9084--14timeseries8037817}
\caption{$SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_native-0.8838--99timeseries7571967}
\caption{$SIM=$ }
\end{subfigure}

\caption{Job-M with hex\_native, selection of similar jobs}
\label{fig:job-M-hex-native}
\end{figure}


\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_phases-0.8831--1timeseries7826634}
\caption{Rank 2, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_phases-0.7963--2timeseries5240733}
\caption{Rank 3, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_phases-0.4583--14timeseries4244400}
\caption{$SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_5024292-out/hex_phases-0.2397--99timeseries7644009}
\caption{$SIM=$ }
\end{subfigure}

\caption{Job-M with hex\_phases, selection of similar jobs}
\label{fig:job-M-hex-phases}
\end{figure}

\subsection{Job-L}


\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/bin_aggzeros-0.1671--1timeseries7869050}
\caption{Rank 2, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/bin_aggzeros-0.1671--2timeseries7990497}
\caption{Rank 3, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\includegraphics[width=\textwidth]{job_similarities_7488914-out/bin_aggzeros-0.1521--14timeseries8363584}
\caption{$SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/bin_aggzeros-0.1097--97timeseries4262983}
\caption{$SIM=$ }
\end{subfigure}

\caption{Job-L with bin\_aggzero, selection of similar jobs}
\label{fig:job-L-bin-aggzero}
\end{figure}


\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_lev-0.9386--1timeseries7266845}
\caption{Rank 2, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_lev-0.9375--2timeseries7214657}
\caption{Rank 3, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_lev-0.7251--14timeseries4341304}
\caption{$SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_lev-0.1657--99timeseries8036223}
\caption{$SIM=$ (30s)}
\end{subfigure}

\caption{Job-L with hex\_lev, selection of similar jobs}
\label{fig:job-L-hex-lev}
\end{figure}


\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_native-0.9390--1timeseries7266845}
\caption{Rank 2, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_native-0.9333--2timeseries7214657}
\caption{Rank 3, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_native-0.8708--14timeseries4936553}
\caption{$SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_native-0.1695--99timeseries7942052}
\caption{$SIM=$ }
\end{subfigure}

\caption{Job-L with hex\_native, selection of similar jobs}
\label{fig:job-L-hex-native}
\end{figure}

\begin{figure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_phases-1.0000--14timeseries4577917}
\caption{Rank 2, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_phases-1.0000--1timeseries4405671}
\caption{Rank 3, $SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_phases-1.0000--2timeseries4621422}
\caption{$SIM=$}
\end{subfigure}
\begin{subfigure}{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{job_similarities_7488914-out/hex_phases-1.0000--99timeseries4232293}
\caption{$SIM=$ }
\end{subfigure}

\caption{Job-L with hex\_phases, selection of similar jobs}
\label{fig:job-L-hex-phases}
\end{figure}


\section{Conclusion}
\label{sec:summary}

%\printbibliography
\end{document}