Conclusion
This commit is contained in:
parent
0a581098bb
commit
fd867d55f0
|
@ -987,10 +987,24 @@ As expected, the histograms mimics the profile of the reference job, and thus, t
|
||||||
\section{Conclusion}
|
\section{Conclusion}
|
||||||
\label{sec:summary}
|
\label{sec:summary}
|
||||||
|
|
||||||
One consideration could be to identify jobs that are found by all algorithms, i.e., jobs that meet a certain (rank) threshold for different algorithms.
|
In this article, we conducted a study to identify similar jobs based on timelines of nine I/O statistics.
|
||||||
|
Therefore, we applied six different algorithmic strategies developed before and included this time as well a distance metric based on the Kolmogorov-Smirnov-Test.
|
||||||
|
The quantitative analysis shows that a diverse set of results can be found and that only a tiny subset of the 500k jobs is very similar to each of the three reference jobs.
|
||||||
|
For the small post-processing job, which is executed many times, all algorithms produce suitable results.
|
||||||
|
For Job-M, the algorithms exhibit a different behavior.
|
||||||
|
Job-L is tricky to analyze, because it is compute intense with only a single I/O phase at the beginning.
|
||||||
|
Generally, the KS algorithm finds jobs with similar histograms which are not necessarily what we subjectively are looking for.
|
||||||
|
|
||||||
|
We found that the approach to compute similarity of a reference jobs to all jobs and ranking these based on their similarity was successful to find related jobs that we were interested in.
|
||||||
|
The HEX\_lev and HEX\_native work best according to our subjective qualitative analysis.
|
||||||
|
Typically, a related job stems from the same user/group and may have a related job name but the approach was inclusive.
|
||||||
|
However, all algorithms perform their task as intended.
|
||||||
|
The pre-processing of the algorithms and distance metrics differ leading to a different definition of similarity.
|
||||||
|
The the data center support/user must define how to define similarity to select the algorithm that suits best.
|
||||||
|
Another consideration could be to identify jobs that are found by all algorithms, i.e., jobs that meet a certain (rank) threshold for different algorithms.
|
||||||
That would increase the likelihood that these jobs are very similar and what the user is looking for.
|
That would increase the likelihood that these jobs are very similar and what the user is looking for.
|
||||||
|
|
||||||
The KS algorithm finds jobs with similar histograms which are not necessarily what we are looking for.
|
Our next step is to foster a discussion in the community to identify and define suitable similarity metrics for the different analysis purposes.
|
||||||
|
|
||||||
\printbibliography
|
\printbibliography
|
||||||
\end{document}
|
\end{document}
|
||||||
|
|
Loading…
Reference in New Issue