Conclusion
This commit is contained in:
parent
0a581098bb
commit
fd867d55f0
|
@ -987,10 +987,24 @@ As expected, the histograms mimics the profile of the reference job, and thus, t
|
|||
\section{Conclusion}
|
||||
\label{sec:summary}
|
||||
|
||||
One consideration could be to identify jobs that are found by all algorithms, i.e., jobs that meet a certain (rank) threshold for different algorithms.
|
||||
In this article, we conducted a study to identify similar jobs based on timelines of nine I/O statistics.
|
||||
Therefore, we applied six different algorithmic strategies developed before and included this time as well a distance metric based on the Kolmogorov-Smirnov-Test.
|
||||
The quantitative analysis shows that a diverse set of results can be found and that only a tiny subset of the 500k jobs is very similar to each of the three reference jobs.
|
||||
For the small post-processing job, which is executed many times, all algorithms produce suitable results.
|
||||
For Job-M, the algorithms exhibit a different behavior.
|
||||
Job-L is tricky to analyze, because it is compute intense with only a single I/O phase at the beginning.
|
||||
Generally, the KS algorithm finds jobs with similar histograms which are not necessarily what we subjectively are looking for.
|
||||
|
||||
We found that the approach to compute similarity of a reference jobs to all jobs and ranking these based on their similarity was successful to find related jobs that we were interested in.
|
||||
The HEX\_lev and HEX\_native work best according to our subjective qualitative analysis.
|
||||
Typically, a related job stems from the same user/group and may have a related job name but the approach was inclusive.
|
||||
However, all algorithms perform their task as intended.
|
||||
The pre-processing of the algorithms and distance metrics differ leading to a different definition of similarity.
|
||||
The the data center support/user must define how to define similarity to select the algorithm that suits best.
|
||||
Another consideration could be to identify jobs that are found by all algorithms, i.e., jobs that meet a certain (rank) threshold for different algorithms.
|
||||
That would increase the likelihood that these jobs are very similar and what the user is looking for.
|
||||
|
||||
The KS algorithm finds jobs with similar histograms which are not necessarily what we are looking for.
|
||||
Our next step is to foster a discussion in the community to identify and define suitable similarity metrics for the different analysis purposes.
|
||||
|
||||
\printbibliography
|
||||
\end{document}
|
||||
|
|
Loading…
Reference in New Issue