Conclusion
This commit is contained in:
		
							parent
							
								
									0a581098bb
								
							
						
					
					
						commit
						fd867d55f0
					
				| @ -987,10 +987,24 @@ As expected, the histograms mimics the profile of the reference job, and thus, t | ||||
| \section{Conclusion} | ||||
| \label{sec:summary} | ||||
| 
 | ||||
| One consideration could be to identify jobs that are found by all algorithms, i.e., jobs that meet a certain (rank) threshold for different algorithms. | ||||
| In this article, we conducted a study to identify similar jobs based on timelines of nine I/O statistics. | ||||
| Therefore, we applied six different algorithmic strategies developed before and included this time as well a distance metric based on the Kolmogorov-Smirnov-Test. | ||||
| The quantitative analysis shows that a diverse set of results can be found and that only a tiny subset of the 500k jobs is very similar to each of the three reference jobs. | ||||
| For the small post-processing job, which is executed many times, all algorithms produce suitable results. | ||||
| For Job-M, the algorithms exhibit a different behavior. | ||||
| Job-L is tricky to analyze, because it is compute intense with only a single I/O phase at the beginning. | ||||
| Generally, the KS algorithm finds jobs with similar histograms which are not necessarily what we subjectively are looking for. | ||||
| 
 | ||||
| We found that the approach to compute similarity of a reference jobs to all jobs and ranking these based on their similarity was successful to find related jobs that we were interested in. | ||||
| The HEX\_lev and HEX\_native work best according to our subjective qualitative analysis. | ||||
| Typically, a related job stems from the same user/group and may have a related job name but the approach was inclusive. | ||||
| However, all algorithms perform their task as intended. | ||||
| The pre-processing of the algorithms and distance metrics differ leading to a different definition of similarity. | ||||
| The the data center support/user must define how to define similarity to select the algorithm that suits best. | ||||
| Another consideration could be to identify jobs that are found by all algorithms, i.e., jobs that meet a certain (rank) threshold for different algorithms. | ||||
| That would increase the likelihood that these jobs are very similar and what the user is looking for. | ||||
| 
 | ||||
| The KS algorithm finds jobs with similar histograms which are not necessarily what we are looking for. | ||||
| Our next step is to foster a discussion in the community to identify and define suitable similarity metrics for the different analysis purposes. | ||||
| 
 | ||||
| \printbibliography | ||||
| \end{document} | ||||
|  | ||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user