diff --git a/paper/main.tex b/paper/main.tex index 358e9a6..3dd165d 100644 --- a/paper/main.tex +++ b/paper/main.tex @@ -994,13 +994,11 @@ For the small post-processing job, which is executed many times, all algorithms For Job-M, the algorithms exhibit a different behavior. Job-L is tricky to analyze, because it is compute intense with only a single I/O phase at the beginning. Generally, the KS algorithm finds jobs with similar histograms which are not necessarily what we subjectively are looking for. - -We found that the approach to compute similarity of a reference jobs to all jobs and ranking these based on their similarity was successful to find related jobs that we were interested in. +We found that the approach to compute similarity of a reference jobs to all jobs and ranking these was successful to find related jobs that we were interested in. The Q-lev and Q-native work best according to our subjective qualitative analysis. -Typically, a related job stems from the same user/group and may have a related job name but the approach was inclusive. -However, all algorithms perform their task as intended. +Typically, a related job stems from the same user/group and may have a related job name but the approach was able to find other jobs as well. The pre-processing of the algorithms and distance metrics differ leading to a different definition of similarity. -The the data center support/user must define how to define similarity to select the algorithm that suits best. +The data center support/user must define how to define similarity to select the algorithm that suits best. Another consideration could be to identify jobs that are found by all algorithms, i.e., jobs that meet a certain (rank) threshold for different algorithms. That would increase the likelihood that these jobs are very similar and what the user is looking for.