Kl. verbesserung

2020-12-04 16:04:35 +00:00 · 2020-12-04 16:04:35 +00:00 · 7be00c5a3b
commit 7be00c5a3b
parent fc99affb60
1 changed files with 4 additions and 4 deletions
--- a/paper/main.tex
+++ b/paper/main.tex
@ -205,17 +205,17 @@ They differ in the way data similarity is defined; either the time series is enc
 B-all determines similarity between binary codings by means of Levenshtein distance.
 B-aggz is similar to B-all, but computes similarity on binary codings where subsequent segments of zero activities are replaced by just one zero.
 Q-lev determines similarity between quantized codings by using Levensthein distance.
-Q-native uses a performance-aware similarity function, i.e., distance for a metric is $\frac{|m_{job1} - m_{job2}|}{16}$.
+Q-native uses a performance-aware similarity function, i.e., the distance between two jobs for a metric is $\frac{|m_{job1} - m_{job2}|}{16}$.
 For jobs with different lengths, we apply a sliding-windows approach which finds the location for the shorter job in the long job with the highest similarity.
 Q-phases extract phase information and performs a phase-aware and performance-aware similarity computation.
 The Q-phases algorithm extracts I/O phases and computes the similarity between the most similar I/O phases of both jobs.
-In this paper, we add a new similarity definition based on Kolmogorov-Smirnov-Test that compares the probability distribution of the observed values which we describe in the following.
-In brief, KS concatenates individual node data (instead of averaging) and computes similarity be means of Kolmogorov-Smirnov-Test.
+In this paper, we add a similarity definition based on Kolmogorov-Smirnov-Test that compares the probability distribution of the observed values which we describe in the following.
+%In brief, KS concatenates individual node data  and computes similarity be means of Kolmogorov-Smirnov-Test.

 \paragraph{Kolmogorov-Smirnov (KS) algorithm}
 % Summary
 For the analysis, we perform two preparation steps.
-Dimension reduction by computing means across the two file systems and by concatenating the time series data of the individual nodes.
+Dimension reduction by computing means across the two file systems and by concatenating the time series data of the individual nodes (instead of averaging) them.
 This reduces the four-dimensional dataset to two dimensions (time, metrics).

 % Aggregation