bin algorithms -> B algorithms
This commit is contained in:
		
							parent
							
								
									d20d36922e
								
							
						
					
					
						commit
						0b8858fcfc
					
				| @ -356,7 +356,7 @@ Finally, the quantitative behavior of the 100 most similar jobs is investigated. | ||||
| To measure the performance for computing the similarity to the reference jobs, the algorithms are executed 10 times on a compute node at DKRZ which is equipped with two Intel Xeon E5-2680v3 @2.50GHz and 64GB DDR4 RAM. | ||||
| A boxplot for the runtimes is shown in \Cref{fig:performance}. | ||||
| The runtime is normalized for 100k jobs, i.e., for B-all it takes about 41\,s to process 100k jobs out of the 500k total jobs that this algorithm will process. | ||||
| Generally, the bin algorithms are fastest, while the Q algorithms take often 4-5x as long. | ||||
| Generally, the B algorithms are fastest, while the Q algorithms take often 4-5x as long. | ||||
| Q\_phases is slow for Job-S and Job-M while it is fast for Job-L, the reason is that just one phase is extracted for Job-L. | ||||
| The Levenshtein based algorithms take longer for longer jobs -- proportional to the job length as it applies a sliding window. | ||||
| The KS algorithm is faster than the others by 10x but it operates on the statistics of the time series. | ||||
| @ -480,7 +480,7 @@ To understand how the Top\,100 are distributed across users, the data is grouped | ||||
| \Cref{fig:userids} shows the stacked user information, where the lowest stack is the user with the most jobs and the topmost user in the stack has the smallest number of jobs. | ||||
| For Job-S, we can see that about 70-80\% of jobs stem from one user, for the Q-lev and Q-native algorithms, the other jobs stem from a second user while bin includes jobs from additional users (5 in total). | ||||
| For Job-M, jobs from more users are included (13); about 25\% of jobs stem from the same user; here, Q-lev, Q-native, and KS is including more users (29, 33, and 37, respectively) than the other three algorithms. | ||||
| For Job-L, the two Q algorithms include with (12 and 13) a bit more diverse user community than the bin algorithms (9) but Q-phases cover 35 users. | ||||
| For Job-L, the two Q algorithms include with (12 and 13) a bit more diverse user community than the B algorithms (9) but Q-phases cover 35 users. | ||||
| We didn't include the group analysis in the figure as user count and group id is proportional, at most the number of users is 2x the number of groups. | ||||
| Thus, a user is likely from the same group and the number of groups is similar to the number of unique users. | ||||
| 
 | ||||
| @ -494,7 +494,7 @@ The boxplots have different shapes which is an indication, that the different al | ||||
| 
 | ||||
| \paragraph{Runtime distribution.} | ||||
| The job runtime of the Top\,100 jobs is shown using boxplots in \Cref{fig:runtime-job}. | ||||
| While all algorithms can compute the similarity between jobs of different length, the bin algorithms and Q-native penalize jobs of different length preferring jobs of very similar length. | ||||
| While all algorithms can compute the similarity between jobs of different length, the B algorithms and Q-native penalize jobs of different length preferring jobs of very similar length. | ||||
| For Job-M and Job-L, Q-phases and KS are able to identify much shorter or longer jobs. | ||||
| For Job-L, the job itself isn't included in the chosen Top\,100 (see \Cref{fig:hist-job-L}, 393 jobs have a similarity of 100\%) which is the reason why the job runtime isn't shown in the figure itself. | ||||
| 
 | ||||
| @ -733,7 +733,7 @@ So this job type isn't necessarily executed frequently and, therefore, our Top\, | ||||
| Some applications are more prominent in these sets, e.g., for B-aggzero, 32~jobs contain WRF (a model) in the name. | ||||
| The number of unique names is 19, 38, 49, and 51 for B-aggzero, Q-phases, Q-native and Q-lev, respectively. | ||||
| 
 | ||||
| The jobs that are similar according to the bin algorithms (see \Cref{fig:job-M-bin-aggzero}) differ from our expectations. | ||||
| The jobs that are similar according to the B algorithms (see \Cref{fig:job-M-bin-aggzero}) differ from our expectations. | ||||
| The other algorithms like Q-lev (\Cref{fig:job-M-hex-lev}) and Q-native (\Cref{fig:job-M-hex-native}) seem to work as intended: | ||||
| While jobs exhibit short bursts of other active metrics even for low similarity we can eyeball a relevant similarity. | ||||
| The KS algorithm working on the histograms ranks the jobs correctly on the similarity of their histograms. | ||||
| @ -871,7 +871,7 @@ Remember, for the KS algorithm, we concatenate the metrics of all nodes together | ||||
| 
 | ||||
| \subsection{Job-L} | ||||
| 
 | ||||
| The bin algorithms find a low similarity (best 2nd ranked job is 17\% similar), the inspection of job names (14 unique names) leads to two prominent applications: bash and xmessy with 45 and 48 instances, respectively. | ||||
| The B algorithms find a low similarity (best 2nd ranked job is 17\% similar), the inspection of job names (14 unique names) leads to two prominent applications: bash and xmessy with 45 and 48 instances, respectively. | ||||
| In \Cref{fig:job-L-bin-aggzero}, it can be seen that the found jobs have little in common with the reference job. | ||||
| 
 | ||||
| The Q-lev and Q-native algorithms identify a more diverse set of applications (18 unique names and no xmessy job). | ||||
|  | ||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user