Reformeded related work.
This commit is contained in:
		
							parent
							
								
									a0a2ebabf7
								
							
						
					
					
						commit
						6d508d53fc
					
				| @ -148,8 +148,9 @@ Finally, we conclude our paper in \Cref{sec:summary}. | ||||
| \section{Related Work} | ||||
| \label{sec:relwork} | ||||
| 
 | ||||
| Related work can be classified into: distance measures, time series analysis of HPC applications, and IO monitoring tools. | ||||
| Related work can be classified into: distance measures, analysis of HPC application performance, inter-comparison of jobs in HPC, and IO-specific tools. | ||||
| 
 | ||||
| %% DISTANCE MEASURES | ||||
| The ranking of similar jobs performed in this article is related to clustering strategies. | ||||
| The comparison of the time series using various metrics has been extensively investigated. | ||||
| In \cite{khotanlou2018empirical}, an empirical comparison of distance measures for clustering of multivariate time series is performed. | ||||
| @ -160,32 +161,30 @@ In this model, gaps imply a cost. | ||||
| Levenshtein distance is often referred to as Edit Distance (ED) \cite{navarro2001guided}. | ||||
| % Lock-Step Measures and Elastic Measures | ||||
| 
 | ||||
| 
 | ||||
| Monitoring systems that record statistics about hardware usage are widely used in HPC. | ||||
| In \cite{halawa2020unsupervised}, 11 performance metrics including CPU and network are utilized for agglomerative clustering showing the general  effectivity of the approach. | ||||
| 
 | ||||
| % Analysis of HPC application performance | ||||
| The performance of applications can be analyzed using one of many tracing tools such as Vampir \cite{weber2017visual} that record the behavior of an application explicitly or implicitly by collecting information about the resource usage with a monitoring system. | ||||
| Monitoring systems that record statistics about hardware usage are widely deployed in data centers to record system utilization by applications. | ||||
| There are various tools for analyzing the IO behavior of an application \cite{TFAPIKBBCF19}. | ||||
| 
 | ||||
| Comparison of applications by extracting the IO patterns from application traces. | ||||
| With PAS2P \cite{mendez2012new}... | ||||
| 
 | ||||
| % time series analysis for inter-comparison of processes or jobs in HPC | ||||
| For Vampir, a popular tool for trace file analysis, in \cite{weber2017visual} the Comparison View is introduced that allows to manually compare traces of application runs, e.g., to compare optimized  with original code. | ||||
| Vampir generally supports the clustering of process timelines of a single job allowing to focus on relevant code sections and processes when investigating large number of processes. | ||||
| 
 | ||||
| Chameleon \cite{bahmani2018chameleon} extends ScalaTrace for recording MPI traces but reduces the overhead by clustering processes and collecting information from one representative of each cluster. | ||||
| For the clustering, a signature is created for each process that includes the call-graph. | ||||
| 
 | ||||
| Characterization of jobs | ||||
| In \cite{halawa2020unsupervised}, 11 performance metrics including CPU and network are utilized for agglomerative clustering of jobs showing the general  effectivity of the approach. | ||||
| 
 | ||||
| In \cite{rodrigo2018towards}, a characterization of the NERSC workload is performed based on job scheduler information (profiles). | ||||
| Profiles that include the MPI activities have shown effective to identify the code that is executed \cite{demasi2013identifying}. | ||||
| Approaches for clustering HPC applications typically operate on profiles for compute, network, and IO \cite{emeras2015evalix,liu2020characterization,bang2020hpc}. | ||||
| Many approaches for clustering applications operate on profiles for compute, network, and IO \cite{emeras2015evalix,liu2020characterization,bang2020hpc}. | ||||
| For example, Evalix \cite{emeras2015evalix} monitors system statistics (from proc) in 1 minute intervals but for the analysis they are converted to a profile removing the time dimension, i.e., compute the average CPU, memory, and IO over the job runtime. | ||||
| 
 | ||||
| % IO-specific tools | ||||
| PAS2P \cite{mendez2012new} extracts the IO patterns from application traces and then allows users to manually compare them. | ||||
| In \cite{white2018automatic}, a heuristic classifier is developed that analyzes the I/O read/write throughput time series to extract the periodicity of the jobs -- similar to Fourier analysis. | ||||
| The LASSi tool \cite{AOPIUOTUNS19} periodically monitors Lustre I/O statistics and computes a "risk" factor to identify IO patterns which stress the file system. | ||||
| 
 | ||||
| In \cite{white2018automatic}, a heuristic classifier is developed that analyzes the I/O read/write throughput time series to extract the periodicity of the jobs -- there is a considerable similarity to fourier analysis. | ||||
| 
 | ||||
| In contrast to existing work, our approach allows a user to identify similar activities based on the temporal I/O behavior recorded with a data center wide deployed monitoring system. | ||||
| 
 | ||||
| \section{Methodology} | ||||
| \label{sec:methodology} | ||||
|  | ||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user