Typos fixed

This commit is contained in:
Julian M. Kunkel 2020-10-23 18:18:37 +01:00
parent 6d508d53fc
commit cceff91731
1 changed files with 10 additions and 9 deletions

View File

@ -148,14 +148,14 @@ Finally, we conclude our paper in \Cref{sec:summary}.
\section{Related Work} \section{Related Work}
\label{sec:relwork} \label{sec:relwork}
Related work can be classified into: distance measures, analysis of HPC application performance, inter-comparison of jobs in HPC, and IO-specific tools. Related work can be classified into distance measures, analysis of HPC application performance, inter-comparison of jobs in HPC, and IO-specific tools.
%% DISTANCE MEASURES %% DISTANCE MEASURES
The ranking of similar jobs performed in this article is related to clustering strategies. The ranking of similar jobs performed in this article is related to clustering strategies.
The comparison of the time series using various metrics has been extensively investigated. The comparison of the time series using various metrics has been extensively investigated.
In \cite{khotanlou2018empirical}, an empirical comparison of distance measures for clustering of multivariate time series is performed. In \cite{khotanlou2018empirical}, an empirical comparison of distance measures for the clustering of multivariate time series is performed.
14 similarity measures are applied to 23 data sets. 14 similarity measures are applied to 23 data sets.
It shows that no similarity measure produces statistical significant better results than another. It shows that no similarity measure produces statistically significant better results than another.
However, the Swale scoring model \cite{morse2007efficient} produced the most disjoint clusters. However, the Swale scoring model \cite{morse2007efficient} produced the most disjoint clusters.
In this model, gaps imply a cost. In this model, gaps imply a cost.
Levenshtein distance is often referred to as Edit Distance (ED) \cite{navarro2001guided}. Levenshtein distance is often referred to as Edit Distance (ED) \cite{navarro2001guided}.
@ -167,24 +167,25 @@ Monitoring systems that record statistics about hardware usage are widely deploy
There are various tools for analyzing the IO behavior of an application \cite{TFAPIKBBCF19}. There are various tools for analyzing the IO behavior of an application \cite{TFAPIKBBCF19}.
% time series analysis for inter-comparison of processes or jobs in HPC % time series analysis for inter-comparison of processes or jobs in HPC
For Vampir, a popular tool for trace file analysis, in \cite{weber2017visual} the Comparison View is introduced that allows to manually compare traces of application runs, e.g., to compare optimized with original code. For Vampir, a popular tool for trace file analysis, in \cite{weber2017visual} the Comparison View is introduced that allows them to manually compare traces of application runs, e.g., to compare optimized with original code.
Vampir generally supports the clustering of process timelines of a single job allowing to focus on relevant code sections and processes when investigating large number of processes. Vampir generally supports the clustering of process timelines of a single job allowing to focus on relevant code sections and processes when investigating a large number of processes.
Chameleon \cite{bahmani2018chameleon} extends ScalaTrace for recording MPI traces but reduces the overhead by clustering processes and collecting information from one representative of each cluster. Chameleon \cite{bahmani2018chameleon} extends ScalaTrace for recording MPI traces but reduces the overhead by clustering processes and collecting information from one representative of each cluster.
For the clustering, a signature is created for each process that includes the call-graph. For the clustering, a signature is created for each process that includes the call-graph.
In \cite{halawa2020unsupervised}, 11 performance metrics including CPU and network are utilized for agglomerative clustering of jobs showing the general effectivity of the approach. In \cite{halawa2020unsupervised}, 11 performance metrics including CPU and network are utilized for agglomerative clustering of jobs showing the general effectivity of the approach.
In \cite{rodrigo2018towards}, a characterization of the NERSC workload is performed based on job scheduler information (profiles). In \cite{rodrigo2018towards}, a characterization of the NERSC workload is performed based on job scheduler information (profiles).
Profiles that include the MPI activities have shown effective to identify the code that is executed \cite{demasi2013identifying}. Profiles that include the MPI activities have shown effective to identify the code that is executed \cite{demasi2013identifying}.
Many approaches for clustering applications operate on profiles for compute, network, and IO \cite{emeras2015evalix,liu2020characterization,bang2020hpc}. Many approaches for clustering applications operate on profiles for compute, network, and IO \cite{emeras2015evalix,liu2020characterization,bang2020hpc}.
For example, Evalix \cite{emeras2015evalix} monitors system statistics (from proc) in 1 minute intervals but for the analysis they are converted to a profile removing the time dimension, i.e., compute the average CPU, memory, and IO over the job runtime. For example, Evalix \cite{emeras2015evalix} monitors system statistics (from proc) in 1-minute intervals but for the analysis, they are converted to a profile removing the time dimension, i.e., compute the average CPU, memory, and IO over the job runtime.
% IO-specific tools % IO-specific tools
PAS2P \cite{mendez2012new} extracts the IO patterns from application traces and then allows users to manually compare them. PAS2P \cite{mendez2012new} extracts the IO patterns from application traces and then allows users to manually compare them.
In \cite{white2018automatic}, a heuristic classifier is developed that analyzes the I/O read/write throughput time series to extract the periodicity of the jobs -- similar to Fourier analysis. In \cite{white2018automatic}, a heuristic classifier is developed that analyzes the I/O read/write throughput time series to extract the periodicity of the jobs -- similar to Fourier analysis.
The LASSi tool \cite{AOPIUOTUNS19} periodically monitors Lustre I/O statistics and computes a "risk" factor to identify IO patterns which stress the file system. The LASSi tool \cite{AOPIUOTUNS19} periodically monitors Lustre I/O statistics and computes a "risk" factor to identify IO patterns that stress the file system.
In contrast to existing work, our approach allows a user to identify similar activities based on the temporal I/O behavior recorded with a data center-wide deployed monitoring system.
In contrast to existing work, our approach allows a user to identify similar activities based on the temporal I/O behavior recorded with a data center wide deployed monitoring system.
\section{Methodology} \section{Methodology}
\label{sec:methodology} \label{sec:methodology}