forked from julian.kunkel/reading-templates
135 lines
9.2 KiB
TeX
135 lines
9.2 KiB
TeX
\documentclass{ResearchProposal} % The class file specifying the document structure
|
|
|
|
\thesistitle{Potential of Microservices in HPC}
|
|
\author{Kyle \textsc{ Spindler}}
|
|
\supervisor{Dr. Julian \textsc{Kunkel}} % Your prospect supervisor's name (if known already), leave empty if you are looking for one
|
|
\university{University of Reading} % The university you apply for
|
|
\department{Department of Computer Science} % The department's name
|
|
\degree{for a PhD at a distance in part-time}
|
|
\group{\href{http://hps.vi4io.org}{High-Performance, Storage Software Architectures}} % The research area/group
|
|
\keywords{Microservice, Serverless, HPC, Storage} % Use a few describing the thesis better
|
|
|
|
\RequirePackage{todonotes}
|
|
\newcommand{\jk}[1]{\todo[inline]{JK: #1}}
|
|
|
|
|
|
\addbibresource{example.bib}
|
|
|
|
\startMain
|
|
|
|
|
|
\section{Motivation}
|
|
|
|
Software Architecture involves considering multiple characteristics such as separation of concerns, quality attributes (maintainability, scalability, loose coupling, high cohesion etc...) and architectural styles. Some architecture styles are more suited for performance while others are better at maintainability and loose coupling like microservices.
|
|
Microservices is a very popular architecture that is used in many domains because of the benefits it offers.
|
|
|
|
\medskip
|
|
|
|
Scientific codes suffer from good software engineering practices. HPC and store applications are typically tightly coupled to utilise the available resources efficiently. While it is claimed that this provides the best performance, the benefit and drawbacks of alternative software architectures for HPC software is not thoroughly investigated. Microservices, for example, provide a scalable architecture and ease the software development process by providing separation of concerns by applying techniques from Domain-Driven Design. When deciding a software architecture not only performance and scalability matters, but also flexibility and maintainability of the software.
|
|
|
|
\medskip
|
|
|
|
In this regard, the HPC community struggles to recruit sufficient developers to keep up with the development of software which can often be seen in important utility tools. For example, existing tools for pre/post-processing of HPC workflows and the analysis of HPC data are typically not the main focus of scientists and developers; hence, they are implemented in a way that shows limited scalability, i.e., are executed sequentially in bash scripts.
|
|
|
|
|
|
\section{Research question}
|
|
|
|
Understand the impact of modern day software architectures (microservices, event driven) has on HPC and particularly the climate/weather domain
|
|
|
|
The goal of this thesis is to see if HPC applications and storage systems can be redeveloped using modern day software architecture such as microservices with minimal or no overhead while gaining the benefits from the loosely coupled architecture.
|
|
|
|
\begin{enumerate}
|
|
\item What parts of the HPC and Storage Solution could benefit from microservices or other software architectures?
|
|
\item How to make HPC and Storage Solution more maintainable, scaleable, loosely couple, more cohesion and more independent?
|
|
\item What areas within the HPC and Storage Solution that could improve it's efficiency through the use of Software Architecture?
|
|
\end{enumerate}
|
|
|
|
|
|
\section{Related work}
|
|
|
|
Relevant related work can be classified into:
|
|
\begin{enumerate}
|
|
\item The usage of microservices in different disciplines and particularly HPC
|
|
\item Software engineering and software architectures in HPC
|
|
\item Performance analysis of microservices
|
|
\end{enumerate}
|
|
|
|
|
|
\paragraph{1. The usage of microservices in different disciplines and particularly HPC.}
|
|
Although microservices are found in HPC applications the majority of research found has applied microservices in storage \citep{Orcutt2015}, pre/post processing, middleware, scheduling, workflow and caching services that wrap around the main HPC processing. Also storage services like iRODS harness microservices, however, these systems are not as performance-critical as a typical HPC application which might be one of the reasons why iRODS is not used in HPC environments.
|
|
|
|
\paragraph{2. Software engineering and software architectures in HPC.} Software engineering has a focus on of having maintainable code using various designs, patterns and principles which has an influence on the software architecture \citep{Johanson2018}. Trade offs are an important decision making process when selecting an architecture, so by having an architecture that is close in performance to a traditional HPC with an increase of maintainability as opposed to an application with high performance an no maintainability might be worth that particular trade off \citep{Jenkins2017}. Other attributes may be included in the trade off besides performance such as security \citep{Joab2018}.
|
|
|
|
\paragraph{3. Performance analysis of microservices}
|
|
The loosely coupled architecture has an influence on the performance aspect of this analysis, as shown, for example, in \citep{Fatema2017}
|
|
Some preliminary analysis of RESTful services exist, but this is only one potential framework for microservices.
|
|
|
|
% \medskip
|
|
%
|
|
% \paragraph{Other areas within an HPC and storage environment where microservices has been used:}
|
|
%
|
|
% \paragraph{Caching Service.} Microservices were used as a distributed interpolation-based memoization cache \citep{Jenkins2017}.
|
|
%
|
|
% \paragraph{Containers for High Performance Computing.} Thoughts on how containers may or may not be used in HPC. Containers are commonly used in microservice architectures. \citep{Joab2018}.
|
|
%
|
|
% \paragraph{Distributed Virtual Machine Cloud Microservice for HPC:SPMD Applications} How Virtual Machines were used in a cloud based microservice architecture in an HPC environment:SPMD. \citep{Fatema2017}.
|
|
%
|
|
% \paragraph{Scheduling Scientific Workflows in HPC} How to dynamically approach to scheduling reconfigurable scientific workflows in heterogeneous HPC environments. \citep{Cheptsov2016}.
|
|
%
|
|
% \paragraph{Middleware cloud based microservice in HPC} Shows how middleware has made use of microservices within a HPC environment. \citep{Benchara2016}.
|
|
%
|
|
% \paragraph{iRODS integrated Microservice Rulebook} Shows how iRODS uses Microservices for storage. \citep{Rajasekar2015}.
|
|
%
|
|
% \paragraph{Microservices in Ocean Climate Data } Shows how Ocean-Derived Climate Data uses Microservices. \citep{Arne2016}
|
|
%
|
|
% \paragraph{Software Engineering in Computational Science} How Software Engineering practices can be used in Computational Science environments. \citep{Johanson2018}
|
|
%
|
|
% \paragraph{Workflow-Oriented Cyberinfrastructure for Sensor Data Analytics} How the use of iRODS was used with streaming sensors. \citep{Orcutt2015}
|
|
%
|
|
% \paragraph {http://eprints.uni-kiel.de/42726/1/2018-04-19GeomarDataScience.pdf}
|
|
|
|
\section{Research methodology}
|
|
Firstly, I must understand the limitations and performance characteristics of alternatives software architectures. This is performed in two steps:
|
|
\begin{enumerate}
|
|
\item Benchmarking of microservice communication protocols in contracts to HPC communication path.
|
|
\item Modelling of systems with different hardware / software architectures
|
|
\end{enumerate}
|
|
|
|
Next, from HPC use cases and scenarios must be derived and characterised with their pros/cons and performance characteristics.
|
|
|
|
Finally, to prove the expectations, a prototype of selected applications must be made and compared to native applications. This may involve to adjust an existing package or create a new application that behaves similar but has limited feature set (so called mini-app).
|
|
|
|
\medskip
|
|
|
|
An orthogonal aspect is to conduct surveys with scientists / scientific developers to understand the reason for the architectural choices made and identify strategies to adjust the existing practice.
|
|
|
|
\section{Required infrastructure}
|
|
|
|
Access to HPC systems to benchmark existing software and the developed prototypes. Access to the scientific network to foster discussions and explore with scientists in co-development the benefits of the prototypes.
|
|
These requirements will be addressed by my supervisor -- there is no special requirement for the university to provide any of this infrastructure.
|
|
|
|
|
|
\section{Workplan}
|
|
|
|
The following sketches the tentative workplan for the different years of the PhD.
|
|
|
|
\paragraph{First year:} Setup of work environment, researching related work, start investigating system under study, send out surveys, analyse system under study -- performing the performance analysis of the microservices architecture and frameworks to realize them, writing the chapters introduction and related work of the thesis.
|
|
|
|
\paragraph{Second year:} Continue researching related work, add more details in thesis based on previous findings, deriving performance models for microservices, start implementing a prototype for proofing the model.
|
|
|
|
\paragraph{Third year:} Continue implementing the prototype, add quality assurance and testing to prototype, continue researching, improve the model, add design details to thesis.
|
|
|
|
\paragraph{Fourth year:} Conclude benchmark tests of benchmarks and mini-applications, adjust prototype, add benchmark findings into thesis, provide critical thinking.
|
|
|
|
\paragraph{Fifth year:} Finalise thesis, prepare thesis for final stages and delivery.
|
|
|
|
\proposalAppendix
|
|
|
|
%Add here any appendix, if needed
|
|
|
|
\printbibliography[heading=bibintoc]
|
|
|
|
\label{LastPage}
|
|
|
|
\end{document}
|