115 lines
7.5 KiB
TeX
115 lines
7.5 KiB
TeX
\documentclass{ResearchProposal} % The class file specifying the document structure
|
|
|
|
\thesistitle[Optional Short Title]{PHD Thesis}
|
|
\author{Kyle \textsc{ Spindler}}
|
|
\supervisor{Dr. Julian \textsc{Kunkel}} % Your prospect supervisor's name (if known already), leave empty if you are looking for one
|
|
\university{University of Reading} % The university you apply for
|
|
\department{Department of Computer Science} % The department's name
|
|
\group{\href{http://hps.vi4io.org}{High-Performance, Storage Software Architectures}} % The research area/group
|
|
\keywords{Microservice, Serverless, HPC, Storage} % Use a few describing the thesis better
|
|
|
|
|
|
\addbibresource{example.bib}
|
|
|
|
\startMain
|
|
|
|
|
|
\section{Motivation}
|
|
|
|
Software Architecture invovles considering multiple characteristics such as separation of concerns, quality attributes (maintainability, scalability, loose coupling, high cohesion etc...) and architectural styles. Some architecture styles are more suited for performance while others are better at maintainability and loose coupling like microservices. Microservices is a very popular architecture that is used in many domains because of the benefits it offers.
|
|
|
|
\medskip
|
|
|
|
Scientic codes suffer from good software engineering practices. HPC and store applications are typically tightly coupled to utilize the available resources efficiently. While it is claimed that this provides the best performance, the benefit and drawbacks of alternative software architectures for HPC software is not thoroughly investigated. Microservices, for example, provide a scalable architecture and ease the software development process by providing separation of concerns by applying techniques from Domain Driven Design. When deciding a software architecture not only performance and scalability matters, but also flexibility and maintainability of the software.
|
|
|
|
\medskip
|
|
|
|
In this regard, the HPC community struggles to recruit sufficient developesr to keep up with the development of software which can often be seen in important utility tools. For example, existing tools for pre/post-processing of HPC workflows and the analysis of HPC data are typically not the main focus of scientists and developers; hence, they are implemented in a way that shows limited scalability, i.e. are executed sequentially in bash scripts.
|
|
|
|
|
|
\section{Research question}
|
|
|
|
Understand the impact of modern day software architectures (microservices, event driven) has on HPC and particularly the climate/weather domain
|
|
|
|
The goal of this thesis is to see if HPC applications and storage systems can be redeveloped using modern day software architecture such as microservices with minimal or no overhead while gaining the benefits from the loosely coupled architecture.
|
|
|
|
\begin{enumerate}
|
|
\item What parts of the HPC and Storage Solution could benefit from microservices or other software architectures?
|
|
\item How to make HPC and Storage Solution more maintainable, scaleable, loosely couple, more cohesion and more independent?
|
|
\item What areas within the HPC and Storage Solution that could improve it's efficiency through the use of Software Architecture?
|
|
\end{enumerate}
|
|
|
|
|
|
\section{Related work}
|
|
|
|
Microservices are becoming very popular in today's world due to the main benefits and problems it solves. The application of microservices are used in many domains where maintainability, scalability and resilience is very important as opose to scientific applications that requires performance is their primary attribute. Although microservices are found in HPC applications the majority of research found has applied microservices in storage, pre/post processing, middleware, scheduling, workflow and caching services that wrap around the main HPC processing.Also storage services like iRODS harness microservices, however, these systems are not as performance-criticial as a typical HPC application which might be one of the reasons why iRODS is not used in HPC environments.
|
|
|
|
\medskip
|
|
|
|
Relevant work can be classified into: a) LaTeX studies, b) performance analysis in HPC, ....
|
|
|
|
|
|
\paragraph{Caching Microservices.} Microservices were used as a distributed interpolation-based memoization cache. \citep{Jenkins2017}.
|
|
|
|
\paragraph{Containers for High Performance Computing.} Thoughts on how containers may or may not be used in HPC. Containers are commonly used in microservice architectures. \citep{Joab2018}.
|
|
|
|
\paragraph{Distributed Virtual Machine Cloud Microservice for HPC:SPMD Applications} How Virtual Machines were used in a cloud based microservice architecture in an HPC environment:SPMD. \citep{Fatema2017}.
|
|
|
|
\paragraph{Scheduling Scientific Workflows in HPC} How to dynamically approach to scheduling reconfigurable scientific workflows in heterogeneous HPC environments. \citep{Cheptsov2016}.
|
|
|
|
\paragraph{Middleware cloud based microservice in HPC} Shows how middleware has made use of microservices within a HPC environment. \citep{Benchara2016}.
|
|
|
|
\paragraph{iRODS integrated Microservice Rulebook} Shows how iRODS uses Microservices for storage. \citep{Rajasekar2015}.
|
|
|
|
\paragraph{Microservices in Ocean Climate Data } Shows how Ocean-Derived Climate Data uses Microservices. \citep{Arne2016}
|
|
|
|
\paragraph{Software Engineering in Computational Science} How Software Engineering practices can be used in Computational Science environments. \citep{Johanson2018}
|
|
|
|
\paragraph{Workflow-Oriented Cyberinfrastructure for Sensor Data Analytics}
|
|
|
|
\paragraph {http://eprints.uni-kiel.de/42726/1/2018-04-19GeomarDataScience.pdf}
|
|
|
|
\section{Research methodology}
|
|
Firstly, I must understand the limitations and performance characteristics of alternatives software architectures. This is performed in two steps:
|
|
\begin{enumerate}
|
|
\item Benchmarking of microservice communication protocols in contracts to HPC communication path.
|
|
\item Modelling of systems with different hardware / software architectures
|
|
\end{enumerate}
|
|
|
|
Next, from HPC use cases and scenarios must be derived and characterised with their pros/cons and performance chatacterisitcs.
|
|
|
|
To prove the expectations, a prototype of selected applications must be made and compared to native applications. This may involve to adjust an existing package or create a new application that behaves similar but has limited feature set (so called mini-app).
|
|
|
|
\medskip
|
|
|
|
An orthogonal aspect is to conduct surveys with scientists / scientific developers to understand the reason for the architectural choices made and identify strategies to adjust the existing practice.
|
|
|
|
\section{Required infrastructure}
|
|
|
|
Access to HPC systems to benchmark existing software and the developed prototypes. Access to the scientific network to foster discussions and explore with scientists in co-development the benefits of the prototypes.
|
|
|
|
|
|
\section{Workplan}
|
|
|
|
The following sketches the workplan for the different years of the PhD.
|
|
|
|
\paragraph{First year:} Setup of work environment, researching related work, Start Investigating system under study, send out surveys, analyise system under study, writing the chapters introduction and related work of the thesis.
|
|
|
|
\paragraph{Second year:} Continue researching related work, Add more details in thesis based on previous findings, Start Implementing Prototype.
|
|
|
|
\paragraph{Third year:} Continue Implementing Prototype, Add Quality Assurance and testing to prototype, continue researching, add implementation details to thesis
|
|
|
|
\paragraph{Fourth year:} Run Benchmark Tests, Adjust Prototype, add benchmark findings into thesis, provide critical thinking.
|
|
|
|
\paragraph{Fifth year:} Finalise thesis, Prepare thesis for final stages and delivery
|
|
|
|
\proposalAppendix
|
|
|
|
Add here any appendix, if needed
|
|
|
|
\printbibliography[heading=bibintoc]
|
|
|
|
\label{LastPage}
|
|
|
|
\end{document}
|