diff --git a/.DS_Store b/.DS_Store new file mode 100644 index 0000000..ca7165e Binary files /dev/null and b/.DS_Store differ diff --git a/research-proposal/example.bib b/research-proposal/example.bib index 629e1a1..047c261 100644 --- a/research-proposal/example.bib +++ b/research-proposal/example.bib @@ -91,4 +91,17 @@ amount of climate-relevant time series data. To interactively explore and analyz keywords = {Software;Scientific computing;Software engineering;Computational modeling;Computers;Productivity;Object recognition;scientific software development;domain-specific languages;software performance engineering;software testing;requirements engineering}, } +@Article{Orcutt2015, + author = {{Orcutt}, J.~A. and {Rajasekar}, A. and {Moore}, R.~W. and {Vernon}, F.}, + title = {{Workflow-Oriented Cyberinfrastructure for Sensor Data Analytics}}, + journal = {AGU Fall Meeting Abstracts}, + year = {2015}, + pages = {IN31C-1778}, + month = dec, + adsnote = {Provided by the SAO/NASA Astrophysics Data System}, + adsurl = {http://adsabs.harvard.edu/abs/2015AGUFMIN31C1778O}, + eid = {IN31C-1778}, + keywords = {1908 Cyberinfrastructure, INFORMATICS, 1910 Data assimilation, integration and fusion, INFORMATICS, 1920 Emerging informatics technologies, INFORMATICS, 1998 Workflow, INFORMATICS}, +} + @Comment{jabref-meta: databaseType:bibtex;} diff --git a/research-proposal/main.tex b/research-proposal/main.tex index 514734a..e2bbdc1 100644 --- a/research-proposal/main.tex +++ b/research-proposal/main.tex @@ -25,7 +25,7 @@ Microservices is a very popular architecture that is used in many domains becaus \medskip -Scientific codes suffer from good software engineering practices. HPC and store applications are typically tightly coupled to utilize the available resources efficiently. While it is claimed that this provides the best performance, the benefit and drawbacks of alternative software architectures for HPC software is not thoroughly investigated. Microservices, for example, provide a scalable architecture and ease the software development process by providing separation of concerns by applying techniques from Domain-Driven Design. When deciding a software architecture not only performance and scalability matters, but also flexibility and maintainability of the software. +Scientific codes suffer from good software engineering practices. HPC and store applications are typically tightly coupled to utilise the available resources efficiently. While it is claimed that this provides the best performance, the benefit and drawbacks of alternative software architectures for HPC software is not thoroughly investigated. Microservices, for example, provide a scalable architecture and ease the software development process by providing separation of concerns by applying techniques from Domain-Driven Design. When deciding a software architecture not only performance and scalability matters, but also flexibility and maintainability of the software. \medskip @@ -47,21 +47,26 @@ The goal of this thesis is to see if HPC applications and storage systems can be \section{Related work} -Relevant related work can be classified into: a) the usage of Microservices in different disciplines and particularly HPC; b) software engineering and architectures in HPC; and c) performance analysis of Microservices. -Given the scope of this document, an excerpt of related work is shown, focusing on b). +Relevant related work can be classified into: +\begin{enumerate} + \item a) The usage of microservices in different disciplines and particularly HPC + \item b) Software engineering and software architectures in HPC + \item c) Performance analysis of microservices +\end{enumerate} + + +\paragraph{a) The usage of microservices in different disciplines and particularly HPC.} +Although microservices are found in HPC applications the majority of research found has applied microservices in storage \citep{Orcutt2015}, pre/post processing, middleware, scheduling, workflow and caching services that wrap around the main HPC processing. Also storage services like iRODS harness microservices, however, these systems are not as performance-critical as a typical HPC application which might be one of the reasons why iRODS is not used in HPC environments. -\jk{This section deserves a bit polishment, a structure like I suggested above. Not necessarily all points must be answered. I would add one paper for each topic at least. Where possible and that is what you have a focus on b)} +\paragraph{b) Software engineering and software architectures in HPC.} Software engineering has a focus on of having maintainable code using various designs, patterns and principles which has an influence on the software architecture \citep{Johanson2018}. Trade offs are an important decision making process when selecting an architecture, so by having an architecture that is close in performance to a traditional HPC with an increase of maintainability as opposed to an application with high performance an no maintainability might be worth that particular trade off \citep{Jenkins2017}. Other attributes may be included in the trade off besides performance such as security \citep{Joab2018}. -\jk{I suggest to write one section like the following for each of a,b,c and add the references inline.} +\paragraph{c) Performance analysis of microservices} Here we will see how the loosely coupled architecture has an influence on the performance aspect of this analysis \citep{Fatema2017} -Microservices are becoming very popular in today's world due to the benefits and problems it solves. -The application of microservices are used in many domains where maintainability, scalability and resilience is very important as oppose to scientific applications that requires performance is their primary attribute. \medskip -\paragraph{The usage of microservices in HPC.} -Although microservices are found in HPC applications the majority of research found has applied microservices in storage \jk{cite iRODS}, pre/post processing, middleware, scheduling, workflow and caching services that wrap around the main HPC processing. Also storage services like iRODS harness microservices, however, these systems are not as performance-critical as a typical HPC application which might be one of the reasons why iRODS is not used in HPC environments. +\paragraph{Other areas within an HPC and storage environment where microservices has been used:} -Microservices were used as a distributed interpolation-based memoization cache \citep{Jenkins2017}. +\paragraph{Caching Service.} Microservices were used as a distributed interpolation-based memoization cache \citep{Jenkins2017}. \paragraph{Containers for High Performance Computing.} Thoughts on how containers may or may not be used in HPC. Containers are commonly used in microservice architectures. \citep{Joab2018}. @@ -77,7 +82,7 @@ Microservices were used as a distributed interpolation-based memoization cache \ \paragraph{Software Engineering in Computational Science} How Software Engineering practices can be used in Computational Science environments. \citep{Johanson2018} -\paragraph{Workflow-Oriented Cyberinfrastructure for Sensor Data Analytics} +\paragraph{Workflow-Oriented Cyberinfrastructure for Sensor Data Analytics} How the use of iRODS was used with streaming sensors. \citep{Orcutt2015} \paragraph {http://eprints.uni-kiel.de/42726/1/2018-04-19GeomarDataScience.pdf}