- 31st March 2016, Due date of AWS research grant, See other related issue A)
- 15th April 2016, Due date of Informatics Research Proposal
- 2nd June 2016, Official start of work on the projects
- Early July, Presentation of project progress to peer
- 19th August 2016, Noon, deadline for dissertation submission
HemeLB currently run on ARCHER super computer. The exclusitivity of the resource coupled with the long list of dependencies and technical know-how, limit the access of doing the simulation by the domain experts like scientist or doctor.
This project is an arrangement to solve this problem. Decouple the components of the simulation and provide a web interface so that people could upload necessary files easily and do the simulation without knowing the technical know-how. Also, running it on the cloud will also means that this project is capable to be run on the commodity servers that people could easily replicate the simulation or workflow on any other capable fleet of servers without resorting to supercomputers.
To understand how we could improve the workflow better, we need to understand the current workflow.
These are the steps we are going to take for this project:
This is a straight-forward step. Whatever the implementation in the end, having HemeLB on its own container, separately from the container which contains the whole workflow from setup step is better. We want to have a HemeLB-core only clusters that do not need other part.
Decision Point! a) (Adam asked) Do we even need docker container? Yes, the purpose of having HemeLB core packaged into its own container on dockerhub is that people can easily install HemeLB core on their own. It's not a required step for this project to be successful, but it will benefit the community for easy distribution of HemeLB core.
We need tools to deploy HemeLB core as a compute server in AWS easily. There are few possible tools:
- CfnClusters, basically python CLI with boto lib, optimized for HPC, doc
- Ansible, more general-purpose system automation, Ansible-Docker
- Chef doc
- Other deployment tools
Decision Point! Decide which tools is more appropriate for the task of deploying the cluster
This web interface should receive two files(.xml + .gmy) that will be fed to the HemeLB core clusters. After simulation is done, an .xpf file will be available for download to the user.
The system should look like this:
The planned approach is for the web interface to receive upload from the user in forms of .stl and .pr2 files. The web server then will spun up a docker container in a new EC2 instance or some kind of reserved instances specially used for docker containers. User will then get access to the docker container via noVNC and proceed to do geometry generation inside the container.
Once the geometry generation is done, the resulting .xml and .gmy file will be fed into the HemeLB cluster to run the simulation.
The system should look like this at this point:
There are two possible extension possible at this point. First, we accomodate even farther the pre-processing step. Allowing users to generate the STL file only and do the domain definition step (GPU Intensive) to generate .stl + .pr2 files that is used in step 4). Or alternatively, we could do the post-processing step that allow the user to view the result of the simulation to be viewed on the browser directly.
As with all projects with limited time and budget, there are risks involved in planning the project. Result of the project does not always accurately fit what we have planned.
This project is structured in such a way that minimize the risk of having nothing at the end of the dissertation period. We decoupled the components and plan on working the most important component first, the HemeLB container cluster and web interface.
After we finished the first phase, have everything working, then we could take a crack at other components of the workflow, which is the geometry generations. This way, we could plan for the project to "degrade gracefully" if the plan does not work perfectly.
I have to make sure that we are on the clear on the IP issues. HemeLB is in LGPL3, so HemeWeb should also be LGPL3. Will the Indonesia Endowment Fund for Education be okay with having their name on the web interface and my dissertation report (Some kind of "This project is supported/funded by the Indonesia Endowment Fund for Education(LPDP)")? Technically, as per license, they can not prevent somebody working on them in the future.
Also the source of funding, will research grant from AWS be a problem? If it is a problem, I should use the allocated budget for my dissertation, but then will that be okay?
Status Got through the Customer Service Officer, She explained that they would prefer for me to use my allocated budget for dissertation. But when asked about the IP related questions, she referred me to the higher ups.
The higher ups are "not sure" since she didn't really understand legal repercussion of open source software, but in her opinion, it should not be a problem, and she prefered me to use my allocated budget for disertation. She will check with the legal team first before writing formal response to my email.
- ✓ Computational Fluid Dynamic[CFD] Direct, link, this simulation is done on AWS
- ✓ Massively parallel fluid simulations on Amazon’s HPC cloud
- ✓ Performance Evaluation of Amazon EC2 for NASA HPC Applications
- ✓ Shifter Paper
- ✓ Shifter Press release
- ✓ Nekkloud: A Software Environment for High-order Finite Element Analysis on Clusters and Clouds
- ✓ An Introduction to High Performance Computing on AWS
- ✓ Models and Simulations as a Service: Exploring the Use of Galaxy for Delivering Computational Models
- Cloud Computing and Grid Computing 360-Degree Compared -- Definition of cloud and grid computing
- Scientific Cloud Computing: Early Definition and Experience
- Cost-benefit analysis of Cloud Computing versus desktop grids
- sshfs
- aws elastic file storage, preview state
- ✓ NFS -- This works, but might not be the "best" solution


