Skip to content

Commit 692f533

Browse files
author
Marco Edoardo Santimaria
committed
Updated README.md and Doxygen
Updated README.md file with new informations and added Doxygen automatic scripts.
1 parent fef5ebb commit 692f533

4 files changed

Lines changed: 3023 additions & 154 deletions

File tree

β€Ž.gitignoreβ€Ž

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,15 @@ cmake-build-*
4444
files_location*.txt
4545
capio_logs
4646

47+
#Doxygen generated documentation
48+
doxy/html
49+
doxy/latex
50+
doxy/doxygen-awesome-css-*
51+
doxy/theme
52+
4753
# Other
4854
debug
4955
build
50-
56+
.devcontainer
57+
.DS_Store
58+
*.alive_connection

β€ŽREADME.mdβ€Ž

Lines changed: 129 additions & 153 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,52 @@
1-
# CAPIO
1+
# CAPIO: Cross Application Programmable IO
22

3-
CAPIO (Cross-Application Programmable I/O), is a middleware aimed at injecting streaming capabilities to workflow steps
4-
without changing the application codebase. It has been proven to work with C/C++ binaries, Fortran Binaries, JAVA,
5-
python and bash.
3+
CAPIO is a middleware aimed at injecting streaming capabilities into workflow steps
4+
without changing the application codebase. It has been proven to work with C/C++ binaries, Fortran, Java, Python, and
5+
Bash.
66

7-
[![codecov](https://codecov.io/gh/High-Performance-IO/capio/graph/badge.svg?token=6ATRB5VJO3)](https://codecov.io/gh/High-Performance-IO/capio)
8-
![CI-Tests](https://github.com/High-Performance-IO/capio/actions/workflows/ci-tests.yaml/badge.svg)
9-
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://raw.githubusercontent.com/High-Performance-IO/capio/master/LICENSE)
7+
[![codecov](https://codecov.io/gh/High-Performance-IO/capio/graph/badge.svg?token=6ATRB5VJO3)](https://codecov.io/gh/High-Performance-IO/capio) ![CI-Tests](https://github.com/High-Performance-IO/capio/actions/workflows/ci-tests.yaml/badge.svg) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://raw.githubusercontent.com/High-Performance-IO/capio/master/LICENSE)
108

11-
## Build and run tests
9+
> [!TIP]
10+
> CAPIO is now multibackend and dynamic by nature: you do not need MPI, to benefit for the in-memory IO improvements!
11+
> Just use a MTCL provided backend, if you want the in-memory IO, or fall back to the file system backend (default) if
12+
> oy just want to coordinate IO operations between workflow steps!
13+
14+
Compatible on:
15+
- ![Architecture](https://img.shields.io/badge/Architecture-x86__64_/_amd64-50C878.svg)
16+
- ![Architecture](https://img.shields.io/badge/Architecture-RISC--V_(riscv64)-50C878.svg)
17+
- ![Architecture](https://img.shields.io/badge/Architecture-ARM64_coming_soon-red.svg)
18+
19+
---
20+
## Automatic install with SPACK
21+
22+
CAPIO is on SPACK! to install it automatically, just add the High Performance IO
23+
repo to spack and then install CAPIO:
24+
```bash
25+
spack repo add https://github.com/High-Performance-IO/hpio-spack.git
26+
spack install capio
27+
```
28+
29+
> [!WARNING]
30+
> To use this method, you need spack >= v1.0.0
31+
32+
## πŸ”§ Manual Build and Install
1233

1334
### Dependencies
1435

15-
CAPIO depends on the following software that needs to be manually installed:
36+
**Required manually:**
1637

17-
- `cmake >=3.15`
18-
- `c++20` or newer
19-
- `openmpi`
38+
- `cmake >= 3.15`
39+
- `C++20`
2040
- `pthreads`
2141

22-
The following dependencies are automatically fetched during cmake configuration phase, and compiled when required.
42+
**Fetched/compiled during configuration:**
2343

24-
- [syscall_intercept](https://github.com/pmem/syscall_intercept) to intercept syscalls
25-
- [Taywee/args](https://github.com/Taywee/args) to parse server command line inputs
26-
- [simdjson/simdjson](https://github.com/simdjson/simdjson) to parse json configuration files
44+
- [syscall_intercept](https://github.com/pmem/syscall_intercept) - Intercept and handles LINUX system calls
45+
- [Taywee/args](https://github.com/Taywee/args) - Parse user input arguments
46+
- [simdjson/simdjson](https://github.com/simdjson/simdjson) - Parse fast JSON files
47+
- [MTCL](https://github.com/ParaGroup/MTCL) - Provides abstractions over multiple communication backends
2748

28-
### Compile capio
49+
### Compile CAPIO
2950

3051
```bash
3152
git clone https://github.com/High-Performance-IO/capio.git capio && cd capio
@@ -35,169 +56,124 @@ cmake --build . -j$(nproc)
3556
sudo cmake --install .
3657
```
3758

38-
It is also possible to enable log in CAPIO, by defining `-DCAPIO_LOG=TRUE`.
59+
To enable logging support, pass `-DCAPIO_LOG=TRUE` during the CMake configuration phase.
3960

40-
## Use CAPIO in your code
61+
---
4162

42-
Good news! You don't need to modify your code to benefit from the features of CAPIO. You have only to do three steps (
43-
the first is optional).
63+
## πŸ§‘β€πŸ’» Using CAPIO in Your Code
4464

45-
1) Write a configuration file for injecting streaming capabilities to your workflow
65+
Good news! You **don’t need to modify your application code**. Just follow these steps:
4666

47-
2) Launch the CAPIO daemons with MPI passing the (eventual) configuration file as argument on the machines in which you
48-
want to execute your program (one daemon for each node). If you desire to specify a custom folder
49-
for capio, set `CAPIO_DIR` as a environment variable.
50-
```bash
51-
[CAPIO_DIR=your_capiodir] [mpiexec -N 1 --hostfile your_hostfile] capio_server -c conf.json
52-
```
67+
### 1. Create a Configuration File *(optional but recommended)*
5368

54-
> [!NOTE]
55-
> if `CAPIO_DIR` is not specified when launching capio_server, it will default to the current working directory of
56-
> capio_server.
69+
Write a CAPIO-CL configuration file to inject streaming into your workflow. Refer to
70+
the [CAPIO-CL Docs](https://capio.hpc4ai.it/docs/coord-language/) for details.
5771

58-
3) Launch your programs preloading the CAPIO shared library like this:
59-
```bash
60-
CAPIO_DIR=your_capiodir \
61-
CAPIO_WORKFLOW_NAME=wfname \
62-
CAPIO_APP_NAME=appname \
63-
LD_PRELOAD=libcapio_posix.so \
64-
./your_app <args>
65-
```
72+
### 2 Launch the workflow with CAPIO
6673

67-
> [!WARNING]
68-
> `CAPIO_DIR` must be specified when launching a program with the CAPIO library. if `CAPIO_DIR` is not specified, CAPIO
69-
> will not intercept syscalls.
74+
To launch your workflow with capio you can follow two routes:
7075

71-
### Available environment variables
76+
#### A) Use `capiorun` for simplified operations
7277

73-
CAPIO can be controlled through the usage of environment variables. The available variables are listed below:
78+
You can simplify the execution of workflow steps with CAPIO using the `capiorun` utility. See the
79+
[`capiorun` documentation](capio-run/readme.md) for usage and examples. `capiorun` provides an easier way to manage
80+
daemon startup and environment preparation, so that the user do not need to manually prepare the environment.
7481

75-
#### Global environment variable
82+
#### B) Manually launch CAPIO
7683

77-
- `CAPIO_DIR` This environment variable tells to both server and application the mount point of capio;
78-
- `CAPIO_LOG_LEVEL` this environment tells both server and application the log level to use. This variable works only
79-
if `-DCAPIO_LOG=TRUE` was specified during cmake phase;
80-
- `CAPIO_LOG_PREFIX` This environment variable is defined only for capio_posix applications and specifies the prefix of
81-
the logfile name to which capio will log to. The default value is `posix_thread_`, which means that capio will log by
82-
default to a set of files called `posix_thread_*.log`. An equivalent behaviour can be set on the capio server using
83-
the `-l` option;
84-
- `CAPIO_LOG_DIR` This environment variable is defined only for capio_posix applications and specifies the directory
85-
name to which capio will be created. If this variable is not defined, capio will log by default to `capio_logs`. An
86-
equivalent behaviour can be set on the capio server using the `-d` option;
87-
- `CAPIO_CACHE_LINES`: This environment variable controls how many lines of cache are presents between posix and server
88-
applications. defaults to 10 lines;
89-
- `CAPIO_CACHE_LINE_SIZE`: This environment variable controls the size of a single cache line. defaults to 256KB;
84+
Launch the CAPIO Daemons: start one daemon per node. Optionally set `CAPIO_DIR` to define the CAPIO mount point:
9085

91-
#### Server only environment variable
86+
```bash
87+
[CAPIO_DIR=your_capiodir] capio_server -c conf.json
88+
```
9289

93-
- `CAPIO_FILE_INIT_SIZE`: This environment variable defines the default size of pre allocated memory for a new file
94-
handled by capio. Defaults to 4MB. Bigger sizes will reduce the overhead of malloc but will fill faster node memory.
95-
Value has to be expressed in bytes;
96-
- `CAPIO_PREFETCH_DATA_SIZE`: If this variable is set, then data transfers between nodes will be always, at least of the
97-
given value in bytes;
90+
> [!CAUTION]
91+
> If `CAPIO_DIR` is not set, it defaults to the current working directory.
9892
99-
#### Posix only environment variable
93+
You can now start your application. Just set the right environment variable and remember to set `LD_PRELOAD` to the
94+
`libcapio_posix.so` intercepting library:
10095

101-
> [!WARNING]
102-
> The following variables are mandatory. If not provided to a posix, application, CAPIO will not be able to correctly
103-
> handle the
104-
> application, according to the specifications given from the json configuration file!
105-
106-
- `CAPIO_WORKFLOW_NAME`: This environment variable is used to define the scope of a workflow for a given step. Needs to
107-
be the same one as the field `"name"` inside the json configuration file;
108-
- `CAPIO_APP_NAME`: This environment variable defines the app name within a workflow for a given step;
109-
110-
## How to inject streaming capabilities into your workflow
111-
112-
With CAPIO is possible to run the applications of your workflow that communicates through files concurrently. CAPIO will
113-
synchronize transparently the concurrent reads and writes on those files. If a file is never modified after it is closed
114-
you can set the streaming semantics equals to "on_close" on the configuration file. In this way, all the reads done on
115-
this file will hung until the writer closes the file, allowing the consumer application to read the file even if the
116-
producer is still running.
117-
Another supported file streaming semantics is "append" in which a read is satisfied when the producer writes the
118-
requested data. This is the most aggressive (and efficient) form of streaming semantics (because the consumer can start
119-
reading while the producer is writing the file). This semantic must be used only if the producer does not modify a piece
120-
of data after it is written.
121-
The streaming semantic on_termination tells CAPIO to not allowing streaming on that file. This is the default streaming
122-
semantics if a semantics for a file is not specified.
123-
The following is an example of a simple configuration:
124-
125-
```json
126-
{
127-
"name": "my_workflow",
128-
"IO_Graph": [
129-
{
130-
"name": "writer",
131-
"output_stream": [
132-
"file0.dat",
133-
"file1.dat",
134-
"file2.dat"
135-
],
136-
"streaming": [
137-
{
138-
"name": ["file0.dat"],
139-
"committed": "on_close"
140-
},
141-
{
142-
"name": ["file1.dat"],
143-
"committed": "on_close",
144-
"mode": "no_update"
145-
},
146-
{
147-
"name": ["file2.dat"],
148-
"committed": "on_termination"
149-
}
150-
]
151-
},
152-
{
153-
"name": "reader",
154-
"input_stream": [
155-
"file0.dat",
156-
"file1.dat",
157-
"file2.dat"
158-
]
159-
}
160-
]
161-
}
96+
```bash
97+
CAPIO_DIR=your_capiodir
98+
CAPIO_WORKFLOW_NAME=wfname
99+
CAPIO_APP_NAME=appname
100+
LD_PRELOAD=libcapio_posix.so
101+
./your_app <args>
102+
103+
killall -USR1 capio_server
162104
```
163105

164-
> [!NOTE]
165-
> We are working on an extension of the possible streaming semantics and in a detailed
166-
> documentation about the configuration file!
106+
> [!CAUTION]
107+
> if `CAPIO_APP_NAME` and `CAPIO_WORKFLOW_NAME` are not set (or are set but do not match the values present in the
108+
> CAPIO-CL configuration file), CAPIO will not be able to operate correctly!
167109
168-
## Examples
110+
> [!tip]
111+
> To gracefully shut down the capio server instance, just send the SIGUSR1 signal.
112+
> the capio_server process will then automatically clean up and terminate itself!
169113
170-
The [examples](examples) folder contains some examples that shows how to use mpi_io with CAPIO.
171-
There are also examples on how to write JSON configuration files for the semantics implemented by CAPIO:
114+
---
172115

173-
- [on_close](https://github.com/High-Performance-IO/capio/wiki/Examples#on_close-semantic): A pipeline composed by a
174-
producer and a consumer with "on_close" semantics
175-
- [no_update](https://github.com/High-Performance-IO/capio/wiki/Examples#noupdate-semantics): A pipeline composed by a
176-
producer and a consumer with "no_update" semantics
177-
- [mix_semantics](https://github.com/High-Performance-IO/capio/wiki/Examples#mixed-semantics): A pipeline composed by a
178-
producer and a consumer with mix semantics
116+
## βš™οΈ Environment Variables
179117

180-
## Report bugs + get help
118+
### πŸ”„ Global
181119

182-
[Create a new issue](https://github.com/High-Performance-IO/capio/issues/new)
120+
| Variable | Description |
121+
|-------------------------|----------------------------------------------------|
122+
| `CAPIO_DIR` | Shared mount point for server and application |
123+
| `CAPIO_LOG_LEVEL` | Logging level (requires `-DCAPIO_LOG=TRUE`) |
124+
| `CAPIO_LOG_PREFIX` | Log file name prefix (default: `posix_thread_`) |
125+
| `CAPIO_LOG_DIR` | Directory for log files (default: `capio_logs`) |
126+
| `CAPIO_CACHE_LINE_SIZE` | Size of a single CAPIO cache line (default: 256KB) |
183127

184-
[Get help](https://github.com/High-Performance-IO/capio/wiki)
128+
### πŸ–₯️ Server-Only
185129

186-
> [!TIP]
187-
> A [wiki](https://github.com/High-Performance-IO/capio/wiki) is in development! You might want to check the wiki to get
188-
> more in depth information about CAPIO!
130+
| Variable | Description |
131+
|----------------------|----------------------------------------------------------------------------|
132+
| `CAPIO_METADATA_DIR` | Directory for metadata files. Defaults to `CAPIO_DIR`. Must be accessible. |
133+
134+
### πŸ“ POSIX-Only (Mandatory)
135+
136+
> ⚠️ These are required by CAPIO-POSIX. Without them, your app will not behave as configured in the JSON file.
137+
138+
| Variable | Description |
139+
|-----------------------|-------------------------------------------------|
140+
| `CAPIO_WORKFLOW_NAME` | Must match `"name"` field in your configuration |
141+
| `CAPIO_APP_NAME` | Name of the step within your workflow |
142+
143+
---
144+
145+
## πŸ“– Extended documentation
146+
147+
Documentation and examples are available on the official site:
148+
149+
🌐 [https://capio.hpc4ai.it/docs](https://capio.hpc4ai.it/docs)
150+
151+
---
152+
153+
## 🐞 Report Bugs & Get Help
154+
155+
- [Create an issue](https://github.com/High-Performance-IO/capio/issues/new)
156+
- [Official Documentation](https://capio.hpc4ai.it/docs)
157+
158+
---
159+
160+
## πŸ‘₯ CAPIO Team
161+
162+
Made with ❀️ by:
163+
164+
- Marco Edoardo Santimaria – <marcoedoardo.santimaria@unito.it> (Designer & Maintainer)
165+
- Iacopo Colonnelli – <iacopo.colonnelli@unito.it> (Workflow Support & Maintainer)
166+
- Massimo Torquati – <massimo.torquati@unipi.it> (Designer)
167+
- Marco Aldinucci – <marco.aldinucci@unito.it> (Designer)
189168

190-
## CAPIO Team
169+
**Former Members:**
191170

192-
Made with :heart: by:
171+
- Alberto Riccardo Martinelli – <albertoriccardo.martinelli@unito.it> (Designer & Maintainer)
193172

194-
Alberto Riccardo Martinelli <albertoriccardo.martinelli@unito.it> (designer and maintainer) \
195-
Marco Edoardo Santimaria <marcoedoardo.santimaria@unito.it> (Designer and maintainer) \
196-
Iacopo Colonnelli <iacopo.colonnelli@unito.it> (Workflows expert and maintainer) \
197-
Massimo Torquati <massimo.torquati@unipi.it> (Designer) \
198-
Marco Aldinucci <marco.aldinucci@unito.it> (Designer)
173+
---
199174

200-
## Papers
201-
[![CAPIO](https://img.shields.io/badge/CAPIO-10.1109/HiPC58850.2023.00031-red)]([https://arxiv.org/abs/2206.10048](https://dx.doi.org/10.1109/HiPC58850.2023.00031))
175+
## πŸ“š Publications
202176

177+
[![CAPIO](https://img.shields.io/badge/CAPIO-10.1109/HiPC58850.2023.00031-red)](https://dx.doi.org/10.1109/HiPC58850.2023.00031)
203178

179+
[![](https://img.shields.io/badge/CAPIO--CL-10.1007%2Fs10766--025--00789--0-green?style=flat&logo=readthedocs)](https://doi.org/10.1007/s10766-025-00789-0)

0 commit comments

Comments
Β (0)