Skip to content
Snippets Groups Projects
Verified Commit cec3111c authored by Andrei Plamada's avatar Andrei Plamada
Browse files

small improvements

parent c48e9a00
No related branches found
No related tags found
No related merge requests found
......@@ -65,9 +65,10 @@ sys 0m3.319s
When benchmarking:
- make sure the system is not used for other tasks (we can check this by monitoring the resource)
- it is hard to compare different infrastructures
- it is hard to compared using different inputs
- make sure the system is not used for other tasks (we can check this by monitoring the resources)
- it is hard to compare different infrastructures (we should avoid doing it)
- it is hard to compared when using different inputs (we should avoid doing it)
- IO operations make the benchmarking less predictable (more later)
## Resource Monitoring (CPU and Memory)
......@@ -251,7 +252,9 @@ _hint_: [Resource monitor](https://cctools.readthedocs.io/en/latest/resource_mon
### Estimating your hardware requirements
**Sample!** Take a representive small sample from a batch, whether it's data, file or parameters. If the batch is heterogeneous, take a few representative samples. If possible, run a pipeline or workflow start-to-finish with that sample. Run the same pipeline with more resources and ask yourself, "does it speed up?" Extrapolate the time needed for the entire data or parameter set from this small sample. Doing this will also help you find bugs or errors in the code much faster than when starting with the entire data or parameter set.
It works well for embarrassingly parallel problems.
**Sample!** Take a representative small sample from a batch, whether it's data, file or parameters. If the batch is heterogeneous, take a few representative samples. If possible, run a pipeline or workflow start-to-finish with that sample. Run the same pipeline with more resources and ask yourself, "does it speed up?" Extrapolate the time needed for the entire data or parameter set from this small sample. Doing this will also help you find bugs or errors in the code much faster than when starting with the entire data or parameter set.
### Running jobs
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment