**Sample!** Take a representive small sample from a batch, whether it's data, file or parameters. If the batch is heterogeneous, take a few representative samples. If possible, run a pipeline or workflow start-to-finish with that sample. Run the same pipeline with more resources and ask yourself, "does it speed up?" Extrapolate the time needed for the entire data or parameter set from this small sample. Doing this will also help you find bugs or errors in the code much faster than when starting with the entire data or parameter set.
It works well for embarrassingly parallel problems.
**Sample!** Take a representative small sample from a batch, whether it's data, file or parameters. If the batch is heterogeneous, take a few representative samples. If possible, run a pipeline or workflow start-to-finish with that sample. Run the same pipeline with more resources and ask yourself, "does it speed up?" Extrapolate the time needed for the entire data or parameter set from this small sample. Doing this will also help you find bugs or errors in the code much faster than when starting with the entire data or parameter set.