Quick Tip: Compress Your Output Files
Blog post from Rescale
Cloud High-Performance Computing (HPC) faces the challenge of minimizing data transfer between on-premise machines and cloud-based systems due to slower and less reliable Wide Area Networks. To address this, it is advisable to perform post-processing remotely and transfer only necessary data. Users often run simulations and need to transfer output files, which are encrypted and stored in the cloud, allowing selective downloading. However, transferring individual files incurs overhead, and compressing data can optimize transfer times by reducing file size. Using compression tools, such as Jeff Gilchrist's MPI-compatible bz2 compressor, can significantly speed up the process by utilizing multiple cores for faster compression compared to single-core gzip. A compressed bz2 file can be up to five times smaller than an uncompressed tar file, significantly reducing download times, especially across slower internet connections. Automating these steps is expected to further streamline data handling in the future, enhancing efficiency for cloud HPC users.