We have an HPC system. In our application, the head node assign tasks to the computer nodes. The results of each compute node will be saved in a matrix. We want all the compute nodes to send this matrix directly back to the head node. The head node will then process these results. We are not considering saving the intermediate results to a database or files since it's too slow.
Please let us know if there are documents on this issue. Thank you very much!