OpenPaths Help

Changes Summary

The new topology of a cluster system involves a “Cluster Manager” process on each host that controls communications with other “Cluster Managers” on other hosts, and with its own collection of “slave” processes (here-afterwards referred as ‘auxiliary processes’).

There is no longer any need to manually start auxiliary processes on each host – that is handled by the Cluster Manager and is configured in the initial Pilot script. The Cluster Manager on the machine the “master” node (hereafter referred as ‘main process’) is executing on will contact each of the required Cluster Manager and start the required number of processes on each.

Error detection has been improved – if any processes fail, they trigger a failure for all processes in the run and the entire run will cleanly exit.

Auxiliary processes are now simply allocated an ID, and no specification of “process name” and number is required – any process from the pool that can perform the task is used. Intrastep tasks simply request the number of processes desired. This ensures that multiple requests for the same process are now no longer possible.

A separate wait function has been implemented for Pilot for multistep operations – “BARRIER”. WAIT4FILES still exists, but as it only operates on files, it no longer functions as a wait for cluster operations to complete.