Configuring Load Balancing on Platform Agents
When your processes can run on more than one process server, you will want to spread the load as evenly as possible. By default the system will run a new process on the server that has the lowest amount of running processes. In other words, if three process servers are eligible and they are running 1, 2 and 3 processes respectively the system will attempt to run the process on the agent with 1 running process.
You can optionally also use os metric load balancing, where the decision is no longer based on the number of running processes but on some metric. The metric that is chosen is user-defined. All metrics are first stored in the Monitor Tree data. This is a tree of nodes and values.
Built-in platform agent metrics
The platform agents are able to send CPU and memory usage data to the server. The frequency at which this data is sent is controlled with the MonitorInterval process server parameter. If you do not set this parameter, the default value is 60 (seconds). Set the value to 0 (zero) if you do not plan to use load-balancing or use different values than the system metrics.
Once you set MonitorInterval to a non-zero value the data for load balancing is stored at:
/System/ProcessServer/<process_server>/Performance/CPUBusy
/System/ProcessServer/<process_server>/Performance/PageRate
Custom defined metrics
The load balancing system can also use other metrics. These can be written to the monitoring tree using the Java API, or using the jmonitor tool within a (possibly long-running) process.
For example, if you want to base your decision on the number of free printers available at a particular system, you should call the following in your job(s) whenever the number of available printers changes:
jmonitor -j /System/ProcessServer/<process_server>/Custom/FreePrinters=$FREE
Setting the Load Factor
You influence the computed load on a particular process server by adding one or more load factors to its definition. Each load factor has three attributes:
- Threshold - the maximum allowed value, once reached, the process server is put to Overloaded until the threshold is no longer met.
- Multiplier - the multiplier is used to compare the current value to other process servers.
- Monitor Value - the monitoring leaf value that is used.
The following fields are on process server-level
- Load Threshold - the maximum allowed load, counting values from all load factors; once reached, the process server is put to Overloaded until the load threshold is no longer met.
- Execution Size - the maximum number of concurrent processes the process server can run
Once you have set the load factors the system will choose a new process to run on the system that has the lowest value for the following equation:
sum(Multiplier * Monitor Value)
The result of the above equation is also used to determine if the Load Threshold has been reached.
If a process server has at least one load factor where the current monitor value exceeds the threshold value or the sum of all monitor values exceeds the Load Threshold value it has status Overloaded and is not chosen.
If two or more process servers have an identical load, the process server that has been created first will be used.
note
If any of the eligible process servers has no load factors at all then OS metric load balancing is not used and the system reverts to counting the number of already executing processes. This means that for custom load factor-based load balancing to be applied to a queue, all process servers serving the queue must have at least one load factor. This allows you to control for which queues of a process server load balancing is taken into account.
Procedure
Setting up load balancing between two agents
- Navigate to "Environment > Process Servers".
- Choose Edit from the context-menu of the weak process server A.
- On the Load Factors tab choose Add and enter System load into the description field.
- In the Multiplier field enter
2
. - In the Threshold field enter
90
. - In the Monitor Value field choose the CPUBusy monitor for the path
/System/ProcessServer/<agent>/Performance
where<agent>
is the agent that you are editing. - Choose Save & Close..
- Choose Edit from the context-menu of the strong process server B.
- On the Load Factors tab choose Add and enter System load into the description field.
- In the Multiplier field enter
1
. - In the Threshold field enter
75
. - In the Monitor Value field choose the CPUBusy monitor for the path
/System/ProcessServer/<agent>/Performance
where<agent>
is the agent that you are editing. - Choose Save & Close..
Example
You want to balance the workload between two platform agents running on two different machines. The first machine has two slow CPUs, and the other one has 4 fast CPUs that are each twice as fast. You want to maximize throughput of the system. You can do this by using the bigger server to good effect by allocating more processes to it, but also need to reserve capacity on that machine for on-line users. To do so you could implement the following settings:
Process server A (slow, with 2 CPUs) suggested load factor:
Field | Value |
---|---|
Multiplier | 1 |
Threshold | 100 |
Monitor Value | CPUBusy (for that process server) |
Process server B (fast, with 4 CPUs) suggested load factor:
Field | Value |
---|---|
Multiplier | 1 |
Threshold | 75 |
Monitor Value | CPUBusy (for that process server) |
The multiplier value can remain 1, as a process running on the faster system B will have less of an impact on CPU load than the same process will have on system A. Multipliers are most often used when you combine multiple load factors.
The threshold is set to 75 on process server B, so that we reserve one CPU (one out of four = 25%) for non-batch work.