/
Agent Self-Surveiliance

Agent Self-Surveiliance

PerformanceGuard agents are typically installed on key computers that many users and business processes rely on. It's important that an agent doesn't affect the performance of the computer it's installed on. Agents therefore have a number of built-in features in order to protect the systems that they run on.

To make its measurements an agent collects information from many parts of the Windows operating system. This means that the agent will be affected if the operating system isn't performing well.

Example: The agent fetches Windows performance counter data by querying a special key in the Windows registry. If the registry, or the underlying mechanism used to collect performance counters, starts using more CPU resources, then the PerformanceGuard Agent service will begin to use more CPU resources as the performance data methods are called and executed within the agent service's context.



Service Recovery Handled by Windows SCM


The PerformanceGuard agent runs as a service. Service recovery is handled by Windows' Service Control Manager (SCM).

The service is installed with the following recovery options:

  • First failure: Restart the service,
  • Second failure: Restart the service,
  • Subsequent failures: Take no action
  • Reset count after: 10 minutes.

This means that the operating system will restart the agent a maximum of two times within 10 minutes if the agent service stops unexpectedly.


Agent Thread Checks


The various kinds of data collected by the agent are handled by a number of threads of execution, all hosted within the PerformanceGuard agent executable.

All the threads of execution within the agent are checked by a special monitoring thread.

A thread is either:

  • Waiting for an event from another thread
  • Waiting for a timer to expire
  • Collecting measurements

Every time a thread wakes up from waiting, it signals to the monitoring thread. Likewise, a thread will signal to the monitoring thread before it begins to wait.

The monitoring threads keeps track of all other threads, and checks that they are all alive. A thread should not be collecting data for more than a few seconds without checking in with the monitoring thread.

Example: If a thread gets stuck in a call to the operating system that's never returned, or that enters an infinite loop, the monitoring thread will detect this.

When the monitoring thread detects a thread that hasn't reported back in due time, the monitoring thread will instruct Windows' service control manager (SCM) to restart the PerformanceGuard Agent service.


Agent Memory Usage Checks


The PerformanceGuard agent regularly checks how much of the host computer's memory it uses. If the agent detects that it uses 60 MB or more of the host computer's RAM, the PerformanceGuard Agent service will automatically restart itself in order not to use excessive amounts of the computer's memory.

Such an agent restart will be logged, and you can view information about it in the message log for the computer in question. To view the message log, select ANALYZE > Computer Search, and search for a computer. Then, in the search results, click the name of the required computer, and then select the Message Log tab. In the log, agent restarts caused by memory usage will be listed with the name Agent restart caused by : Agent memory size exceeded.


Example: Agent restart caused by : Agent memory size exceeded - Heap size (kB) = 65290

Older versions of the PerformanceGuard agent use a limit of 40 MB, and very old agent versions don't have the memory usage check feature at all.

Search this documentation

On this page

In this section