• /
  • EnglishEspañolFrançais日本語한국어Português
  • Log inStart now

Synthetics job manager configuration

This doc will guide you through configuring your synthetics job manager by showing you how to:

Configuration using environment variables

Environmental variables allow you to fine-tune the synthetics job manager configuration to meet your specific environmental and functional needs.

User-defined variables for scripted monitors

Private synthetics job managers let you configure environment variables for scripted monitors. These variables are managed locally on the SJM and can be accessed via $env.USER_DEFINED_VARIABLES. You can set user-defined variables in two ways. You can mount a JSON file or you can supply an environment variable to the SJM on launch. If both are provided, the SJM will only use values provided by the environment.

Accessing user-defined environment variables from scripts

To reference a configured user-defined environment variable, use the reserved $env.USER_DEFINED_VARIABLES followed by the name of a given variable with dot notation (for example, $env.USER_DEFINED_VARIABLES.MY_VARIABLE).

Caution

User-defined environment variables are not sanitized from logs. Consider using the secure credentials feature for sensitive information.

Custom node modules

Custom node modules are provided in both CPM and SJM. They allow you to create a customized set of node modules and use them in scripted monitors (scripted API and scripted browser) for synthetic monitoring.

Set up your custom modules directory

Create a directory with a package.json file following npm official guidelines in the root folder. The SJM will install any dependencies listed in the package.json's dependencies field. These dependencies will be available when running monitors on the private synthetics job manager. See an example of this below.

Example

In this example, a custom module directory is used with the following structure:

/example-custom-modules-dir/
├── counter
│ ├── index.js
│ └── package.json
└── package.json ⇦ the only mandatory file

The package.json defines dependencies as both a local module (for example, counter) and any hosted modules (for example, smallest version 1.0.1):

{
"name": "custom-modules",
"version": "1.0.0", ⇦ optional
"description": "example custom modules directory", ⇦ optional
"dependencies": {
"smallest": "1.0.1", ⇦ hosted module
"counter": "file:./counter" ⇦ local module
}
}

Add your custom modules directory to the SJM for Docker, Podman, or Kubernetes

To check if the modules were installed correctly or if any errors occurred, look for the following lines in the synthetics-job-manager container or pod logs:

2024-06-29 03:51:28,407{UTC} [main] INFO c.n.s.j.p.options.CustomModules - Detected mounted path for custom node modules
2024-06-29 03:51:28,408{UTC} [main] INFO c.n.s.j.p.options.CustomModules - Validating permission for custom node modules package.json file
2024-06-29 03:51:28,409{UTC} [main] INFO c.n.s.j.p.options.CustomModules - Installing custom node modules...
2024-06-29 03:51:44,670{UTC} [main] INFO c.n.s.j.p.options.CustomModules - Custom node modules installed successfully.

Now you can add "require('smallest');" into the script of monitors you send to this private location.

Change package.json for custom modules

In addition to local and hosted modules, you can utilize Node.js modules as well. To update the custom modules used by your SJM, make changes to the package.json file, and restart the SJM. During the reboot process, the SJM will recognize the configuration change and automatically perform cleanup and re-installation operations to ensure the updated modules are applied.

Caution

Local modules: While your package.json can include any local module, these modules must reside inside the tree under your custom module directory. If stored outside the tree, the initialization process will fail and you will see an error message in the docker logs after launching SJM.

Permanent data storage

Users may want to use permanent data storage to provide the user_defined_variables.json file or support custom node modules.

Docker

To set permanent data storage on Docker:

  1. Create a directory on the host where you are launching the Job Manager. This is your source directory.

  2. Launch the Job Manager, mounting the source directory to the target directory /var/lib/newrelic/synthetics.

    Example:

    bash
    $
    docker run ... -v /sjm-volume:/var/lib/newrelic/synthetics:rw ...

Podman

To set permanent data storage on Podman:

  1. Create a directory on the host where you are launching the Job Manager. This is your source directory.
  2. Launch the Job Manager, mounting the source directory to the target directory /var/lib/newrelic/synthetics.

Example:

bash
$
podman run ... -v /sjm-volume:/var/lib/newrelic/synthetics:rw,z ...

Kubernetes

To set permanent data storage on Kubernetes, the user has two options:

  1. Provide an existing PersistentVolumeClaim (PVC) for an existing PersistentVolume (PV), setting the synthetics.persistence.existingClaimName configuration value. Example:

    bash
    $
    helm install ... --set synthetics.persistence.existingClaimName=sjm-claim ...
  2. Provide an existing PersistentVolume (PV) name, setting the synthetics.persistence.existingVolumeName configuration value. Helm will generate a PVC for the user. The user may optionally set the following values as well:

  • synthetics.persistence.storageClass: The storage class of the existing PV. If not provided, Kubernetes will use the default storage class.

  • synthetics.persistence.size: The size for the claim. If not set, the default is currently 2Gi.

    bash
    $
    helm install ... --set synthetics.persistence.existingVolumeName=sjm-volume --set synthetics.persistence.storageClass=standard ...

Sizing considerations for Docker, Podman, Kubernetes, and OpenShift

Docker and Podman

To ensure your private location runs efficiently, you must provision enough CPU resources on your host to handle your monitoring workload. Many factors impact sizing, but you can quickly estimate your needs. You'll need 1 CPU core for each heavyweight monitor (i.e., simple browser, scripted browser, or scripted API monitor). Below are two formulas to help you calculate the number of cores you need, whether you're diagnosing a current setup or planning for a future one.

Formula 1: Diagnosing an Existing Location

If your current private location is struggling to keep up and you suspect jobs are queuing, use this formula to find out how many cores you actually need. It's based on the observable performance of your system.

The equation:

Creq=(Rproc+Rgrowth)×Davg,mC_{req} = (R_{proc} + R_{growth}) \times D_{avg,m}

  • CreqC_{req} = Required CPU Cores.
  • RprocR_{proc} = The rate of heavyweight jobs being processed per minute.
  • RgrowthR_{growth} = The rate your jobManagerHeavyweightJobs queue is growing per minute.
  • Davg,mD_{avg,m} = The average duration of heavyweight jobs in minutes.

Here's how it works: This formula calculates your true job arrival rate by adding the jobs your system is processing to the jobs that are piling up in the queue. Multiplying this total load by the average job duration tells you exactly how many cores you need to clear all the work without queuing.

Formula 2: Forecasting a New or Future Location

If you're setting up a new private location or planning to add more monitors, use this formula to forecast your needs ahead of time.

The equation:

Creq=Nmon×Davg,m×1Pavg,mC_{req} = N_{mon} \times D_{avg,m} \times \frac{1}P_{avg,m}

  • CreqC_{req} = Required CPU Cores.
  • NmonN_{mon} = The total number of heavyweight monitors you plan to run.
  • Davg,mD_{avg,m} = The average duration of a heavyweight job in minutes.
  • Pavg,mP_{avg,m} = The average period for heavyweight monitors in minutes (e.g., a monitor that runs every 5 minutes has Pavg,m=5P_{avg,m} = 5).

Here's how it works: This calculates your expected workload from first principles: how many monitors you have, how often they run, and how long they take.

Important sizing factors

When using these formulas, remember to account for these factors:

  • Job duration (Davg,mD_{avg,m}): Your average should include jobs that time out (often ~3 minutes), as these hold a core for their entire duration.
  • Job failures and retries: When a monitor fails, it's automatically retried. These retries are additional jobs that add to the total load. A monitor that consistently fails and retries effectively multiplies its period, significantly impacting throughput.
  • Scaling out: In addition to adding more cores to a host (scaling up), you can deploy additional synthetics job managers with the same private location key to load balance jobs across multiple environments (scaling out).

It's important to note that a single Synthetics Job Manager (SJM) has a throughput limit of approximately 15 heavyweight jobs per minute. This is due to an internal threading strategy that favors the efficient competition of jobs across multiple SJMs over the raw number of jobs processed per SJM. If your calculations indicate a need for higher throughput, you must scale out by deploying additional SJMs. You can check if your job queue is growing to determine if more SJMs are needed.

Adding more SJMs with the same private location key provides several advantages:

  • Load balancing: Jobs for the private location are distributed across all available SJMs.
  • Failover protection: If one SJM instance goes down, others can continue processing jobs.
  • Higher total throughput: The total throughput for your private location becomes the sum of the throughput from each SJM (e.g., two SJMs provide up to ~30 jobs/minute).

NRQL queries for diagnosis

You can run these queries in the query builder to get the inputs for the diagnostic formula. Make sure to set the time range to a long enough period to get a stable average.

1. Find the rate of jobs processed per minute (RprocR_{proc}): This query counts the number of non-ping (heavyweight) jobs completed over the last day and shows the average rate per minute.

FROM SyntheticCheck SELECT rate(uniqueCount(id), 1 minute) AS 'job rate per minute' WHERE location = 'YOUR_PRIVATE_LOCATION' AND type != 'SIMPLE' SINCE 1 day ago

2. Find the rate of queue growth per minute (RgrowthR_{growth}): This query calculates the average per-minute growth of the jobManagerHeavyweightJobs queue on a time series chart. A line above zero indicates the queue is growing, while a line below zero means it's shrinking.

FROM SyntheticsPrivateLocationStatus SELECT derivative(jobManagerHeavyweightJobs, 1 minute) AS 'queue growth rate per minute' WHERE name = 'YOUR_PRIVATE_LOCATION' TIMESERIES SINCE 1 day ago

Tip

Make sure to select the account where the private location exists. It's best to view this query as a time series because the derivative function can vary wildly. The goal is to get an estimate of the rate of queue growth per minute. Play with different time ranges to see what works best.

3. Find total number of heavyweight monitors (NmonN_{mon}): This query finds the unique count of heavyweight monitors.

FROM SyntheticCheck SELECT uniqueCount(monitorId) AS 'monitor count' WHERE location = 'YOUR_PRIVATE_LOCATION' AND type != 'SIMPLE' SINCE 1 day ago

4. Find average job duration in minutes (Davg,mD_{avg,m}): This query finds the average execution duration of completed non-ping jobs and converts the result from milliseconds to minutes. executionDuration represents the time the job took to execute on the host.

FROM SyntheticCheck SELECT average(executionDuration)/60e3 AS 'avg job duration (m)' WHERE location = 'YOUR_PRIVATE_LOCATION' AND type != 'SIMPLE' SINCE 1 day ago

5. Find average heavyweight monitor period (Pavg,mP_{avg,m}): If the private location's jobManagerHeavyweightJobs queue is growing, it isn't accurate to calculate the average monitor period from existing results. This will need to be estimated from the list of monitors on the Synthetic Monitors page. Make sure to select the correct New Relic account and you may need to filter by privateLocation.

Tip

Synthetic monitors may exist in multiple sub accounts. If you have more sub accounts than can be selected in the query builder, choose the accounts with the most monitors.

Note about ping monitors and the pingJobs queue

Ping monitors are different. They are lightweight jobs that do not consume a full CPU core each. Instead, they use a separate queue (pingJobs) and run on a pool of worker threads.

While they are less resource-intensive, a high volume of ping jobs, especially failing ones, can still cause performance issues. Keep these points in mind:

  • Resource model: Ping jobs utilize worker threads, not dedicated CPU cores. The core-per-job calculation does not apply to them.
  • Timeout and retry: A failing ping job can occupy a worker thread for up to 60 seconds. It first attempts an HTTP HEAD request (30-second timeout). If that fails, it immediately retries with an HTTP GET request (another 30-second timeout).
  • Scaling: Although the sizing formula is different, the same principles apply. To handle a large volume of ping jobs and keep the pingJobs queue from growing, you may need to scale up and/or scale out. Scaling up means increasing cpu and memory resources per host or namespace. Scaling out means adding more instances of the ping runtime. This can be done by deploying more job managers on more hosts, in more namespaces, or even within the same namespace. Alternatively, the ping-runtime in Kubernetes allows you to set a larger number of replicas per deployment.

Kubernetes and OpenShift

Each runtime used by the Kubernetes and OpenShift synthetic job manager can be sized independently by setting values in the helm chart. The node-api-runtime and node-browser-runtime are sized independently using a combination of the parallelism and completions settings.

A key consideration when sizing your runtimes is that a single SJM instance has a maximum throughput of approximately 15 heavyweight jobs per minute (scripted API and browser monitors). This is due to an internal threading strategy that favors the efficient competition of jobs across multiple SJMs over the raw number of jobs processed per SJM.

You can use your average job duration to calculate the maximum effective parallelism for a single SJM before hitting this throughput ceiling:

Parallelismmax15×Davg,m{Parallelism}_{max} \approx 15 \times D_{avg,m}

Where Davg,mD_{avg,m} is the average heavyweight job duration in minutes.

If your monitoring needs exceed this ~15 jobs/minute limit, you must scale out by deploying multiple SJM instances. You can check if your job queue is growing to see if more instances are needed.

The parallelism setting controls how many pods of a particular runtime run concurrently, and it is the equivalent of the HEAVYWEIGHT_WORKERS environment variable in the Docker and Podman SJM. The completions setting controls how many pods of a particular runtime must complete before the CronJob can start another Kubernetes Job for that runtime. For improved efficiency, completions should be set to 6-10x the parallelism value.

The following equations can be used as a starting point for completions and parallelism for each runtime.

Completions=300Davg,sCompletions = \frac{300}D_{avg,s}

Where Davg,sD_{avg,s} is the average job duration in seconds.

Parallelism=NmCompletionsParallelism = \frac{N_m}{Completions}

Where NmN_m is the number of synthetic jobs you need to run every 5 minutes.

The following queries can be used to obtain average duration and rate for a private location.

-- non-ping average job duration by runtime type
FROM SyntheticCheck SELECT average(duration) AS 'avg job duration'
WHERE type != 'SIMPLE' AND location = 'YOUR_PRIVATE_LOCATION' FACET typeLabel SINCE 1 hour ago
-- non-ping jobs per minute by runtime type
FROM SyntheticCheck SELECT rate(uniqueCount(id), 5 minutes) AS 'jobs per 5 minutes'
WHERE type != 'SIMPLE' AND location = 'YOUR_PRIVATE_LOCATION' FACET typeLabel SINCE 1 hour ago

Tip

The above queries are based on current results. If your private location does not have any results or the job manager is not performing at its best, query results may not be accurate. In that case, try a few different values for completions and parallelism until you see a kubectl get jobs -n YOUR_NAMESPACE duration of at least 5 minutes (enough completions) and the queue is not growing (enough parallelism).

Example

Description

parallelism=1

completions=1

The runtime will execute 1 synthetics job per minute. After 1 job completes, the CronJob configuration will start a new job at the next minute. Throughput will be extremely limited with this configuration.

parallelism=1

completions=6

The runtime will execute 1 synthetics job at a time. After the job completes, a new job will start immediately. After the completions setting number of jobs completes, the CronJob configuration will start a new Kubernetes Job and reset the completions counter. Throughput will be limited, but slightly better. A single long running synthetics job will block the processing of any other synthetics jobs of this type.

parallelism=3

completions=24

The runtime will execute 3 synthetics jobs at once. After any of these jobs complete, a new job will start immediately. After the completions setting number of jobs completes, the CronJob configuration will start a new Kubernetes Job and reset the completions counter. Throughput is much better with this or similar configurations. A single long running synthetics job will have limited impact to the processing of other synthetics jobs of this type.

If synthetics jobs take longer to complete, fewer completions are needed to fill 5 minutes with jobs but more parallel pods will be needed. Similarly, if more synthetics jobs need to be processed per minute, more parallel pods will be needed. The parallelism setting directly affects how many synthetics jobs per minute can be run. Too small a value and the queue may grow. Too large a value and nodes may become resource constrained.

If your parallelism settings is working well to keep the queue at zero, setting a higher value for completions than what is calculated from 300 / avg job duration can help to improve efficiency in a couple of ways:

  • Accommodate variability in job durations such that at least 1 minute is filled with synthetics jobs, which is the minimum CronJob duration.
  • Reduce the number of completions cycles to minimize the "nearing the end of completions" inefficiency where the next set of completions can't start until the final job completes.

It's important to note that the completions value should not be too large or the CronJob will experience warning events like the following:

8m40s Warning TooManyMissedTimes cronjob/synthetics-node-browser-runtime too many missed start times: 101. Set or decrease .spec.startingDeadlineSeconds or check clock skew

Tip

New Relic is not liable for any modifications you make to the synthetics job manager files.

Scaling out with multiple SJM instances

To achieve higher total throughput, you can install multiple SJM Helm releases in the same Kubernetes namespace. Each SJM will compete for jobs from the same private location, providing load balancing, failover protection, and an increased total job throughput.

When installing multiple SJM releases, you must provide a unique name for each release. All instances should be configured with the same private location key in their values.yaml file. While not required, setting the fullnameOverride is recommended to create shorter, more manageable resource names.

For example, to install two SJMs named sjm-alpha and sjm-beta into the newrelic namespace:

bash
$
helm upgrade --install sjm-alpha -n newrelic newrelic/synthetics-job-manager -f values.yaml --set fullnameOverride=sjm-alpha --create-namespace
bash
$
helm upgrade --install sjm-beta -n newrelic newrelic/synthetics-job-manager -f values.yaml --set fullnameOverride=sjm-beta

You can continue this pattern for as many SJMs as needed to keep the job queue from growing. For each SJM, set parallelism and completions to a reasonable value based on your average job duration and the ~15 jobs per minute limit per instance.

Copyright © 2025 New Relic Inc.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.