Integrating NewRelic alerts via NerdGraphAPI to VuGen

January 9, 2023

First, let’s make the main building blocks clear.

  • Instrumented/monitored application: a simple Node.js application shipped with express web framework to expose a REST interface and produce some observable errors.
  • New Relic: the tool and agent to monitor the instrumented application and set up alerts.
  • VuGen: performance testing tool to execute a replayable action script to fetch New Relic alerts.

How will it work?

  • As usual, the New Relic agent will ingest MELT from the application to the New Relic account using the assigned license key.
  • The application will expose a REST service to produce internal server errors, more accurately a simulated database connection error with an HTTP status 500.
  • We set up an alert policy including a condition to set up a threshold for a given number of server errors.
  • When the condition is violated, a new issue will be created.
  • The issues will be queried via the NewRelic NerdGraph API in the VuGen script. In general, you can configure push notifications or REST hooks too to receive notifications about issues/incidents, in this case, the alerts will be pulled instead of pushed.

Let’s implement our Node.js application

First of all, we need an application, which will be monitored and able to produce some errors.  

This simple code snippet exposes a REST service on the url  http://localhost:8080/fetchCategories and will return an HTTP 500 Internal Server Error with the error message: Could not fetch categories due to a database connection issue


const express = require('express');
const app = express();

app.get('/fetchCategories', function (req, res) {
  throw Error('Could not fetch categories due to a database connection issue')
})

const port = process.env.PORT || 8080
app.listen(port, () => { console.log(`Listening on port ${port}`); })

To reproduce an error, please execute the curl command below:

curl http://localhost:8080/fetchCategories

Get started with New Relic and keys

Now, we have the application, let’s instrument it with the New Relic agent.

First, let’s sign into our New Relic account, if you don't have an account yet, you can sign up here.

Open up https://one.eu.newrelic.com/marketplace/install-data-source, then select Node.js type,

and select Package Manager instrumentation method.

Afterward, please configure the application name and follow the instructions to set up monitoring.

As a best practice, the recommendation is to generate a new license key immediately, which can be rotated, as the initial license key cannot be deleted or changed. Now, we can start the application by executing:

node -r newrelic app.js

Add an alert policy to observe server errors

By executing the curl command above, let’s produce an error and verify the monitoring. If it works as expected, you should see it in the APM/Error view.

Let’s open up the Alert conditions(Policies) view on the Alerts & AI page, and click the New alert policy button to create a collection of conditions.

We will call the policy Node.js alert policy, and select One issue per condition issue creation preference, which means one issue will be open at a time for each condition in our policy.

Now, we define the alert condition to create alerts, when a server error occurs.  Click Create an alert condition and define the name and query (in NRQL language) , which represents the number of HTTP 500  server errors.

In the next step, we define the threshold of the condition. The time window of the query will be 1 minute long, and if more than 2 errors are produced, then an alert will be triggered, in other words, an issue will be created. The issue will be automatically closed if the error does not reoccur for 10 minutes, aka. signal lost. Please note, in production longer intervals are supposed to be used with less strict and more sophisticated conditions (like error rate).

Fine-tune advanced signal settings are not relevant for this guide, the only one recommendation is to change the Streaming method to Event timer since the events are not propagated continuously to the New Relic.

Query alerts via NerdGraph API

The NerdGraph API is built based on GraphQL and gives a flexible API to query/obtain data or perform some New Relic changes (via mutations).

The way, we can fetch the alerts, we need a GraphQL query with the requested data structure


{"query":"{
  actor {
    account(id: [--- REPLACE IT WITH YOUR ACCOUNT ID ---]) {
      aiIssues {
        issues(timeWindow: {startTime: 1672843028110, endTime: 1672908823000}) {
          issues {
            issueId
            priority
            state
            title
            createdAt
            isIdle
            totalIncidents
            incidentIds
            conditionName
            activatedAt
            conditionFamilyId
            deepLinkUrl
            description
            entityGuids
            entityNames
            isCorrelated
            policyIds
            policyName
            updatedAt
          }
        }
      }
    }
  }
}
", "variables":""}

Please see the Curl query, the access is allowed via an API-gateway with a given API key, and the user is authorized (Introduction to New Relic NerdGraph, our GraphQL API) by the account id.

It’s highly recommended to create a user key type API-Key instead of using a license key, as license keys are for ingesting data into New Relic, while user keys are to perform queries and configure the account . For information. For more information, please see

New Relic API keys To create a new key, click Create a key button

The newly created API key can be placed in the curl command

The Account ID is the unique identifier of your account that can be found at multiple places, e.g. at API key creation view.


curl https://api.eu.newrelic.com/graphql \
  -H 'Content-Type: application/json' \
  -H 'API-Key: [---REDACTED---]' \
  --data-binary '{"query":"{ actor { account(id: [--- REPLACE IT WITH YOUR ACCOUNT ID ---]) { aiIssues { issues(timeWindow: {startTime: 1672843028110, endTime: 1672908823000}) { issues { issueId priority state title createdAt isIdle totalIncidents incidentIds conditionName activatedAt conditionFamilyId deepLinkUrl description entityGuids entityNames isCorrelated policyIds policyName updatedAt } } } } } } ", "variables":""}'

If we produce some errors and an alert/issue has been created in Relic, the response of the query will contain the requested fields of the issues.


{
  "data": {
    "actor": {
      "account": {
        "aiIssues": {
          "issues": {
            "issues": [
              {
                "activatedAt": "1672907641562",
                "conditionFamilyId": [
                  1968728
                ],
                "conditionName": [
                  "Number of server errors"
                ],
                "createdAt": "1672907641561",
                "deepLinkUrl": [
                  "https://insights.eu.newrelic.com/accounts/3752132/query?query=SELECT+count%28*%29+FROM+Transaction+WHERE+%60http.statusCode%60%3D%27500%27++FACET+%60http.statusCode%60+TIMESERIES+1+MINUTE+SINCE+%272023-01-05+02%3A35%3A00%27+UNTIL+%272023-01-05+08%3A34%3A00%27"
                ],
                "description": [
                  "Policy: 'Node.js alert policy'. Condition: 'Number of server errors'"
                ],
                "entityGuids": [
                  "Mzc1MjEzMnxBUE18QVBQTElDQVRJT058NDQ5NDA0ODgx"
                ],
                "entityNames": [
                  "sample-node-js-app"
                ],
                "incidentIds": [
                  "efb0622f-9b33-4152-bb07-2e8ac64be6d1"
                ],
                "isCorrelated": false,
                "isIdle": false,
                "issueId": "df54dbf1-d3d6-47da-98f4-1754071e4eb5",
                "policyIds": [
                  874059
                ],
                "policyName": [
                  "Node.js alert policy"
                ],
                "priority": "CRITICAL",
                "state": "ACTIVATED",
                "title": [
                  "sample-node-js-app query result is > 2.0 for 1 minutes on 'Number of server errors'"
                ],
                "totalIncidents": 1,
                "updatedAt": "1672907641562"
              }
            ]
          }
        }
      }
    }
  }
}

Let’s see the alerts in VuGen

To have replayable scripts, please install Virtual User Generator (VuGen) | AppDelivery Marketplace ( Windows only) first, then in the File menu click New Script and solution

In the next view please select the Web - HTTP/HTML type.

You will see an Action script in C language, which has to be changed to Javascript in the Recording options.

Click OK, and start and stop Recording. Into the newly created script, please insert the javascript implementation below to obtain the alerts.


function Action() {

    const now = Date.now()
    const duration = 10 * 60 * 1000 //10 minutes in milliseconds
    const startTime = now - duration


    const query = '{"query":"{ actor { account(id: 3752132) { aiIssues { issues(timeWindow: {startTime:  ' + startTime
        + ', endTime: ' + now + '}) { issues { issueId priority state title createdAt isIdle totalIncidents incidentIds conditionName activatedAt conditionFamilyId deepLinkUrl description entityGuids entityNames isCorrelated policyIds policyName updatedAt } } } } } } ", "variables":""}'

    web.setSocketsOption('SSL_VERSION', 'AUTO');
    
    lr.startTransaction('FetchNewRelicAlerts')

    web.rest(
        {
            name: 'New Relic alerts',
            url: 'https://api.eu.newrelic.com/graphql',
            method: 'POST',
            enctype: 'raw',
            snapshot: 't809911.inf',
            body: query,
            headers:
                [
                    { name: 'API-Key', value: '[---REDACTED---]' },
                    { name: 'Content-Type', value: 'application/json' }
                ]
        }
    );

    lr.endTransaction('FetchNewRelicAlerts', lr.AUTO)

    return 0;
}

To execute the script, please click the Replay button. The screenshot below shows a successful execution of our action script.

Please see the verbose logs of the script execution.


Virtual User Script started at: 1/5/2023 11:28:47 AM
Starting action vuser_init.
Web Turbo Replay of LoadRunner 2022.0.0 for Windows 11; build 605 (Mar 08 2022 19:29:30)  	[MsgId: MMSG-26983]
Run mode: HTML  	[MsgId: MMSG-26993]
Replay user agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36 Edg/108.0.1462.54  	[MsgId: MMSG-26988]
Runtime Settings file: "\\Mac\Home\Documents\VuGen\Scripts\NewRelic\\default.cfg"  	[MsgId: MMSG-27141]
Vuser directory: "\\Mac\Home\Documents\VuGen\Scripts\NewRelic"  	[MsgId: MMSG-27052]
Vuser output directory: "\\Mac\Home\Documents\VuGen\Scripts\NewRelic"  	[MsgId: MMSG-27050]
Operating system's current working directory: "\\Mac\Home\Documents\VuGen\Scripts\NewRelic"  	[MsgId: MMSG-27048]
UTC (GMT) start date/time   : 2023-01-05 10:28:47  	[MsgId: MMSG-26992]
LOCAL start date/time       : 2023-01-05 11:28:47  	[MsgId: MMSG-26991]
Local daylight-Savings-Time : No  	[MsgId: MMSG-26990]
Some of the Runtime Settings:  	[MsgId: MMSG-27142]
    Download non-HTML resources: Yes  	[MsgId: MMSG-27018]
    Verification checks: No  	[MsgId: MMSG-27017]
    Convert from/to UTF-8: No  	[MsgId: MMSG-27016]
    Simulate a new user each iteration: Yes  	[MsgId: MMSG-27009]
    Non-critical item errors as warnings: Yes  	[MsgId: MMSG-27008]
    HTTP errors as warnings: No  	[MsgId: MMSG-27007]
    WinInet replay instead of Sockets: No  	[MsgId: MMSG-27006]
    HTTP version: 1.1  	[MsgId: MMSG-27005]
    Keep-Alive HTTP connections: Yes  	[MsgId: MMSG-27004]
    Max self Meta refresh updates: 2  	[MsgId: MMSG-27003]
    No proxy is used (direct connection to the Internet)  	[MsgId: MMSG-27171]
    DNS caching: Yes  	[MsgId: MMSG-27035]
    Simulate browser cache: Yes  	[MsgId: MMSG-27034]
        Cache URLs requiring content (e.g., HTMLs): Yes  	[MsgId: MMSG-27033]
            Additional URLs requiring content: None  	[MsgId: MMSG-27032]
        Check for newer versions every visit to the page: No  	[MsgId: MMSG-27031]
    Page download timeout (sec): 120  	[MsgId: MMSG-27030]
    Resource Page Timeout is a Warning: No  	[MsgId: MMSG-27029]
    ContentCheck enabled: Yes  	[MsgId: MMSG-27028]
    ContentCheck script-level file: "\\Mac\Home\Documents\VuGen\Scripts\NewRelic\LrwiAedScript.xml"  	[MsgId: MMSG-27027]
    Enable Web Page Breakdown: Yes  	[MsgId: MMSG-27026]
    Enable connection data points: Yes  	[MsgId: MMSG-27023]
    Process socket after reschedule: Yes  	[MsgId: MMSG-27022]
    Snapshot on error: No  	[MsgId: MMSG-27021]
    Define each step as a transaction: No  	[MsgId: MMSG-27020]
    Read beyond Content-Length: No  	[MsgId: MMSG-26994]
    Parse HTML Content-Type: TEXT  	[MsgId: MMSG-26999]
    Graph hits per second and HTTP status codes: Yes  	[MsgId: MMSG-26998]
    Graph response bytes per second: Yes  	[MsgId: MMSG-26997]
    Graph pages per second: No  	[MsgId: MMSG-26996]
    Web recorder version ID: 10  	[MsgId: MMSG-26995]
Ending action vuser_init.
Running Vuser...
Starting iteration 1.
Maximum number of concurrent connections per server: 6  	[MsgId: MMSG-26989]
Starting action Action.


Action.js(18): web.setSocketsOption started  	[MsgId: MMSG-26355]
Action.js(18): web.setSocketsOption was successful  	[MsgId: MMSG-26392]
Action.js(20): Notify: Transaction "FetchNewRelicAlerts" started.
Action.js(22): web.rest("New Relic alerts") started  	[MsgId: MMSG-26355]
Action.js(22): Warning: The string 'startTime:  1672913927684, endTime: 1672914527684' with parameter delimiters is not a parameter.
Action.js(22): Warning: The string ' issueId priority state title createdAt isIdle totalIncidents incidentIds conditionName activatedAt conditionFamilyId deepLinkUrl description entityGuids entityNames isCorrelated policyIds policyName updatedAt ' with parameter delimiters is not a parameter.
Action.js(22): Notify: ****************   web_add_header is called internally from web_rest. The following messages are from web_add_header   *****************
Action.js(22): web.addHeader("API-Key") started  	[MsgId: MMSG-26355]
Action.js(22): An unrecognized header ("API-Key") is being added  	[MsgId: MMSG-26595]
Action.js(22): "API-Key: [---REDACTED---]" header registered for adding to requests from the immediately following Action function  	[MsgId: MMSG-26506]
Action.js(22): web.addHeader("API-Key") was successful  	[MsgId: MMSG-26392]
Action.js(22): web.addHeader("Content-Type") started  	[MsgId: MMSG-26355]
Action.js(22): Warning -26594: The header being added may cause unpredictable results if applied to ALL the URLs generated on behalf of the next script function. It will apply to the primary URL only.  	[MsgId: MWAR-26594]
Action.js(22): "Content-Type: application/json" header registered for adding to requests from the immediately following Action function  	[MsgId: MMSG-26506]
Action.js(22): web.addHeader("Content-Type") highest severity level was "warning"  	[MsgId: MMSG-26391]
Action.js(22): Notify: ****************   End of messages from web_add_header   *****************
Action.js(22): t=12603ms: Connecting [0] to host 185.221.86.9:443  	[MsgId: MMSG-26000]
Action.js(22): t=12660ms: Connected socket [0] from 10.211.55.3:49918 to 185.221.86.9:443 in 54 ms  	[MsgId: MMSG-26000]
Action.js(22): t=12663ms: Trying to set SNI with servername api.eu.newrelic.com  	[MsgId: MMSG-26000]
Action.js(22): t=12665ms: Setting SNI was succesful  	[MsgId: MMSG-26000]
Action.js(22): t=12751ms: 398-byte request headers for "https://api.eu.newrelic.com/graphql" (RelFrameId=1, Internal ID=1)
Action.js(22):     POST /graphql HTTP/1.1\r\n
Action.js(22):     Content-Type: application/json\r\n
Action.js(22):     API-Key: [---REDACTED---]\r\n
Action.js(22):     User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Geck
Action.js(22):     o) Chrome/108.0.0.0 Safari/537.36 Edg/108.0.1462.54\r\n
Action.js(22):     Accept-Encoding: gzip, deflate, br\r\n
Action.js(22):     Accept-Language: en-US,en;q=0.9\r\n
Action.js(22):     Accept: */*\r\n
Action.js(22):     Connection: Keep-Alive\r\n
Action.js(22):     Host: api.eu.newrelic.com\r\n
Action.js(22):     Content-Length: 376\r\n
Action.js(22):     \r\n
Action.js(22): t=12764ms: 376-byte request body for "https://api.eu.newrelic.com/graphql" (RelFrameId=1, Internal ID=1)
Action.js(22):     {"query":"{ actor { account(id: 3752132) { aiIssues { issues(timeWindow: {startTime:  1672
Action.js(22):     913927684, endTime: 1672914527684}) { issues { issueId priority state title createdAt isId
Action.js(22):     le totalIncidents incidentIds conditionName activatedAt conditionFamilyId deepLinkUrl desc
Action.js(22):     ription entityGuids entityNames isCorrelated policyIds policyName updatedAt } } } } } } ",
Action.js(22):      "variables":""}
Action.js(22): t=13192ms: 419-byte response headers for "https://api.eu.newrelic.com/graphql" (RelFrameId=1, Internal ID=1)
Action.js(22):     HTTP/1.1 200 OK\r\n
Action.js(22):     Proxied-By: Service Gateway\r\n
Action.js(22):     Strict-Transport-Security: max-age=31536000; includeSubDomains\r\n
Action.js(22):     Content-Security-Policy: frame-ancestors *.newrelic.com\r\n
Action.js(22):     Cache-Control: max-age=0, private, must-revalidate\r\n
Action.js(22):     Content-Type: application/json; charset=utf-8\r\n
Action.js(22):     Date: Thu, 05 Jan 2023 10:28:47 GMT\r\n
Action.js(22):     Served-By: nerd-graph\r\n
Action.js(22):     Server: Cowboy\r\n
Action.js(22):     Vary: accept-encoding\r\n
Action.js(22):     content-encoding: gzip\r\n
Action.js(22):     transfer-encoding: chunked\r\n
Action.js(22):     \r\n
Action.js(22): t=13214ms: 5-byte chunked response overhead for "https://api.eu.newrelic.com/graphql" (RelFrameId=1, Internal ID=1)
Action.js(22):     277\r\n
Action.js(22): t=13216ms: 2-byte chunked response overhead for "https://api.eu.newrelic.com/graphql" (RelFrameId=1, Internal ID=1)
Action.js(22):     \r\n
Action.js(22): t=13218ms: 631-byte ENCODED response body received for "https://api.eu.newrelic.com/graphql" (RelFrameId=1, Internal ID=1)
Action.js(22): t=13220ms: 1050-byte DECODED response body for "https://api.eu.newrelic.com/graphql" (RelFrameId=1, Internal ID=1)
Action.js(22):     {"data":{"actor":{"account":{"aiIssues":{"issues":{"issues":[{"activatedAt":"1672914498096
Action.js(22):     ","conditionFamilyId":[1968728],"conditionName":["Number of server errors"],"createdAt":"1
Action.js(22):     672914498095","deepLinkUrl":["https://insights.eu.newrelic.com/accounts/3752132/query?quer
Action.js(22):     y=SELECT+count%28*%29+FROM+Transaction+WHERE+%60http.statusCode%60%3D%27500%27++FACET+%60h
Action.js(22):     ttp.statusCode%60+TIMESERIES+1+MINUTE+SINCE+%272023-01-05+04%3A29%3A17%27+UNTIL+%272023-01
Action.js(22):     -05+10%3A28%3A17%27"],"description":["Policy: 'Node.js alert policy'. Condition: 'Number o
Action.js(22):     f server errors'"],"entityGuids":["Mzc1MjEzMnxBUE18QVBQTElDQVRJT058NDQ5NDA0ODgx"],"entityN
Action.js(22):     ames":["sample-node-js-app"],"incidentIds":["923022a5-969e-459d-99c8-83f17e3f25a2"],"isCor
Action.js(22):     related":false,"isIdle":false,"issueId":"e2760211-6295-4b16-9739-d02a43c6cdf8","policyIds"
Action.js(22):     :[874059],"policyName":["Node.js alert policy"],"priority":"CRITICAL","state":"ACTIVATED",
Action.js(22):     "title":["sample-node-js-app query result is > 2.0 for 1 minutes on 'Number of server erro
Action.js(22):     rs'"],"totalIncidents":1,"updatedAt":"1672914498096"}]}}}}}}
Action.js(22): t=13232ms: 3-byte chunked response overhead for "https://api.eu.newrelic.com/graphql" (RelFrameId=1, Internal ID=1)
Action.js(22):     a\r\n
Action.js(22): t=13234ms: 7-byte chunked response overhead for "https://api.eu.newrelic.com/graphql" (RelFrameId=1, Internal ID=1)
Action.js(22):     \r\n
Action.js(22):     0\r\n
Action.js(22):     \r\n
Action.js(22): t=13237ms: 10-byte ENCODED response body received for "https://api.eu.newrelic.com/graphql" (RelFrameId=1, Internal ID=1)
Action.js(22): t=13240ms: Request done "https://api.eu.newrelic.com/graphql"  	[MsgId: MMSG-26000]
Action.js(22): web.rest("New Relic alerts") was successful, 641 body bytes, 419 header bytes, 17 chunking overhead bytes  	[MsgId: MMSG-26385]
Action.js(38): Notify: Transaction "FetchNewRelicAlerts" ended with a "Pass" status (Duration: 1.1134 Wasted Time: 0.1494).
Ending action Action.
Ending iteration 1.
Ending Vuser...
Starting action vuser_end.
Ending action vuser_end.

Summary

So what did we do in the previous steps? Long story short: we implemented and instrumented a basic application that acts as a layer to produce some errors. Due to our New Relic alert policy and condition configuration, the alert issues are automatically created, when the condition threshold is being hit. The Virtual User Generator script provides an interface to play replayable actions to test and verify the New Relic issue creation and that they can be fetched via the NerdGraph API.