Problem
During SNMP monitoring, you don't see all of the expected metrics for your device.
Solution
Identify what metrics exist in New Relic by running the following NRQL query, replacing $DEVICE_NAME
as necessary:
FROM Metric SELECT uniques(metricName) WHERE instrumentation.provider = 'kentik' AND device_name = '$DEVICE_NAME'SINCE 1 HOUR AGO LIMIT MAX
This query will give you a list of every dimensional metric being collected on your device in the last hour. If the metric is not listed, you should try these tests:
Run the snmpwalk utility from the host where your ktranslate
agent is running, using the SNMP credentials you configured in the snmp-base.yaml
configuration file.
If the test fails, the device most likely does not support the OID you want to collect. This is a limitation of the device itself, as controlled by the vendor.
팁
If you are using SNMPv3, validate the configuration of the v3 user on the device. In most situations, device administrators need to explicitly grant access to MIBs for a v3 user account.
Check whether the OID exists in the device profile itself. If there seems to be an issue with an OID that already exists in the profile, open a GitHub issue to contact the repository maintainers so they work towards a resolution. If the OID does not exist in the profile, you can submit a pull request to have them added. Follow the steps in the SNMP profiles documentation.
팁
The value of instrumentation.name
on your dimensional metrics maps to the profile file name where the metrics collection is configured.
Verify that the configured value for mib_profile
in your snmp-base.yaml
file matches the correct profile file name. For example:
devices: deviceOne: ... mib_profile: cisco-catalyst.yml ...
You can check this in New Relic with the following NRQL query, replacing $DEVICE_NAME
as necessary:
FROM Metric SELECT latest(instrumentation.name)WHERE instrumentation.provider = 'kentik'AND device_name = '$DEVICE_NAME'
The library of SNMP profiles is constantly being updated, and sometimes the container image you're using doesn't have the profile settings you're seeking. If the mib_profile
doesn't match the expected profile, you can either manually update your configuration file, or run a new discovery.
You should always pull the latest image for your container before making changes by running docker pull kentik/ktranslate:v2
.
Alternatively, you can get the latest via apt-get:
$curl -s https://packagecloud.io/install/repositories/kentik/ktranslate/script.deb.sh | sudo bash && \>sudo apt-get install ktranslate
Check your account for Warn
-severity errors that signify ktranslate
is having issues collecting certain metrics from your device.
Logs UI:
$collector.name:"ktranslate" message:"*OID failed to return results*"
NRQL:
FROM Log SELECT * WHERE `collector.name` = 'ktranslate' AND `message` LIKE '%OID failed to return results%'
Expected Results:
KTranslate>cisco-7513 OID failed to return results, Metric Name: ipIfStatsHCInOctets, Profile: cisco-asr
팁
In this example, you can see that the target device, cisco-7513
is not returning metrics for the ipIfStatsHCInOctets
OID, which is found in the cisco-asr
SNMP profile.
Next, you should run a single SNMP poll against your device to see exactly what ktranslate
receives from the request, using the supplied configuration.
To do this, run ktranslate
as a short-lived container, utilizing the -snmp_poll_now
flag. You can run this container using this command, replacing TARGET_DEVICE_NAME
with the value of devices.[].device_name
in your configuration YAML file for the device in question:
$docker run -d --name ktranslate-poll_now --rm --pull=always -p 162:1620/udp \>-v `pwd`/snmp-base.yaml:/snmp-base.yaml \>kentik/ktranslate:v2 \> -snmp /snmp-base.yaml \> -service_name=poll_now \> -snmp_poll_now=$TARGET_DEVICE_NAME \> -format=new_relic_metric
The results of this polling can be seen in the container logs using docker logs --follow ktranslate-poll_now
Device metadata polling example of success:
2022-01-03T23:08:50.583 ktranslate/poll_now [Info] KTranslate SNMP Device Metadata: Data received: {SysName:router123 SysObjectID:.1.3.6.1.4.1.9.1.46 SysDescr:Cisco Internetwork Operating System Software ...}2022-01-03T23:08:50.585 ktranslate/poll_now [Info] nrmFormat New Metadata for router123
Device statistics polling example of success:
[{"metrics":[{"name":"kentik.snmp.ifInErrors","type":"count","value":0,"attributes":{"if_Speed":2,"mib-name":"IF-MIB","poll_duration_sec":60,"if_Type":"proppointtopointserial", "if_AdminStatus":"up","objectIdentifier":".1.3.6.1.2.1.2.2.1.14","mib-table":"if","if_OperStatus":"up","device_name":"router123","provider":"kentik-router","if_interface_name":"Se11/0/0:16","instrumentation.name":"cisco-asr","if_Index":"63","if_Address":"10.201.0.65","eventType":"KSnmpInterfaceMetric","if_Netmask":"255.255.255.252","if_Alias":"pkt.ds1"}}]...}]
Looking at the "prettified" JSON, you can see here that polling is working as expected for this device:
[ { "metrics": [ { "name": "kentik.snmp.ifInErrors", "type": "count", "value": 0, "attributes": { "if_Speed": 2, "mib-name": "IF-MIB", "poll_duration_sec": 60, "if_Type": "proppointtopointserial", "if_AdminStatus": "up", "objectIdentifier": ".1.3.6.1.2.1.2.2.1.14", "mib-table": "if", "if_OperStatus": "up", "device_name": "router123", "provider": "kentik-router", "if_interface_name": "Se11/0/0:16", "instrumentation.name": "cisco-asr", "if_Index": "63", "if_Address": "10.201.0.65", "eventType": "KSnmpInterfaceMetric", "if_Netmask": "255.255.255.252", "if_Alias": "pkt.ds1" } } ] }]