Stream connectivity metadata and build dashboards to monitor your devices in real-time

Before each device is able to send data to an application server, it needs to communicate with the mobile network to ensure that the data is allowed to be sent over the network. This communication happens through signaling events which are usually hidden to the application. The EMnify DataStreamer makes this connectivity metadata and device data usage available in real-time in your Devicepilot account for detailed insights on the consumption and state of your devices.

In this guide we will be building an example dashboard with several relevant widgets to troubleshoot connectivity. The following queries will be created and will have associated charts added to our dashboard:

  • Top 10 devices with highest data usage in last 24 hours
  • Top 10 Devices by SMS consumption last 24 hours
  • Top devices with highest number of alerts in the last hour
  • Count of unique devices per network last 1 hours
  • Data usage by device in the last hour (5 minute aggregate)
  • Data consumption per hour over last 24 hours
  • SMS consumption by device in last 24 hours
  • PDP create and delete events
  • Count of devices that reached service limits and are blocked from data usage

Prerequisites

  • A device with a cellular modem and EMnify SIM card
  • A Devicepilot account

Benefits 

  • data consumption integrated into operational dashboards for service teams
  • allows faster triaging between device, connectivity and application issues
  • view on service usage and cost per device 
  • directly delivered to Devicepilot without need for managing an application server

Integration steps

The Devicepilot platform has a dedicated guide for streaming data from an EMnify account. The means for ingesting device data in Devicepilot is to use a Webhook and can be performed in three steps:

  1. In your Devicepilot account navigate to Connect -> Available
  2. In the Connectivity section, select EMnify
  3. Click Generate Webhook

This webhook can now be used to stream usage and event data from your EMnify account.

  1. Navigate to the Tech Settings page on the EMnify portal and click +Add Data Stream
  2. Select Usage Data and Events as the stream type and RestAPI for the API Type
  3. Click the cog icon to manage API URLs
  4. Create a new URL with Devicepilot as the purpose, the generated webhook as the URL and click the plus + icon to save
  5. Click OK done and select the Devicepilot URL from the dropdown selector and click Add Data Stream

Verifying the integration

In the EMnify portal, a HTTP 200 code in the Remote Status column indicates that the data stream is successfully configured in the Devicepilot account and receiving events.

To verify that the stream is bring processed correctly in Devicepilot, use the Test your Data button or navigate to this test page to check if data is incoming:

Navigating to the View page will display a list of devices that have sent data to Devicepilot. When clicking on any of the devices in the List view, other widgets will update their results to drill-down into details:

Managing KPIs, Filters and the Cohort View

Before building widgets and dashboards in Devicepilot, queries must be built that can be applied to graphs. To create a new KPI, navigate to Cohort and click New KPI. In this view, the following are the main building blocks:

  • Timeframe to apply the query
  • KPI name which is editable on mouse-click
  • Group by property (which is typically endpoint.id or endpoint.name for our use case)
  • Filter which can be used to exclude devices by certain properties. This can be useful if you are making use of Tags to group collections of devices.
  • Options which is useful for limiting results to top or bottom 10, for example, or to compare with values in the previous timeframe.

An accelerator for later use is to prepare Filters which will be used in our queries to focus on PDP context activity, SMS activity and Warnings when devices are about to exhaust data quotas or have services blocked.

Prepare three filters for use:

  • Navigate to Settings -> Filters and click Create
  • Add the name PDP Context Events where the event_type.description contains PDP and click save
  • Create a filter named Warnings where the event_severity.description is WARN and click save
  • Create a filter named SMS Events where the traffic_type.name is SMS and click save

These filters can now be applied to queries at later steps for targeting the scope of our graphs.

Top 10 devices with highest data usage in last 24 hours

Showing the top devices by data usage is a simple and effective way to identify devices which are using more data than expected. This can be especially helpful for troubleshooting misconfigured devices which may consume a large amount of data. To build this query:

  • Name the KPI Top 10 Devices by Data Usage last 24 Hours
  • Set the timeframe to Last 24 hours
  • Set the Metric to Sum of volume.total
  • Group by endpoint.id
  • Limit to Top 10

Top devices with highest number of alerts in the last hour

  • Name the KPI Alerts by Device last hour
  • Set the timeframe to Last 1 hour
  • Set the Metric to Count distinct values of alert
  • Group by endpoint.id and group by time of 1 minute
  • Choose the stacked bar chart to vertically stack by device

Hint: clicking on the H icon above the graph will allow for setting titles to the axes of the graph. Setting this allows for improving the readability of these graphs on the final dashboard:

Data usage by device in the last hour (5 minute aggregate)

  • Name the KPI Data Usage by Device last Hour
  • Set the timeframe to Last 1 hour
  • Set the Metric to Sum of volume.total
  • Group by endpoint.id and group by time 5 minutes

Top 10 Devices by SMS consumption last 24 hours

  • Name the KPI Top 10 Devices by SMS Usage
  • Set the timeframe to Last 24 hours
  • Set the Metric to Count of $id
  • Group by endpoint.id
  • Limit to Top 10

Hint: clicking on the traffic light icon allows for setting performance thresholds which can format graphs according to expected values. In this graph, devices consuming less than 20 SMS will be formatted as green, more than 20 as yellow and more than 90 as red.

Data consumption per hour over last 24 hours

  • Name the KPI Data Usage by device last 24 Hours
  • Set the timeframe to Last 24 hours
  • Set the Metric to Sum of volume.total
  • Group by endpoint.id and set group by time 1 hour

Hint: If KPIs are skewed by devices which are outliers, these devices can be deselected from the current view. This is done by clicking on the endpoint ID in the graph legend (highlighted on the right of the screenshot above) and non-displayed devices are greyed-out.

SMS consumption by device in last 24 hours

  • Create a new KPI with the name SMS activity last 24 Hours
  • Set the timeframe to Last 24 hours
  • Set the Metric to Count of $id (this is used to count the total number of events)
  • Group by endpoint.id and set group by time 5 minutes
  • For the Scope Filter, choose the SMS Events filter and click Save

PDP create and delete events

  • Create a new KPI with the name PDP Events last 24 Hours
  • Set the timeframe to Last 24 hours
  • Set the Metric to Count of $id
  • Group by event_type.description and set group by time 30 minutes
  • For the Scope Filter, choose the PDP Context Events filter and click Save

Count of unique devices per network last 1 hours

As this KPI uses a filter with two logical operators, we can use an accelerator for building scopes on-the-fly when designing the query. In this KPI, we will be counting the number of devices which used a network operator at least once.

To create a filter expression while building the KPI:

  • Create a new KPI with the name Count of unique devices per network
  • Set the timeframe to Last 6 hours
  • Set the Metric to Count of endpoint.id
  • Group by detail.name and set group by time 1 hour
  • For the Scope Filter, click Filters -> + Create New
  • In the sidebar menu, give the filter the name Has Operator Info
  • Create an expression where the detail.name isn't null and click + add condition
  • Create an and operator for the expression where the detail.name has any value and click save


Count of Devices that reached Service Limits

This is a straightforward query that involves a single filter, but can be used as a quick gauge to check if any devices are blocked from sending data due to Usage Limits set in the EMnify portal.

  • Create a new KPI with the name Service Limit Reached
  • Set the timeframe to Last 1 hour
  • Set the metric to count devices where and click + Filter
  • Set the filter name to Service Limit Reached where the event_type.description is Endpoint blocked and click save

 

Building the Dashboard

After at least one query has successfully been saved, it's possible to include it in a dashboard. To create a dashboard, navigate to Dashboards and give it the name Emnify Data.

  • Click + Add widget and select a KPI, PDP Events, for example and click SaveThe graph can then be resized in the dashboard and the chart style can be changed by clicking on the graph icon to switch the KPI from line to bar chart, for example:
    Repeat the steps to add a widget for each KPI until the dashboard shows all necessary graphs needed for key insights.

Next Steps - Alerts via the Rules engine and Slack Notifications

In this guide, we have learned how to create simple dashboards which provide actionable insights into the data consumption, SMS and network activity of your devices. To switch to a more proactive approach to monitoring device data, alerting can be configured in Devicepilot for truly actionable insights. One of the most useful ways for operations teams to view and manage alerts like this is via Slack actions.

Devicepilot has built in components for activating Slack integrations and alerts can be generated for metrics and thresholds that suit your use case. To create a Slack alert for a KPI you are interested in, navigate to Connect -> Active and click connect for the Slack integration.

To create alerts that can be sent via Slack;

  • Navigate to Rules and click + Create
  • Select a KPI as a template to alert on, this example uses Data Usage by Device last hour
  • Choose a Y Value which in our case is MB, and the trigger is Y is more than 1 (MB)
  • The job will run every 5 minutes and check device data consumption over the last hour

The Slack notification can be customised using a mix of handlebars tags and HTML to introduce more rich content to notifications. To direct the user to the relevant endpoint in the EMnify portal, the following example snippet can be copied:

{{ rule.name }} has fired - endpoint {{ kpi.x }} has consumed {{ kpi.y }}MB.

To configure connectivity of this endpoint in the EMnify portal, visit:
<a href="https://cdn.emnify.net/#/endpoints/{{ kpi.x }}">Endpoint ID {{ kpi.x }}</a>

Alerting when there are sudden unexpected changes in device activity can be a useful way to be proactive when it comes to device troubleshooting. The following alert can be used as a starting point to be customised according to the expected pattern of device activity per use case.

  • Create a new rule and select the KPI Count of Devices with events last 24 hours
  • For the trigger section, select current has changed more than 20 percent
  • Run every hour

In a similar way, we can also create a rule which triggers an action if a device has not sent data in the last 12 hours and it has previously sent data. The logic happening here is that the current calculated Y value will be 1 if the device has not sent data in the last 12 hours and 0 if it has sent data. We are measuring the change from 0 (sent data in previous 12 hours) to 1 (not sent data in last 12 hours).

  • Create a new rule and select count devices where
  • Click + add Filter and create a filter where volume.total is less than or equal to 0
  • Set the timeframe to be over the last 12 hours and the grouping to endpoint.id
  • For the trigger section, select current has increased by more than 0
  • Run every 12 hours

With notifications delivered to your private or public Slack channels, you can stay up to date on the current status of your device fleet and even stay ahead of potential issues by alerting on unexpected sudden changes in the connectivity data of your devices. Next steps that can be integration of alerts into Zendesk to automatically open tickets for operations teams depending on KPIs. More details on this integration can be found in the Devicepilot documentation on deep-linking.