Tips & Tricks: Working with Thousands of Dimension Values
Data-driven alerts are among the most compelling features of Metric Insights. They're even more compelling when combined with dimensions. A single, dimensioned metric can easily be used to monitor many data streams. You'll want to be careful, however, when the number of dimension values becomes very large. You can easily wind up with an unmanageable set of charts that don't provide the key insights you really need. In this issue of Tips & Tricks, we'll show you a general method to handle large numbers of dimension values.
You can accomplish similar functions by creating a Change Report: see Create a Change Report
Let's say that you're a manager for a bank that has 10,000 customers. As a protection against accounting errors, you've asked to be alerted if any account balance changes by more than 20% on any given day. Your first inclination might be to create a Daily Balance metric, dimensioned by Account. You could then set at alert that would fire if today's balance differs from yesterday's balance by more than 20%. What's wrong with that?
For starters, you probably expect some number of large transactions every day. You'll therefore receive daily alerts even when nothing is wrong. When a computer glitch does occur, you might not notice that your email digest includes an unusual number of charts. And even if you do take notice, you'll need to interpret a large number of charts, when what you really want is a list of affected accounts.
Fortunately, there is a better approach. Let's start with what you want, namely, a list of affected accounts. You can easily create such a list by writing an appropriate SQL statement against your database. For instance:
Select account_id, balance_today, balance_yesterday From source_table Where balance_today < 0.8 * balance_yesterday Or balance_today > 1.2 * balance_yesterday
Use your SQL statement as the basis for a report named 'Affected Accounts'. You could place the report in a favorite folder, but since you only want to see it when glitches occur, we'll connect it to the alerting mechanism instead. Here are the necessary steps:
1. Report Settings
At the Report Editor > Data tab:
Set the Snapshot Report (keep history) field to 'yes'
Set Use Report as Source (save as system table) to yes. Metric Insights will preserve each day's list of affected accounts in a MySQL table that can be used to create other elements.
2. Build the Metric to trigger the alert
Build a metric based on the report. The SQL command is simple -- we'll just count the number of entries by day:
Select measurement_time, count(*) From affected_accounts Group By 1
NOTE: You may be familiar with this process if you've read our earlier article on snapshotting.
3. Create a new Alert
At the top right corner of the Report Viewer click an Alert icon. The Add Alert rules for pop-up opens. Click [+ New rule].
3.1. Tie the Metric to the Report
Connect the Report to the Metric by assigning it as an “alert metric.” Alternatively, scroll down to the bottom of the drop-down list and click Define a new Metric from <report name>.
Create an Alert with a threshold of, say, 50 affected accounts. Now if the Metric issues an alert, it will be as if the report had issued an alert as well.
4. Finally, subscribe to the report alert
Now we have what we want: A list of affected accounts, which will be delivered only when it contains more than 50 entries. This is much more useful than a daily collection of charts, which you may quickly learn to ignore.
The key point here is not to avoid dimensions but to apply them thoughtfully. Let's see how we might do that in the current example.
Let's say that your 10,000 accounts are divided among ten local branches. In other words, each branch represents 10% of your accounts. If a computer glitch occurs at only one of them, the number of affected accounts might fail to exceed your system-wide 50-account threshold. You can avoid that outcome by dimensioning everything by Branch and setting a lower threshold. You'll still have a manageable number of dimension values (10 instead of 10,000). You'll also have a more sensitive alerting mechanism that can detect branch-level failures without issuing system-wide false alerts. Finally, you'll have a mechanism for distributing alerts to the people who should act on them, namely, the branch managers.
The method described in this article is quite general. The next time you're tempted to create a dimension with a large number of values, consider this approach first.