Write a SQL Statement for a Dimensioned Metric
This article outlines the requirements for a SQL statement that fetches data for a Dimensioned Metric.
The screens below are from the Metric Editor. For a step-by-step guide for defining a new Metric using the Metric Editor, see Create a Dimensioned Metric
- This article assumes that the Dimension has already been defined using the Dimension Editor. For details on how to define a Dimension, see the articles in Creating Dimensions
- This article also assumes that the Dimension is configured in the Dimension Editor to fetch all values in a single SQL Statement (rather than one SQL statement issued per Dimension Value).
1. SQL Statement Requirements
The SQL statement for a Dimensioned Metric must return the following columns in the result set:
- A Key Value of the Dimension
- A measurement date/time
- A measurement value
Measurement date/time must be returned in ANSI date format, with or without the time: "YYYY-MM-DD" OR "YYYY-MM-DD HH:MM:SS". These values in the data set are automatically normalized based on the Measurement Interval set for the metric. For the most commonly used Intervals:
- Daily: Set to the beginning of the day. e.g. '2012-12-28 12:22:33' is truncated to '2012-12-29 00:00:00'
- Weekly: Set to the last day of the calendar week based on the calendar week table in the Metric Insights database. The default week is Sunday through Saturday.
- Monthly: Converted to fall on the first day of the calendar month. e.g. '2012-12-28' is converted to '2012-12-01'
It is also important to ensure that only a single row is returned in the data set for a specific normalized date and Dimension key value pair. If the data set contains multiple rows that are normalized to fall on the same date, the last value collected is retained and all other values are discarded. For example, if SQL for a Daily Metric Dimensioned by Country returned the following 2 rows:
USA, '2012-12-28 12:32:00', 25
USA, '2012-12-28 08:12:00', 44
Since this is an example of a Daily Metric, both of the above date/time values would be normalized to '2012-12-28 00:00:00' and the first row in the result set would be discarded (only the value of 44 would be recorded for the specified date).
2. Collecting data incrementally
If you would like to fetch data incrementally, you can include the ':last_measurement_time' variable in your SQL statement. This variable is set at run-time to be the measurement time of the most recent metric value collected in the Metric Insights repository for the metric.
If you do not include :last_measurement_time in your SQL Statement, Metric Insights will re-fetch ALL historical values. In this case, new metric values will be added and existing values will be updated based on the result set of your query.
2.1. Setting limits on incremental data collection
In dimensioned metrics that are batch-collected, :last_measurement_time is set to be the earliest 'last measurement time' value across all dimension values.
The following additional controls can be set on the value assigned to :last_measurement_time
- You can set an offset to use in computing :last_measurement_time and this offset will be used to collect additional historical data. For example, if you set up the On data collection also re-run last '7' days, the system will subtract 7 days from the last measurement time to set the :last_measurement_time variable.
- You can set a maximum number of days that the system will go back to collect data. This is useful in situations where you have some Dimension values that are no longer accumulating values.
For example, if you have the following measurement times for various 'Country' Dimension values:
The value of :last_measurement_time in this scenario would normally be set to '2011-03-01' since that is the earliest 'last measurement time' across all Dimension values. If your organization is no longer doing business in Australia, that Dimension may never collect additional data and you may not want to re-collect all data from that point in time. By instructing the system to Batch Load Data Going back no more than 14 days, :last_measurement_time would be set to be at most 14 days before the latest 'last measurement time' across all Dimension values. In this example, this would be 14 days before 2012-10-02 which will result in :last_measurement_time being set to '2012-09-18'.
3. Recollecting all history for New Dimension Values
Normally, when a new Dimension Value is created for an existing Dimension, data for an existing Metric will be collected from the earliest ':last measurement time' across all Dimension Values. You can specify that ALL available history be collected for a new Dimension value by including the '<single-fetch></single-fetch>' tag in your data fetch command. This tag will be removed from regular data collection and will ONLY be used for data collection for new dimension values.
For example, if your SQL statement for a metric dimensioned by 'product_subcategory' is written as follows:
SELECT product_subcategory, calendar_date, sum(total_amount) FROM daily_order_summary s, product p WHERE p.product_id = s.product_id <single-fetch> AND product_subcategory = :product_subcategory </single-fetch> AND s.calendar_date > :last_measurement_time group by 1,2
The system will first process the above command for any NEW Dimension Values that have been added since data collection was last performed for the metric:
SELECT product_subcategory, calendar_date, sum(total_amount) FROM daily_order_summary s, product p WHERE p.product_id = s.product_id AND product_subcategory = :product_subcategory AND s.calendar_date > :last_measurement_time group by 1,2
The above statement will be run once for every new product category value (with the key value of the product subcategory substituted for ':product_subcategory' in SQL).
Once all new Dimensions have been processed, another fetch command will be issued to collect new measurements across all Dimension Values by running the SQL statement without the contents of the <single-fetch></single-fetch> tag. The following SQL statement will be run:
SELECT product_subcategory, calendar_date, sum(total_amount) FROM daily_order_summary s, product p WHERE p.product_id = s.product_id AND s.calendar_date > :last_measurement_time group by 1,2
4. Writing a SQL Statement for Single and Bulk Fetch Commands
Dimensions use fetch commands to tell Metric Insights how to collect data for elements. There are two methods: single ('Each Individual Dimension Value') and bulk ('All Dimension Values at Once'). A bulk fetch attempts to retrieve all the data in one request whereas a single fetch command makes multiple requests to retrieve the data.
In most circumstances, bulk fetch is sufficient so it is the default setting for new Dimensions. Although, there may be conditions where a single fetch is required. An example is when the SQL query requests three columns but the data source can only return two columns at a time. This would take more than a bulk request to gather the data, so using the single option is necessary. Imagine that a SQL table has columns for all the information you want: Sales Rep, Date, Amount. A bulk fetch can accomplish this. Now imagine that you have distinct tables named after each Sales rep, and only have columns Date and Amount in those tables. A single fetch is needed to query all the tables to get the same information.
The SQL format for using a Dimension's bind variable depends on the fetch command. If a Dimension uses a single fetch, the bind variable (:product_category here) is included in the 'where' clause:
Select calendar_date, sum(total_amount) From daily_order_summary a, product b Where a.product_id = b.product_id and b.product_category = :product_category group by 1
If the dimension uses bulk fetching, then you need to select the dimensioned column ('wine_vintage' in this example) and not include the bind variable in the 'where' clause:
Select wine_vintage, calendar_date, sum(total_amount) from daily_order_summary a, product b where a.product_id = b.product_id group by 1, 2
NOTE: Metric Insights will tell you what you need to return in the SQL query. Look at the notes to the right of the SQL Statement box and it will define the columns to include or to use the bind variable in the clause.