Create a Dataset from any Data Source
Datasets serve as a buffer level between Data Sources and Metric Insights elements (Metrics and Reports).
The ultimate goal of Dataset functionality is to simplify the process of Metrics and Reports creation and make it intuitive and consistent.
1. Access New or Content menu > Datasets
The Dataset List page may not have a grid if there are no Datasets or display the list of already created Datasets.
Below the grid, click [+ New dataset].
2. [Info tab] Define the basics
- Measured: select the measurement interval that applies to the level of aggregation that you want in your result set.
- Collecting: new Datasets are always disabled by default to make sure that you can take time to configure them properly before enabling. This setting is duplicated at the top of the screen.
- Name: provide a unique name for your dataset. Preferably, the dataset name should explain what kind of data it contains.
Move to the Data tab to define the source of data and how often it should be updated.
3. [Data tab] Configure data collection
- Data Source: Choose a data fetch method from drop-down list. A SQL-based Data Source is used in this example. For more details on other available data sources and specific fetch method requirements, see Understanding Data Sources
- Data Collection Schedule: select how often data should be recollected for this Dataset to ensure that it contains relevant data.
- Write a SQL Statement defining the data to be extracted from the database. You may use :measurement_time in your statement to bind in a date or series of date values. See also: How do I use the :measurement_time bind variable?
- Validate your statement. If you have included :measurement_time parameter in your statement, it is required to specify Measurement Time for Parameter Substitution.
- If the statement is valid, a Validation Rows Preview section pops-up at the bottom of the screen. You can also open it by clicking Show validation rows under the SQL Statement box.
3.1. [New in 5.0.4] Optionally, customize the display of your Dataset values
- Click Save to display the Dataset Columns table.
- Dataset Columns: This table can be used to rename column names and define precision for numeric fields, if needed. In the given example, there are a few fields that could use better Column Names and let's say we would like the Total_unit_count to be a whole number rather than the default value of 2 decimal points.
- Click the Gear for each field you would like to rename
- On the resulting pop-up:
- Type in a new Column Name
- If data element is a floating point integer, you can also change the number of decimals to display using the drop-down list on the Precision field.
See the screen below to examine the changes made
Special and accented characters (Éé Êê Èè Ëë Ââ Àà Ää Ïï Îî Üü Ûû Ùù Ÿÿ Çç Ôô Öö ẞß Ææ Œœ)
Regardless of the Dataset's Data Source, Metric Insights supports special and accented characters. Please note, that after the command is validated, and data is ditributed to columns and the Dataset Columns table is shown, the special characters may be converted to the underscore symbol [ _ ] in the Reference Name column. This behaviour is only characteristic for the Dataset Editor and doesn't cause issues in the Viewer.
- Snapshot Datasets are associated with keeping Dataset history and having the ability to compare Dataset instances over time.
- If this the Dataset is not defined as a Snapshot dataset (this field is set to 'no'), then only the most recent instance of the Dataset will be retained
- If it is a Snapshot Dataset (the history is going to be kept, for more details refer to: Snapshot Datasets: Comparing Instances), then an additional setting will be exposed below, namely, Can historical instances be backfilled? (see below)
This field is shown for Snapshot Datasets only.
- If the Can historical instances be backfilled? field is set to 'no', then only one instance of the Dataset will be computed at run time and it is required to set the value for the ':measurement_time' variable in the field below. It's important to note that while only one instance of the report will be computed at run time, a new instance of the report will be computed at each succeeding refresh interval. Since history is kept (the Snapshot Report? field is set to 'yes'), all instances will be retained. This technique can be used to create 'snapshots' of your underlying data at fixed time intervals.
- If this field is set to 'yes', then multiple instances of the report can be computed at run time and the ':measurement_time' variable is defined automatically be the system.
At the upper right corner of the screen click Enable and Publish.
4. You are redirected to the Dataset Viewer
Your Dataset is now ready for use.