Help & Documentation

Advanced Search Tuning: Understanding Rate Calculation

The total rate of the search result comprises three types of score:

Token score: weight of a word and its relation to a phrase
Field score; e.g., the Element name field is rated higher then the Element description field
Table-relation score; e.g., an Element's Dimension Value is rated higher than the one found by a Tag

DEFINITION OF "TOKEN": A token is a unique word or combination of characters in a text.

During processing, text is segmented into words, punctuation, and unique character sets by applying rules specific to each language. For example, punctuation at the end of a sentence is ignored whereas U.K. remains one token.

Token Score

Each lemma (a basic form of a word) receives 100 points by default, unless it is a stop word.

Stop words receive 50 points, but in case a phrase includes only stop words, each of these words is assigned 100 points. Fifty additional points are divided by the number of words in the search argument, and the result is added to reach the final score.

Consider the following example:

There is an element and a related Dataset:
- Element(name=”Sales metric”, description=”Canada sales”)
- Dataset(name=”Daily sales in Canada”)
The token score is calculated during indexation and is as follows:
- Element name [Sales metric] →
  - sales[100 + 50 / 2], metric[100 + 50 / 2] →
  - sales[125], metric[125]
- Element description [Canada sales] →
  - canada[100 + 50 / 2], sales[100 + 50 / 2] →
  - canada[125], sales[125]
- Dataset name [Daily sales in Canada] →
  - daily[100 + 50 / 4], sales[100 + 50 / 4], in[50 + 50 / 4], canada[100 + 50 / 4] →
  - daily[112.5], sales[112.5], in[62.5], canada[112.5]

MI_stop_word_list.txt

Result Score

Tokens that satisfy the search argument are identified during the search. After that, the program:

Summarizes token scores
Calculates the rate based on the order of words (tokens) in a phrase and adds it to the result
Multiplies the result by a Field score and divides it by 100 (field scores are used as coefficients)
Multiplies the result by the table relation score and divides it by 100

The score is calculated as illustrated in Steps 1- 4 that follow.

1. Summarize Token Scores

The query “Canada daily sales” returns the following tokens:

Token score for Element name [Sales metric] = sales[125] = 125
Token score for Element description [Canada Sales] = canada[125] + sales[125] = 250
Token score for Dataset name [Daily sales in Canada] = daily[112.5] + sales[112.5] + canada[112.5] = 337.5

2. Calculate the Rate Based on Word Order

Element Name Canada daily sales vs Sales metric → 0 points (only one match – "sales") = 125
Element Description Canada daily sales vs Canada sales → 1 point ("canada" goes before sales, but there are other words between them) = 251
Dataset Name Canada daily sales vs Daily sales in Canada → 2 points ("canada" is at the end of the phrase, but “sales” goes directly after “daily”) = 337.5 + 2 = 339.5

3. Multiply by Field Score

Field rates can be changed under Admin>System>Search Setup>Advanced Search Tuning to adjust rankings according to the needs of your organization.

For the purpose of this article, default values will be used.

Score for Element name [100] = 125 * 100 / 100 = 125
Score for Element description [50] = 251 * 50 / 100 = 125.5
Score for Dataset name [100] = 339.5 * 100 / 100 = 339.5

4. Multiply by Table-Relation Score:

Score for Element → Element (for Element name) [100] = 125 * 100 / 100 = 125
Score for Element → Element (for Element description) [100] = 125.5 * 100 / 100 = 125.5
Score for Dataset → Element (for Dataset name) [50] = 339.5 * 50 / 100 = 169.75
Score for Dataset → Dataset (for Dataset name) [100] = 339.5 * 100 / 100 = 339.5

The search returns two entities: the element and Dataset. The element has 3 scores: 125, 125.5, and 169.75. Thus, the program will consider that the Element is found by Dataset name since it has the highest score (169.75). The Dataset is found by its Name.

Search results are as follows:

Dataset [339.5]
Element [169.75]

Additional Rules

When entities have the same score, additional ordering rules are applied:

Certified elements have higher rankings than non-certified
“Metric” and “Multi-Metric” element types have more weight than other types
Elements with higher engagement rates have higher rankings
Elements are ranked based on their internal ID (in ascending order)

[6.4.2] In case with partial search, sublemmas are used. If several sublemmas appear in the same word, this does not add extra score to the search result. For example, in case of a search for "sal der", results "salamander", "sale", "commander" will have the same score.

Help & Documentation

Advanced Search Tuning: Understanding Rate Calculation

Token Score

Result Score

1. Summarize Token Scores

2. Calculate the Rate Based on Word Order

3. Multiply by Field Score

4. Multiply by Table-Relation Score:

Additional Rules

Topics

Last Updated

Other Resources

Getting Started

Getting Data

Creating Content

Alerting

Collaboration and Sharing

System Administration