Data freshness is an essential part of data quality. Freshness refers to the degree to which your data is up to date and accurate. Stale data may be outdated, incomplete, or inaccurate. You can configure a freshness rule to generate alerts when your data fails to update at the expected frequency.
For example, imagine you expect your sales data to update every three days. If the update fails and misses the three-day frequency mark, the system will alert you to the failure. Now you have an opportunity to refresh or update the data before individuals use it for decision making.
You can configure a confidence-based or a threshold-based freshness alert.
Considerations for generating freshness alerts
| Using Databricks datasource | ||||||||||
|
||||||||||
| Using Redshift datasource | ||||||||||
|
||||||||||
| Using Snowflake datasource | ||||||||||
| In Snowflake, you use the
LAST_ALTERED column to get freshness at table level.
This supports the Add, Delete, Update operation. Execute the following SQL query to get information on Freshness
The query, when executed successfully, returns the LAST_ALTERED date. This date is useful for tracking the Freshness metric. |
||||||||||
Configure a confidence-based alert
Set a confidence-based alert if you want to get alerts based on how certain the system is about the freshness of the data. You set the range of confidence you desire, and the Observability Adaptive Model will alert you when it is certain that the table has failed to update at the expected frequency.
- After you select the data assets to observe on the Create Observer page, click Next.
- In the Freshness row, click the gear icon in the Configure column.
- Select the confidence level for generating an alert:
- Drag the lower value slider and higher value slider to set a range of percentages, or
- Enter the lower and higher value in the text box to set the range of percentages.
- Click Save.
For example, set the warning range at 60% to 80% to get warning alerts when the confidence of failure to update the table is detected in that range. You will get critical alerts when the certainty is above 80% and no alerts will be generated when the certainty is below 60%.
For example, alerts will not be generated below 80% for the range selected as 80% to 80%. Alerts will be generated only above 80%. If you set the range as 100% to 100%, the range will revert to default range as 60% to 80%.
Configure a threshold-based alert
Set a threshold-based alert if you want to manually set limits to trigger alerts. You set a custom threshold to define the expected frequency of updates.
For example, if you set the Freshness Frequency to three days, alerts will be generated when the table fails to update every three days.
Use these steps as you create an Observer to configure a freshness rule with a threshold-based alert. You can also edit existing Observer rules.
- After you select the data assets to observe on the Create Observer page, click Next.
- In the Freshness row, click the gear icon in the Configure column.
- Click the Threshold-Based Alerts radio button.
- In the Freshness Frequency field, type a number or use the incremental arrows to set the expected schedule for the table update.
- Click Save.
For example, if your freshness frequency is set to run every four days, but your Observer only runs every seven days, it will not detect whether the table failed to update at the expected frequency.