The Data Observability service within the Data Integrity Suite enhances trust in data and analytics by leveraging machine learning to detect anomalies and outliers, ensuring data reliability and supporting data-driven decision-making.
Key features of Data Observability include:
- Comprehensive data visibility: Provides a complete view of the data environment with automated alerts and a unified monitoring dashboard.
- Scheduled and on-demand data monitoring: Supports both scheduled and real-time data monitoring across platforms like Databricks, Snowflake, and Amazon Redshift.
- Automated data profiling: Automatically profiles data with semantic tagging for easier exploration and analysis.
- User-driven data exploration: Enables users to analyze data trends and statistics, enhancing insights with intuitive exploration tools.
- Advanced data insights: Uses machine learning to detect data irregularities and anomalies, with customizable criteria.
- Customizable alerting system: Offers alerts for detected anomalies with adjustable sensitivity settings.
- Enhanced data catalog integration: Allows direct access and modification of data monitoring tools from the catalog, enhancing governance and control.
Data Observability configuration guide
By following this checklist, you can ensure continuous monitoring and profiling of your data, maintaining its quality and reliability. Start your data observability journey today to stay ahead of potential data issues.
- Identify the datasource: Determine the datasource you want to observe and profile.
- Establish and catalog a connection: Establish and catalog a connection for the datasource.
- Create an observer or a data profile: Set up either an Observer to monitor data or a Data Profile to analyze data based on the requirements.
- Choose cataloged data assets: Select the data assets you wish to observe or profile.
- Define rules: Determine the types of rules to apply.
- Observer rules: Establish rules to detect anomalies such as freshness, volume, data drift, and schema drift.
- Profiler rules: Establish rules to gather statistics, such as default analysis rule.
- Choose a schedule: Decide on a schedule for running the Observer and Profiler.
- Identify alert recipients: Specify who should receive alerts when anomalies are detected.
- Run the observer and profiler:
- Automatic: Run the Observer automatically.
- Manual or Scheduler: Run the Profile either manually or based on a predefined schedule.
- Receive notifications and act on alerts: Get alerts via email and view details to understand issues. Then, take necessary actions to resolve any detected issues.
- View profiling results: Examine profiling details to understand your data.
To fully leverage this feature, consider asking yourself the following questions:
- How can continuous data monitoring improve my current analytics process?
- What insights can I gain by understanding the dependencies within my data landscape?
- In what ways can proactive anomaly detection help mitigate risks in my data operations?
- How can focusing on frequently used data assets increase overall operational efficiency?
- What strategies can I implement to ensure that data-driven decisions are based on the most reliable information?