Dataset lineage gives you visibility into the upstream sources and downstream usage of a particular dataset. Viewing dataset lineage is a crucial process for understanding the life cycle of data within an organization. It helps in visualizing the flow of data from its source to its final destination, ensuring data quality, compliance, and effective data management.
Viewing datasets lineage enables:
- Understand dataset construction: Clarifies data origins and transformations, supports documentation.
- Identify dependent datasets: Recognizes dependencies, aids in impact analysis, and enhances integration planning.
- Monitor data quality issues: Detects upstream quality issues, improves accuracy, and identifies root causes.
- Manage compliance with data regulations: Ensures regulatory compliance, reduces breach risks, and builds trust.
View lineage for datasets
- Navigate to and select the specific dataset you wish to analyze for lineage.
- Click on the Lineage tab to view the lineage diagram.
- The diagram visually represents the complete lineage of the selected dataset, identified as HOME. It is displayed at a default zoom level of 100%, which you can adjust to fit your viewing preferences.
- To explore deeper, expand the dataset to reveal the fields it contains. By default, only five fields are visible; however, you can use infinite scrolling to access more fields.
- Fields are organized according to two properties: name and data quality score. This arrangement facilitates a detailed view of both the fields themselves and their interconnections.
- For a broader or more detailed view, use the canvas controls ("+" or "-") to expand or collapse all fields. Alternatively, adjust the zoom level using the drop-down percentage (%) selector.
- To engage the 'Dataset-level' lineage view, right-click on the dataset to activate the context menu and choose View Info.
Explore dataset details through the information panel in Lineage diagrams
- Click on a dataset card in the Lineage diagram to open the Information panel associated with that dataset.
- If the Information panel is not visible, click the Information Panel button to reactivate it.
- Inside the Information panel, navigate to the Details tab to view the dataset's properties and a comprehensive list of related business contexts.
- To examine specific fields within the dataset, switch to the Fields tab. Here, you can view each field separately from the main diagram, select fields for detailed inspection, or explore a field's specific lineage diagram.