Dataset lineage gives you visibility into the upstream sources and downstream usage of a particular dataset. Viewing dataset lineage is a crucial process for understanding the life cycle of data within an organization. It helps in visualizing the flow of data from its source to its final destination, ensuring data quality, compliance, and effective data management.
Viewing datasets lineage enables:
- Understand dataset construction: Clarifies data origins and transformations, supports documentation.
- Identify dependent datasets: Recognizes dependencies, aids in impact analysis, and enhances integration planning.
- Monitor data quality issues: Detects upstream quality issues, improves accuracy, and identifies root causes.
- Manage compliance with data regulations: Ensures regulatory compliance, reduces breach risks, and builds trust.
View lineage for datasets
- Navigate to and select the specific dataset you wish to analyze for lineage.
- Click on the Lineage tab to view the lineage diagram.
- The diagram visually represents the complete lineage of the selected dataset, identified as HOME. It is displayed at a default zoom level of 100%, which you can adjust to fit your viewing preferences.
- To explore deeper, expand the dataset to reveal the fields it contains. By default, only five fields are visible; however, you can use the Show More control to access the complete list.
- Fields are organized according to two properties: name and data quality score. This arrangement facilitates a detailed view of both the fields themselves and their interconnections.
- When you hover your cursor over a dataset card, location information appears as a tool-tip, uniquely identifying the asset by displaying its full path, including the datasource, schema, and asset name.
- For a broader or more detailed view, use the Expand All or Collapse All controls at the bottom of the diagram to manage all fields simultaneously. Alternatively, adjust the zoom level using the drop-down percentage (%) selector.
- Right-click on a dataset card to open a context menu with options including Show Information (opens the information panel), Open (opens the asset in a new tab within the current view), and Open in New Tab (opens the asset in a separate browser tab).
Explore dataset details through the information panel in Lineage diagrams
- Click on a dataset card in the Lineage diagram to open the Information panel associated with that dataset.
- If the Information panel is not visible, click the Information Panel button to reactivate it.
- Inside the Information panel, navigate to the Details tab to view the dataset's properties and a comprehensive list of related business contexts.
- To examine specific fields within the dataset, switch to the Fields tab. Here, you can view each field separately from the main diagram, select fields for detailed inspection, or explore a field's specific lineage diagram.