Datasource lineage shows you where a particular datasource originates and how it flows into downstream datasets or processes. It provides comprehensive insights into the flow and transformation of data across top-level datasources in an organization.
By viewing datasource lineage, you can:
- Understand data flow: Provides a high-level view of data movement across major sources within the organization.
- Assess data quality impact: Identifies how data quality issues affect critical assets like reports and dashboards.
- Evaluate change impact: Analyzes effects of data changes at the source on downstream business tools.
- Trace data transformation: Visualizes data transformations to maintain integrity and consistency.
- Answer critical questions: Addresses key issues about data quality, prioritization for resolution, and remediation focus areas.
View lineage for datasources
- Navigate to and select the specific datasource for which you wish to view lineage visualization.
- Click on the asset you wish to view lineage for. Then, go to the Lineage tab to access the lineage diagram. The diagram provides a graphical representation of the upstream sources from which your selected datasource inherits data. The datasource can be expanded to reveal its associated datasets.
- Initially, only a limited number of datasets are displayed, with options to progressively reveal more or fewer datasets as needed.
- Datasets are organized based on two properties available in the Dataset Sort picker at the top of the diagram: Name and Data Quality Score.
- Utilize the Expand All or Collapse All options to manage the visibility of datasets across all data sources.
- View all cards in an expanded state, with relationships detailed for each dataset. Select Show more on any card to display an additional five items.
- Items within the card will be rendered in a faded appearance when no relationships are established.
- Interact with any item in the card list by clicking, hovering, or right-clicking, allowing you to open the item in the same or a new tab. The selected item will open in its Lineage view.
- You have the option to change the default sorting from 'Name' to 'Data Quality' to reorder the items in the card list accordingly.
This provides a visual map of how the datasource fits into the broader data ecosystem. The diagram will show the flow of the datasource through the pipeline, including any datasets created from the datasource.
Navigating the information panel for datasources in Lineage diagrams
- Click a card in the Lineage diagram to access the Information panel for that datasource.
- Select an item from a card to show its information panel. Reactivate a closed Information panel using the Information Panel button.
- The Details tab displays the selected datasource properties and related business contexts.
- The Datasets tab offers a clear view of all datasets, allows dataset selection for further inspection, and the addition of datasets to the diagram.
- The Fields tab lists all fields, enabling detailed inspection and specific lineage diagram viewing.