The filtering feature helps reduce noise within lineage diagrams, allowing you to focus on relevant data elements. It is designed to maintain the context of the lineage diagram. This means that applying a filter will not remove any nodes from the diagram. All nodes remain visible, providing a complete view of the data flow, while nodes that match the filter criteria will be highlighted for significance.
You can navigate to the Catalog from the main navigation menu and explore lineage at the following levels:
To perform filtering, follow these steps:
- While on the Lineage tab for an asset, initiate the filter process by selecting the Filter option, which will prompt a popup window to appear.
- Within this popup, you have the option to choose an operator such as is, is not, is populated, or is not populated, allowing you to define the conditions for your filter.
- Additionally, you can filter by datasource types (for example,
Snowflake, Databricks, Salesforce) and Data Quality scores.Note: When you apply filters based on scoring, the filtering functionality considers both parent datasets and their child assets. Therefore, the displayed datasets may include fields with low scores, even if the overall dataset score is medium or good.
- Once you have made your selections, confirm your filter choices by clicking the Apply button.
- The results of your filtering will display all matching assets
based on the criteria you have set and pinpoint Datasources,
Datasets, or Fields that align with your defined criteria. Note: A Datasource is deemed a match if it satisfies the criteria directly or if any of its associated datasets or fields meet the conditions. Non-matching elements will be dimmed in the diagram, with counts reflecting how many datasets adhere to the criteria.
- To return to the original view and reset the filter, utilize either the Clear All option in the popup or select the Reset Filters button.
Key points to remember:
- When filtering for poor data quality scores on a datasource, the system displays datasets with poor scores or those linked to fields with poor scores. If the displayed datasets show medium or good scores, you should drill down into the datasets to find specific fields with lower scores. This helps in thoroughly evaluating the quality across related datasets and fields.
- With multiple filters applied, the system employs AND logic, ensuring that only cards meeting all selected criteria remain visible.
- Filters continue to be active as you navigate, helping you concentrate on specific data elements.
- If nodes appear grayed out due to filter criteria, selecting a field for tracing will bypass the filtering constraints.
Enhance your analysis with search functionalities to quickly locate assets or trace capabilities to map their lineage. For more details, check the Search and Trace topics.