Required Snowflake roles and permissions for data quality rules and profiling

Data Integrity Suite

Product
Spatial_Analytics
Data_Integration
Data_Enrichment
Data_Governance
Precisely_Data_Integrity_Suite
geo_addressing_1
Data_Observability
Data_Quality
dis_core_foundation
Services
Spatial Analytics
Data Integration
Data Enrichment
Data Governance
Geo Addressing
Data Observability
Data Quality
Core Foundation
ft:title
Data Integrity Suite
ft:locale
en-US
PublicationType
pt_product_guide
copyrightfirst
2000
copyrightlast
2025

This section outlines the structured process for setting up roles and permissions in Snowflake to facilitate execution of Data Quality roles. By defining a custom role, and assigning appropriate permissions, users can manage staging schemas, access necessary databases and tables, and utilize designated warehouses. The steps include creating roles, granting permissions, and establishing connections, ensuring a streamlined and secure environment for data profiling and activities related to Data Quality rules execution. This guide is essential for administrators and users involved in setting up and maintaining data cataloging processes within Snowflake.

Key permissions include CREATE STAGE, CREATE FILE FORMAT, and CREATE FUNCTION, which facilitate the uploading of components, storage of intermediate data, and creation of user-defined functions to enhance processing efficiency. Temporary stages, file formats, and functions are created to support data processing but are not retained post-operation, ensuring no permanent data storage. An automated cleanup process is in place to remove all temporary objects after processing, maintaining a clean and compliant environment. This ensures efficient operations while adhering to data governance standards.

For more information on why these permissions are needed, refer to the note mentioned below the table.

Step Action Description
1 Create role Create a custom role.

create role PRECISELY_DIS_INSIGHTS;

2 Create and empty staging schema create schema EXAMPLE_DB.STAGING_SCHEMA;
If the staging schema exists, drop the created stage. drop stage if exists EXAMPLE_DB.STAGING_SCHEMA.PRECISELY_DQ_STAGE;
3 Grant staging permissions grant CREATE STAGE, CREATE FILE FORMAT, CREATE FUNCTION on schema EXAMPLE_DB.STAGING_SCHEMA to role PRECISELY_DIS_INSIGHTS;
4 Grant warehouse usage permission
Note: Grant the same role to use the desired warehouse for processing.
grant usage on warehouse EXAMPLE_WH to role PRECISELY_DIS_INSIGHTS;
5 Grant database and schema access
Note: Provide the same role and permission to use the database and desired schemas and read the data from the tables.
  • grant usage on database EXAMPLE_DB to role PRECISELY_DIS_INSIGHTS;
  • grant usage on all schemas in database EXAMPLE_DB to role PRECISELY_DIS_INSIGHTS;
  • grant usage on future schemas in database EXAMPLE_DB to role PRECISELY_DIS_INSIGHTS;
  • grant select on all tables in database EXAMPLE_DB to role PRECISELY_DIS_INSIGHTS;
  • grant select on future tables in database EXAMPLE_DB to role PRECISELY_DIS_INSIGHTS;
6 Assign the suitable role to the user to perform the cataloging grant role PRECISELY_DIS_INSIGHTS to user PRECISELY_DIS_USER;
7 Grant user permissions for internal stages alter user PRECISELY_DIS_USER set PREVENT_UNLOAD_TO_INTERNAL_STAGES=FALSE;
8 Create Snowflake connection Establish a connection with the following details

User : PRECISELY_DIS_USER, Role: PRECISELY_DIS_INSIGHTS, Warehouse : EXAMPLE_WH.
9 Pipeline Engine setup

Create a pipeline engine, and specify its schema as the staging schema

Create a pipeline engine with schema STAGING_SCHEMA.
Note:
  • For specific operations in Snowflake, a temporary staging area is set up where necessary files are uploaded to run processes. This includes creating temporary storage areas, file formats, and custom functions to facilitate quicker data processing.
  • The data is stored Snowflake in a compressed JSON format, which includes details about records that did not meet specified criteria. Special functions are also set up to manage this data efficiently.
  • No temporary tables are created; however, temporary stages, file formats, and user-defined functions are established for the duration of the processing tasks.
  • Data is not permanently stored in the staging objects. All data is deleted from these temporary areas after the pipeline run and the process is complete.
  • After completing the tasks, all the temporary items are permanently deleted, ensuring no residual data or objects remain.
  • The cleanup process is fully automated, which clears most of the temporary items like data files and functions after each run. The temporary staging area, however, is not deleted as it is required for pipeline runs and various tasks within the same database.