This section outlines the structured process for setting up roles and permissions in Snowflake to facilitate execution of Data Quality roles. By defining a custom role, and assigning appropriate permissions, users can manage staging schemas, access necessary databases and tables, and utilize designated warehouses. The steps include creating roles, granting permissions, and establishing connections, ensuring a streamlined and secure environment for data profiling and activities related to Data Quality rules execution. This guide is essential for administrators and users involved in setting up and maintaining data cataloging processes within Snowflake.
Key permissions include CREATE STAGE, CREATE FILE FORMAT, and CREATE FUNCTION, which facilitate the uploading of components, storage of intermediate data, and creation of user-defined functions to enhance processing efficiency. Temporary stages, file formats, and functions are created to support data processing but are not retained post-operation, ensuring no permanent data storage. An automated cleanup process is in place to remove all temporary objects after processing, maintaining a clean and compliant environment. This ensures efficient operations while adhering to data governance standards.
For more information on why these permissions are needed, refer to the note mentioned below the table.
| Step | Action | Description |
|---|---|---|
| 1 | Create role | Create a custom role.
|
| 2 | Create and empty staging schema | create schema EXAMPLE_DB.STAGING_SCHEMA;
|
| If the staging schema exists, drop the created stage. | drop stage if exists
EXAMPLE_DB.STAGING_SCHEMA.PRECISELY_DQ_STAGE; |
|
| 3 | Grant staging permissions | grant CREATE STAGE, CREATE FILE FORMAT, CREATE FUNCTION
on schema EXAMPLE_DB.STAGING_SCHEMA to role
PRECISELY_DIS_INSIGHTS; |
| 4 | Grant warehouse usage permission Note: Grant the same role to use the
desired warehouse for processing.
|
grant usage on warehouse EXAMPLE_WH to role
PRECISELY_DIS_INSIGHTS; |
| 5 | Grant database and schema access Note: Provide the same role and permission
to use the database and desired schemas and read the data
from the tables.
|
|
| 6 | Assign the suitable role to the user to perform the cataloging | grant role PRECISELY_DIS_INSIGHTS to user
PRECISELY_DIS_USER; |
| 7 | Grant user permissions for internal stages | alter user PRECISELY_DIS_USER set
PREVENT_UNLOAD_TO_INTERNAL_STAGES=FALSE; |
| 8 | Create Snowflake connection | Establish a connection with the following details
User :
PRECISELY_DIS_USER, Role:
PRECISELY_DIS_INSIGHTS, Warehouse :
EXAMPLE_WH. |
| 9 | Pipeline Engine setup Create a pipeline engine, and specify its schema as the staging schema |
Create a pipeline engine with schema
STAGING_SCHEMA. |
- For specific operations in Snowflake, a temporary staging area is set up where necessary files are uploaded to run processes. This includes creating temporary storage areas, file formats, and custom functions to facilitate quicker data processing.
- The data is stored Snowflake in a compressed JSON format, which includes details about records that did not meet specified criteria. Special functions are also set up to manage this data efficiently.
- No temporary tables are created; however, temporary stages, file formats, and user-defined functions are established for the duration of the processing tasks.
- Data is not permanently stored in the staging objects. All data is deleted from these temporary areas after the pipeline run and the process is complete.
- After completing the tasks, all the temporary items are permanently deleted, ensuring no residual data or objects remain.
- The cleanup process is fully automated, which clears most of the temporary items like data files and functions after each run. The temporary staging area, however, is not deleted as it is required for pipeline runs and various tasks within the same database.