- Pipeline engine name: Enter a meaningful name for the pipeline engine.
- Type: Specifies Snowflake as the datasource type for this pipeline engine.
- Connection: Lists the available connections for the selected datasource.
- Schema: Select a schema within the Snowflake datasource where the pipeline engine has write permissions. This schema is used by the pipeline engine for reading and writing temporary staging data during processing, not just for accessing the data to be processed.
-
Session query timeout in seconds (optional): This field
specifies the Snowflake query timeout (Session-level), which is a mechanism that
sets the maximum amount of time a query can run before it is automatically
stopped.Note: Session-level Snowflake query time has the highest precedence (among User, Warehouse, and Account levels), so if a lower-level timeout query is being used, then the session-level query will override it.
- Enrich datasets database (optional): Specifies the name of the data share database in the Databricks environment. Use the exact data share database name to access these datasets while running a pipeline with the Enrich step. This provides you with improved flexibility and customization for saving the data share under one name.
Before using "Enrich datasets database" for Snowflake
Note:
- To subscribe to data or to create data shares in the Data Integrity Suite workspace, contact your Precisely support representative.
- Depending on the subscribed platform, customers can send an email to the Databricks Partnership (databricks.partnership@precisely.com) or to the Snowflake Partnership (snowflake.partnership@precisely.com) to provision subscribed data.
- For more information about how to view data shares in the Databricks environment, see Read data shared using Databricks-to-Databricks Delta Sharing (Databricks documentation) .
- For more information about how to view data shares in the Snowflake environment, see Data Consumers (Snowflake documentation) .
- Set up data share: Ensure you have set up a data share that contains the datasets you intend to use for data enrichment. This share should include all the relevant datasets required for the Enrich step.
-
Create database from a data share: For the first-time
user, it's crucial to create a database from the data share within your
workspace. To do this:
- Navigate to the workspace's section and locate the data share containing the enrich datasets.
- Select Get Data associated with the data share. This creates a database that includes the share within your workspace, making it accessible for future enrichment steps.
- Name your data share: While creating a database, you'll have the option to provide a name for the database. This name helps you identify the specific dataset collection associated with the Enrich step.
- Access the database: Once the data share database is created, you can access the datasets within your workspace's section. The datasets will be organized under the database name you provided earlier.