Agent
A critical component of the Data Integrity Suite that enables secure communication between on-premises infrastructure and the Precisely Cloud. It facilitates data processing within the user's own environment and must be installed and configured on the user's infrastructure. Agents work alongside engines to support data operations and require regular maintenance, including periodic updates that must be downloaded and installed by users to ensure continued compatibility and performance.
Alert
Alert level
A classification system that indicates the severity of an alert, typically displayed as Warning (yellow) or Critical (red), along with a confidence percentage. It helps users prioritize and respond appropriately to data anomalies based on their impact level.
API key
API secret
Asset
A structured data object that represents a specific instance of an asset type, containing detailed rules, configurations, or information about organizational resources such as policies, technical components, or infrastructure elements. Assets are created from predefined asset type templates and serve as the actual operational entities within the Data Integrity Suite.
Broker
A broker is a Kafka server component used in Data Integration pipelines when Kafka is the target. It handles message storage and delivery, acting as a middle layer between producers (sending data) and consumers (receiving data). Identified by a hostname or IP and port, the broker ensures reliable message flow and is essential for replication pipelines that write to Kafka.
Cataloging
Connections
Interfaces between the Data Integrity Suite and external datasources that enable cataloging and accessing associated databases or data warehouses. These connections serve as the foundation for organizing and documenting data assets in a centralized repository for improved data management and accessibility.
Completeness
A metric that represents the percentage of complete and incomplete rows detected in profiled data. It is used to assess quality of data and sort data profiles based on how much of the data contains all required values versus missing or null values.
Consolidation rule
Continuous replication
A data integration process that facilitates real-time or near-real-time synchronization of data between source and target systems through automated pipelines. It automatically replicates updates, deletions, and insertions while maintaining data integrity and availability throughout the replication process.
Continuous replication pipeline
Data Catalog
Data Enrichment
A service that helps you enhance the value and usability of your data by adding information from many expertly curated datasets for locations in the United States of America. It adds attributes such as risk scores, demographics, or property details to improve data context and usability. It also enhances address-level data by joining it with licensed, domain-specific datasets using a unique PreciselyID.
Data Governance
A service allows you to define, track, and manage your data assets with Data Integrity Suite. It provides a high level of flexibility, set of repeatable, scalable strategies and technologies that ensure important data assets stay in compliance with organizational policies and government regulations. It has a comprehensive framework including policies, processes, responsibilities, and tools for managing data access, security, quality, and consistency across an organization.
Data Integration
Data Integrity Suite
Data Observability
Data profile
A configuration within the Data Observability service that analyzes data quality across sources by scanning for completeness, validity, and correctness issues. It generates statistics and scores for selected data assets to help identify and address data quality problems.
Data Quality
Data Quality pipelines
Data sample storage
A dedicated space within the Data Integrity Suite where sample datasets are securely stored for testing, validation, and building data quality pipelines. It can be configured to use either Precisely Cloud or AWS S3 bucket storage options.
Data volume
A Data Observability rule, where the number of rows in a dataset are used as a metric to monitor unexpected changes such as additions or deletions of data records.
Dataset entities
Defined structures within a Data Quality pipeline that represent meaningful objects, such as a Location composed of multiple related fields, and include configurable field mappings used primarily in record matching operations.
Datasource
A defined reference to an external data system such as a database, file store, cloud service, or API. It is typically the first configuration step and identifies what data the Data Integrity Suite will work with. Each datasource is linked to a connection, which defines how the suite accesses the data. This structure supports flexible integration and reuse across services.
Datastore
DATASTORE command and must be associated with one or
more descriptions that specify its record structure and layout.Diagnostic bundle
Distribution key
A key mechanism used in Data Integration service to determine how data is distributed across different nodes or partitions. It ensures efficient data placement and retrieval in distributed database environments.
Drift
A significant change or deviation in data characteristics, structure, or patterns that triggers alerts in data monitoring systems. It encompasses changes in data values beyond specified ranges (data drift) or modifications to database schema elements like tables and columns (schema drift).
End index
A parameter that specifies the last position in a string for Get Substring and Replace Between actions, where the count starts with 1. If the end index exceeds the string length, operations extend to the end of the string.
Enrich
A transformation step in Data Quality pipeline step that enhances existing records by adding additional information from external sources or reference datasets, such as location-based risk data, weather insights, or other attributes. This process makes data more valuable and useful for analysis and decision-making.
Enrichment pipeline
Exact match
A matching algorithm used in Match and Group transformation step that determines whether two text strings are identical, including case sensitivity, returning a score of 100 for perfect matches and 0 for non-matches. It compares fields character by character to identify records with precisely matching values.
Field mapping
Freshness frequency
The expected update frequency for a data table, used in threshold-based freshness alerts to trigger notifications when the table is not updated within the defined time interval.
Geo Addressing
A comprehensive solution for efficient address data management, featuring global address verification and geocoding to improve data accuracy and streamline operations, ensuring address data is accurate, complete, and ready for use in various applications, thereby enhancing operational efficiency and decision-making processes.
Geocode Address
A step in Data Quality pipeline that converts address information into geographic coordinates (latitude and longitude) using geocoding services. It requires a data subscription and returns coordinates in decimal degrees to 7 decimal places for precise location mapping.
Group condition
AND operator to create the overall condition for
execution.Mainframe replication
Mainframe replication pipeline
Metabase
Metadata
Essential information about a dataset that describes its structure and characteristics, such as dataset names, field names and types. The metadata enables organizations to understand their data assets an establish standard data architecture across different modules without disclosing or storing the actual data.
Noise
Observation
Observer
Observer rule
Parsing
Pipeline editor
An editor that allows users to create, modify, and manage data transformation pipelines by adding, editing, and configuring transformation steps. It provides a visual environment with multiple panes for viewing sample data, transformation steps, and editing options for pipeline configuration.
Pipeline engine
A processing engine that provides the computational resources and processing capabilities necessary for running Data Quality pipelines. It is created based on a connection and supports various connection types including Databricks, Precisely Agent, and Snowflake, etc.
Popularity
A metric that shows the number of times a table or column is used in database queries within a specified time period, typically the last 24 hours. It includes both manual queries and queries run during Profile or Observer runs, and is available for datasources like Snowflake, Redshift, and BigQuery.
Precisely Cloud
A cloud-based platform that provides data integrity services and requires secure communication with on-premises environments through installed agents. It serves as the hosting environment for Precisely's Data Integrity Suite services and components.
Registration key
A unique key generated in the Precisely Data Integrity Suite and used during Agent installation. It authenticates the Agent with the Suite, allowing secure registration and communication between the on-premises environment and the Precisely Cloud.
Replication designer role
Replication engine
A replication engine is the core component that performs data replication from a source to a target system. It works with agents that manage communication and data transfer, while the engine handles the actual replication. Used across Data Integrity Suite services like Data Integration, it ensures real-time synchronization, data consistency, and automation; reducing manual effort and improving reliability.
Replication environment
Replication operator role
A user role that allows users to start and stop quality and replication pipelines within the Data Integrity Suite. This role is assigned to users in the Operators and Replication Operators user groups to provide operational control over replication processes.
Replication options
Replication pipeline
A collection of mappings that define the relationship between source and target data, used to transfer data between systems either continuously, on schedule, or for mainframe-specific processing. It can move source data to a target database in bulk or initiate data capture and replication processes for efficient data management across different systems.
Replication user ID
Runtime server
A server component in Data Integrity suite that hosts and executes agents and engines, identified by a host name and port number. It provides the runtime environment where these components operate and can be managed through APIs for operations such as starting, stopping, and retrieving details.
Sample data
A subset of records extracted from a larger dataset, used for testing, validation, and quality assurance purposes in data processing workflows. Sample data can be uploaded from external files or generated directly from cataloged datasets, and is stored securely in encrypted format for analysis and pipeline creation.
Schema registry
Security policy
A rule-based configuration that specifies who may access which data assets within a workspace, under what conditions, and what actions are permitted or denied. It expresses access control by linking subjects (user groups, individual users, and roles) to resources and operations, and may include constraints such as time, location, or device context. Security policies can be deployed as default (predefined templates automatically created to cover common scenarios) or custom (created by users from scratch to meet specific requirements).
Source dataset
A collection of data within a datasource that serves as the input to a pipeline, whether in Data Quality or Data Integration, and must conform to the pipeline’s input schema for successful execution.
Source server
A server that hosts the original data or resources to be accessed, replicated, or processed by another system or application. In a pipeline, it acts as the origin from which data is extracted.
Spatial Analytics
Start index
A parameter that specifies the starting position in a string where the replacement should begin. The position count starts from 1. It marks the beginning of the substring that will be replaced with new characters. This parameter is applicable only in Replace Between action in the Cleanse Data transformation step.
Substitution variable
%(<parm_name>) and can
store string or encrypted string values for use within scripts. These
variables enable dynamic configuration and parameterization of pipeline
execution by allowing values to be defined once and referenced throughout
the pipeline scripts.Tablespace
A logical storage unit in Oracle database that groups related data files together and provides a way to organize and manage database storage. It serves as a container for database objects like tables and indexes, with each tablespace having associated data files that physically store the data on the database server.
Target dataset
A dataset that serves as the destination for processed data output from a Data Quality pipeline run configuration. It must have a schema that matches the pipeline output schema for the job to execute successfully.
Target field name
The new name assigned to a column or field during a data replication process, such as when renaming fields in a pipeline or mapping fields between datasets.