RRN mapping assigns unique identifiers to each data item in the source system. These identifiers, called Record-Reference Numbers (RRNs), help track changes. When data changes occur, RRNs are used to identify updated, inserted, or deleted records since the last replication cycle. With RRN mapping, efficient detection of changes is facilitated, allowing for the propagation of these changes to the target dataset during the replication process.
To configure RRN mapping in a continuous replication pipeline:
- Select the Synchronize, when setting up your replication pipeline.
- After selecting the source and target connections, in the Staging section, provide required information on following options:
- Service account to use: Choose the option 'Service account specified in the connection' from drop-down.
- Google cloud storage directory: Specify the folder path in Google Cloud Storage where data will be stored or retrieved.
- Stgaing Schema: Define the temporary storage area to organize and structure the data during processing.
- While Mapping fields, you have the option to perform custom mapping. Ensure that the Target table has a specific primary key defined, not in the Source table. The Target table primary key should have RRN_Field_Data and type as Numeric.
- On the summary page, click Validate Configuration
In the log reader config, the replicate clear physical file member should be set to Yes if the pipeline is using the RRN field as the distribution key for one or more datasets.
The replicate clear physical file member field is set to No in this case, we should throw a validation warning, indicating that if they proceed, and a CLRPFM is done, the data may be inconsistent and may need to be resynchronized.
Warning: The following problems were found while validating configuration changes.The above warning message is displayed when a project configuration has validation issues but can still be committed.