The Verify Address step parses, standardizes, verifies, cleanses, and formats address data against reference postal data. The step requires a data subscription to process addresses. You can add the step to a pipeline without a data subscription entitlement, but the pipeline will not run on a runtime engine without a subscription entitlement for the workspace. Separate entitlements are required for different countries.
- To use the Verify Address step in Data Quality pipelines with a connected Snowpark data source, you must set up an environment for the Snowpark data source that defines verify external functions in Snowflake.
- The name of the external function for the Verify operator must be precisely_verify.
- When the Verify operator is added to a pipeline, address data will be sent to the Precisely Cloud through the Verify external function.
- The Verify address operator currently only supports the string data type for user-defined functions in Snowflake.
- Related Information: UDF in Snowflake for Address Verification and Geocoding in Data Quality
Key Features
- Cleansing: Fixes typos in street names or removes non-existent street or company names.
- Supplementation: Adds missing address components, such as state or postal code.
- Standardization: Parses and formats addresses (e.g., placing the house number before or after the street name, normalizing abbreviations).
During address matching and standardization, the address components are compared to a
country’s reference database (where available). Any information not used for
matching is considered dropped address information. If a match is found, the address
is standardized accordingly. You may choose to retain the original data. The step
adds a confidence column labeled address_set_name_Confidence, which
shows the confidence level (0–100) that the address is valid. A confidence level
below 100 can be used to flag records for manual review. A confidence value of -1
indicates that the Verify Address step could not process the address.
Configure Step Properties and Output Field Settings
Configure the step properties and output field settings on two tabs:
- Step Properties Tab: Specifies the behavior and output of the Verify Address step.
- Output Configuration Tab: Defines the output schema for the step.
Step Properties
Verified, cleansed, and standardized address field values replace the original
values. The output columns are labeled by the original input-field names. If schema
names are unmapped, the columns are prefixed with the address set name (e.g.,
address-set-name_SchemaName). Generated columns are populated
with reference database data where possible. A confidence column is added to the
output dataset, showing the confidence level for each verified record or -1 when a
record cannot be verified. Here are the key verify fields:
| Field | Description |
|---|---|
| Step Name | Defines the name for the step. Use a meaningful name for easier identification. |
| Subscription | Data processing is restricted to specifically subscribed regions.
Select an available region(s) from the drop-down menu. The regions
are divided into AMER, EMEA,
and APAC, and each region is divided into three
tiers:
Note:
|
| Default Country Code | Specifies the default country if no country is provided in the input data. The runtime engine will only run the pipeline if a subscription entitlement exists for the country dataset. |
| Preserve Input Field Values | Select this option to retain input field values. Retained values
are labeled by the input field name prefixed by the address set name
and Input (e.g.,
address-set-name_Input_input-field-name). |
Mapping address sets
An address set schema contains standard address field types. Here are the key schema fields:
| Field | Description | Example |
|---|---|---|
| FirmName | Organization, place, or building | United Nations Headquarters |
| AddressLine1 | Street portion of the address | 10 Downing Street |
| AddressLine2 | Unit, suite, or apartment | Apt 534 |
| City | City or town name | London |
| CitySubdivision | Neighborhood, borough, or other subdivisions | Brooklyn |
| StateProvince | Primary division of the country | New York |
| StateProvinceSubdivision | Secondary division, such as a county | Nassau County |
| PostalCode | Postal code of the address | 10032-3725 |
| Country | Country name or ISO code | USA or GBR |
Output configuration
Selections on the Output Configuration tab define the output schema for the Verify Address step. All fields are selected by default. You can clear any fields you do not want to include in the step output. Address verification outputs verified and standardized address components to the mapped fields, replacing the original values. For unmapped schema names, address verification generates columns and populates fields where possible from the reference database for a country. Output fields are left empty when no value can be created in a column.
Schema names mapped to input fields:
- Address verification outputs verified and standardized values to columns with
the original mapped input field names:
input_field_name - Schema names that are not mapped to an input field.
- Address verification generates columns and populates them where possible from
the reference dataset for a country. Generated column names include the schema
name prefixed by the address set name and the underscore character:
address-set-name_SchemaName - When the Preserve input field values checkbox is selected, the original
input values from each mapped column are preserved in columns labeled by the
address set name, the _Input_ moniker, and the input field name:
address-set-name_Input_input-field-name
Unmapped schema names: For any schema name that is not mapped to an input field, an output field is generated and labeled with the address set name and schema name.
| Field | Description |
|---|---|
address-set-name_AddressLine1
|
Contains the first line of a standardized address. This may include PO Box, unit number, unit type house number, and street name. |
address-set-name_AddressLine2
|
Contains the second line of a standardized address. For US addresses, typically city or town, state, and postal code. |
address-set-name_City
|
Contains the city or town name. |
address-set-name_CitySubdivision
|
Contains the neighborhood, borough, or other subdivision (if available). |
address-set-name_StateProvince
|
Contains the state, province, department or similar primary subdivision of a country. |
address-set-name_StateProvinceSubdivision
|
Contains the county, district, municipality, or similar secondary subdivision of a country (if available). |
address-set-name_PostalCode
|
Contains the postal code associated with an address. |
address-set-name_VerifyLevel
|
For each address set, address verification adds several fields
that characterize the address verification. This value specifies the
level at which an address has been verified.
|
address-set-name_Confidence
|
The level of confidence assigned to the address being returned. Range is from 0 to 100. Zero indicates that no match was found. 100 indicates a very high level of confidence that the match results are correct. The step returns -1 when it is unable to perform verification. |
address-set-name_PrecisionCode
|
The precision code is a string that describes the precision of the address match for the input address. |
address-set-name_PreciselyID
|
PreciselyID is a unique identifier. It can serve as a lookup key to add attributes to an address from Precisely Enrichment datasets. |
Precision code
The precision code is a string describes the precision of the address match for the input address. This topic describes the precision codes used to indicate the level of address match for a given input address.
Postal Code Match (Z-category): Matches in the Z category indicate that a match was made at the postal code level.
| Code | Description |
| Z1 | Match to ZIP Code™ or postal code 1. |
| Z2 | Match to ZIP + 2 or partial match to postal code 2. |
| Z3 | Match to ZIP + 4® or postal code 2. |
Area Name Match (G-category): Matches in the G category indicate that the record was matched to an area name.
| Code | Description |
| G1 | Match to state/province (area name 1). |
| G2 | Match to country/region (area name 2). |
| G3 | Match to city/town (area name 3). |
| G4 | Match to suburb/village (area name 4). |
PO Box Match (B-category): Matches in the B category indicate that the record was matched to a PO Box.
| Code | Description |
| B1 | Matched to an unvalidated PO Box. |
| B2 | Matched to a validated PO Box. |
Single Address Match (S-category): Matches in the S category indicate that the record was matched to a single address candidate.
| Code | Description |
| S0 | Single match, coordinates unavailable. |
| S1 | Single match to a ZIP Code™ or postal code 1 level. |
| S2 | Single match to a ZIP + 2 or partial match to postal code 2 level. |
| S3 | Single match to a ZIP + 4® or postal code 2 level. |
| S4 | Single match at the street level. |
| S5 | Single match to the street address. |
| S6 | Single match to a point located at a ZIP centroid. |
| S7 | Single match to a street address that was interpolated between houses. |
| S8 | Single match to the street address or house number. |
| SC | Single match at the house-level projected from the nearest segment. |
| SG | Single match with point at the center of a locality (areaName3) or Locality level geocode derived from topographic feature. (Australia addresses only.) |
| SL | Single match to a sublocality (block or sector) street level match. (India addresses only.) |
| SX | Single match to a point located at a street intersection. |
Street Matched Precision Codes: For S (street matched) precision codes, eight additional characters describe how closely the address matches an address in the database. The characters appear in the order shown.
| Character | Description |
| H | House number match. |
| P | Street prefix (pre-directional). |
| N | Street name match. |
| T | Street/thoroughfare type match. |
| S | Street suffix (post-directional). |
| C | City or town name. |
| Z | Postal code match. |
| A | Addressing dataset match. |
| U | Custom user dictionary match. |