Identify country

Data Integrity Suite

Product
Spatial_Analytics
Data_Integration
Data_Enrichment
Data_Governance
Precisely_Data_Integrity_Suite
geo_addressing_1
Data_Observability
Data_Quality
dis_core_foundation
Services
Spatial Analytics
Data Integration
Data Enrichment
Data Governance
Geo Addressing
Data Observability
Data Quality
Core Foundation
ft:title
Data Integrity Suite
ft:locale
en-US
PublicationType
pt_product_guide
copyrightfirst
2000
copyrightlast
2025
Type: Addressing step

The Identify Country step uses available address field values to identify the country.

Step Properties

Step name: Defines the name for a step. Provide a meaningful name so that anyone who edits steps in a pipeline will be able to identify the purpose of a step.

Country Output Format: Specifies the format in which you want to display the names of countries. The available options ISO2, ISO3, and Country Name represent different ways to format and display country names.
  • ISO2: represents countries by their two-letter ISO country codes. For example:

    United States: US
    Canada: CA
    United Kingdom: GB

  • ISO3: represents countries by their three-letter ISO country codes. For example:

    United States: USA
    Canada: CAN
    United Kingdom: GBR

  • Country Name: displays the full name of the country in a human-readable form. For example:

    US: United States
    CA: Canada
    GB: United Kingdom

Enhance with AI: In certain instances, Geocode SDK may fail to identify the country for specific addresses due to the unavailability of complete address information or sufficient details. In such cases, the output fields for addresses of unidentified countries remain blank. To resolve this, you can select the Enhance with AI checkbox, enabling AI to identify the country based on available address information. When you choose this option, an additional output column, (Address_set_1_Source, Address_set_2_Source, …) is displayed, indicating the source of identification (AI or Geocode SDK, as applicable).

Note: To access the AI functionality in the Data Quality Pipelines, you must initially enable it at the workspace level under the AI tab.
Warning: For running the data quality pipeline on Snowflake using Identify step, you must create an External User Defined Function (UDF). See Create User Defined Functions (UDFs) in Snowflake.

Address_set_1, Address_set_2, …: . One or more address sets may map schema fields to input dataset fields. The address set schema is mapped to each set of address fields in an input dataset when there is more than one set (such as home address, business address, corporate headquarters, and so forth). The Schema column specifies schema names for standard address fields. The Map input field column maps input dataset fields to the Schema names.

To map a dataset field to a schema name, click the column heading in the Map input field column entry that corresponds to the name in the Schema column. When there is no input field corresponding to a schema name, leave Map input field value set to none for that Schema name.

An address set schema contains the following standard address field types.
  • FirmName: Specifies an organization name, place, or building. Examples:
    • PRIME MINISTER & FIRST LORD OF THE TREASURY
    • ARAMARK Ltd.
    • UNITED NATIONS HEADQUARTERS
    • CHRYSLER BUILDING
  • AddressLine1: The street portion of the address, including the directional and street suffix, formatted to the country's standard. If no other address field is populated, then the AddressLine1 entry is treated as a single line input. Single line input can consist of multiple input address fields, such as 10 Downing St, London SW1A 2AB, GBR. This field is required. Examples:
    • 10 Downing Street
    • 630 W 168th St
    • PO Box 3554
  • AddressLine2: The unit, suite or apartment portion of an address that specifically locates an addressee at a street address. This is typically optional. Examples:
    • Suite 214
    • Apt 534
    • Maildrop A25
  • City: Specifies a city or town name. Examples:
    • London
    • New York
  • CitySubdivision: Specifies a city subdivision or locality. Typically a division such as neighborhood, hamlet, or borough. Examples:
    • Bromley
    • Brooklyn
  • StateProvince: The first-level administrative division in a country. Typically a division such as state, province, department, region, or territory. Examples:
    • New York
    • West Midlands
  • StateProvinceSubdivision: The second-level administrative division in a country. Typically a division such as county, municipality, parish, prefecture, or district. Examples:
    • Nassau County
    • Coventry
  • PostalCode: The postal code used by a country. Examples:
    • 10032-3725
    • SW1A 2AA
  • Country: The country name or the ISO alpha-3 code. Examples:
    • United States of America or USA
    • United Kingdom or GBR
    • France or FRA

Add address dataset Click this button to add an additional set of address mappings. Address set mappings map schema names to input fields. There is initially one set of mappings, by default titled Address_set_1. If data includes more than one address—such as shipping and billing addresses, or a home and business addresses—you can click this button to add an additional address set and map its schema names to the input fields for the additional address. The default name for each added address set increments the index (Address_set_2, Address_set_3, and so forth). You can rename any address set by editing the name that shows above the address set. Alphanumeric and underscore characters are allowed in an address set name. When there is more than one address set, you can click the delete button that appears next to a name to delete a dataset.

Save Click this button to close settings and save changes to the transformation settings.

Preview Click this button to preview the results of the transformation settings.

Cancel Click this button to close settings for this transformations without saving any changes.

Output fields

This pipeline step populates a single IdentifiedCountry output field for each address set.

Address_set_1_IdentifiedCountry, Address_set_2_IdentifiedCountry, ... The output field is populated with the ISO 3166-1 alpha-2 country code that best matches address information in a record. An incorrect country code may be associated with incomplete address information. For example, an input field that includes a value for AddressLine1, but omits values for all other fields may return an incorrect country code. You may therefore want to check for incomplete data.

Create User Defined Functions (UDFs) in Snowflake

In Data Quality pipeline, using Country Identify step requires you to create external UDFs in Snowflake. It is a one-time activity you need to perform. There are two separate UDFs required for Identify Country "without AI" or "with AI" to run your pipeline successful.

Define the Identify Country external functions in Snowflake

The Identify Country operator in Data Quality and Enrichment uses Format 1 type user defined external function for Geocode.

This task is only required to use the Identify Country pipeline operator.

The name of the external function for the Identify Country operator must be precisely_geocode.

Note: When the Identify Country operator is added to a pipeline, address data will be sent to the Precisely Cloud through Identify Country external function.