The Identify Country step uses available address field values to identify the country.
Step Properties
Step name: Defines the name for a step. Provide a meaningful name so that anyone who edits steps in a pipeline will be able to identify the purpose of a step.
- ISO2: represents countries by their two-letter ISO country
codes. For example:
United States: US
Canada: CA
United Kingdom: GB ISO3: represents countries by their three-letter ISO country codes. For example:
United States: USA
Canada: CAN
United Kingdom: GBR
-
Country Name: displays the full name of the country in a
human-readable form. For example:
US: United States
CA: Canada
GB: United Kingdom
Enhance with AI: In certain instances, Geocode SDK may fail to identify the country for specific addresses due to the unavailability of complete address information or sufficient details. In such cases, the output fields for addresses of unidentified countries remain blank. To resolve this, you can select the Enhance with AI checkbox, enabling AI to identify the country based on available address information. When you choose this option, an additional output column, (Address_set_1_Source, Address_set_2_Source, …) is displayed, indicating the source of identification (AI or Geocode SDK, as applicable).
Address_set_1, Address_set_2, …: . One or more address sets may map schema fields to input dataset fields. The address set schema is mapped to each set of address fields in an input dataset when there is more than one set (such as home address, business address, corporate headquarters, and so forth). The Schema column specifies schema names for standard address fields. The Map input field column maps input dataset fields to the Schema names.
To map a dataset field to a schema name, click the column heading in the Map input field column entry that corresponds to the name in the Schema column. When there is no input field corresponding to a schema name, leave Map input field value set to none for that Schema name.
- FirmName: Specifies an organization name, place, or building.
Examples:
PRIME MINISTER & FIRST LORD OF THE TREASURYARAMARK Ltd.UNITED NATIONS HEADQUARTERSCHRYSLER BUILDING
- AddressLine1: The street portion of the address, including the
directional and street suffix, formatted to the country's standard. If no other address
field is populated, then the AddressLine1 entry is treated as a
single line input. Single line input can consist of multiple input address fields, such
as 10 Downing St, London SW1A 2AB, GBR. This field is required.
Examples:
10 Downing Street630 W 168th StPO Box 3554
- AddressLine2: The unit, suite or apartment portion of an
address that specifically locates an addressee at a street address. This is typically
optional. Examples:
Suite 214Apt 534Maildrop A25
- City: Specifies a city or town name. Examples:
LondonNew York
- CitySubdivision: Specifies a city subdivision or locality.
Typically a division such as neighborhood, hamlet, or borough. Examples:
BromleyBrooklyn
- StateProvince: The first-level administrative division in a
country. Typically a division such as state, province, department, region, or territory.
Examples:
New YorkWest Midlands
- StateProvinceSubdivision: The second-level administrative
division in a country. Typically a division such as county, municipality, parish,
prefecture, or district. Examples:
Nassau CountyCoventry
- PostalCode: The postal code used by a country. Examples:
10032-3725SW1A 2AA
- Country: The country name or the ISO alpha-3 code. Examples:
United States of AmericaorUSAUnited KingdomorGBRFranceorFRA
Add address dataset Click this button to add an additional set of address mappings. Address set mappings map schema names to input fields. There is initially one set of mappings, by default titled Address_set_1. If data includes more than one address—such as shipping and billing addresses, or a home and business addresses—you can click this button to add an additional address set and map its schema names to the input fields for the additional address. The default name for each added address set increments the index (Address_set_2, Address_set_3, and so forth). You can rename any address set by editing the name that shows above the address set. Alphanumeric and underscore characters are allowed in an address set name. When there is more than one address set, you can click the delete button that appears next to a name to delete a dataset.
Save Click this button to close settings and save changes to the transformation settings.
Preview Click this button to preview the results of the transformation settings.
Cancel Click this button to close settings for this transformations without saving any changes.
Output fields
This pipeline step populates a single IdentifiedCountry output field for each address set.
Address_set_1_IdentifiedCountry, Address_set_2_IdentifiedCountry, ... The output field is populated with the ISO 3166-1 alpha-2 country code that best matches address information in a record. An incorrect country code may be associated with incomplete address information. For example, an input field that includes a value for AddressLine1, but omits values for all other fields may return an incorrect country code. You may therefore want to check for incomplete data.
Create User Defined Functions (UDFs) in Snowflake
In Data Quality pipeline, using Country Identify step requires you to create external UDFs in Snowflake. It is a one-time activity you need to perform. There are two separate UDFs required for Identify Country "without AI" or "with AI" to run your pipeline successful.
Define the Identify Country external functions in Snowflake
The Identify Country operator in Data Quality and Enrichment uses Format 1 type user defined external function for Geocode.
This task is only required to use the Identify Country pipeline operator.
The name of the external function for the Identify Country operator must be
precisely_geocode.