Type:
Standardization step
The Standardize Date operator in the Data Integrity Suite is used to convert values in a date column to a consistent format. This is useful when input data contains multiple date formats across records.
Configuration Options
- Step name: Defines a name for the step. Provide a meaningful name so that users editing the pipeline can easily identify the purpose of this step.
-
Columns: Specifies which columns to transform.
- Only columns with string data types can be selected for this transformation.
- The listed columns correspond to those in the dataset inspection table.
- Use the dropdown to select or clear checkboxes next to column names.
- Language: Selects the language used for month and day names. This affects parsing and formatting behaviour.
- Country: Determines the locale for formatting the date. It affects the way dates are interpreted and standardized, especially for formats that differ between regions (e.g., MM/DD/YYYY vs DD/MM/YYYY).
- Format: Specifies the standardized date format to apply to the selected columns. The format you choose should align with your selected language and country settings. Example: EEEE, MMMM d, y → Thursday, July 3, 2025
-
Year Cut-off for Two-Digit Year: Defines how to interpret years
represented with only two digits (e.g., 55). When parsing dates where
the year is specified with two digits (e.g., 01/01/55), the year
cut-off determines whether this is interpreted as 1955 or 2055. You
can configure this behaviour using one of two methods.
- Current Year Method: Uses the current year’s last two digits
as the cut-off value.
- Any two-digit year less than or equal to the cut-off is interpreted as 21st century (20XX).
- Any two-digit year greater than the cut-off is interpreted as 20th century (19XX).
- Example: If the current year is 2025: 00–25 → 20XX (e.g., 25 → 2025) and 26–99 → 19XX (e.g., 26 → 1926)
- Custom Method:You manually define a custom cut-off value between 0 and 99 using a slider. This gives precise control over how two-digit years are interpreted. Example: If you set the cut-off to 58: 00–58 → 20XX (e.g., 55 → 2055) and 59–99 → 19XX (e.g., 60 → 1960)
- Current Year Method: Uses the current year’s last two digits
as the cut-off value.
Note:
- If dates in your input data include years in two-digit format, it’s important to configure the Year cut-off correctly to avoid misinterpretation across centuries.
- Only string-type columns are allowed for this transformation.
- Be consistent with language, country, and format settings to avoid parsing errors.