Data Manager Functions

Fuzzy Matching Functions

Overview of Fuzzy Matching The concept of Fuzzy Matching when working with data is the idea that you may want to consider two records a "match" even if they do not match perfectly. Mis-spellings in user-entered data are a common example. For example, two records containing the name "George Washington" and "George Washnigton" likely represent a match, but an approach that only looks for exact matches would miss it, and Regular Expressions, while quite powerful, would not easily identify all pos...

Date/Time Formatting Strings for str_to_date

The functions str_to_date and str_to_datetime (documented at https://support.lityxiq.com/507515-Type-Conversion-Functions) provide a nice way of converting strings into date or date/time variables, in a way that produces NULL values appropriately if the string does not match the provided format. These two functions accept format strings that are structured differently from those accepted by the TO_DATE function. The following table provides a list of the format codes that can be used in these fu...

Numeric Formatting Strings

This is a listing of format strings available for numeric formatting arguments in the TO_CHAR and TO_NUMBER functions: - 9=Numeric value with the specified number of digits - 0=Numeric value with leading zeros - . (period), or D=Decimal point - , (comma)=Thousands separator - CC=Century code. For example, the 21st century started on 2001-01-01 (supported for TO_CHAR only) - FM=Fill mode. Suppress padding blanks and zeroes - PR=Negative value in angle brackets - S=Sign anchored...

Date/Time Formatting Strings

These are formatting strings for use in the TO_DATE function. - CC=Two-digit century number - YYYY, YYY, YY, or Y=4-digit, 3-digit, 2-digit, or 1-digit year number - Y,YYY=4-digit year number with comma - Q=Quarter number (1 to 4) - MONTH, Month, or month=Month name (uppercase, mixed-case, lowercase, blank-padded to 9 characters) - MON, Mon, or mon=Abbreviated month name (uppercase, mixed-case, lowercase, blank-padded to 9 characters) - MM=Month number (01-12) - W=Week of mont...

Logical and Comparison Operators

Logical and comparison operators can be used for making comparisons between values (often used in CASE/WHEN statements) that result in logical true or false for each comparison,. Function Usage and description Equal Usage: = Greater Than Or Equal Usage: >= Greater Usage: > Between Usage: BETWEEN Example: expression [NOT] BETWEEN value1 AND value2 Less Than Or Equal Usage: <= Less Than Usage: < Not Equal Usage: != or <> ...

Pattern Matching / REGEX Functions

Pattern matching functions make use of "Regular Expression" processing language to find simple or very complex patterns in strings. These patterns can be identified, counted, replaced, or extracted once found. Regular Expressions are a very complex and difficult concept and language to grasp. Please see websites such as http://www.regex101.com to get more guidance and examples, and additional documentation here: https://support.lityxiq.com/692230-Regular-Expressions-and-Pattern-Matching. ...

Hashing Functions

Hashing functions provide a way to convert an arbitrary string into a (nearly) unique string of a known length, consisting of (nearly) random characters. Function Usage and description SHA1 Usage: FUNC_SHA1(string) This function uses the SHA1 cryptographic hash function to convert a variable-length string into a 40-character string. The new string is a text representation of the hexadecimal value of a 160-bit checksum. MD5 Usage: MD5(string) Returns a string of length 3...

New Field Aggregations - Function List

Below is a list of functions currently available when creating a New Field Aggregation. These are grouped into categories: Basic Aggregation Functions - Average - the average of the variable for all rows in the window. - Weighted Average - the weighted average of the variable for all rows in the window. This requires also selecting a weighting variable. - Sum - the sum of the variable for all rows in the window. - Count - the count of all rows in the window (regardless of null or non...

New Field Aggregations - Simple Conceptual Example

Here is a simple example of how new field aggregations work. This will give a few different possibilities for defining new fields this way. Let's say we have a simple dataset with three variables: Name, State, and Age. The State "MD" has 5 rows in the dataset, while the State "VA" has 3 rows in the dataset. See below: We want to create four new fields on this dataset using the concepts of New Field Aggregation. Conceptually, we want the following new fields: - "Average Age By State" - f...

New Field Aggregations - Concept and Comparisons

The concept of creating "New Field Aggregations" in LityxIQ is similar to partition and windowing functions that you may be familiar with in some SQL languages. The functionality provides a way to add new variables to a dataset that have a value for every row, and are computed using summary functions over partitions and ordered windows of the dataset. It is a very powerful set of functionality that typically requires complex SQL coding in a SQL language, but that LityxIQ makes easily available t...

Type Conversion Functions

Type conversion functions help change the data type of variables from one to another. Common examples are converting strings to dates or numbers, or converting dates into strings. See the sections on Date/time formatting strings (https://support.lityxiq.com/413446-DateTime-Formatting-Strings) and Numeric formatting strings (https://support.lityxiq.com/300999-Numeric-Formatting-Strings) for more information on using the format or formatstr parameters mentioned below. This subsection may produce e...

Conditional and Null Handling Functions

Conditional functions are functions that return a value depending on the values for other variable. Null Handling functions are specialized functions for dealing with NULL values in the data. Function Usage and description CASE/WHEN Conditional Expressions Usage: CASE value WHEN [compare_value] THEN result [WHEN [compare_value] THEN result ...] [ELSE result] END CASE WHEN [condition] THEN result [WHEN [condition] THEN result ...] [ELSE result] END The first version above retu...

Date & Time Functions

Date and time functions allow you to manipulate variables that have date or datetime data types. This includes performing date addition and subtraction and extracting parts of dates, among other operations. For converting from date to strings or vice versa, see the article: https://support.lityxiq.com/507515-Type-Conversion-Functions Function Usage and description Month Addition Usage: ADD_MONTHS(date, number) This function returns a date that is computed by adding the provide...

Mathematical Functions

Mathematical functions let you perform math operations on data, including rounding, trigonometry, logging and exponentiation, and random number generation (among other operations). Function Usage and description Absolute Value Usage: ABS(X) Returns the absolute value of X. Arc Cosine Usage: ACOS(X) Returns the arc cosine of X, that is, the value whose cosine is X. Returns NULL if X is not in the range -1 to 1. Arc Sine Usage: ASIN(X) Returns the arc sine of ...

Math Operators

Mathematical operators can be used for specifying simple arithmetic formulas when creating new fields. Division Usage: / Addition Usage: + Multiplication Usage: * Subtraction Usage: - Modulo Usage: N % M Modulo operation. Returns the remainder of N divided by M. See also the MOD function. Exponentiation Usage: X ^ Y Returns X raised to the Y power.

String Functions

String functions allow you to manipulate string (character) data with functionality such as concatenating strings together or finding substrings within strings. Function Usage and description Length Usage: LENGTH(str) OR CHAR_LENGTH(str) OR LEN(STR) Returns the length of the string str measured in number of characters (even for multi-byte characters). Find Substring Usage: CHARINDEX(substr,str), STRPOS(str,substr) Returns the position of the first occurrence of substring...

Amazon Redshift Functions

https://docs.aws.amazon.com/redshift/latest/dg/c_SQL_functions.html