Edit Settings

Edit a Manual Dataset

Video showing this: https://youtu.be/SI4DFSG-3Aw Most datasets in LityxIQ are sourced from large files or database tables. However, sometimes there is a need to create a dataset that is small and editable directly in the LityxIQ UI. The Manual Dataset option allows you to hand-enter data manually, or cut/paste from an Excel spreadsheet. 0) If you haven't already created the new manual dataset, see https://support.lityxiq.com/319229-Create-New-Dataset 1) Select the dataset in the dataset list,...

Derived Dataset New Fields Editing Window

The Define New Field dialog appears after clicking Create New Field or when editing an existing field within a derived dataset. The options are explained below. Single/Multi - Select whether you are defining a single new field, or multiple fields at once. If you select Multi, you will also see the Base Fields option made available. See the document https://support.lityxiq.com/502396-Create-Multiple-New-Fields-at-Once for more help on creating multiple new fields. Field Name - Enter the nam...

Editing/Defining a Derived Dataset

A Derived Dataset in LityxIQ is one in which one or multiple other datasets can be brought together and modified (joined, aggregated, filtered, transformed, etc) to create a new dataset. In order to define a derived dataset, follow these steps: 1) Select the dataset from the available datasets list, then select Edit Settings from the Selected Dataset menu. Alternatively, you can right click on the dataset to see the menu options. If the dataset was just created using the Create New Dataset bu...

Defining QC (Quality Control) Rules for a Derived Dataset

QC, or Quality Control, rules provide a way to do error checking on a dataset after it finishes executing. When no QC rules are set, the dataset execution is considered a success when it completes (of course, other errors may have happened along the way, such as an invalid field definition). If QC rules are defined, each rule is checked when the dataset is otherwise finished executing. If any of the QC rules are evaluated to be be valid, the result of the execution is considered an "error" inste...

Defining Incoming Data for a Derived Dataset

The Incoming Dataset(s) in a derived dataset are those that are stacked together (a SQL-experienced person would call this a UNION) to begin the process of creating the new dataset. After opening the Settings window for the derived dataset (see https://support.lityxiq.com/241959-EditingDefining-a-Dataset-View), follow these steps to define the incoming dataset(s). If the Incoming Data panel is not already opened, click on that panel header to open it. Then click the Edit Incoming Data button...

Settings for a Single Incoming Dataset

When you click the Add Dataset button in the Incoming Dataset section of a derived dataset, or edit an existing Incoming Dataset, you will have a number of options related to how it is used. Dataset: A list of all datasets available to you appears in the topmost drop down menu. Find the dataset you wish to use and click it. Variables to Keep: Select the variable(s) from the dataset that you want to bring into the derived dataset processing. Select All: Checking this box will keep all variab...

Define Multiple Incoming Datasets for a Derived Dataset

When defining the Incoming Datasets for a derived dataset, you can define multiple datasets at once using the Multiple Datasets/Automate button. Clicking this button, or editing an existing Multiple Datasets definition, provides options described below: All variables from all datasets will be included in the resulting derived dataset. An example of this process is shown here. Here is an example of an Incoming Datasets area where the multiple datasets button has been used: If we click the Edi...

Defining Joins for a Derived Dataset

In the LityxIQ Derived Dataset processing, a Join is equivalent to a join or merge process in languages like SQL or Python. The objective is to join or merge together the data from rows that cross over multiple datasets. The rows are "matched" together across the multiple datasets using what are referred to as "join keys", which are variable(s) across the datasets whose values will be compared for matching. Note that you can also perform what is referred to as a "cross-join" in which all rows fr...

Defining a Transpose Operation in a Derived Dataset

A transpose operation within a derived dataset allows you to put data that spans multiple columns into data in multiple rows. This is sometimes also referred to as a pivot on a dataset. For example, suppose you have a dataset that has a unique field such as StoreID, and has a number of fields containing zip codes. You can think of this perhaps as a set of zip codes that define a footprint for a store. The dataset is currently structured as in the screenshot below, with one unique row per store, ...

Defining New Fields for a Derived Dataset

The New Fields area of a derived dataset in LityxIQ allows you to add new fields (also called "variables" or "features") to the dataset. The functionality is very robust, allowing for creating simple new variables such as the sum or ratio of two others, or complex variables that do things such as search/replace text or figure out the day of week from a date variable. After opening the settings dialog for the derived dataset (see https://support.lityxiq.com/241959-EditingDefining-a-Dataset-View)...

Create Multiple New Fields at Once

The new fields area of a derived dataset (defined in detail here https://support.lityxiq.com/608058-Defining-New-Fields-for-a-View) has the option to create multiple new fields at once. This functionality is a powerful way to apply the same basic operation to many variables at the same time. Two use case examples are show below. Example 1 - Take Absolute Value of Many Fields Suppose you have many numeric variables, and you want to create a new field related to each of them that contains the...

Defining a Filter for a Derived Dataset

The Filter area of a Derived Dataset definition allows you to subset the records based on a condition you specify. After opening the settings dialog for a derived dataset (see https://support.lityxiq.com/241959-EditingDefining-a-Dataset-View), follow these steps to define a filter. Note that it is not required that a derived dataset have a filter defined. 1) If the Filter panel is not already opened, click on that panel header to open it. Then, click the Edit button. Note that if this view ...

Defining an Aggregation for a Derived Dataset

After opening the settings dialog for a derived dataset (see https://support.lityxiq.com/241959-EditingDefining-a-Dataset-View), follow these steps to define an aggregation step. It is not required that a derived dataset have an aggregation defined. If an aggregation is defined, the resulting dataset will typically have fewer (usually many fewer) records than the original datasets. The number of records will be the total number of different combinations of the selected aggregation variables (see...

Create a New Aggregation Field with Examples

The new aggregation field function will provide an aggregated value for every record in an existing dataset. This is a distinct feature as opposed to the “aggregate” function, which can change the number of fields and number of records in an existing dataset. See https://support.lityxiq.com/192161-New-Field-Aggregations---Concept-and-Comparisons for a more complete overview of New Field Aggregations. 1) In Data Manager select the dataset you wish to edit, and under the Selected Dataset tab clic...

Comparison of Aggregation Types

In LityxIQ, two different types of data aggregations are available, which can be used in conjunction with each other. The two types can be described as: 1) aggregating over all combinations of the levels of the selected variables,. 2) aggregating over all individual levels of the selected variables. This document provides a comparison of these two techniques. Both area available in the Aggregation area of defining a View. Assume we have a very small and simple dataset with three variables: A...

Defining a Pivot within the Aggregation Step

This video demonstrates the steps listed below: https://www.youtube.com/watch?v=fAr6mu81N2g The Aggregation step of a Derived Dataset includes an option to "pivot" (or "transpose") data from rows into columns of the result set. An Aggregation definition may include any number of pivoted variables, restricted only by the limit of 1600 total variables in the result set. As an example, consider the following dataset having variables STATE, domain_name, and ZIP. This is a small snapshot of a da...

Define Finalization Settings for a Derived Dataset

The Finalize and QC step in a derived dataset is the last step in the derived dataset processing. It is not required, but provides a wide range of functionality to complete the processing steps. It includes the ability to: - Drop or Re-order the variables in the dataset before it is saved. - Add QC rules that are checked prior to the dataset processing being finalized. - Add a row number field to the dataset. - Add special settings related to configuration variables. - Other advance...

Configuration Dataset Settings Options

Configuration Variables (described in more detail here https://support.lityxiq.com/084396-Create-and-Utilize-a-Configuration-Variable), can be created dynamically as the result of either a data import process, or executing a derived dataset. In either case, the same interface options are available, and these are described here. The screenshot below is from the Finalization dialog when editing a derived dataset, but the same options are available in the Define Dataset Source dialog for a Raw Data...