Defining Incoming Data for a Derived Dataset

The Incoming Dataset(s) in a derived dataset are those that are stacked together (a SQL-experienced person would call this a UNION) to begin the process of creating the new dataset.

After opening the Settings window for the derived dataset (see, follow these steps to define the incoming dataset(s).

If the Incoming Data panel is not already opened, click on that panel header to open it.  Then click the Edit Incoming Data button.  Note that if this view had previously been defined, as in the example below, you will see information on the currently defined incoming datasets.  If this is a brand new dataset, this panel will be empty (no datasets shown).

The interface elements available are described here:

Add Dataset Button - click this button to add a single dataset to the list of incoming datasets.  See this link for more information:

Multiple Datasets/Automate Button - click this button to add multiple datasets at once, either by selecting them individually, or by specifying a pattern their names must match.  See for more details.

Dataset List - the list of datasets is ordered based on when they were added (or based on moving them using the Move Up or Move Down links).  For most purposes, the ordering of the incoming datasets does not matter since all data from those datasets are stacked together.  However, the first dataset in the list has an additional option available which allows you to defined a new variable in the resulting datasets that contains the name of the dataset from which each row emanated.  Note that incoming datasets that have been inactivated will be shown as greyed out with the "Inactive" label (these datasets will play no role in the processing steps, but can later be activated again - see


Icons - Edit | Copy | Move Up | Move Down | Delete

Note: Not all icons will be available or shown, depending on the situation.  For example, you cannot move the first dataset in the list upwards.

Edit - Allows you to edit the settings for that dataset.  See this link for more information:

Copy - clicking this link will make an exact duplicate copy of the dataset definition and add it to the incoming dataset list at the bottom.  This is especially useful to use one incoming dataset definition as a template for another one you want to add.

Move Up  - if there is more than one dataset listed, you will have the ability to move it up in the list with this link

Move Down - if there is more than one dataset listed, you will have the ability to move it down in the list with this link.

Delete - if there is more than one dataset listed, you will have the option to delete it using this link.  You cannot remove all incoming datasets from the list since at least one incoming dataset is required to define a derived dataset.


When Variables are of Different Types from Different Datasets

When you execute the derived dataset, if there is more than one incoming dataset, these will get stacked on top of each other (rows from each incoming dataset are added incrementally to the resulting dataset).  There are some important aspects regarding how variables across incoming datasets are processed:

  • If a variable appears in some of the incoming datasets but not all, it will be assigned a NULL value for those rows coming from datasets where it does not appear.  
  • If the same variable has a different type in different incoming datasets, its type in the resulting dataset is determined in the following way:
    • If the type mismatches are all numeric (e.g., integer or decimal), then the type assigned in the result will be decimal.
    • Otherwise, the type assigned will be string.  Non-string data will be converted to string.