Create New Dataset

Batch Import Files

LityxIQ has the ability to auto-load external data in batch. For example, it can be setup so that any file in a particular folder on an FTP site will be loaded once they are placed there, and continue to do so in an automated, ongoing manner. This document will describe the settings specific to setting up a batch loading process. Other settings related to the settings for a Raw Dataset are found here: https://support.lityxiq.com/261835-Define-the-Source-Settings-for-a-Raw-Dataset. This will help...

Define the Source Settings for a Raw Dataset

To import a file or other external data into LityxIQ (to create a LityxIQ dataset), you must define the source settings for the dataset. The dataset in LityxIQ is referred to as a Raw Dataset because it points to and imports data from an external raw data source. That data source may be a file on an FTP site or S3, a database connection, a file uploaded directly into LityxIQ using the File Manager (see https://support.lityxiq.com/125826-Uploading-a-File), or data in a CRM system (among other pos...

Import Data into a Raw Dataset

To import or load data into a raw dataset manually, follow these steps. To import data on a schedule or automatically, see https://support.lityxiq.com/373978-Execute-Raw-Data-Import-or-Views-on-a-Schedule (https://support.lityxiq.com/373978-Execute-Raw-Data-Import-or-Views-on-a-Schedule?r=1). 1) Select the dataset in the Available Datasets list. 2) Click Load Data -> Load Now from the Selected Dataset menu. 3) Click Yes to confirm or No to change your mind. 4) The import process...

Create New Dataset

To create a new dataset in LityxIQ, follow these steps: 1) Select the dataset library in which you want to create the new dataset. 2) Click the Create New Dataset dropdown and select either Raw Data Source or View, depending on how you want to create the dataset. Each option is described in more detail separately. 2) In either case, a dialog is displayed that lets you give a name to the dataset as well as a description. The name must not be the name of an already existing dataset in...

Creating a Derived Dataset

To create a new derived dataset in LityxIQ, follow these steps: 1) In the Data Manager, click the Create New Dataset dropdown menu and select Derived Dataset. 2) The New Derived Dataset dialog box will appear. Enter the name you would like to give to the dataset, and optionally provide a description of the dataset. These will both appear in the list of datasets. The name you enter must not be the name of an already existing dataset in the active project. You can also specify the library ...

Creating Datasets - Advanced Tab

The Advanced tab contains options that are similar both when importing raw data (in the Define Dataset Source dialog), or when executing a derived dataset (in the Finalize and QC dialog). The options are described here. Sort and Join Keys - Select the field(s) that are most likely to be used in joins or aggregations in future operations with this dataset. The most common of these can be dragged to the top of the list. Making these selections will help with performance of data operations. Fil...

Previewing a Raw Dataset Source

The Preview area in a raw dataset source definition allows you to take a look at the underlying dataset before importing it. To generate a data preview, first setup the all of the settings correctly, then click the Preview Data Source button. This process will retrieve the first 100 rows of data and display them in the preview window. You can scroll up/down and right/left to see more data. When Excel or Google Sheets is the selected file type, the View Worksheet Names button becomes availab...

View and Edit a Dataset Dictionary

A dataset's dictionary in LityxIQ is a list of all the variables (fields) in the dataset, their names, and for each variable, the type of data it contains. For datasets created from raw data sources, the dictionary is most often created automatically by LityxIQ by analyzing the data source and automatically determining the variables and data types. However, you always have the ability to change the dataset dictionary. This includes changing variable names and data types. For Derived Datasets,...

Import Data from an Excel File or Google Sheets

An Excel file is a common format for holding data in rows and columns. The data is held in a spreadsheet, often within a workbook of multiple spreadsheets. Typically, this spreadsheet format is only used for relatively small datasets. This document will explain the options available for importing data from Excel sheets into LityxIQ. Note that this same approach and discussion holds for Google Sheets data as well. A spreadsheet is a file, so it can be stored and retrieved from a variety of types...

Import Data from a SQL Database Table

A SQL database table is a structured dataset with rows and columns. Many vendors provide SQL-style database compatibility, including SQL Server, Oracle, PostgreSQL, and Amazon Redshift. Data contained in these databases can be wide-ranging, including transactional data, prospect or CRM data, or Big Data. This document will explain how to import data from a SQL database table into LityxIQ. In order to access an external database, you first need to setup a Data Connection in LityxIQ. See https://...

Import a Delimited Text File

A delimited text file, sometimes called a "csv" file or a flat file, is a very common method for encapsulating a dataset. Almost all systems or other file formats have a way to export data into delimited files, making it an easy and simple way to exchange files between systems. However, despite there being fairly strict standards in place for how a delimited file is to be created, there are differences from one system to the next that require options to be set and sometimes cause issues. This do...

Import Data from Snowflake

Snowflake is a cloud-based highly scalable database (see http://www.snowflake.com for more information). Snowflake acts muhc like a standard SQL database with respect to how one interacts with it, and therefore importing Snowflake data is much like importing data from other SQL databases in LityxIQ. See https://support.lityxiq.com/096794-Import-Data-from-a-SQL-Database-Table for more information. Note that when setting the connection settings for Snowflake connections, you are required to enter...

Import a SAS, SPSS, or Stata Dataset File

Users of LityxIQ may have datasets that were originally created or stored by an outside statistical processing system, such as SAS, SPSS, or Stata, and may want to use this data directly in LityxIQ without having to use an outside tool to covert it to an appropriate format. LityxIQ allows you to import files in these formats directly into a LityxIQ dataset. The file types supported are: - SAS datasets (.sas7dat files) - SAS Xport Files (.xpt files) - SPSS datasets (.sav files) - Stata...