Create New Dataset

Importing Data Using a Custom SQL Query

LityxIQ provides the ability to import data from SQL databases using arbitrary queries. If you are familiar with the SQL query language, this gives you the opportunity to import data using any query, including those with complex clauses like JOIN, WHERE, or GROUP BY clauses. This can help reduce the number of data processing steps that you perform directly in LityxIQ. To get started, you must have a valid data connection to a SQL database (such as SQL Server, Redshift, Oracle, or MySQL). Within...

Import Data from Microsoft OneDrive

Microsoft OneDrive is a cloud environment for sharing and storing files in a secure manner. See https://support.microsoft.com/en-us/onedrive for more information. If you have a OneDrive account and have files stored there which you would like to analyze in LityxIQ, it is simple to import them and begin working with the data. The first step to import data from OneDrive, as with any external connection, is to create the data connection in LityxIQ. See https://support.lityxiq.com/277108-Data-Conne...

Import Data from Amazon Redshift

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools (see https://docs.aws.amazon.com/redshift/index.html for more information). The first step, as with any external connection, is to create the data connection in LityxIQ. See https://support.lityxiq.com/277108-Data-Connections for more information on creating and managing external data connect...

Import Data from Dropbox

Dropbox is a platform for sharing and storing files in a secure manner. See https://www.dropbox.com/ for more information. If you have a Dropbox account and have files stored there which you would like to analyze in LityxIQ, it is simple to import them and begin working with the data. The first step to import data from Dropbox, as with any external connection, is to create the data connection in LityxIQ. See https://support.lityxiq.com/277108-Data-Connections for more information on creating an...

Import Data from a JSON File

JSON files provide a platform-independent way of specifying complex structured or unstructured data. You can learn more about the JSON file structure in many places on the web, including https://towardsdatascience.com/an-introduction-to-json-c9acb464f43e. A JSON file can be import into LityxIQ similar to other types of files. Whether creating a new Raw Dataset (https://support.lityxiq.com/319229-Create-New-Dataset) or editing an existing one (https://support.lityxiq.com/261835-Define-the-Source...

Import Data from an XML File

XML files provide an unstructured and potentially very complex way to represent data. You can learn about XML files in many places on the web, including https://www.w3schools.com/xml/. An XML file can be import into LityxIQ similar to other types of files. Whether creating a new Raw Dataset (https://support.lityxiq.com/319229-Create-New-Dataset) or editing an existing one (https://support.lityxiq.com/261835-Define-the-Source-Settings-for-a-Raw-Dataset), the options you will use are on the Data ...

Import Data from Google BigQuery

BigQuery is a scalable, managed data warehouse tool available from Google (see https://cloud.google.com/bigquery for more information). The first step to import data from BigQuery, as with any external connection, is to create the data connection in LityxIQ. See https://support.lityxiq.com/277108-Data-Connections for more information on creating and managing external data connections. Note that when setting the connection settings for BigQuery connections, you will need to know the BigQuer...

Batch Import Files

LityxIQ has the ability to auto-load external data in batch. For example, it can be setup so that any file in a particular folder on an FTP site will be loaded once they are placed there, and continue to do so in an automated, ongoing manner. This document will describe the settings specific to setting up a batch loading process. Other settings related to the settings for a Raw Dataset are found here: https://support.lityxiq.com/261835-Define-the-Source-Settings-for-a-Raw-Dataset. This will help...

Define the Source Settings for a Raw Dataset

To import a file or other external data into LityxIQ (to create a LityxIQ dataset), you must define the source settings for the dataset. The dataset in LityxIQ is referred to as a Raw Dataset because it points to and imports data from an external raw data source. That data source may be a file on an FTP site or S3, a database connection, a file uploaded directly into LityxIQ using the File Manager (see https://support.lityxiq.com/125826-Uploading-a-File), or data in a CRM system (among other pos...

Import Data into a Raw Dataset

To import or load data into a raw dataset manually, follow these steps after defining the settings (see https://support.lityxiq.com/261835-Define-the-Source-Settings-for-a-Raw-Dataset). 1) Select the dataset in the Available Datasets list. 2) Click Load Data -> Load Now from the Selected Dataset menu. 3) You will be asked to confirm. Click Yes to confirm or No to change your mind. 4) The import process will begin shortly. To watch the loading process and any related messages, o...

Create New Dataset

To create a new dataset in LityxIQ, follow these steps: 1) Within Data Manager, navigate to the Manage Data section, select the dataset library in which you want to create the new dataset, and then click the Create New Dataset dropdown. Select Raw Data Source, Derived Dataset, or Manual Dataset, depending on the type you want to create. Each option is described in more detail separately. 2) Regardless of the type of dataset, a dialog is displayed that lets you give a name to the datase...

Creating a Derived Dataset

To create a new derived dataset in LityxIQ, follow these steps: 1) In the Data Manager, click the Create New Dataset dropdown menu and select Derived Dataset. 2) The New Derived Dataset dialog box will appear. Enter the name you would like to give to the dataset, and optionally provide a description of the dataset. These will both appear in the list of datasets. The name you enter must not be the name of an already existing dataset in the active project. You can also specify the library ...

Creating Datasets - Advanced Tab

The Advanced tab contains options that are similar both when importing raw data (in the Define Dataset Source dialog), or when executing a derived dataset (in the Finalize and QC dialog). The options are described here. Sort and Join Keys - Select the field(s) that are most likely to be used in joins or aggregations in future operations with this dataset. The most common of these can be dragged to the top of the list. Making these selections will help with performance of data operations. Fil...

Previewing a Raw Dataset Source

The Preview area in a raw dataset source definition allows you to take a look at the underlying dataset before importing it. To generate a data preview, first setup the all of the settings correctly, then click the Preview Data Source button. This process will retrieve the first 100 rows of data and display them in the preview window. You can scroll up/down and right/left to see more data. When Excel or Google Sheets is the selected file type, the View Worksheet Names button becomes availab...

View and Edit a Dataset Dictionary

A dataset's dictionary in LityxIQ is a list of all the variables (fields) in the dataset, their names, and for each variable, the type of data it contains. For datasets created from raw data sources, the dictionary is most often created automatically by LityxIQ by analyzing the data source and automatically determining the variables and data types. However, you always have the ability to change the dataset dictionary. This includes changing variable names and data types. For Derived Datasets,...

Import Data from an Excel File or Google Sheets

An Excel file is a common format for holding data in rows and columns. The data is held in a spreadsheet, often within a workbook of multiple spreadsheets. Typically, this spreadsheet format is only used for relatively small datasets. This document will explain the options available for importing data from Excel sheets into LityxIQ. Note that this same approach and discussion holds for Google Sheets data as well. A spreadsheet is a file, so it can be stored and retrieved from a variety of types...

Import Data from a SQL Database Table

A SQL database table is a structured dataset with rows and columns. Many vendors provide SQL-style database compatibility, including SQL Server, Oracle, PostgreSQL, and Amazon Redshift. Data contained in these databases can be wide-ranging, including transactional data, prospect or CRM data, or Big Data. This document will explain how to import data from a SQL database table into LityxIQ. In order to access an external database, you first need to setup a Data Connection in LityxIQ. See https://...

Import a Delimited Text File

A delimited text file, sometimes called a "csv" file or a flat file, is a very common method for encapsulating a dataset. Almost all systems or other file formats have a way to export data into delimited files, making it an easy and simple way to exchange files between systems. However, despite there being fairly strict standards in place for how a delimited file is to be created, there are differences from one system to the next that require options to be set and sometimes cause issues. This do...

Import Data from Snowflake

If you are having trouble with Snowflake (https://support.lityxiq.com/189302-Trouble-with-Snowflake) Snowflake is a cloud-based highly scalable database (see http://www.snowflake.com for more information). The first step, as with any external connection, is to create the data connection in LityxIQ. See https://support.lityxiq.com/277108-Data-Connections for more information on creating and managing external data connections. Note that when setting the connection settings for Snowflake connect...

Import a SAS, SPSS, or Stata Dataset File

Users of LityxIQ may have datasets that were originally created or stored by an outside statistical processing system, such as SAS, SPSS, or Stata, and may want to use this data directly in LityxIQ without having to use an outside tool to covert it to an appropriate format. LityxIQ allows you to import files in these formats directly into a LityxIQ dataset. The file types supported are: - SAS datasets (.sas7dat files) - SAS Xport Files (.xpt files) - SPSS datasets (.sav files) - Stata...