In LityxIQ, data is stored in what is referred to as Datasets. Just like most analytic applications, LityxIQ stores data in its own special format which is fully maintained on the backend. In the case of LityxIQ, this format is a high speed data structure that supports big data and fast parallel operations. As in other applications, a LityxIQ dataset can be thought of as a table with rows and columns, where rows generally represent different observations (e.g., customers, transactions, products) and columns generally represent the different pieces of information available for each observation (e.g., gender, zip code, last purchase date).
Datasets come in different flavors in LityxIQ. However, in all cases, they are represented in the same way as rows and columns.
- A Raw dataset is one created directly from an external source, such as a file uploaded to the File Manager (see https://support.lityxiq.com/374367-Using-the-File-Manager), a file from an FTP site, or a database connection.
- A Derived Dataset is a dataset derived from one or multiple other datasets using techniques like joins, creating new fields, or aggregating.
- An Internal dataset is one created by LityxIQ. An example of an internal dataset is a Scoring Catalog created by Predict for storing results of scoring jobs. Generally, you cannot edit internal datasets, but you can create other datasets based on them.
There are three steps to creating a new (raw) dataset:
- Upload a file using the File Manager, or create an external connection such as to an FTPS/SFTP site, a SQL or Big Data database connection, Amazon S3, a CRM system such as Google Analytics, or other connection.
- Create and name the new dataset in LityxIQ and define its settings and options.
- Load data into the dataset by executing the dataset. This can be done manually or automatically Upon Data Refresh.