Viewing Summary Statistics for a Dataset

When data is loaded into a dataset, LityxIQ will automatically create summary statistics for each variable.  The summary statistics can be viewed by following these steps:

1) Select the dataset in the available datasets list (it will highlight blue gray), then click Summary Statistics in the Selected Dataset menu or in the right click menu.



2) The summary statistics will appear in a new window.  If summary statistics are still in the process of being computed (for example, if the dataset just finished executing), you will see a partial listing of statistics currently available along with a warning message.  See below for descriptions of the information provided.  Click Done when finished.

The statistics provided include:

  • Minimum Value -  The minimum value in the dataset.  For character-based fields, this is the minimum value based on alphabetical ordering.
  • Mean Value - The average value in the dataset.  This is only provided for numeric fields.
  • Maximum Value -  The maximum value in the dataset.  For character-based fields, this is the maximum value based on alphabetical ordering.
  • Q1, Median, Q3 - the first and third quartiles and the median are shown for numeric fields only if this option is chosen in the dataset settings.  The settings are in the Advanced tab (for raw datasets it is in the main settings dialog, and for derived datasets this is in the Finalize/QC area).
  • St. Dev. - the standard deviation for the full dataset.  This is only provided for numeric fields.
  • Sum - the sum of all values in the dataset.  This is only provided for numeric fields.
  • Valid Values - the number of non-null values in the dataset.
  • Null Values -  the number of missing values in the field.
  • Zero Count - the number of values that are exactly zero in the dataset.  This is only provided for numeric fields.
  • Min/Max String Length - for non-numeric fields, these are the smallest and largest string lengths in the dataset.  Note that older datasets may not show this statistic.
  • # Unique Values -  This is the number of unique values across the datatset (not including a missing value).
  • Unique Values -  A listing of the unique values.  For integer, date, and string fields, this is only shown if there are 1000 or fewer unique values, and for decimal fields, only if there are 255 or fewer unique values.  In addition, this is not shown for string fields whose maximum string length is more than 255.

You can also click column headers to sort the summary statistics window based on that column, and use the actions button to export, refresh, or resize the window.

Click the X in the upper right corner to close the window.