Home
Discover high variance
We've developed a unique way to group data together to show which rows or columns have similar variance.

Script Overview

The script in the following section was recorded while using Author mode as explained in the first tutorial section.  The following notes describe what each important command was doing.

  • AddData allows you to load a CSV or nextanalytics optimized data file.  As an alternative, you can always load data from a SQL query.
  • ShowAsNumberStandardDeviations convert the cells into their distance from average.  You can choose whether the average is calculated down the column, across the row, or from the whole page.
  • SwapCellValues is what groups all the cells of the same value together. You can choose whether or not you want to group the row labels or column labels.
  • HidePage is handy for when you have an intermediate page which you don't think would be useful if it were visible to an end user. 
  • Select is a way to select rows or columns. In this example, the first and last row were selected. 
  • KeepSelected is a technique for removing anything that wasn't selected. You have the choice of having it perform only on rows, or on columns, or both.
In the following script, PageCaption and PageDescription commands provide "business context" for each page that being displayed.
Script
  1. adddata , actuals.csv
  2. PageCaption , "This is sample sales data"
  3. PageDescription , "We're going to show a very powerful technique for identifying outliers."
  4. ;--------------
  5. ShowAsNumberStandardDeviations,Column
  6. PageCaption , Distance from Column Average
  7. PageDescription , "Negative values are below average, positive are above. Although we chose down the columns, we could have also measured it across the row, or for the whole page."
  8. ;--------------
  9. SwapCellValues,Row
  10. PageCaption , All Outliers
  11. PageDescription , "This is too verbose to be shown by default so we're hiding it"
  12. Select,row,first
  13. Select,row,last
  14. hidepage
  15. ;--------------
  16. Keepselected,true,false
  17. PageCaption , Top and Bottom Outliers
  18. PageDescription , We used a simple technique to keep only the first and last row.  These are the most extreme outliers. Negative outliers are below average, and positive outliers were above average. Only the top and bottom were kept. The resulting page shows who they are and in which period their high variance occurred.
  19. newrowlabel , "Lowest and Highest Outlier Values"

Results

This is sample sales data
We're going to show a very powerful technique for identifying outliers.
Name Region Position 01/06 02/06 03/06 04/06 05/06 06/06 07/06 08/06 09/06 10/06 11/06 12/06
Ted Burton Central Account Manager 33,900 62,100 106,200 37,100 115,600 151,700 58,500 102,400 177,700 83,000 133,600 210,400
Nell Gordon Central Sr. Account Manager 87,200 172,700 300,100 85,700 49,400 47,500 68,700 11,400 123,200 109,200 202,100 152,400
Bonnie Mills Central Account Manager 25,800 75,100 77,300 71,500 32,900 14,700 44,000 13,400 227,000 98,400 120,000 173,800
Juan Perez Central Account Manager 25,000 32,900 117,600 67,900 104,600 62,700 55,600 54,800 18,000 18,500 109,600 233,000
Jonathan Ryan Central Account Representative 0 0 0 0 0 0 16,600 32,000 34,800 6,700 4,800 84,800
Marcus Saul Central Account Manager 25,000 46,000 22,500 25,800 76,400 199,200 11,100 67,800 203,300 44,100 65,300 193,500
Jerry Wong Central Sr. Account Manager 99,000 159,700 63,200 7,200 169,700 286,200 83,700 231,600 251,700 155,600 208,000 314,900
Will Hill Federal Account Representative 16,200 15,700 22,300 13,500 43,100 29,900 11,900 46,600 44,200 32,000 6,100 122,600
Gil Jacobs Federal Account Manager 57,600 42,500 88,200 23,600 74,700 186,400 66,400 113,700 214,900 18,300 142,700 198,700
Larry Kelley Federal Account Manager 1,200 22,900 97,400 68,700 99,200 38,100 86,900 107,600 31,700 63,900 109,500 203,800
Distance from Column Average
Negative values are below average, positive are above. Although we chose down the columns, we could have also measured it across the row, or for the whole page.
Name Region Position 01/06 02/06 03/06 04/06 05/06 06/06 07/06 08/06 09/06 10/06 11/06 12/06
Ted Burton Central Account Manager 0 0 0 0 1 0 0 0 1 1 1 1
Nell Gordon Central Sr. Account Manager 2 2 3 1 -1 -1 1 -1 0 1 2 0
Bonnie Mills Central Account Manager 0 0 0 1 -1 -1 0 -1 1 1 1 0
Juan Perez Central Account Manager 0 -1 0 1 1 -1 0 0 -1 -1 1 1
Jonathan Ryan Central Account Representative -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 -1 -1
Marcus Saul Central Account Manager 0 0 -1 0 0 1 -1 0 1 0 0 1
Jerry Wong Central Sr. Account Manager 2 2 0 -1 2 2 1 2 2 3 2 2
Will Hill Federal Account Representative -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0
Gil Jacobs Federal Account Manager 1 0 0 0 0 1 1 0 1 -1 1 1
Larry Kelley Federal Account Manager -1 -1 0 1 0 -1 1 0 -1 0 1 1
Top and Bottom Outliers
We used a simple technique to keep only the first and last row. These are the most extreme outliers. Negative outliers are below average, and positive outliers were above average. Only the top and bottom were kept. The resulting page shows who they are and in which period their high variance occurred.
Lowest and Highest Outlier Values 01/06 02/06 03/06 04/06 05/06 06/06 07/06 08/06 09/06 10/06 11/06 12/06
-2 Jonathan Ryan; Ashok Chopra Larry Jones; Mike Li
3 Stephan Lauzon Serge Roy Nell Gordon; Ricki Murphy; Mike Li Priscilla Morris; Anh Nguyen Larry Jones Jerry Wong Stephan Lauzon
 
< Prev