| Iterative sorting, filtering and more |
|
In this example, we'll extract data from a 540,000-row database of U.S. city populations over a 44-year period, looking for the cities that have grown over the previous 20 years, and with populations over 1 million people, in states with more than 5 such cities. None of these criteria are complex concepts, and many business analytics scenarios have similar combinations of filtering, growth calculations, and sorting but many reporting solutions are incapable of performing the sequential operations required. With nextanalytics, we generate a simple script to solve a simple problem. The entire nextanalytics script, complete with database query, is included at the bottom of the page Querying the databaseFrom a 540,000-record database of U.S. city population statistics, we extract the state and city populations for each year available. Note that the year column is only two digits, and for 2000 through 2004, a simple SQL query returns single digits. While we could write a longer SQL query to reformat the years to 4-digits, we can also perform this type of change within nextanalytics and reduce the need for skilled SQL coding.
Swap the year to the columns to make a crosstabWith one command (script line 6), we swap the text column "year" to the columns. This is effectively a "pivot operation" because it creates a crosstab that has a column for every unique value contained in the year column. Please take note that this crosstab was created without a cube and there was no need to predefine metadata or or even know what the values were in the original column. Script lines 9 through 17 convert the column captions into 4-digit dates, ensure they are in the right order, and set a page title.
Filter 1 million plus and calculate 20-year growthIn one script command (line 20), we filter the data to keep only the cities with populations over 1 million in 2004. To calculate the 20-year growth, we compare each cell to one 20-columns to the left (line 24) and express the result as a percentage. As you can see by looking at the number of parameters in these two script lines, there are plenty of other filtering and comparison options available within nextanalytics, but have no fear -- our open source authoring tool porvides a wizard-like interface to make sense of it all. We give this page a title because we will come back to it. All intermediate steps within nextanalytics are kept as 'pages' in memory, and can be referenced freely from anywhere else in the script sequence.
Large, growing cities by stateAnother quick filter operation to keep only growing cities (with growth numbers above 100 percent), we then combine similar rows by state to get a count of the number of cities and, in this case, an average growth figure (line 33).
States with the most large growing citiesOne more filter operation to keep the counts of 5 or higher, and we have the states with the most large (>1 million population) growing (compared with 20-years ago) cities. To find out what those cities are, we go back to a previous page
Large growing cities in the states with the mostTo go back to a previous page, we simple reference the page title or caption (line 40). Then we perform another type of filtering operation (line 41) - filtering by the rows of another page, in this case, the Winning States page we just created. We keep only those rows that start with one of the states in the Winning States list. To make the results a little more useful, we apply some alert colors to the final page, showing red/green for numbers below/above the average of the entire page, showing the larger and smaller numbers on the page and making a few patterns more evident.
Summary and complete script listingAs you have seen here, it is not difficult to perform successive analytics operations with nextanalytics. The scenario here occurs daily within business - filtering sales growth figures, looking for shrinking manufacturing volumes, calculating the month-over-month expenses for cost centers that are over budget. These analytic concepts are not complicated, but typical SQL query and OLAP reporting tools make it difficult to get the information. Spreadsheets are frequently used, but they are labor-intensive and error-prone. With nextanalytics, sequential analytic steps are easily handled in simple scripts like this one.
Complete script listing
|
| < Prev | Next > |
|---|