Skip to content

Calculate statistics

Summary Statistics

Quickly access key metrics of your data using the Statistics agent located on the right-side drawer. Select the numeric node(s) you desire and add to the agent. A preview list will instantly generate the following:

  • count: calculates the number of successors your selected node has
  • sum: calculates the total of all the outgoing edge weights
  • mean: calculates the average of the outgoing edge weights
  • std: calculates the standard deviation of the outgoing edge weights
  • min: returns the smallest value amongst the outgoing edge weights
  • min: returns the largest value amongst the outgoing edge weights

Select ‘Open As View’ to look at your node(s) statistics in a table view.

For instance, simply by selecting a numeric feature, say “Age” from a customer-related dataset, the resulting preview tells us:

  • from count: that there are 7043 recorded customers’ age
  • from sum: that the total of all the customers’ ages add up to 32568
  • from mean: that the average age of customers is 46.51
  • from std: 16.75, that the distribution of ages is spread out
  • from min and max: that the youngest customer is 19 and oldest is 80

Group By Statistics

The aggregate option allows you to group a numerical feature by a categorical one using one of the following functions Count, Percentage, Sum,Mean,Std,Min,Max. A new table view will be generated with the calculation results.

Example 1

We have a dataset containing airline reviews. To find how many reviews are from passengers in each seat type, we select Calculate: Count and Group By Category: Type_of_Travellers. This will return a table of the number of successors for each traveller category.

Example 2

Using the same dataset above, if we’d like to examine the total seat comfort rating from each seat type, we select Calculate: Sum, Of Numeric Feature Node: Seat Comfort and Group By Category: Type_of_Travellers. This will return the total comfort rating for each seat type.

The Transformations agent can handle more complicated statistics such as applying operators on features.


Last update: 2024-12-20