Clean your data
Transform Numeric Features
When dealing with numeric variables, you have the ability to manipulate the values of their outgoing
edges through operations such as Add
, Multiply
, Subtract
, and Divide
,
using either another numeric feature or a numerical value.
Example 1
Consider a dataset detailing product features, including a column labeled "box width [m]".
To convert these units into centimeters, simply select the corresponding node and apply
the Multiply
operation with a Numeric
Value of 100.
Example 2
Suppose you've just imported a dataset containing customer orders, with two numeric features
in your taxonomy labeled "orders of product 1" and "orders of product 2".
By selecting these nodes and utilizing the Add
transformation with Headers
,
you can create a new node indicating the total number of product orders per customer.
Transform Node Labels
You can modify node labels using simple pattern matching in the following transformations:
Replace
The Replace agent allows you to update portions of all your selected nodes labels using pattern matching.
From
: Specify the segment of the label you wish to update.To
: Define the replacement for the segment specified in "From".
Split
The Split agent generates a new set of nodes, each containing a portion of the labels from your current selection, split based on the input provided in the Value field.
Transform Feature Types
You can modify the data types of features within your taxonomy using the Convert Nodes
agent located in the Transformations drawer.
String Transformation:
To convert a numeric or datetime variable into a categorical variable, select the corresponding feature node and apply a string transformation. This action converts the outgoing edges from the header node into intermediate nodes, with labels representing the weight of the edges.
For instance, imagine transforming a metric like "maximum vehicle speed," which is directly connected to nodes representing collision events with edge weights equivalent to the vehicle's speed during those events. After transformation the same header node "maximum vehicle speed" links to a new set of nodes with labels reading the vehicle speeds. These nodes, in turn, connect to the event nodes.
Numerical Transformation:
You can convert a categorical variable into a numerical variable when all categorical values stored in node labels are numbers.
For example, consider a categorical header "number of items purchased," linked to three intermediate nodes labeled "10," "5," and "8". These nodes connect to data nodes representing rows in your database. Upon applying a numerical transformation, the "number of items purchased" header node will directly connect to the data nodes with edge weights of 10, 5, and 8 respectively.
Note that for all transformations, enabling "In Place" applies changes directly to your current selection of nodes.
When the "In Place" box is unchecked, a new set of nodes will be created for each transformation and the new header node will be stored in the Home view under a "transformed_columns" header.
Cleaning using the Feature Extractor Agent
Note that the Feature Extractor Agent can also be used to clean your taxonomy via natural language queries such as "Remove the punctuation from the labels of my selected nodes". New feature nodes will always be created in this case.
Propagate
The Propagate tool can be used to transfer labels, edges, and URLs from one group of nodes to another, provided they are connected by edges and intermediary nodes.
Source
: Nodes containing the URL label or the edge intended for relocation.Target
: Nodes set to receive the updated label, URL, or edges.Header
: Required for edge propagation, these nodes possess outgoing edges intended for transfer.
Note that Propagate
can only transfer properties between nodes which are at a maximum of four hops
(edge + node pair) away in the graph.
Delete Nodes
To delete nodes, simply select the nodes you wish to delete then use the Delete Nodes
option located in the Node
menu along the top tool bar or towards the bottom of your context menu