Fuse your data

Data fusion is the process of connecting siloed, complex and heterogeneous datasets to uncover valuable relationships and insights. Conode's Data Fusion Agent can connect all types of data in a unified, navigable structure.

How does Conode achieve seamless data fusion

Enterprise data often reside in siloed systems, come in heterogenous structures, and contain semantic mismatches which traditional integration methods struggle to address. Our AI agent is able to overcome these challenges by identifying overlaps between different and sometimes even within the same dataset(s), resolving these inconsistencies, and links relationships between them to produce a unified graph on which to begin analyses.

Auto Schema Fusion

To speed up the process of fusing structured table data sources, Conode is able to automatically fuse data that is imported from Postgres database connection. Once uploaded, Conode examines all the nodes and merge ones with identical labels, eliminating duplicates among their successors. The result is one unified dataset, devoid of any redundant variables.

Auto-Fuse Example

In the video below we see a dataset of reports to the police about personal-injury collisions that consists of 3 different, separate tables -casualities, vehicles, and collisions. There are a few shared features (also known as headers in the tables) such as accident_year between the datasets, and Conode has conveniently simplified this for us by identifying and fusing them to be the same across all 3 tables.

Group VS Merge

Group creates additional predecessor nodes that are used to categorise the input nodes.
Merge only acts on the nodes in your selection, and reduces the number of nodes after having combined similar ones together.

In short, grouping organises nodes based on shared characteristics, while merging combines distinct nodes into a single representation, eliminating redundancies.

Group your Data

Data can be grouped in three ways:

Group By Meaning Finds common themes between node labels and group them by proposed categories.

For example if given nodes “Apple”, “Orange”, “Oak”, “Maple“ it would likely identify two groups: “Fruit Types” and “Tree Types”, and would add these 2 groups as features to a new view.

Group By Identical Label Finds and group nodes with the exact same labels.

String matching will take into consideration punctuation, numerical characters, capitalisation and white spaces, so “Conode” and “conode”, will not be grouped together.

Group By Position Clusters nodes based on their spatial position in the view.

When the number of groups is specified in the Advanced Settings section, the clustering agent will use a K-Means approach, but if no number is specified, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is used.

Merge your Data

Merging reduces the nodes in your selection to a single node, which contains the combined properties of your selection -imagine 2 car lanes merging onto 1, now all the cars from both lanes are on this single lane.

Merge All Nodes Combines all selected nodes into a single node.

The label for the representative node becomes a combination of the labels of all the selected nodes. Note that the successors of the selected nodes get combined under this new single representative node.

Merge By Label Combines all nodes with identical labels.

Remember that merging only takes place on the nodes that you select, so if there are any hierarchical relations that you want to preserve, ensure that you do not select the predecessor node. The video below shows how merging results differ with slightly different selections.

Merge By Group Combines all successor nodes into their predecessor.

Even if the nodes in the group do not have the same labels, as long as they are connected to a predecessor node, they will be ‘absorbed’ into it with a merge-by-group action, and the representative node will no longer have its successor nodes.

Fusing in Action

You might find it useful to use both Group and Merge in succession, especially when there's duplicates in your data. In this example we’ve managed to reduce the number of scientific paper topics from 79 to 22 with this method.

Last update: 2025-04-15