Skip to content

Extract features

Feature Extraction Agent

Think of the Extract Agent as your go-to solution for extracting insights from your unstructured data. This powerful agent will pull structured features from all types of data, whether its textual, JSON, image, or audio data (mining information from videos is in the works).

With the Extract Agent, you can expect to:

  • Conduct sentiment analysis
  • Perform ontology extraction
  • Have full control over entity resolution

Featurizer

With the Extract Agent being so popular among Conode users, we've coined a flashy name for it: the Featurizer!

How to effectively use the Featurizer

The Featurizer enriches your knowledge graph, helping you get more out of your data. Using it effectively follows three main steps -Input, Prompt, Extract.

1. Input

Supply the nodes you want to analyse into the agent by highlighting them, then dragging and dropping or clicking the green button.

  • The nodes you have directly input will be used by the Featurizer.
  • If the Consider the successors box is checked, the added nodes are interpreted as header/feature nodes, and the agent instead looks to act on their successors.
  • If your nodes contain image or audio data and you'd like features to be extracted from them, check the Use node content box.
Inputting Examples

In this video example, we select and use email (represented as) nodes from a specific time period as input, whereas if we want to look at all of the senders’ email content, we simply input the header nodes -senders’ email addresses, and check the Consider the successors box. In the latter part of the video, we’ve got nodes on a spatial view that contain images and instruct the agent to use node contents by checking the box.

2. Prompt

Simply tell the Featurizer what you want. Type in what you’d like to see and the agent will think, then formulate a reply containing a preview of the features it thinks best answers your prompt.

You can either accept it as is, or continue to chat to refine certain features until you’re satisfied with what you’d like extracted.

Conode’s AI Agents allow for complete transparency and control over what and how you’d like to extract from your data.

3. Extract

Hit the Extract Features button and let the Featurizer handle the rest.

Understanding the results of the featurizer

Upon a successful extraction, a view of the results will appear. A No Features returned node means that the nodes connected to it did not satisfy the given prompt, in which case you can remove them from view.

Extracted schema structure

The graph schema of the results may vary slightly but generally follow this structure from left to right: header of the extracted features, the extracted features themselves, and the nodes from which the Featurizer extracted information from (nodes which we input into).

There are a variety of ways to visualise the results:

  • Horizontal layout. Highlight the extracted features and their successors, then select LayoutHorizontal.

  • In a table. Highlight the extracted features nodes then select New ViewTable.

  • Force-directed layout. Highlight the extracted features and their successors, then select LayoutForce-directed.

Tidying up the results view

You can either remove the header node or add the feature and data nodes into a new view. Selecting a layout function then acts on all the nodes in that view, eliminating the need to highlight specific nodes every time you’d like to want to visualise the results.

Examples of the Featurizer in action

Featurizing textual data from Enron Corpus emails to extract sentiment

The agent returned a sentiment feature that will only classify the emails into ‘postitive, ‘negative’ and ‘neutral’, but we’d like it to pull out more emotions, so we refine the prompt further.

See how we used the agent to conduct a forensic investigation on the Enron Corpus to identify evidence of fraud here ↗.

Featurizing textual data from airline reviews to extract reasons for poor experience

See how we used the agent to conduct a forensic investigation on the Enron Corpus to identify evidence of fraud here ↗.

Featurizing visual data from manual image collection to extract road infrastructure features

Here we collected images within 2 areas in Cambrige, and prompt the agent to look at them through node contents, and extract features that we’re interested in for AV safety. We then view the results in a table.


Feature Describer

The Describer tells you the predecessor nodes that best “describe” your selection of nodes. By default, this description is by comparison to the other “background” nodes in the active view; open the settings by clicking the button in the bottom right for more advanced options.

Describing predecessors are presented in a table alongside their importance: large values of importance, whether positive or negative, indicate that a given predecessor is a good description of the selection. Positive importance values indicate that the predecessor is likely to connect to the selection of nodes or does so with a greater edge weight than the background nodes, negative values indicate the converse.

To see all of the describing predecessors and their importances in a view, click “Create View”.


Ontology Extraction

An ontology is the formal definition of categories; of entities and their relations. Conode’s Ontology Extraction Agent emulates this by creating a knowledge graph structure from unstructured text in just minutes. By matching a pre-defined schema to your input data, our agent facilitates organisation of entities for subsequent structured analysis.

The Ontology Extraction Agent is dynamic and follows a schema-on-read approach to let you build, refine, and extend ontologies. This works with pdfs, emails, and any other textual data.

Defining the Ontology

  1. Open a new view and name it Ontology.

    This is where you will construct a subgraph that defines the schema of what you want extracted from any data you input.

  2. Create an ontology header node.

    This node serves as the header and predecessor for all features.

  3. Add however many nodes you’d like and rename them to reflect the features you’d like extracted from the data.

    Draw a connecting edge between all the nodes you created in this step to the ontology header node, ensuring that they are successors of that 1 node.

  4. Encode relationships between the node entities in step 3 as successors or predecessors by drawing connecting edges.

How is Ontology Extraction (OE) different from Feature Extraction (FE)?

→ In OE, you define the features you want extracted first, and these will always be the features extracted unless you change the ontology. In FE, you specify the features you want extracted in natural language, refine it as many times as you’d like until satisfied and then extract.

  • Think of it this way: you’ve got some dough, and you only need to make 1 thing -a bunny-shaped cookie, so you create a bunny-shaped stencil, and whatever dough you input, you get the same result. Come December, you now want to add a santa hat on this bunny, all you have to do is simply improve your stencil to include this. That is OE, where stencil = ontology, dough = data, cookie = extracted features.

  • Using the same analogy for FE, you don’t have a prepared stencil. Whatever order request that comes through -gingerbread, star, heart, the agent makes the stencil on the spot. You check that you’re happy with it before inputting the dough.

→ Another key difference between OE and FE lies in the relationships between the extracted nodes.

  • In FE, all extracted features exist individually to their connected successors (data points). Whereas in OE, because the relationship between the features has been defined prior, you can explore the connections between features.

Extracting Features using the Ontology

  1. Input the ontology header node to the appropriate Ontology header box. This informs the agent on what defined structure to follow when extracting.

  2. Input the nodes which contain unstructured textual data into the Nodes the consider box.

  3. Hit Extract and let the Ontology Agent handle the rest.

  4. View the results in a force-directed layout to identify relationships between data points and features.


Last update: 2025-04-23