Creating a network from a table of entities and their attributes

2017-02-22

last modified: 2023-04-10

== !

gephi logo 2010 transparent

   

Gephi workshops

I organize online workshops and personalized trainings for Gephi, for beginners and experts. To schedule one or to get more information: analysis@exploreyourdata.com.

Update

This plugin has been replaced by a new and better function available on the web, developed by the same author. It is free, click and point, without registration at this address: https://nocodefunctions.com

Presentation of the plugin

This plugin is created by Clement Levallois.

It converts a spreadsheet or a csv file into a network.

This plugin enables you to:

  • Start from a data table in Excel or csv format

  • In the data table, nodes are the entities listed in column A

  • Nodes' attributes must be listed in columns B, C, D, etc.

  • Connections will be created between nodes, when they have identical attributes.

  • Attributes can have values, stored in columns right next to the attribute.

1. The input

An Excel file
Figure 1. An Excel file

   

2. The output

result
Figure 2. Resulting network

   

Installing the plugin

Choose the menu Tools then Plugins
Figure 3. Choose the menu Tools then Plugins

   

Click on the tab Available Plugins
Figure 4. Click on the tab Available Plugins

   

Install the plugin Similarity Computer then restart Gephi
Figure 5. Install the plugin Similarity Computer then restart Gephi

   

Opening the plugin

Open the plugin via the menu File   Import
Figure 6. Open the plugin via the menu File - Import

   

Using the plugin

First panel

Select a file
Figure 7. Select a file

   

file without header en
Figure 8. A file without headers

   

file with header en
Figure 9. A file with headers

   

Second panel

plugin 4 en
Figure 10. Parameter for weight

   

Third panel

plugin 5 en
Figure 11. Confirmation panel

   

How is the similarity computed, exactly?

We use the cosine similarity. Sounds complicated, but it is not. Check here.

The source code for the cosine calculation is in this file, at this place.

FAQ / special notes on the plugin

1. Excel files should be .xlsx, not .xls

Because they represent two slightly different files formats, and the plugin supports only .xlsx

2. csv files are ok.

If you select a csv file, you will be asked to indicate the field delimiter and optionally the text delimiter.

plugin 6 en
Figure 12. When a csv file is selected

   

3. You can’t use numerical values in the attributes

numerical attributes en
Figure 13. Age is a numerical attribute

   

This is too bad. If there is enough demand for it I’ll add this feature, which is not trivial.

4. Each entity should appear only on one line

plugin 7 en
Figure 14. An entity appearing twice

   

David appears on lines 2 and 5 (because he made two purchases). Only the latest line where David appears (line 5) will be taken into account.

to go further

Visit the Gephi group on Facebook to get help,

Give a try to nocodefunctions.com, the web application I develop to create networks for Gephi. Click-and-point, free, no registration needed.

site
    stats