Once you load or create a network in SocNetV, you may use the options in the
Analysis menu to analyse it.
The first option in the Analysis menu is (Symmetry Test).
It reports whether the network is symmetric or not.
A network is called "symmetric" if for every edge (i,j) in the set E of the
corresponding graph G(V,E) , the 'opposite' (j,i) edge also exists in E.
In other words, when the adjacency matrix is symmetric.
The next options in the Analysis menu (Distance, Average Distance, Distances Matrix, Diameter) etc focus on basic network/graph measures, such as the geodesic distance between nodes, the mean distance between all nodes, the diameter of the graph, the number of geodesics between nodes and the eccentricity of each node. Each option is explained below.
In graph theory, the shortest path between two vertices of the graph
is called "geodesic".
The distance (or geodesic distance) of two nodes in a social network is the
length of the shortest path between the corresponding vertices in the graph G(V,E).
By clicking on the "Distance" option (or pressing Ctrl+G) you will be asked for
source and target nodes.
Then their distance will be calculated and displayed.
The average distance in a social network is the average length of all shortest paths (geodesics) between the connected pairs of vertices in the corresponding graph.
The 'Distances Matrix' option calculates and displays a matrix of geodesic distances between all possible pair of nodes in the social network. A distances matrix is a n x n square matrix, in which the (i,j) element is the distance from node i to node j.
This option calculates and displays a n x n square matrix, where the (i,j) element is the number of geodesics between node i and node j. The produced matrix, called sigma matrix, is used in Centralities calculation (see below).
The eccentricity or association number of each node i is the largest geodesic
distance between node i and every other node in the graph.
Therefore, in social network analysis, the eccentricity reflects how far, at
most, is each node from every other node.
This index can be calculated in both graphs and digraphs but is usually best
suited for undirected graphs.
It can also be calculated in weighted graphs although the weight of each
edge (v,u) in E is always considered to be 1.
The diameter of a social network is the maximum eccentricity of any vertex in the corresponding graph G(V,E), that is the maximum distance between any two connected nodes.
Checks whether the network is a connected graph, a weakly connected digraph or
a disconnected graph/digraph.
In graph theory, a graph is connected if there is a path between every pair of nodes.
A digraph is strongly connected if there the a path from i to j and
from j to i for all pair of nodes (i,j).
A digraph is weakly connected if at least a pair of nodes are joined by a semipath.
A digraph or a graph is disconnected if at least one node is isolate.
Clicking this option asks for a desired walk length (max: n-1).
Then SocNetV calculates and displays a square matrix where each element (i,j)
is the number of walks of the given length between the corresponding pair of
nodes i and j.
A walk is a sequence of alternating vertices and edges such as
v0e1, v1e2, v2e3, …, ekvk,
where each edge, ei is defined as ei = {vi-1, vi}.
This function calculates the number of walks of the given length between
each pair of nodes, by studying the powers of the sociomatrix.
Calculates and displays a (n x n) square matrix whose elements denote the
number of walks of any length between each pair of nodes.
The algorithm is based on the powers of the sociomatrix.
Please note that this function is VERY SLOW on large networks (n > 50),
since it will calculate all powers of the sociomatrix up to (n-1) in order to
find out all possible walks.
If you need to make a simple reachability test, we advise to use the
Reachability Matrix function instead.
Calculates the reachability matrix XR of the graph where each (i,j)
element is 1 if nodes i and j are reachable, otherwise is 0.
This function is based on the Distances Matrix; it checks whether the
corresponding element of the Distances matrix is not zero.
If it is not zero, then the nodes (i,j) are reachable and the XR
element is 1.
A triangle or a clique is a complete subgraph of three nodes of G. It is defined as delta(v) = |{{u, w} in E : {v, u} in E and {v, w} in E}|.
This method computes the number of triangles each node v belongs to and the total number of cliques present in the network.
Please note this method can be very slow. Its computational cost increases very rapidly in large and / or dense networks, it can run for hours or even days :)
The Clustering Coefficient of a node quantifies how close the node and its
neighbors are to being a clique.
This is used to determine whether a network is a small-world or not.
This option calculates and displays the clustering coefficients of all nodes.
For the ring lattice the clustering coeffient is $$ C(0)=\frac{3(K-2)}{4(K-1)} $$ tending to \( 3/4 \) as \( K\) grows, where K the mean degree.
Tip: All the basic network statistics, such as nodes, edges and density are displayed and automatically updated in the Analysis tab of the left dock in SocNetV main window.
By clicking the "Triad Census" menu option, SocNetV will examine each of the
triads present in the current network, and count how many of these belong to a
certain triad type.
Some background:
In any network of N actors, there are C(N,3) triads.
For instance, in a network of 6 actors there are C(4,3)=20 triads,
whereas in a network of 10 actors there are C(10,3)=60 triads.
In any case, though, there can be only sixteen different triad types
(isomophism classes).
Every one of the C(N,3) triads of a network must belong (be isomorphic) to one
of these sixteen types.
A Triad Census is a method which counts all the different types (classes) of
observed triads within a network.
The triad types are coded and labeled according to their number of mutual,
asymmetric and non-existent (null) dyads.
SocNetV follows the M-A-N labeling scheme, as described by Holland, Leinhardt
and Davis in their studies.
In the M-A-N scheme, each triad type has a label with four characters:
In the seven rows below, you can see all the sixteen triad types (classes).
Within each row, all the triad types have the same number of arcs present:
003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C 210 300
So, when you click on Triad Census menu option, SocNetV calculates and displays
a vector T of length 16.
Each vector element (Tu) is the frequency of each one triad type inside the
active network, i.e. T003 = 3.
Furthermore, the order of the elements of vector T is the same as the
aforementioned ordering of the triad types:
T = [ T003, T012, T102, T021D, T021U, T021C, T111D, T111U, T030T, T030C, T201, T120D, T120U, T120C, T210, T300 ]
Apparently, the sum of all these frequencies Tu is C(N,3).
The last option in the Analysis menu opens the "Centrality and Prestige" sub-menu.
Social network analysts use various metrics (measures or indices) to calculate how 'prominent' or important each actor (node) is inside a network (graph). For instance, we might want to know how important is a person inside her friendship network or how critical is a power station inside the power company grid...
Although there are various metrics, focusing on different graph notions and applying to different graph types, they are usually refered to as 'centralities' collectively.
SocNetV follows the conceptualization of prominence that Wasserman and Faust as well as Knoke and Burt use in their essays. To our understanding, all indeces attempt to measure the visibility, the importance or the "prominence" of each node. But we distinguish two types of prominence: Centrality and Prestige.
Centrality metrics attempt to quantify how central is each node inside the network and usually examine the ties attached to that node and its geodesic distances (shortest path lengths) to other nodes. Most Centrality indices were designed for undirected graphs (symmetric), where the relations are non-directional. For instance, SocNetV can calculate Betweenness, Closeness, Degree, Stress, Graph and Eccentricity centrality indices.
For digraphs, where the relations are directional, most centrality indices can also be calculated by focusing on "choices made" (or outEdges). But due to the nature of the directional relations in digraphs, the social networks researcher usually needs to measure the "prestige" (as known as status, rank or popularity) of each node, focusing on "choices received" by other nodes rather than "choices made" by that node. Prestige indices focus exactly on "choices received" to a node. These indices measure the nominations or ties to each node from all others (or inLinks). Thus, Prestige indices can only be calculated on directed graphs..
Centrality indices are calculated for each node (node Centrality) and for the whole network (group Centrality). Thus, when you click on a centrality option, SocNetV will calculate the corresponding index of every node and the whole network and it will display them in a new window (a small text editor). From there you can save the analysis into a text file of your choice. By default, analysis files are saved on bin/ subfolder.
In undirected graphs, the DC index of each node v is the number of edges attached to it.
In digraphs, the DC is the total number of arcs (outEdges) starting from v (outDegree).
In social network theory, DC is often considered a measure of actor activity.
Along with other metrics which are based on the notion of distance
(closeness, eccentricity etc), DC falls in the category of reachability measures.
This index can be calculated in both graphs and digraphs but is usually best suited for undirected graphs.
It can also be calculated in weighted graphs. In weighted relations, ODC is the sum of weights of all edges/outEdges attached to v.
To compute Degree Centralization (or Group Degree Centrality in SocNetV-speak),
we use the Freeman's formula for unvalued graphs.
Note: For valued or weighted graphs, we cannot calculate Group DC using Freeman's formula.
You can use DC variance or mean instead.
For each node u, the CC score is the inverse sum of geodesic distances from that node to
every other node.
This measure focuses on how close each node is to all other nodes in the network.
Nodes with high closeness centrality are those who can reach many other nodes in few steps.
The idea is that a node is more central if it can quickly interact with more of the others.
CC is also interpreted as the ability to access information through the "grapevine" of network members.
This index can be calculated in graphs and strongly connected digraphs
(that is, if there is a directed path from v to u for all nodes v and u in the graph).
If there are isolate nodes in the network, they are dropped by default.
In not strongly connected digraphs, the ordinary CC is undefined.
In that case, you can use the Influence Range Closeness Centrality index.
CC can also be calculated in weighted graphs although the value of each
edge (u,v) is always considered to be 1.
The maximum value of CC is 1/(N-1), when the node is adjacent to all others.
Thus the standardized CC index (CC') is calculated by (N-1) * CC.
Group CC is calculated using Freeman's general formula, in undirected graphs:
Note: As with all centrality indices in directed graphs, CC considers only outbound edges. If you want to analyze inbound edges, use Prestige indices, i.e. Proximity Prestige.
For each node u, IRCC is the standardized inverse average distance
between u and every other node reachable from it.
This improved CC index is optimized for graphs and directed graphs which
are not strongly connected. Unlike the ordinary CC, which is the inverted
sum of distances from node v to all others (thus undefined if a node is isolated
or the digraph is not strongly connected), the IRCC considers only distances from
node u to nodes in its influence range J (nodes reachable from u).
The IRCC formula used is the ratio of the fraction of nodes reachable by u
(|J|/(n-1)) to the average distance of these nodes from u sum(d(u,j))/|J|
For each node u, BC is the ratio of all geodesics between pairs of nodes which run through u.
It reflects how often that node lies on the geodesics between the other nodes of the network.
The BC score of each actor can be interpreted as a measure of potential control as
it quantifies just how much that actor acts as an intermediary to others.
An actor which lies between many others is assumed to have a higher likelihood
of being able to control information flow in the network.
In essence, BC assumes that communication in a network occurs along the shortest possible path, the geodesic.
It totally neglects the possibility of communication along non-geodesic paths between actors.
If you need a centrality index which considers all possible paths, use the Information Centrality (IC).
Note that betweenness centrality assumes that all geodesics have equal weight
or are equally likely to be chosen for the flow of information between any two nodes.
This is reasonable only on "regular" networks where all nodes have similar degrees.
On networks with significant degree variance you might want to try using IC instead.
Also, BC is very sensitive to network dynamics, i.e. it changes a lot when we add or remove
a vertex or an edge.
This index can be calculated in both graphs and digraphs but is usually best suited for undirected graphs.
It can also be calculated in weighted graphs although the weight of each edge (u,v) in E is always considered to be 1.
The SC of each node u is the total number of geodesics between all other nodes which run through u.
When one node falls on all geodesics between all the remaining (N-1) nodes,
then we have a star graph with maximum Stress Centrality.
This index reflects how often a node lies on the geodesics between other nodes.
This index can be calculated in both graphs and digraphs but is usually best suited for undirected graphs.
It can also be calculated in weighted graphs although the weight of each edge (v,u) in E is always considered to be 1
This index was introduced by Shimbel (1953).
For each node u, the Eccentricity Centrality is the inverse of the largest geodesic distance (u,v) to every other node v in the network.
Therefore, the EC score reflects how close is each node to every other node and therefore to the middle of the network.
Nodes with high EC score have short distances to other nodes in the graph and therefore are likely to be near the middle of the network.
Nodes with low EC score have longer distances to some other nodes in the graph, and therefore are most likely towards the "edge" of the network.
This index can be calculated in both graphs and digraphs but is usually best suited for undirected graphs.
It can also be calculated in weighted graphs although the weight of each edge (v,u) in E is always considered to be 1.
The EC is also known as Graph Centrality (Hage and Harary, 1995).
The Power Centrality (PC) is a a generalised degree centrality measure suggested by Gil and Schmidt.
For each node u, this index sums its degree (with weight 1), with the size of the
2nd-order neighbourhood (with weight 2), and in general, with the size of the kth
order neighbourhood (with weight k).
Thus, for each node u the most important other nodes are its immediate
neighbours and then in decreasing importance the nodes of the 2nd-order neighbourhood,
3rd-order neighbourhood etc.
For each node, the sum obtained is normalised by the total numbers of nodes in the same component minus 1.
This index can be calculated in both graphs and digraphs but is usually best suited
for undirected graphs.
It can also be calculated in weighted graphs although the weight of each edge
(u,v) in E is always considered to be 1 (therefore not considered).
The Information Centrality (IC) is an index suggested by Stephenson and Zelen (1989)
which focuses on how information might flow through many different paths.
Unlike SC and BC, the IC metric uses all paths between actors weighted by strength of tie and distance.
The IC' score is the standardized IC (IC divided by the sumIC) and can be seen as
the proportion of total information flow that is controlled by each actor.
Note that standard IC' values sum to unity, unlike most other centrality indices.
Since there is no known generalization of Stephenson & Zelen's theory for information centrality
to directional relations, the index should be calculated only for graphs and is
more meaningful in weighted graphs/networks.
Note: To compute this index, SocNetV drops all isolated nodes and symmetrizes
(if needed) the adjacency matrix even when the graph is directed (Wasserman & Faust, p. 196).
In order to calculate the IC index of each actor, we create a N x N matrix A from the (symmetrized) sociomatrix with:
Aii=1+weighted_degree_ni
Aij=1 if (i,j)=0
Aij=1-wij if (i,j)=wij
Next, we compute the inverse matrix of A, for instance C. Note that we can always compute C since the matrix A is always a diagonally strong matrix, hence it is always invertible.
Finally, IC is computed by the formula:
IC(i) - 1 / [ Cii + (T-2R)/ N]
where:
T is the trace of matrix C (the sum of diagonal elements) and
R is the sum of the elements of any row (since all rows of C have the same sum)
IC has a minimum value but not a maximum.
For each node u, this metric counts the number of inbound arcs at u.
The index is meaningful in directed graphs as a measure of the prestige of each node.
Thus it is called Degree Prestige (it is also known as InDegree Centrality).
Note that in undirected graphs, the DP index is identical to Degree Centrality.
Actors with higher DP are considered more prominent among others because
they receive more nominations or choices (they have larger inDegree).
The largest the index is, the more prestigious is the node/actor.
This index can be calculated only for unvalued or valued digraphs.
In weighted relations, DP is the sum of weights of all arcs/inLinks ending at node v.
In unvalued graphs, SocNetV computes Group Degree Prestige using the Freeman's formula .
Note: For valued or weighted graphs, we cannot calculate Group DC using Freeman's formula.
You can use DP variance or mean instead.
The PageRank Prestige is an importance ranking for each node based on the structure of its in-link.
The original PageRank algorithm, developed by Page and Brin (1997),
focuses on how nodes are connected to each other, treating each link from a node
as a citation/backlink/vote to another.
In essence, for each node PageRank counts all backlinks to it, but it does so
by not counting all links equally while it normalizes each link from a node by
the total number of links from it.
The PRP score of each node is calculated iteratively and corresponds to the
principal eigenvector of the normalized link matrix.
This index can be calculated in both graphs and digraphs but is usually best suited for directed graphs since it is a prestige measure.
It can also be calculated in weighted graphs.
In weighted relations, each backlink to a node u from another node v is considered
to have weight=1 but it is normalized by the sum of outEdges weights (outDegree) of v.
Therefore, nodes with high outLink weights give smaller percentage of their PR to node u.
The Proximity Prestige index measures how proximate a node u is to the nodes in its
influence domain I - the influence domain I of a node is the number of other nodes that can reach it.
Apparently, in this metric the proximity of each node u is based on distances to rather than distances from it.
To put it simply, in PP what matters is how close are all the other nodes to node u.
The algorithm takes the average distance to node u of all nodes in its influence domain,
standardizes it by multiplying with (N-1)/|I| and takes its reciprocal.
In essence, the formula SocNetV uses to calculate PP for every node u is the ratio
of the fraction of nodes that can reach node u, to the average distance of that nodes to u
(Wasserman & Faust, formula 5.25, p. 204):
PP = (I/(N-1))/(sum{d(v,u)}/I)
where the sum is over every node v in I.