I now have a graph that includes many nodes and weighted edges.
This graph is generated from a pandas data frame Here's the sample input dataframe
A B count
x y 3
x z 2
y x 5
y z 1
z x 1
The principle to sort the nodes is: when A goes to B (count) is bigger than B goes to A, then A is bigger than B. Otherwise, B is bigger than A. e.g. x to y is 3 and y to x is 5, so y is bigger than x.
Here's how I do it
I created a list to store the nodes that need to be rank. This is the output list. e.g. {X, Y, Z}
and create a dict to store all the relationships from the dataframe:
For example: ({'X': {'Y': 3, 'Z': 2}, {'Y': {'X': 5, 'Z': 1}, {'Z': {'X': 1, 'Y': 0}).
By looping through these relations in turn, high-level nodes are inserted into the previous space of low-level nodes, and the rest of the nodes are moved back one step.
The change of output list while the program running:
{X, Y, Z}
->{Y, X, Z}
->{Y, X, Z}
->{Y, X, Z}
->{Y, X, Z}
->{Y, X, Z}
Hence the output is {Y, X, Z}. This is the way I can imagine, but it is too heuristic. Is there any elegant method else? Can it identify a global order A>B>C from partial orders like A>B and B>C? Is this related to any famous problem?
IIUC, you can use pivot_table
get what you expect:
out = (df.pivot_table(index='A', columns='B', values='count', fill_value=0)
.agg(lambda s: s.drop(s.name).to_dict(), axis=1).to_dict())
print(out)
# Output
{'x': {'y': 3, 'z': 2},
'y': {'x': 5, 'z': 1},
'z': {'x': 1, 'y': 0}}