I have paired values in a csv file. None of the paired values is unique. I would like to break this large list into independent complete kits for further analysis.
To illustrate, my "megalist" looks like:
megalist = [['a', 'b'], ['a', 'd'], ['b', 'd'],['b', 'f'], ['r', 's'], ['t', 'r']...]
Most importantly, the output will save a list of pair values (i.e., it will not consolidate the values). Ideally, the result will eventually lead to different csv files for separate analysis later. For example, this megalist will:
completeset1 = [['a', 'b'], ['a', 'd'], ['b', 'd'], ['b', 'f']]
completeset2 = [['r', 's'], ['t', 'r']]
...
In the context of graph theory, I am trying to take a giant graph of mutually exclusive subgraphs (where the paired values are connected by vertices) and divide them into independent graphs that are more manageable. Thanks for any input!
1: , . !
import sys, csv
import networkx as nx
megalist = csv.reader(open('megalistfile.csv'), delimiter = '\t')
G = nx.Graph()
G.add_edges_from(megalist)
subgraphs = nx.connected_components(G)
output_file = open('subgraphs.txt','w')
for subgraph in subgraphs:
output_line = str(G.edges(subgraph)) + '\n'
output_file.write(output_line)
output_file.close()