in reply to How to compare two undirected graphs?

Your description is confusing.

Comparing the "shape" sound like a isomorphism test. In this case be aware that the "graph isomorphism problem" is NP-complete.

Then you say your nodes are already named and have "children". General graphs have no children they have neighbours. So maybe you are speaking of trees or lattices (which have a bottom and a top element).

But names and other properties should simplify the comparison it can be easier to normalize both graph according those weights and consequently ordering the nodes.

This depends on the memory representation, e.g. when taking an incidence matrix of both graphs you can order rows and columns according to the associated weights (e.g. first row/column representing the "top node" and so on). Graphs with the same incidence matrix after permutation are isomorphic.

Of course this becomes more complicated when having a nontrivial automorphism group in large graphs which is the reason for the mentioned NP-completeness.

In short there is no general fast solution for arbitrary graphs when they get big, but "a number of important special cases of the graph isomorphism problem have efficient, polynomial-time solutions"

Cheers Rolf

UPDATE:added links

  • Comment on Re: How to compare two undirected graphs?

Replies are listed 'Best First'.
Re^2: How to compare two undirected graphs?
by Anonymous Monk on Oct 18, 2010 at 17:06 UTC

    Thank you, I didn't even know how to really properly describe the algorithm trouble I was having, but yes, I am trying to find the isomorphism between two undirected graphs. Apologies about my poor description of the data structure; let me try to give a little more information.

    Essentially, what I'm trying to do is to see if two transistor schematics are identical. I have a large, several thousand device netlist that I'm trying to simplify by identifying structures (the majority of which are called "static gates", which are like NAND gates, XOR gates, etc) and then abstracting those away into a sub-component. I guess a conceptually similar way to think of the problem is if you were working on a several-thousand line procedural Perl script, and say you had some incredible code to identify bits of code that, say, "this bit reads in data from a file" or "this bit encodes a string", and you were able to pull out those chunks of code and replace them with a call to a function.

    Now, I have code that can identify a group of devices as a static gate, but now I'd like now is to make sure that I don't have a dozen instantiations of, say, a dozen different 2-input NAND gates that are only separate because their outputs are named N1, N2, N3, etc. What I'd like is to be able to say, "Oh, these two static gates are both NAND gates, so I'm gonna throw out the second one and make sure they're both instantiations of the same gate."

    I can separate each static gate into two halves, so that, say in a really easy case, the top half of one might look like this:

    VDD | |------| | | M1| | | M3| X | | | M2| | | | |------| | OUT

    That is to say, the edges are the transistors themselves, and the vertices are the source/drain of the transistor. Unfortunately, although this looks like a tree shape, it actually isn't because two paths may be in parallel, e.g. that they would have the same parent and child nodes. The actual names of the edges and vertices don't actually matter to me, as they could be anything and arbitrary and certainly won't be the same - what does matter though is that the properties of all the edges (the W/L ratio) between two different gates match, and that the structure of two different gates match.

    Through your help on the wikipedia page, I was able to find this paper:

    Although I'm not really sure if my structure qualifies as a planar graph. It still *feels* like there should be a simpler solution here... maybe if I treated my structure as directed rather than undirected it would help cut down on complexity?

      Don't know much about transistor schematics.

      So your talking about identifying isomorphic subgraphs which are only connected via two nodes (IN and OUT) to the rest of the graph?

      As long as those subgraphs are not too big you can simply use a brute force normalization of the incidence matrix, i.e. trying all permutations until certain criterias are optimized.

      Sorry anything else is IMHO far too complicated for perlmonks, this doesn't only fill bookshelves it fills international conferences.

      Cheers Rolf

      If your structures can be etched in a single layer onto the surface of a silicon wafer, it is almost certainly planer.