Graph Isomorphism

Bijection between the Vertex Set of Two Graphs: Understanding Graph Isomorphism

In the often-overlooked yet fundamentally crucial realm of graph theory, the concept of an isomorphism of graphs G and H serves as a cornerstone, defining when two seemingly distinct graphical representations are, in fact, structurally identical. It's not about how they're drawn, how their vertices are labeled, or even what cosmic alignment brought them into being; it's about their intrinsic relational architecture.

Formally, an isomorphism between two graphs, G and H, is meticulously defined as a bijection (a one-to-one and onto mapping) between their respective vertex sets. This mapping, denoted as:

$f\colon V(G)\to V(H)$

ensures that for any two vertices, u and v, within the graph G, they are considered adjacent (connected by an edge) in G if and only if their corresponding images under the function f, specifically f(u) and f(v), are also adjacent in graph H. This stringent condition implies a perfect preservation of the adjacency relationship across the two graphs. This type of bijective mapping is frequently, and quite accurately, characterized as an "edge-preserving bijection," a descriptor that resonates deeply with the broader, more general understanding of an isomorphism as a function that preserves the fundamental structure between two mathematical objects. It's less about the superficial differences and more about the underlying blueprint.

Isomorphic Graphs, Automorphisms, and the Unsolved Problem

When such an isomorphism can be demonstrably established between two graphs, G and H, these graphs are then declared to be isomorphic. This structural equivalence is commonly symbolized by the notation:

$G\simeq H$

This symbol, a tilde, quietly asserts that G and H are, for all intents and purposes relevant to their graph-theoretic properties, the same graph. They might look different on paper, their vertices might have different names, but their internal connectivity is identical.

A particularly intriguing special case arises when the isomorphism is a mapping of a graph onto itself; that is, when G and H are, in fact, one and the same graph. In this scenario, the isomorphism takes on the specific designation of an automorphism of G. An automorphism reveals the symmetries inherent within a single graph, showing how its vertices can be rearranged while maintaining all original adjacencies. It's a self-reflection, a glimpse into the graph's internal balance.

The concept of graph isomorphism fundamentally establishes an equivalence relation among graphs. Like any respectable equivalence relation, it diligently partitions the vast class of all possible graphs into distinct equivalence classes. Each of these equivalence classes comprises a collection of graphs that are all isomorphic to one another, hence earning the moniker of an isomorphism class of graphs. Within any given isomorphism class, every graph shares the exact same structural properties.

Despite the seemingly straightforward definition, the computational challenge of determining whether graph isomorphism can be identified in polynomial time remains one of the most significant and persistently unsolved problems in the sprawling landscape of computer science. This enduring enigma is famously known as the graph isomorphism problem. It’s a recurring thorn in the side of theoretical computer scientists, a testament to the fact that sometimes, even what appears simple can hide profound complexities. [1] [2]

To illustrate this point, consider the two graphs presented below. They are undeniably isomorphic, a fact that often eludes the casual observer due to their strikingly different visual drawings. Their superficial arrangements might differ, but the underlying connections are preserved.

Graph G	Graph H	An isomorphism between G and H
!Graph G	!Graph H	`f(a) = 1` `f(b) = 6` `f(c) = 8` `f(d) = 3` `f(g) = 5` `f(h) = 2` `f(i) = 4` `f(j) = 7`

One can trace the connections in G and see that they perfectly mirror the connections in H, even if the spatial layout is entirely different. This mapping f demonstrates that for every edge in G, there is a corresponding edge in H between the mapped vertices, and vice-versa.

Variations: Expanding the Scope of Isomorphism

The foundational definition of graph isomorphism typically presumes that we are dealing with undirected, non-labeled, and non-weighted graphs. However, the elegance of the concept lies in its adaptability. The notion of isomorphism is not confined to these basic structures; it can be readily extended and applied to virtually all other variants of graphs. This is achieved by simply layering additional requirements onto the initial definition: the isomorphism must then also preserve the corresponding supplementary elements of structure that define these more complex graph types. For instance, when dealing with directed graphs (or digraphs), the bijection must also preserve the direction of arcs. For weighted graphs, the weights associated with edges must be maintained. And for labeled graphs, the labels themselves become part of the structure that must be preserved, though with an important nuance.

Isomorphism of Labeled Graphs: A Tale of Two Definitions

When venturing into the territory of labeled graphs, the definition of isomorphism becomes slightly more nuanced, leading to the co-existence of two distinct, yet valid, interpretations. It's almost as if the universe decided to offer a multiple-choice question where both 'A' and 'B' have merit, depending on the context.

Under the first, and perhaps more intuitive, definition, an isomorphism is understood as a vertex bijection that not only preserves edges (as in the standard definition) but also rigorously preserves the specific labels assigned to vertices and/or edges. [3] [4] This means that a vertex with a particular label in graph G must be mapped to a vertex with the exact same label in graph H. There's no room for interpretation here; labels are treated as integral, immutable parts of the structure.

The second definition offers a slightly more flexible perspective. Here, an isomorphism is still an edge-preserving vertex bijection, but its requirement for label preservation is relaxed to preserving equivalence classes of labels. [5] This implies that vertices with equivalent labels (for example, simply having the same label value) in graph G must be mapped onto vertices with equivalent labels in graph H, and vice versa. The same principle applies to edge labels. This distinction is subtle but carries significant implications.

To illustrate this semantic divergence, consider the graph known as $K_2$ , which is essentially a single edge connecting two vertices. If these two vertices are uniquely labeled, say with '1' and '2', the implications of each definition become clear. Under the first definition, which demands exact label preservation, $K_2$ possesses only a single automorphism: the identity mapping where vertex '1' maps to '1' and '2' maps to '2'. Swapping them would violate the label preservation. However, under the second definition, which merely requires preservation of equivalence classes (and if '1' and '2' are considered equivalent in some abstract sense, or if their unique numerical identity is merely for distinction, not intrinsic property), there would be two automorphisms: the identity mapping and the mapping where '1' maps to '2' and '2' maps to '1'. This highlights how the precise interpretation of "label-preserving" can drastically alter the perceived symmetries or equivalences.

This second definition is frequently adopted in scenarios where graphs are furnished with unique labels, often drawn from a simple integer range like 1, ..., n, where n represents the total number of vertices in the graph. In these cases, the labels serve primarily as unique identifiers for the vertices rather than carrying intrinsic structural meaning. Consequently, when applying this second definition, two labeled graphs are sometimes considered isomorphic if their underlying unlabeled graphs are isomorphic. Otherwise, if exact label matching were always required for such arbitrary identifiers, the definition of isomorphism would become trivially restrictive, effectively stating that two labeled graphs are isomorphic only if they are identical in every single detail, including their arbitrary vertex enumeration – a rather uninteresting conclusion.

Motivation: Why Structure Matters More Than Appearance

The formal concept of "isomorphism," and specifically "graph isomorphism," is not merely an academic exercise. It serves a profound purpose: it meticulously captures the informal, yet deeply intuitive, notion that certain objects possess "the same structure," even when one deliberately chooses to disregard the individual distinctions of their "atomic" components. In the context of graphs, these "atomic" components are the vertices and edges. It's about seeing past the superficial to the inherent relational framework.

Whenever the individual identity of these "atomic" components (vertices and edges for graphs) becomes critical for accurately representing whatever real-world phenomenon the graphs are modeling, the model itself must be refined. This refinement involves imposing additional, more stringent restrictions on the graph's structure, leading to the utilization of other specialized mathematical objects. This includes, but is not limited to, digraphs (for directional relationships), labeled graphs (for distinct properties of nodes or connections), colored graphs (for categorizing elements), and rooted trees (for hierarchical structures with a defined origin). For each of these sophisticated generalizations of graphs, the isomorphism relation must be redefined to ensure that the bijective mapping preserves all the elements of structure that specifically define that object type. This includes arcs for digraphs, labels for labeled graphs, vertex or edge colors for colored graphs, and the designated root for a rooted tree. The principle remains constant: if a feature defines the object, it must be preserved under isomorphism.

The notion of "graph isomorphism" grants us the invaluable ability to cleanly differentiate between intrinsic graph properties – those characteristics that are fundamental to the very structure of the graphs themselves – and properties that are merely artifacts or consequences of a particular graph representation. These representational properties might arise from how a graph is physically drawn (graph drawings), how its data is organized (data structures for graphs), or how its vertices are arbitrarily numbered or assigned (graph labelings).

For instance, if a graph possesses the inherent structural property of having exactly one cycle, then it is an absolute certainty that all other graphs residing within its isomorphism class will also exhibit this exact same property – they too will have precisely one cycle. This is a structural invariant, unaffected by how we choose to depict or name the vertices.

On the other hand, consider a common scenario where the vertices of a graph are, for convenience or necessity, represented by the integers 1, 2, ..., N. Now, imagine calculating a sum such as:

$\sum _{v\in V(G)}v\cdot {\text{deg }}v$

where deg v denotes the degree of vertex v. This expression, while mathematically sound, can yield entirely different results for two graphs that are, in fact, isomorphic. Why? Because the specific numerical labels (1, 2, ..., N) assigned to the vertices are arbitrary and do not inherently reflect the graph's structure. If we simply re-label the vertices of an isomorphic graph, the sum will change, even though the graph's underlying connectivity remains identical. This starkly illustrates the distinction between an intrinsic structural property (like the number of cycles) and a property that is dependent on a specific, non-canonical representation (like a sum involving arbitrary vertex labels). The former is preserved by isomorphism; the latter is not.

Whitney Theorem: A Glimmer of Structural Correspondence

The Whitney graph isomorphism theorem, a notable contribution from Hassler Whitney, offers a powerful statement about the relationship between a graph and its derived structures. [6] It posits that two connected graphs are isomorphic if and only if their respective line graphs are isomorphic. This theorem provides a remarkable bridge, suggesting that the structure of a graph is often uniquely encoded in the graph formed by its edges.

However, as is often the case in mathematics, there is a single, rather famous, exception to this rule. The theorem does not hold for the pair consisting of $K_3$ (the complete graph on three vertices, a triangle) and the complete bipartite graph $K_{1,3}$ (a star graph with one central vertex connected to three others). These two graphs are demonstrably not isomorphic to each other, yet both of them yield $K_3$ as their line graph. This exception serves as a cautionary tale, reminding us that even elegant theorems have their boundaries.

!Whitney's Theorem Exception The exception to Whitney's theorem: these two graphs are not isomorphic but have isomorphic line graphs.

Beyond standard graphs, the Whitney graph theorem has even found extensions into the more generalized domain of hypergraphs, demonstrating its underlying robustness as a principle. [7]

Recognition of Graph Isomorphism: The Algorithmic Pursuit

While the study of graph isomorphism can certainly be approached through traditional mathematical avenues, as beautifully demonstrated by theorems like Whitney's, it is perhaps even more prominently recognized as a problem demanding an algorithmic solution. The computational challenge of definitively determining whether two finite graphs are isomorphic is formally known as the graph isomorphism problem. This is where theoretical elegance clashes with practical computational limits.

The practical applications stemming from the efficient recognition of graph isomorphism are diverse and impactful. They include, but are by no means limited to:

Cheminformatics and mathematical chemistry: Here, graph isomorphism is indispensable for the unambiguous identification of chemical compounds. Molecules can be represented as graphs where atoms are vertices and chemical bonds are edges. Determining if two molecular structures are identical, despite different orientations or drawing conventions, relies entirely on graph isomorphism.
Electronic design automation (EDA): In the complex world of integrated circuits, graph isomorphism is crucial for verifying the functional equivalence of various representations of an electronic circuit design. Whether comparing a high-level logical design to a physical layout, ensuring that two different circuit implementations perform the same function hinges on their underlying graph structures being isomorphic.

Computational Complexity: A Persistent Enigma

From the perspective of computational complexity theory, the graph isomorphism problem holds a rather unique and vexing position. It is one of a select few standard problems that definitively belong to the complexity class NP, meaning that if a proposed isomorphism exists, it can be verified in polynomial time. However, crucially, it is not known to belong to either of NP's two most prominent (and, if P ≠ NP, disjoint) subsets: P (problems solvable in polynomial time) or NP-complete (the "hardest" problems in NP). This makes it a fascinating outlier.

Indeed, in the seminal 1979 work by Garey & Johnson, which cataloged 12 foundational problems, the graph isomorphism problem stood as one of only two whose complexity remained stubbornly unresolved, the other being the notoriously difficult integer factorization. The enduring uncertainty surrounding its precise classification continues to fuel intense research. It is, however, theoretically significant that if the graph isomorphism problem were to be proven NP-complete, it would lead to a profound consequence: the entire polynomial hierarchy would collapse to a finite level, a scenario that most complexity theorists view as highly unlikely. [8]

A significant development in this ongoing saga occurred in November 2015, when László Babai, a distinguished mathematician and computer scientist at the University of Chicago, announced a monumental claim: he had, he stated, proven that the graph isomorphism problem is solvable in quasi-polynomial time. [9] [10] This would place it squarely outside the realm of NP-complete problems, suggesting it is significantly easier than many had feared, though not quite in P. He subsequently published preliminary versions of these groundbreaking results in the proceedings of the 2016 Symposium on Theory of Computing [11] and the 2018 International Congress of Mathematicians. [12] The path to scientific consensus, however, is rarely linear. In January 2017, Babai briefly retracted his quasi-polynomiality claim, instead positing a sub-exponential time complexity bound. Yet, with a swift turn, he restored his original claim just five days later. [13] As of 2024 [update], the complete, peer-reviewed journal version of Babai's intricate paper is still eagerly awaited by the wider scientific community, a testament to the meticulous scrutiny required for such a profound result.

In contrast to the general graph isomorphism problem, its generalization, known as the subgraph isomorphism problem (determining if a smaller graph is isomorphic to a subgraph of a larger one), is unequivocally known to be NP-complete. This distinction is vital, as the added constraint of finding a specific pattern makes the problem significantly harder.

The primary avenues of research concerning this problem continue to revolve around two core objectives: the design and refinement of increasingly fast algorithms for practical applications, and rigorous theoretical investigations into its computational complexity, both for the general problem and for specific, more constrained classes of graphs.

Algorithms for Isomorphism Recognition

While a definitive polynomial-time algorithm for the general graph isomorphism problem remains elusive, several powerful algorithms and heuristic tests have been developed to tackle it in practice.

One such method is the Weisfeiler-Leman graph isomorphism test. [14] This test operates heuristically, meaning it's a good guide but not an absolute guarantee. If the Weisfeiler-Leman test fails for two input graphs, it provides a strong guarantee: the graphs are definitively non-isomorphic. However, if the test "succeeds," indicating that the graphs appear similar structurally, it does not provide a definitive proof of isomorphism. The graphs may or may not be isomorphic. This ambiguity highlights the challenge of the problem. There exist generalizations of the Weisfeiler-Leman test algorithm that are guaranteed to detect isomorphisms, but these enhanced versions invariably come with the significant computational cost of an exponential runtime, rendering them impractical for very large graphs.

Another widely recognized and practically effective algorithm for graph isomorphism is the vf2 algorithm, which was meticulously developed by Cordella et al. in 2001. [15] The vf2 algorithm employs a depth-first search strategy, systematically attempting to construct an isomorphism between two graphs in an incremental fashion. Its efficiency stems from the clever use of a set of "feasibility rules" that intelligently prune the vast search space. These rules allow the algorithm to discard branches of the search tree that cannot possibly lead to a valid isomorphism, thereby enabling it to handle graphs with thousands of nodes relatively efficiently in many real-world scenarios. The vf2 algorithm has found extensive application across diverse fields, including pattern recognition, computer vision, and bioinformatics. While its theoretical worst-case time complexity remains exponential, its empirical performance for a wide range of practical graph instances is remarkably good, making it a go-to choice for many researchers and developers.

Notes

^ Grohe, Martin (2020-11-01). "The Graph Isomorphism Problem". Communications of the ACM. 63 (11): 128–134. doi:10.1145/3372123. Retrieved 2023-03-06.
^ Klarreich, Erica (2015-12-14). "Landmark Algorithm Breaks 30-Year Impasse". Quanta Magazine. Retrieved 2023-03-06.
^ p.424
^ Hsieh, Shu-Ming; Hsu, Chiun-Chieh; Hsu, Li-Fu (2006). "Efficient Method to Perform Isomorphism Testing of Labeled Graphs". Computational Science and Its Applications - ICCSA 2006. Lecture Notes in Computer Science. Vol. 3984. pp. 422–431. doi:10.1007/11751649_46. ISBN 978-3-540-34079-9.
^ Pierre-Antoine Champin, Christine Solnon, "Measuring the Similarity of Labeled Graphs" in: Lecture Notes in Computer Science, vol. 2689, pp 80–95
^ Whitney, Hassler (January 1932). "Congruent Graphs and the Connectivity of Graphs". American Journal of Mathematics. 54 (1): 150–168. doi:10.2307/2371086. hdl:10338.dmlcz/101067. JSTOR 2371086.
^ Dirk L. Vertigan, Geoffrey P. Whittle: A 2-Isomorphism Theorem for Hypergraphs. J. Comb. Theory, Ser. B 71(2): 215–230. 1997.
^ Schöning, Uwe (1988). "Graph isomorphism is in the low hierarchy". Journal of Computer and System Sciences. 37 (3): 312–323. doi:10.1016/0022-0000(88)90010-4.
^ Cho, Adrian (November 10, 2015), "Mathematician claims breakthrough in complexity theory", Science, doi:10.1126/science.aad7416.
^ Klarreich, Erica (December 14, 2015), "Landmark Algorithm Breaks 30-Year Impasse", Quanta Magazine
^ Babai, László (2016), "Graph isomorphism in quasipolynomial time [extended abstract]", STOC'16—Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, ACM, New York, pp. 684–697, doi:10.1145/2897518.2897542, ISBN 978-1-4503-4132-5, MR 3536606, S2CID 17118954
^ Babai, László (2018), "Group, graphs, algorithms: the graph isomorphism problem", Proceedings of the International Congress of Mathematicians—Rio de Janeiro 2018. Vol. IV. Invited lectures, World Sci. Publ., Hackensack, NJ, pp. 3319–3336, MR 3966534
^ Babai, László (January 9, 2017), Graph isomorphism update
^ Huang, Ningyuan Teresa; Villar, Soledad (2021). "A Short Tutorial on the Weisfeiler-Lehman Test and Its Variants". ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 8533–8537. arXiv:2201.07083. doi:10.1109/ICASSP39728.2021.9413523. ISBN 978-1-7281-7605-5. S2CID 235780517.
^ Cordella, L. P.; Foggia, P.; Sansone, C.; Vento, M. (2001). "An Improved Algorithm for Matching Large Graphs". 3rd IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition: 149–159.