Multiple Self-Joins to Find Transitive Subsets

by Escherichia   Last Updated July 11, 2019 19:26 PM - source

In simplest terms, I have a table representing a relation that is both reflexive and symmetric, although that information will never be in the table.

+-----+-----+  
| id1 | id2 |  
+-----+-----+  
|   1 |   4 |  
|   3 |   1 |  
|   2 |   1 |  
|   2 |   3 |  
|   2 |   4 |  
|   5 |   1 |  
+-----+-----+

EDIT This table is meant to concisely show the following relation:
{(1,1), (2,2), (3,3), (4,4), (5,5), (1,4), (4,1), (3,1), (1,3), (2,1), (1,2), (2,3), (3,2), (2,4), (4,2), (5,1), (1,5)}. This can be visualized by the following undirected graph, where loops are not shown. Picture of Relation represented by Test table.

CREATE TABLE Test (
id1 int not null,
id2 int not null);

INSERT INTO Test
VALUES
(1,4),
(3,1),
(2,1),
(2,3),
(2,4),
(5,1);

I would like to identify transitive subsets in my table.

I've tried doing the following but the result set is larger than I want it to be. I hope that there is an easier way.

select t1.id1, t1.id2, t2.id1, t2.id2, t3.id1, t3.id2
from test as t1
    join test as t2
        on t1.id1 = t2.id1
        or t1.id2 = t2.id2
        or t1.id1 = t2.id2
        or t1.id2 = t2.id1
    join test as t3
        on t2.id1 = t3.id1
        or t2.id2 = t3.id2
        or t2.id1 = t3.id2
        or t2.id2 = t3.id1
where
    not
    (
        t1.id1 = t2.id1
        and
        t1.id2 = t2.id2
    )
    and not
    (
        t2.id1 = t3.id1
        and
        t2.id2 = t3.id2
    )
    and not
    (
        t1.id1 = t3.id1
        and
        t1.id2 = t3.id2
    )
    and
    (
        (
            t3.id1 = t1.id1
            or
            t3.id1 = t1.id2
            or
            t3.id1 = t2.id1
            or
            t3.id1 = t2.id2
        )
        and
        (
            t3.id2 = t1.id1
            or
            t3.id2 = t1.id2
            or
            t3.id2 = t2.id1
            or
            t3.id2 = t2.id2
        )
    );

Actual Output:

+-----+-----+-----+-----+-----+-----+
| id1 | id2 | id1 | id2 | id1 | id2 |
+-----+-----+-----+-----+-----+-----+
|   1 |   4 |   2 |   4 |   2 |   1 |
|   1 |   4 |   2 |   1 |   2 |   4 |
|   3 |   1 |   2 |   3 |   2 |   1 |
|   3 |   1 |   2 |   1 |   2 |   3 |
|   2 |   1 |   2 |   4 |   1 |   4 |
|   2 |   1 |   2 |   3 |   3 |   1 |
|   2 |   1 |   3 |   1 |   2 |   3 |
|   2 |   1 |   1 |   4 |   2 |   4 |
|   2 |   3 |   2 |   1 |   3 |   1 |
|   2 |   3 |   3 |   1 |   2 |   1 |
|   2 |   4 |   2 |   1 |   1 |   4 |
|   2 |   4 |   1 |   4 |   2 |   1 |
+-----+-----+-----+-----+-----+-----+

The expected result set would only have two rows. Each row would represent a transitive relation that is a subset of the original relation.

╔═════╦═════╦═════╦═════╦═════╦═════╗
║ id1 ║ id2 ║ id1 ║ id2 ║ id1 ║ id2 ║
╠═════╬═════╬═════╬═════╬═════╬═════╣
║   1 ║   4 ║   2 ║   4 ║   2 ║   1 ║
║   3 ║   1 ║   2 ║   1 ║   2 ║   3 ║
╚═════╩═════╩═════╩═════╩═════╩═════╝

EDIT The expected output could also look like,

╔═════╦═════╦═════╗
║ id1 ║ id2 ║ id3 ║
╠═════╬═════╬═════╣
║   1 ║   4 ║   2 ║
║   3 ║   1 ║   2 ║
╚═════╩═════╩═════╝,

whatever is easier. I just need to display the fact that the sets
{(1,4), (4,1), (2,4), (4,2), (2,1), (1,2)}
and
{(3,1), (1,3), (2,1), (1,2), (2,3), (3,2)}
are proper subsets of the original relation and are themselves transitive relations. I am using the definition that a relation R is transitive if and only if ∀a∀b∀c((a,b)∈R ∧ (b,c)∈R → (a,c)∈R). In other words, I am trying to find all subgraphs that are also complete graphs.



Answers 1


Times ago I needed to create clusters of data using transitive closure. The best way to do that was using SQLCLR. Here's the GitHub code (article to detailed link is there too)

https://github.com/yorek/non-scalar-uda-transitive-closure

That could be a good starting point. Can you also be more precise on what is the expected result with the input data you have in the sample?

mauridb
mauridb
July 10, 2019 23:40 PM

Related Questions


LEFT JOIN vs. LEFT OUTER JOIN in SQL Server

Updated June 17, 2019 02:26 AM

SQL Full Outer Join duplicate Issue

Updated February 01, 2018 21:26 PM



How to prevent insertion of cyclic reference in SQL

Updated October 11, 2017 13:26 PM