ColA1 corresponds to ColA2 if:
Count (ColA1) = Count (ColA2) = Count (ColA1 x ColA2)
This approach attempts to optimize the query speed.
Materialize the original invoice as it is used more than once and can declare PK.
(CTE is just syntax and evaluated)
Where RA.rawcount = RB.rawcount only allows you to evaluate the connection if the counts are equal. And the query plan shows that it is executed first.
create table
(ColA varchar(50) not null primary key, rawcount int not null)
insert into
select [ColA], COUNT(*) as [rawCount]
from [tbl]
group by [ColA]
order by [ColA]
select a.ColA as ColA1, b.ColA as ColA2, COUNT(*) [matchcount]
from tbl A
join tbl B
on a.ColB = b.ColB
and a.ColA < b.ColA
join
on RA.ColA = A.ColA
join
on RB.ColA = B.ColA
where RA.rawcount = RB.rawcount
group by a.ColA, b.ColA, RA.rawcount
having COUNT(*) = RA.rawcount
source
share