I have an anonymous table in which there are two columns: UserId and PhoneNumber.
It was selected from the table of call details records. Now I would like to create a network based on similarity between users. There should be a connection between users if they call at least 3 identical numbers.
There are over 20 million lines. When I use a simple program written in C #, it will take more than 4 days to complete this task. I wonder if it is possible to write an SQL query that will give me the same result, and if there is a similarity, just insert a row into a new table with two columns, user1 and user2, or just pass it to the output?
Perhaps there is another good solution for this task?
source
share