Looking for words that can create a set of letters?

I am trying to write some SQL that will take a set of letters and return all the possible words that it can make. My first thought was to create a basic database of three tables:

Words -- contains 200k words in real life
------
1 | act
2 | cat

Letters -- contains the whole alphabet in real life
--------
1  | a
3  | c
20 | t

WordLetters --First column is the WordId and the second column is the LetterId
------------
1  | 1
1  | 3
1  | 20
2  | 3
2  | 1
2  | 20

But I'm a little fixated on how I write a query that returns words that contain a WordLetters entry for each letter passed. You also need to consider words that have two letters. I started with this query, but it obviously does not work:

SELECT DISTINCT w.Word 
FROM Words w
INNER JOIN WordLetters wl
ON wl.LetterId = 20 AND wl.LetterId = 3 AND wl.LetterId = 1

How can I write a query to return only words containing all letters that are transmitted and taking into account duplicate letters?


Additional Information:

My Word 200 000 , , . enable1 , - .

+3
3

SQL , , , : , , .

, :

sorted_text word_id
act         123    /* we'll assume `act` was word number 123 in the original list */
act         321    /* we'll assume 'cat' was word number 321 in the original list */

, (, "tac" ), , , , , .

, SQL, , , - . , , , , front-end, SQL , , : .

+5

, WordLetters. , , , , .

, , . , , , . , . , ( ). , . , . :

WITH letters
     AS (SELECT Cast('a' AS VARCHAR) AS Letter,
                1                    AS LetterValue,
                1                    AS LetterNumber
         UNION ALL
         SELECT Cast(Char(97 + LetterNumber) AS VARCHAR),
                Power(2, LetterNumber),
                LetterNumber + 1
         FROM   letters
         WHERE  LetterNumber < 26),
     words
     AS (SELECT 1 AS wordid, 'act' AS word
         UNION ALL SELECT 2, 'cat'
         UNION ALL SELECT 3, 'tom'
         UNION ALL SELECT 4, 'moot'
         UNION ALL SELECT 5, 'mote')
SELECT wordid,
       word,
       Sum(distinct LetterValue) as WordValue
FROM   letters
       JOIN words
         ON word LIKE '%' + letter + '%'
GROUP  BY wordid, word

, , "act" "cat" WordValue, "tom" "moot", .

, ? -, . , , .

0

SQL . , . , "a":

select len(word) - len(replace(word, 'a', ''))

, , :

select w.word, (LEN(w.word) - SUM(LettersInWord))
from 
(
  select w.word, (LEN(w.word) - LEN(replace(w.word, wl.letter))) as LettersInWord
  from word w 
  cross join wordletters wl
) wls
having (LEN(w.word) = SUM(LettersInWord))

. , . , :

select w.word, (LEN(w.word) - SUM(LettersInWord))
from 
(
   select w.word,
     (case when (LEN(w.word) - LEN(replace(w.word, wl.letter))) <= maxcount 
         then (LEN(w.word) - LEN(replace(w.word, wl.letter))) 
         else maxcount end) as LettersInWord
   from word w 
   cross join
   (
      select letter, count(*) as maxcount
      from wordletters wl
      group by letter
   ) wl
) wls
having (LEN(w.word) = SUM(LettersInWord))

, case " = maxcount" " <= maxcount".

In my experience, I really saw decent performance with small cross-connects. This may work on the server side. There are two big advantages to doing this work on the server. First, it takes advantage of parallelism on the box. Secondly, much less data needs to be transmitted over the network.

0
source

All Articles