Shortest sentence matching pattern matching algorithm

I have a problem that I am trying to solve.

The program is provided with a text file containing the following characters: a - z, A - Z, 0 - 9, fullstop (.) And a space. The words in the text file consist solely of az, AZ, and 0-9. The program receives several requests. Each request consists of a set of complete words already present in the file. The program should return the smallest phrase from the file where all the words are present (in any order). If there are many such phrases, return the first.

Here is an example. Let's say the file contains:

Bar is doing a computer science degree. Bar has a computer at home. Bar  is now at home.

Request 1:

Bar computer a

Answer:

Bar has a computer

Request 2:

Bar home

Answer:

home. Bar

. 1 Bar, Bar . node . ,

1- node ", 0, 1" [, , ]. 2- 3- node.

. .

1st node "Bar Computer", 0, 5

2- node "-", 7, 4 ..

"a". , node, , , . .

? , , , , .

, TSP?

+3
2

TSP - . n - , m - ; , n > m.

best = infinity
for i = 1 to n
  for j = i to n
    all_found = true
    for k = 1 to m
      found = false
      for l = i to j
        if text[l] == query[k]
          found = true
      all_found = all_found || found
    if all_found && j - i < best
      best = j - i
      best_i = i
      best_j = j

O (n 3 m) . .

-, -.

best = infinity
for i = 1 to n
  for j = i to n
    subtext_set = {}
    for l = i to j
      subtext_set = subtext_set union {text[l]}
    all_found = true
    for k = 1 to m
      all_found = all_found && query[k] in subtext_set
    if all_found && j - i < best
      best = j - i
      best_i = i
      best_j = j

O (n 3), O (n 3 log n), .

, subtext_set, .

best = infinity
for i = 1 to n
  subtext_set = {}
  for j = i to n
    subtext_set = subtext_set union {text[l]}
    all_found = true
    for k = 1 to m
      all_found = all_found && query[k] in subtext_set
    if all_found && j - i < best
      best = j - i
      best_i = i
      best_j = j

O (n 2 m). , subtext_set : , ?

query_set = {}
for k = 1 to m
  query_set = query_set union {query[k]}
best = infinity
for i = 1 to n
  subtext_set = {}
  num_found = 0
  for j = i to n
    if text[l] in query_set && text[l] not in subtext_set
      subtext_set = subtext_set union {text[l]}
      num_found += 1
    if num_found == m && j - i < best
      best = j - i
      best_i = i
      best_j = j

O (n 2). O (n) . -, ,

text = Bar has a computer at home. Bar
       1   2   3 4        5  6     7
query = Bar computer a

# j 1 2 3 4 5 6 7
i +--------------
1 | 1 1 2 3 3 3 3
2 | 0 0 1 2 2 2 3
3 | 0 0 1 2 2 2 3
4 | 0 0 0 1 1 1 2
5 | 0 0 0 0 0 0 1
6 | 0 0 0 0 0 0 1
7 | 0 0 0 0 0 0 1

, . m, . . i, j , i; j.

j , , . num_found, .

best = infinity
count = hash table whose entries are zero by default
for k = 1 to m
  count[query[k]] = -1
num_found = 0
i = 1
j = 0
while true
  if num_found == m
    if j - i < best
      best = j - i
      best_i = i
      best_j = j
    count[text[i]] -= 1
    if count[text[i]] == -1
      num_found -= 1
    i += 1
  else
    j += 1
    if j > n
        break
    if count[text[j]] == -1
      num_found += 1
    count[text[j]] += 1

O (n). O (n) O (m) . . ( , , .)

+4

.

0

All Articles