Building a tree using Python

I am trying to implement an undefiled boolean search. To do this, I need to build a tree and execute DFS to retrieve the documents. I have leaf nodes, but it's hard for me to build a tree.

For example: query = OR (AND (maria sharapova) tennis)

Result:

      OR
     | |
     AND tennis
     | |
  maria sharapova

I traverse the tree using DFS and compute the logical equivalent of certain document identifiers to identify the required document from the corpus. Can someone help me design this with python? I analyzed the request and received the nodes of the sheet at the moment.

EDIT : I'm new here, so I apologize for the lack of clarity. I am basically trying to build a very naive search engine. Thus, the user enters any logical queries of type OR (AND (maria sharapova)). I have a document from Wikipedia documents that is displayed to the user, depending on the type of request.

So far, I have analyzed the request for individual statements (for example, OR, AND, etc.). And, individual search terms (maria, tennis, etc.). The code for parsing is just a function that basically groups all the operators and query conditions as input. (Maria Sharapova), (tennis), OR, AND. I analyzed this function in such a way as to create a tree from the bottom up. Now, using inverted lists for relevant keywords such as tennis, maria, sharapova, etc. I perform a logical operation with an inverted list to get a specific "document". This documentid is then passed to the API, which will then retrieve the correct wikipedia page.

, : http://www.ccs.neu.edu/home/jaa/CSG339.06F/Lectures/boolean.pdf

+5
2

, , , , lex/yacc, .

-, , , , , , python. . , , :

"OR ( AND ( maria sharapova ) tennis )"

(AND/OR) /. ( DFS ) .

(AND/OR) (, maria, tennis). / . , ).

.

, . .

1. "OR" .

+               +
+               +
+    OR         +
+ + + + + + + + +

2. (, .

3. "AND" . :

+               +
+    AND        +
+    OR         +
+ + + + + + + + +

4. (.

5. "maria" .

6. "" . :

+   sharapova   +
+    maria      +
+    AND        +
+    OR         +
+ + + + + + + + +

7. ). . , . , . "" "" "". , "maria" 3 doc: [1, 2, 3]. "" 5 : [2, 3, 8, 9, 10]. "" [2,3] , . : .

+               +           +         +
+               +           +         +
+               +           +         +
+    OR         +           +  [2,3]  +
+ + + + + + + + +           + + + + + +

8. .

+               +           +         +
+               +           +         +
+    tennis     +           +         +
+    OR         +           +  [2,3]  +
+ + + + + + + + +           + + + + + +

9. ). , , "". , "", , : [3, 5, 7]. "OR", , , doc: [2,3,5,7].

. . doc len(word) .

, (1- ), (2- ), ( ) (4- ).

+4

- Python ( ):

>>> query = ['OR', ['AND', 'maria', 'sharapova'], 'tennis']
+2

All Articles