Parsing xdot draws attributes using pyparsing

New for PyParsing. I am trying to figure out how to parse drawing attributes (and the like) in xdot files. There are a number of elements in which the number of the following elements is indicated as an integer at the beginning - similar to NetStrings. I looked at some code examples for working with netstring as constructors, but it doesn't seem to work for me.

Here are some examples:

Polygon with 3 points (3 after P indicates the number of following points):
P 3 811 190 815 180 806 185must be analyzed on'P', [[811, 190], [815, 180], [806, 185]]

Polygon with 2 points:
P 2 811 190 815 180 806 185must be analyzed on 'P', [[811, 190], [815, 180]](with undisclosed text at the end)

Pen fill color (4 after C indicates the number of characters after using “-”):
C 4 -blueshould be analyzed for'C', 'blue'


Updated information:
I think I was misleading by putting examples in my lines, without any extra context. Here is a real example:

S 5 -solid S 15 -setlinewidth(1) c 5 -black C 5 -black P 3 690 181 680 179 687 187

See http://www.graphviz.org/doc/info/output.html#d:xdot for the actual specification.

Please note that there may be significant spaces in the text fields. The setlinewidth (1) value above can be "abcd efgh hijk", and as long as it is exactly 15 characters, it must be associated with the "S" tag. There must be exactly 7 numbers (initial counter + 3 pairs) after the "P" tag, and everything else should raise a parsing error, as there may be more tags following (on one line), but the numbers themselves are not real.

Hope this simplifies things a bit.

+2
2

, , scanString.

int_ = Word(nums).setParseAction(lambda t: int(t[0]))
float_ = Combine(Word(nums) + Optional('.' + ZeroOrMore(Word(nums, exact=1)))).setParseAction(lambda t: float(t[0]))
point = Group(int_ * 2 ).setParseAction(lambda t: tuple(t[0]))
ellipse = ((Literal('E') ^ 'e') + point + int_ + int_).setResultsName('ellipse')
n_points_start =  (Word('PpLBb', exact=1) + int_).setResultsName('n_points')
text_start = ((('T' + point + int_*3 ) ^ ('F' + float_ + int_) ^ (Word('CcS') + int_) ) + '-').setResultsName('text')
xdot_attr_parser = ellipse ^ n_points_start ^ text_start

def parse_xdot_extended_attributes(data):
    results = []
    while True:
        try:
            tokens, start, end = xdot_attr_parser.scanString(data, maxMatches = 1).next()
            data = data[end:]
            name = tokens.getName()
            if name == 'n_points':
                number_to_get = int(tokens[-1])
                points, start, end = (point * number_to_get).scanString(data, maxMatches = 1).next()
                result = tokens[:1]
                result.append(points[:])
                results.append(result)
                data = data[end:]
            elif name == 'text':
                number_to_get = int(tokens[-2])
                text, data = data[:number_to_get], data[number_to_get:]
                result = tokens[:-2]
                result.append(text)
                results.append(result)
            else:
                results.append(tokens)
        except StopIteration:
            break
    return results
+1

OP, .

. , . , , :

P 3 811 190 815 180 806 185
P 2 811 190 815 180 806 185

, ? . , . :

from pyparsing import *

EOL = LineEnd().suppress()

number = Word(nums).setParseAction(lambda x: int(x[0]))
point_pair = Group(number + number)

poly_flag  = Group(Literal("P") + number("length"))("flag")
poly_type  = poly_flag + Group(OneOrMore(point_pair))("data")

xdot_line = Group(poly_type) + EOL
grammar   = OneOrMore(xdot_line)

, data, flag length, . :

S = "P 3 811 190 815 180 806 185\nP 2 811 190 815 180 806 185\n"
P = grammar.parseString(S)

for line in P:
    L = line["flag"]["length"]  
    while len(line["data"]) > L: 
        line["data"].pop()

:

[['P', 3], [[811, 190], [815, 180], [806, 185]]]
[['P', 2], [[811, 190], [815, 180]]]

. , , xdot_line, ..

xdot_line = Group(poly_type | pen_fill_type) + EOL
+1

All Articles