PHP: Separate a string with a comma, but NOT when between brackets or quotation marks?

In PHP, I have the following line:

$str = "AAA, BBB, (CCC,DDD), 'EEE', 'FFF,GGG', ('HHH','III'), (('JJJ','KKK'), LLL, (MMM,NNN)) , OOO"; 

I need to break this line into the following parts:

AAA
BBB
(CCC,DDD)
'EEE'
'FFF,GGG'
('HHH','III')
(('JJJ','KKK'),LLL, (MMM,NNN))
OOO

I tried a few regexes but couldn't find a solution. Any ideas?

UPDATE

I decided to use regex is actually not the best solution, dealing with garbled data, hidden quotes, etc.

Thanks to the suggestions made here, I found a function that uses parsing, which I rewrote according to my needs. It can handle various types of brackets, and the delimiter and quotation mark are also parameters.

 function explode_brackets($str, $separator=",", $leftbracket="(", $rightbracket=")", $quote="'", $ignore_escaped_quotes=true ) {

    $buffer = '';
    $stack = array();
    $depth = 0;
    $betweenquotes = false;
    $len = strlen($str);
    for ($i=0; $i<$len; $i++) {
      $previouschar = $char;
      $char = $str[$i];
      switch ($char) {
        case $separator:
          if (!$betweenquotes) {
            if (!$depth) {
              if ($buffer !== '') {
                $stack[] = $buffer;
                $buffer = '';
              }
              continue 2;
            }
          }
          break;
        case $quote:
          if ($ignore_escaped_quotes) {
            if ($previouschar!="\\") {
              $betweenquotes = !$betweenquotes;
            }
          } else {
            $betweenquotes = !$betweenquotes;
          }
          break;
        case $leftbracket:
          if (!$betweenquotes) {
            $depth++;
          }
          break;
        case $rightbracket:
          if (!$betweenquotes) {
            if ($depth) {
              $depth--;
            } else {
              $stack[] = $buffer.$char;
              $buffer = '';
              continue 2;
            }
          }
          break;
        }
        $buffer .= $char;
    }
    if ($buffer !== '') {
      $stack[] = $buffer;
    }

    return $stack;
  }
+5
source share
2 answers

Instead, preg_splitdo a preg_match_all:

$str = "AAA, BBB, (CCC,DDD), 'EEE', 'FFF,GGG', ('HHH','III'), (('JJJ','KKK'), LLL, (MMM,NNN)) , OOO"; 

preg_match_all("/\((?:[^()]|(?R))+\)|'[^']*'|[^(),\s]+/", $str, $matches);

print_r($matches);

will print:

Array
(
    [0] => Array
        (
            [0] => AAA
            [1] => BBB
            [2] => (CCC,DDD)
            [3] => 'EEE'
            [4] => 'FFF,GGG'
            [5] => ('HHH','III')
            [6] => (('JJJ','KKK'), LLL, (MMM,NNN))
            [7] => OOO
        )

)

\((?:[^()]|(?R))+\)|'[^']*'|[^(),\s]+ :

  • \((?:[^()]|(?R))+\),
  • '[^']*'
  • [^(),\s]+, char -, '(', ')', ','
+7

Spartan regex, , , :

\G\s*+((\((?:\s*+(?2)\s*+(?(?!\)),)|\s*+[^()',\s]++\s*+(?(?!\)),)|\s*+'[^'\r\n]*+'\s*+(?(?!\)),))++\))|[^()',\s]++|'[^'\r\n]*+')\s*+(?:,|$)

Regex101

:

'/\G\s*+((\((?:\s*+(?2)\s*+(?(?!\)),)|\s*+[^()\',\s]++\s*+(?(?!\)),)|\s*+\'[^\'\r\n]*+\'\s*+(?(?!\)),))++\))|[^()\',\s]++|\'[^\'\r\n]*+\')\s*+(?:,|$)/'

ideone

1. ideone PREG_OFFSET_CAPTURE, 0 ( ), .

  • , \s. , .
  • (, ), ' ,.
  • 1 .
  • .
  • . , '.
  • .
  • - : , .
  • 2 ,
  • ( ).
  • , , () .
  • : , . ,. , .
  • ( \s, ) , (s) , , (s) (, ) .

\G\s*+
(
  (
    \(
    (?:
        \s*+
        (?2)
        \s*+
        (?(?!\)),)
      |
        \s*+
        [^()',\s]++
        \s*+
        (?(?!\)),)
      |
        \s*+
        '[^'\r\n]*+'
        \s*+
        (?(?!\)),)
    )++
    \)
  )
  |
  [^()',\s]++
  |
  '[^'\r\n]*+'
)
\s*+(?:,|$)
+3

All Articles