What is the best way to extract values ​​between curly braces in bash / awk?

I have output in this format:

Infosome - infotwo: (29333) - data-info-ids: (33389, 94934)

I want to extract the last two numbers from the last pair of curly braces. Sometimes in the last pair of braces there is only one number.

This is the code I used.

echo "Infosome - infotwo: (29333) - data-info-ids: (33389, 94934)" | \
  tr "," " " | tr "(" " " | tr ")" " " | awk -F: '{print $4}'

Is there a cleaner way to extract values? or in a more optimal way?

+5
source share
3 answers

Try the following:

awk -F '[()]' '{print $(NF-1)}' input | tr -d ,

This is a kind of refactoring of your team.

+13
source
 awk -F\( '{gsub("[,)]", " ", $NF); print $NF}' input

will give

 33389  94934 

I do not understand the meaning of the “optimal” / “professional” problem in this context, but only one command / tool is used for this, I’m not sure if this corresponds.

@kev ( tr ):

awk -F'[(,)]' '{print $4, $5}' input

:

33389  94934
+3

bash. , , :

$ text="Infosome - infotwo: (29333) - data-info-ids: (33389, 94934)"
$ result="${text/*(}"
$ echo ${result//[,)]}
33389 94934

This uses the shell extension "parameter extension" (which you can find on the bash man page) to remove the line in the same way as with tr. Strictly speaking, quotation marks in the second line are not needed, but they help with StackOverflow syntax highlighting. :-)

You can alternately make this a bit more flexible by looking for the field you are interested in. If you use GNU awk, you can specify RS with a few characters:

$ gawk -vRS=" - " -vFS=": *" '
  { f[$1]=$2; }
  END {
    print f["data-info-ids"];
    # Or you could strip the non-numeric characters to get just numbers. 
    #print gensub(/[^0-9 ]/,"","g",f["data-info-ids"]);
  }' <<<"$text"

I prefer this method because it actually interprets the input so that it is structured text representing some kind of array.

+1
source

All Articles