Entering line breaks using sed in bash, problems with regular expressions

Hello everyone my data look like

  samplename 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 ...
  samplename2 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 ...

and I want it to look like this:

  >samplename
  0 1 1 1 1 1 1 1 1 1 
  1 0 0 0 0 0 0 0 0 ...
  >samplename2 
  0 0 0 0 0 1 1 1 1 1 
  1 1 1 1 1 1 0 0 0 ...

[note - displaying a line break after every 10 digits; I really want it after every 200, but I understand that showing such a line will not be very useful.

I could do this using a regular expression in a text editor, but I want to use the sed command in bash, because I have to do this several times, and I need 200 characters per line.

I tried this but got an error:

sed -e "s/\(>\w+\)\s\([0-9]+\)/\1\n\2" < myfile > myfile2

sed: 1: "s / (> \ w +) \ s ([0-9] +) / ...": unescaped newline inside the replacement template

- Mac; , sed Mac gnu sed. , Mac, .

.

+3
5

200 awk.

echo "hello 1 2 3 4" | awk '{print ">"$1; for(i=2; i<=NF; i++) {printf("%d ",$i); if((i+1)%2 == 0) printf("\n");}}

>hello
1 2 
3 4 

, , hello,

echo "hello 1 2 3 4" | awk '/^hello / {print ">"$1; for(i=2; =NF; i++) {printf("%d ",$i); if((i+1)%2 == 0) printf("\n");}}

( / / : " , ".

if( (i + 1) % 2 == 0) if( (i + 1) % 100 == 0 ), 100 ... 2, .

, , .

: ( /^hello /, , "", {} , ).

/^hello/ { print ">"$1;
   for(i=2; i<=NF; i++)
   {
      printf("%d ",$i);
      if((i+1)%100 == 0) printf("\n");
   }
   print "";
}

awk -f breakIt inputFile > outputFile

: " breakIt inputFile outputFile".

.

, sed, ( ). sedSplit

s/^([A-Za-z]+ )/>\1\
/g
s/([0-9 ]{10})/\1\
/g
s/$/\
/g

sed; , , .

s/^                  - substitute, starting from the beginning of the line
([A-Za-z]+ )/        - substitute the first word (letters only) plus space, replacing with 
>\1\
/g                   - the literal '>', then the first match, then a newline, as often as needed (g)

s/([0-9] ]{10})/     - substitute 10 repetitions of [digit followed by space]
\1\
/g                   - replace with itself, followed by newline, as often as needed

s/$/\
/g                   - replace the 'end of line' with a carriage return

sed script :

sed -E -f sedSplit < inputFile > outputFile

-E ( - ..)

-f flag (' ')

- , Mac ( , , ).

+1
$ awk '{print ">" $1; for (i=2;i<=NF;i++) printf "%s%s", $i, ((i-1)%10 ? FS : RS)}' file
>samplename
0 1 1 1 1 1 1 1 1 1
1 0 0 0 0 0 0 0 0 ...
>samplename2
0 0 0 0 0 1 1 1 1 1
1 1 1 1 1 1 0 0 0 ...
+1

fold - :

sed 's/\([^ ]*\) /\1\n/' input | fold -w 100
+1

. .

sed -e 's/\(>\w+\)\s\([0-9]+\)/\1\n\2/' < myfile > myfile2
sed -e "s/\\(>\\w+\\)\\s\\([0-9]+\\)/\\1\\n\\2/" < myfile > myfile2

PS, I added a trailing slash. You had s / ... / ... instead of s /.../.../

PS, as I look at your regular expression, sed will not complain. Try it.

sed -e 's/^\(\w\+\)\s\+/>\1\n/' < myfile > myfile2

MAC version, with 200 characters (100 single digits and 100 spaces)

sed -Ee 's/^([a-zA-Z0-9]+) />\1\
/' | sed -Ee 's/(([0-9] ){99}[0-9]) /\1\
/g' < myfile > myfile2

The first sed separates the character string from the number, the second separates the string.

0
source

plain bash:

while read -r name values; do
    printf ">%s\n%s\n" "$name" "$values"
done <<END
samplename 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 ...
samplename2 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 ...
END
>samplename
0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 ...
>samplename2
0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 ...

Assuming samplename contains no spaces

0
source

All Articles