I am trying to create an awk command to select rows with a column of values โโ2 that is in the range of values โโdetermined by the union of the individual columns of the row. It has application in the challenge of single nucleotide polymorphisms that are not within 50 nucleotides of exon boundaries. The file is as follows:
ID X start end start end start end start end
Fal1825_c6 802 2 62 62 239 239 362 362 934
Fal1821_c2 152 1 19 22 159 159 263 264 398
Fal18279_c7 41 1 177 177 598
Fal18376_c3 367 1 251 251 421
Fal18748_c2 601 1 152 152 489 489 499 499 677
Fal18748_c2 500 1 152 152 489 489 499 499 677
Fal18792_c3 750 1 234 234 459 459 762 762 83
Fal19487_c2 89 1 177 177 270 270 409 411 459
I only want to print lines in which the value of the second column falls into the range ("start" + 50) and ("end" - 50), for any "start" and "end" pairing on this line (pairs from the "start" columns only "and the" end "next to each other), that is, between ($ 3 + 50 and $ 4-50) or ($ 5 + 50 and $ 6-50) or ($ 7 + 50 and $ 8-50), and so on further, considering all pairs of initial columns for the component.
The result will look like this:
ID X start end start end start end start end
Fal1825_c6 802 2 62 62 239 239 362 362 934
Fal18376_c3 367 1 251 251 421
Fal18748_c2 601 1 152 152 489 489 499 499 677
Fal19487_c2 89 1 177 177 270 270 409 411 459
My attempt was like this
awk '{a=3; b=4; while ($a > 0) do {if ($2 > ($a + 50) && $2 < ($b + 50)){print $0} else {a+2, b+2} }'
thank
source
share