Bash 'while reading efficiency of a line with a large file

I used a while loop to handle the task,

which reads records from a large file of about 10 million lines.

I found that processing is becoming slower over time.

and I create a simulation of a script with 1 million lines as a hit, which identifies the problem.

but I still don’t know why, how does the team work read?

seq 1000000 > seq.dat
while read s;
do
    if [ `expr $s % 50000` -eq 0 ];then
        echo -n $( expr `date +%s` - $A) ' ';
        A=`date +%s`;
    fi
done < seq.dat

The terminal gives a time interval:

98 98 98 98 98 97 97 98 97 98 101 106 112 121 121 127 132 135 134

about 50,000 lines, processing becomes slower, obviously.

+3
source share
2 answers

, ( !). , . bash:

tabChar="   "  # put a real tab char here, of course
seq 1000000 > seq.dat
while read s;
do
    if (( ! ( s % 50000 ) )) ;then
        echo $s "${tabChar}" $( expr `date +%s` - $A) 
        A=$(date +%s);
    fi
done < seq.dat

​​, , , 50000- . Doah!

  if ((  s % 50000 )) ;then

  if (( ! ( s % 50000 ) )) ;then

echo ${.sh.version} = JM 93t + 2010-05-24

50000
100000   1
150000   0
200000   1
250000   0
300000   1
350000   0
400000   1
450000   0
500000   1
550000   0
600000   1
650000   0
700000   1
750000   0

bash

50000    480
100000   3
150000   2
200000   3
250000   3
300000   2
350000   3
400000   3
450000   2
500000   2
550000   3
600000   2
650000   2
700000   3
750000   3
800000   2
850000   2
900000   3
950000   2
800000   1
850000   0
900000   1
950000   0
1e+06    1

, ... . , , . , , , . , - truss strace ( ).

, .

+3

- , " Korn" *. ( , 7.2.2.1.) , awk sed, , : .

, , , . awk .

, awk :

#!/usr/bin/env bash

seq 1000000 | 
awk '
  BEGIN {
    command = "date +%s"
    prevTime = 0
  }
  $1 % 50000 == 0 {
    command | getline currentTime
    close(command)

    print currentTime - prevTime
    prevTime = currentTime
  }
'

:

1335629268
0   
0   
0   
0   
0   
0   
0   
0   
0   
0   
0   
0   
0   
0   
1   
0   
0   
0   
0

, date +%s. , .

* , Korn, bash OP, bash ksh . ksh bash. , .

+3

All Articles