I have a simple find command that has to go through millions of files on a server and find some with a given suffix. Files are written and deleted over time very often. I'm just wondering if there is a way to make the search faster. Using a location is out of the question because creating a database for a location will be very expensive.
find /myDirWithThausandsofDirectories/ -name *.suffix
On some servers, these commands take several days!
Any thoughts?
Thank,
Divide and win? Assuming MP os and a processor, create several commands findfor each subfolder.
find
for dir in /myDirWithThausandsofDirectories/* do find "$dir" -name "*.suffix" & done
, , (find ) . , (.. bash, $! , ). , .
$!
, , ;) - . , . , , .
. find , .
, Bash . :
shopt -s globstar for path in /etc/**/**.conf do echo "$path" done
, , find.
Bash, , :
for path in /etc/*/*.conf /etc/*/*/*.conf /etc/*/*/*/*.conf do echo "$path" done
:
find /myDirWithThausandsofDirectories/ -d type maxdepth 1 > /tmp/input IFS=$'\n' read -r -d '' -a files < /tmp/input do_it() { for f; do find $f -name *.suffix | sed -e s/\.suffix//g ; done } # Divide the list into 5 sub-lists. i=0 n=0 a=() b=() c=() d=() e=() while ((i < ${#files[*]})); do a[n]=${files[i]} b[n]=${files[i+1]} c[n]=${files[i+2]} d[n]=${files[i+3]} e[n]=${files[i+4]} ((i+=5, n++)) done # Process the sub-lists in parallel do_it "${a[@]}" >> /tmp/f.unsorted 2>/tmp/f.err & do_it "${b[@]}" >> /tmp/f.unsorted 2>/tmp/f.err & do_it "${c[@]}" >> /tmp/f.unsorted 2>/tmp/f.err & do_it "${d[@]}" >> /tmp/f.unsorted 2>/tmp/f.err & do_it "${e[@]}" >> /tmp/f.unsorted 2>/tmp/f.err & wait echo Find is Done!
, , - ( ). , !