Extract file from tar.gz without touching the drive

Current process:

  • I have a file tar.gz. (In fact, I have about 2,000 of them, but this is another story).
  • I create a temporary directory, extract the file tar.gz, opening 100,000 tiny files (about 600 bytes each).
  • For each file, I drop it into a processing program, process this cycle in another analysis program, and save the result.

The temporary space on the machines that I use can process only one of these processes at once, not to mention the 16 (hypertext dual quad core) that they get by default. I am looking for a way to make this process without saving to disk. I believe that the performance limit for individually pulling files using tar -xf $file -O <targetname>would be prohibitive, but that might be what I'm stuck with.

Is there any way to do this?

EDIT: Since two people have already made this mistake, I will clarify:

  • Each file represents one point in time.
  • Each file is processed separately.
  • After processing (in this case, a variant of the Fourier analysis), each of them gives one line of output.
  • This conclusion can be combined to do things like autocorrelation in time.

EDIT2: Actual Code:

for f in posns/*; do
    ~/data_analysis/intermediate_scattering_function < "$f"
done | ~/data_analysis/complex_autocorrelation.awk limit=1000 > inter_autocorr.txt
+5
5

, , , script. Python tarfile, , , ( tar --to-stdout).

+4

, tar --to-stdout -xf $file , ; stdout .

, GNU tar, , bash.

[]

, , , script .

Python, Archive:: Tar Perl module. tar .

+5

tar --to-command=cmd . , TAR_FILENAME. .

.

tar zxf file.tar.gz --to-command='./process.sh'

, OSX bsdtar , . gnutar.

+4

You can use ramdisk ( http://www.vanemery.com/Linux/Ramdisk/ramdisk.html ) to process and download it. (I boldly believe that you are using Linux, but other UNIX systems should have the same conditions)

+2
source
tar zxvf <file.tar.gz> <path_to_extract> --to-command=cat

The above command will show the contents of the extracted file only on the shell. There will be no changes to the disk. The tar command must be GNU tar.

Log examples:

$ cat file_a
aaaa
$ cat file_b
bbbb
$ cat file_c
cccc
$ tar zcvf file.tar.gz file_a file_b file_c
file_a
file_b
file_c
$ cd temp
$ ls <== no files in directory
$ tar zxvf ../file.tar.gz file_b --to-command=cat
file_b
bbbb
$ tar zxvf ../file.tar.gz file_a --to-command=cat
file_a
aaaa
$ ls  <== Even after tar extract - no files in directory. So, no changes to disk
$ tar --version
tar (GNU tar) 1.25
...
$
0
source

All Articles