Is there an open source tool to automatically search for patterns in log files?

I have been working on a cluster system for many years and decided that the time had come when we had a tool that allows us to easily request text log files (among other things). I downloaded all the log files to an old test machine, where they compressed the compression by about 20 GB, but would take 550 GB without compression (partly due to many stack traces). We have different “topics” supported by different people, and our magazine formats have changed over the years. But let me assume that I could somehow turn it into a single, consistent format on all topics.

My question is: is there any free open source tool that I can simply expand into these files and it will automatically recognize duplicate similar log messages. As an example of a message:

User John Smith has logged in from IP aaa.bbb.ccc.ddd. Duration: zzz ms.

Given the many cases of such a message, the tool will develop a template such as:

User * has logged in from IP *. Duration: * ms.

Where * is a placeholder for different data. As soon as we get these templates (which need to be updated regularly), we could match each new message with the templates and create useful statistics.

Ideally, the tool would be Java, or Python or Perl, since we use them, and we are in a mixed Windows / Linux environment.

+3
source share
1 answer

All Articles