# logparser **Repository Path**: bluew11/logparser ## Basic Information - **Project Name**: logparser - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 1 - **Created**: 2024-02-26 - **Last Updated**: 2024-02-26 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

# Logparser
Python version Pypi version Pypi version Downloads License

Logparser provides a machine learning toolkit and benchmarks for automated log parsing, which is a crucial step for structured log analytics. By applying logparser, users can automatically extract event templates from unstructured logs and convert raw log messages into a sequence of structured events. The process of log parsing is also known as message template extraction, log key extraction, or log message clustering in the literature.


An example of log parsing

### 🌈 New updates + Since the first release of logparser, many PRs and issues have been submitted due to incompatibility with Python 3. Finally, we update logparser v1.0.0 with support for Python 3. Thanks for all the contributions ([#PR86](https://github.com/logpai/logparser/pull/86), [#PR85](https://github.com/logpai/logparser/pull/85), [#PR83](https://github.com/logpai/logparser/pull/83), [#PR80](https://github.com/logpai/logparser/pull/80), [#PR65](https://github.com/logpai/logparser/pull/65), [#PR57](https://github.com/logpai/logparser/pull/57), [#PR53](https://github.com/logpai/logparser/pull/53), [#PR52](https://github.com/logpai/logparser/pull/52), [#PR51](https://github.com/logpai/logparser/pull/51), [#PR49](https://github.com/logpai/logparser/pull/49), [#PR18](https://github.com/logpai/logparser/pull/18), [#PR22](https://github.com/logpai/logparser/pull/22))! + We build the package wheel logparser3 and release it on pypi. Please install via `pip install logparser3`. + We refactor the code structure and beautify the code via the Python code formatter black. ### Log parsers available: | Publication | Parser | Paper Title | Benchmark | |:-----------:|:-------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------:| | IPOM'03 | [SLCT](https://github.com/logpai/logparser/tree/main/logparser/SLCT#slct) | [A Data Clustering Algorithm for Mining Patterns from Event Logs](https://ristov.github.io/publications/slct-ipom03-web.pdf), by Risto Vaarandi. | [:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/SLCT#benchmark) | | QSIC'08 | [AEL](https://github.com/logpai/logparser/tree/main/logparser/AEL#ael) | [Abstracting Execution Logs to Execution Events for Enterprise Applications](https://www.researchgate.net/publication/4366728_Abstracting_Execution_Logs_to_Execution_Events_for_Enterprise_Applications_Short_Paper), by Zhen Ming Jiang, Ahmed E. Hassan, Parminder Flora, Gilbert Hamann. | [:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/AEL#benchmark) | | KDD'09 | [IPLoM](https://github.com/logpai/logparser/tree/main/logparser/IPLoM#iplom) | [Clustering Event Logs Using Iterative Partitioning](https://web.cs.dal.ca/~makanju/publications/paper/kdd09.pdf), by Adetokunbo Makanju, A. Nur Zincir-Heywood, Evangelos E. Milios. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/IPLoM#benchmark) | | ICDM'09 | [LKE](https://github.com/logpai/logparser/tree/main/logparser/LKE#lke) | [Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/DM790-CR.pdf), by Qiang Fu, Jian-Guang Lou, Yi Wang, Jiang Li. [**Microsoft**] |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/LKE#benchmark) | | MSR'10 | [LFA](https://github.com/logpai/logparser/tree/main/logparser/LFA#lfa) | [Abstracting Log Lines to Log Event Types for Mining Software System Logs](http://www.se.rit.edu/~mei/publications/pdfs/Abstracting-Log-Lines-to-Log-Event-Types-for-Mining-Software-System-Logs.pdf), by Meiyappan Nagappan, Mladen A. Vouk. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/LFA#benchmark) | | CIKM'11 | [LogSig](https://github.com/logpai/logparser/tree/main/logparser/LogSig#logsig) | [LogSig: Generating System Events from Raw Textual Logs](https://users.cs.fiu.edu/~taoli/pub/liang-cikm2011.pdf), by Liang Tang, Tao Li, Chang-Shing Perng. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/LogSig#benchmark) | | SCC'13 | [SHISO](https://github.com/logpai/logparser/tree/main/logparser/SHISO#shiso) | [Incremental Mining of System Log Format](http://ieeexplore.ieee.org/document/6649746/), by Masayoshi Mizutani. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/SHISO#benchmark) | | CNSM'15 | [LogCluster](https://github.com/logpai/logparser/tree/main/logparser/LogCluster#logcluster) | [LogCluster - A Data Clustering and Pattern Mining Algorithm for Event Logs](http://dl.ifip.org/db/conf/cnsm/cnsm2015/1570161213.pdf), by Risto Vaarandi, Mauno Pihelgas. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/LogCluster#benchmark) | | CNSM'15 | [LenMa](https://github.com/logpai/logparser/tree/main/logparser/LenMa#lenma) | [Length Matters: Clustering System Log Messages using Length of Words](https://arxiv.org/pdf/1611.03213.pdf), by Keiichi Shima. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/LenMa#benchmark) | | CIKM'16 | [LogMine](https://github.com/logpai/logparser/tree/main/logparser/LogMine#logmine) | [LogMine: Fast Pattern Recognition for Log Analytics](http://www.cs.unm.edu/~mueen/Papers/LogMine.pdf), by Hossein Hamooni, Biplob Debnath, Jianwu Xu, Hui Zhang, Geoff Jiang, Adbullah Mueen. [**NEC**] |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/LogMine#benchmark) | | ICDM'16 | [Spell](https://github.com/logpai/logparser/tree/main/logparser/Spell#spell) | [Spell: Streaming Parsing of System Event Logs](https://www.cs.utah.edu/~lifeifei/papers/spell.pdf), by Min Du, Feifei Li. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/Spell#benchmark) | | ICWS'17 | [Drain](https://github.com/logpai/logparser/tree/main/logparser/Drain#drain) | [Drain: An Online Log Parsing Approach with Fixed Depth Tree](https://jiemingzhu.github.io/pub/pjhe_icws2017.pdf), by Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R. Lyu. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/Drain#benchmark) | | ICPC'18 | [MoLFI](https://github.com/logpai/logparser/tree/main/logparser/MoLFI#molfi) | [A Search-based Approach for Accurate Identification of Log Message Formats](http://publications.uni.lu/bitstream/10993/35286/1/ICPC-2018.pdf), by Salma Messaoudi, Annibale Panichella, Domenico Bianculli, Lionel Briand, Raimondas Sasnauskas. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/MoLFI#benchmark) | | TSE'20 | [Logram](https://github.com/logpai/logparser/tree/main/logparser/Logram#logram) | [Logram: Efficient Log Parsing Using n-Gram Dictionaries](https://arxiv.org/pdf/2001.03038.pdf), by Hetong Dai, Heng Li, Che-Shao Chen, Weiyi Shang, and Tse-Hsun (Peter) Chen. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/Logram#benchmark) | | ECML-PKDD'20 | [NuLog](https://github.com/logpai/logparser/tree/main/logparser/NuLog#NuLog) | [Self-Supervised Log Parsing](https://arxiv.org/abs/2003.07905), by Sasho Nedelkoski, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, Odej Kao. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/NuLog#benchmark) | | ICSME'22 | [ULP](https://github.com/logpai/logparser/tree/main/logparser/ULP#ULP) | [An Effective Approach for Parsing Large Log Files](https://users.encs.concordia.ca/~abdelw/papers/ICSME2022_ULP.pdf), by Issam Sedki, Abdelwahab Hamou-Lhadj, Otmane Ait-Mohamed, Mohammed A. Shehab. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/ULP#benchmark) | | TSC'23 | [Brain](https://github.com/logpai/logparser/tree/main/logparser/Brain#Brain) | [Brain: Log Parsing with Bidirectional Parallel Tree](https://ieeexplore.ieee.org/abstract/document/10109145), by Siyu Yu, Pinjia He, Ningjiang Chen, Yifan Wu. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/Brain#benchmark) | | ICSE'24 | [DivLog](https://github.com/logpai/logparser/tree/main/logparser/DivLog#DivLog) | [DivLog: Log Parsing with Prompt Enhanced In-Context Learning](https://doi.org/10.1145/3597503.3639155), by Junjielong Xu, Ruichun Yang, Yintong Huo, Chengyu Zhang, and Pinjia He. |[:arrow_upper_right:](https://github.com/logpai/logparser/tree/main/logparser/DivLog#benchmark) | :bulb: Welcome to submit a PR to push your parser code to logparser and add your paper to the table. ### Installation We recommend installing the logparser package and requirements via pip install. ``` pip install logparser3 ``` In particular, the package depends on the following requirements. Note that regex matching in Python is brittle, so we recommend fixing the regex library to version 2022.3.2. + python 3.6+ + regex 2022.3.2 + numpy + pandas + scipy + scikit-learn Conditional requirements: + If using MoLFI: `deap` + If using SHISO: `nltk` + If using SLCT: `gcc` + If using LogCluster: `perl` + If using NuLog: `torch`, `torchvision`, `keras_preprocessing` + If using DivLog: `openai`, `tiktoken` (require python 3.8+) ### Get started 1. Run the demo: For each log parser, we provide a demo to help you get started. Each demo shows the basic usage of a target log parser and the hyper-parameters to configure. For example, the following command shows how to run the demo for Drain. ``` cd logparser/Drain python demo.py ``` 2. Run the benchmark: For each log parser, we provide a benchmark script to run log parsing on the [loghub_2k datasets](https://github.com/logpai/logparser/tree/main/data#loghub_2k) for evaluating parsing accuarcy. You can also use [other benchmark datasets for log parsing](https://github.com/logpai/logparser/tree/main/data#datasets). ``` cd logparser/Drain python benchmark.py ``` The benchmarking results can be found at the readme file of each parser, e.g., https://github.com/logpai/logparser/tree/main/logparser/Drain#benchmark. 3. Parse your own logs: It is easy to apply logparser to parsing your own log data. To do so, you need to install the logparser3 package first. Then you can develop your own script following the below code snippet to start log parsing. See the full example code at [example/parse_your_own_logs.py](https://github.com/logpai/logparser/blob/main/example/parse_your_own_logs.py). ```python from logparser.Drain import LogParser input_dir = 'PATH_TO_LOGS/' # The input directory of log file output_dir = 'result/' # The output directory of parsing results log_file = 'unknow.log' # The input log file name log_format = '