From d6c4697bf3826c05e9bc76446c835134acd045e6 Mon Sep 17 00:00:00 2001 From: Ubuntu <zbzscript@zbgpu01.ndj4anicnnlexeer5gqrjnt1hc.ax.internal.cloudapp.net> Date: Sat, 1 Feb 2025 20:24:37 +0000 Subject: [PATCH] :art: add syntax highlighting to README --- README.md | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 5fec553..ec6e78a 100644 --- a/README.md +++ b/README.md @@ -21,18 +21,17 @@ To calculate the pseudo-perplexities, any huggingface model from the BERT family ### Getting Started To calcualte the pseudo-perplexity per word, run: -``` +```bash # Installing the dependencies ->>> pip install transformers, tqdm - ->>> python3 compute_pppl.py -m your-model-name -i path/to/your/data -o path/to/output/directory --window-size 11 +pip install transformers, tqdm +python3 compute_pppl.py -m your-model-name -i path/to/your/data -o path/to/output/directory --window-size 11 ``` As input, the script expects a json file with the following structure: -``` +```json [ { "page_id": "ocr_27812752_p1.json", @@ -54,7 +53,7 @@ As input, the script expects a json file with the following structure: The output are json files containing the pseudo-perplexity scores: -``` +```json [ { "page_id": "ocr_27812752_p1.json", @@ -85,18 +84,18 @@ To calculate the pseudo-perplexity per sentence, we use the [Language Model Perp Install the Language Model Perplexity (LM-PPL) repository: -``` ->>> pip install lmppl +```bash +pip install lmppl ``` To use the repositroy, run: -``` ->>> python3 run_lmppl.py -m your-model-name -i path/to/your/data -o path/to/output/directory +```bash +python3 run_lmppl.py -m your-model-name -i path/to/your/data -o path/to/output/directory ``` As input, the script expects a json file with the following structure: -``` +```json [ { "sent_id": "ocr_26843985_p4_6", @@ -117,7 +116,7 @@ As input, the script expects a json file with the following structure: The output are json files containing the pseudo-perplexity scores: -``` +```json [ { "sent_id": "ocr_26843985_p4_6", -- GitLab