A sample Python utility using LLMs to summarise code files, producing a short summary of the important aspects for each file.
This tool has been developed as a Python CLI (command-line interface), so to use first install, then run commands via the terminal.
To create the virtual environment and install the tool, run the following commands from the project directory:
- Windows:
python -m venv ./venv # Create the virtual environment in the current directory
.\venv\Scripts\activate # Activate the virtual environment
pip install . # Install the codesummariser Python package- Linux:
python -m venv ./venv # Create the virtual environment in the current directory
source ./venv/bin/activate # Activate the virtual environment
pip install . # Install the codesummariserrfmon Python packageTo create the environment used to develop the tools, use:
pip install -r requirements-dev.txtOnce you are in an activated environment (see setup), you can run the tool using
codesummariserTo get help on the available options, use
codesummariser --helpThe output of this is also included below:
> codesummariser --help
usage: codesummariser [-h] [--search-dirs SEARCH_DIRS [SEARCH_DIRS ...]] [--code-exts CODE_EXTS [CODE_EXTS ...]]
[--summary-store SUMMARY_STORE] [--model MODEL] [--max-tokens MAX_TOKENS] [--model-temperature MODEL_TEMPERATURE]
[--cost-per-1k-tokens COST_PER_1K_TOKENS] [--always-check-existing-summaries] [--recursive]
options:
-h, --help show this help message and exit
--search-dirs SEARCH_DIRS [SEARCH_DIRS ...]
Which directories to search for code files. Can be multiple, will default to:
[WindowsPath('C:/Users/michael.walshe/source/katalyze-data/code-summariser/data/inputs')]
--code-exts CODE_EXTS [CODE_EXTS ...]
Which code extensions to search for. Can be multiple, will default to: {'.sas'}
--summary-store SUMMARY_STORE
Where to store the code summary CSV, defaults to C:\Users\michael.walshe\source\katalyze-data\code-
summariser\data\summaries\summary.csv
--model MODEL Which OpenAI model to use, defaults to: gpt-3.5-turbo
--max-tokens MAX_TOKENS
Max tokens to pass to the LLM in one chunk, defaults to: 4096
--model-temperature MODEL_TEMPERATURE
How deterministic is the model?
--cost-per-1k-tokens COST_PER_1K_TOKENS
How much does the model cost?
--always-check-existing-summaries
codesummariser will check if there is an existing CSV in the summary-store location. If this flag is set, it will
error if that store does not exist.
--recursive Whether to search --search-dirs recursively
All notable changes to this project will be documented in here. This project adheres to Semantic Versioning.
- No unreleased changes yet
Initial solution release.
- All Features
- n/a
- n/a