Skip to content

Add NLP metrics package#513

Merged
jhnwu3 merged 6 commits intosunlabuiuc:masterfrom
plandes:master
Jun 7, 2025
Merged

Add NLP metrics package#513
jhnwu3 merged 6 commits intosunlabuiuc:masterfrom
plandes:master

Conversation

@plandes
Copy link
Collaborator

@plandes plandes commented May 22, 2025

This pull request:

  • Adds a pyhealth.nlp.metrics package that provides a common API to provide metrics for edit distance (Lev.), Rouge {1..9,L} and Bleu.
  • Adds unit test cases and a simple base class to control logging.
  • A tighter requirements-nlp.txt file that was needed to run the tests (this should be folded into requirements.txt).
  • A very simple build automation GNU makefile.

@jhnwu3 jhnwu3 requested a review from Copilot June 1, 2025 18:45
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds a new NLP metrics package along with its associated unit tests, updated dependency files for NLP, and build/CI automation.

  • Introduces the pyhealth.nlp.metrics package with edit distance, Rouge, and Bleu metrics.
  • Adds unit tests and a base test class to control logging and output expectations.
  • Updates dependency requirements and adds build/CI automation via a makefile and GitHub workflows.

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/nlp/test_metrics.py New unit tests for the NLP metrics package with logging setup.
tests/base.py Added a base test class with logging helpers for tests.
test-resources/nlp/metrics.csv Provides baseline CSV data for testing metric outputs.
requirements-nlp.txt Lists the new NLP dependencies required for testing and execution.
makefile Introduces a build automation file to install deps and run tests.
README.rst Updates documentation with an added CI badge.
.github/workflows/test.yml Configures CI for pull requests and pushes on master.
.github/workflows/run-unit-tests.yml Removes an outdated workflow configuration.

self.assertEqual(1., res1.scores['editdistance'].value)

def test_pandas(self):
WRITE: bool = 0
Copy link

Copilot AI Jun 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider initializing the boolean variable WRITE with 'False' instead of 0 for clarity.

Suggested change
WRITE: bool = 0
WRITE: bool = False

Copilot uses AI. Check for mistakes.
scorer = Scorer()
ss: ScoreSet = scorer.score(ScoreContext(self.pairs))
df: pd.DataFrame = ss.as_dataframe()
# give tolarance for arch high sig digits that might be off by epsilon
Copy link

Copilot AI Jun 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment: 'tolarance' should be corrected to 'tolerance'.

Suggested change
# give tolarance for arch high sig digits that might be off by epsilon
# give tolerance for arch high sig digits that might be off by epsilon

Copilot uses AI. Check for mistakes.
matrix:
python-version: ['3.11']
steps:
- name: Checkout reposistory
Copy link

Copilot AI Jun 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in workflow step: 'reposistory' should be corrected to 'repository'.

Suggested change
- name: Checkout reposistory
- name: Checkout repository

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@jhnwu3 jhnwu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@jhnwu3 jhnwu3 merged commit 7a0a86c into sunlabuiuc:master Jun 7, 2025
1 check passed
dalloliogm pushed a commit to dalloliogm/PyHealth that referenced this pull request Nov 26, 2025
* fix file mode of a non-executable text file

* add basic NLP scoring methods package, unit tests, and automation

* fix unit tests

* fix unit tests

* add test resources; fix tests

* add CI bulid status badge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants