Add NLP metrics package by plandes · Pull Request #513 · sunlabuiuc/PyHealth

plandes · 2025-05-22T20:52:52Z

This pull request:

Adds a pyhealth.nlp.metrics package that provides a common API to provide metrics for edit distance (Lev.), Rouge {1..9,L} and Bleu.
Adds unit test cases and a simple base class to control logging.
A tighter requirements-nlp.txt file that was needed to run the tests (this should be folded into requirements.txt).
A very simple build automation GNU makefile.

Copilot

Pull Request Overview

This pull request adds a new NLP metrics package along with its associated unit tests, updated dependency files for NLP, and build/CI automation.

Introduces the pyhealth.nlp.metrics package with edit distance, Rouge, and Bleu metrics.
Adds unit tests and a base test class to control logging and output expectations.
Updates dependency requirements and adds build/CI automation via a makefile and GitHub workflows.

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/nlp/test_metrics.py	New unit tests for the NLP metrics package with logging setup.
tests/base.py	Added a base test class with logging helpers for tests.
test-resources/nlp/metrics.csv	Provides baseline CSV data for testing metric outputs.
requirements-nlp.txt	Lists the new NLP dependencies required for testing and execution.
makefile	Introduces a build automation file to install deps and run tests.
README.rst	Updates documentation with an added CI badge.
.github/workflows/test.yml	Configures CI for pull requests and pushes on master.
.github/workflows/run-unit-tests.yml	Removes an outdated workflow configuration.

Copilot · 2025-06-01T18:45:44Z

tests/nlp/test_metrics.py

+        self.assertEqual(1., res1.scores['editdistance'].value)
+
+    def test_pandas(self):
+        WRITE: bool = 0


Consider initializing the boolean variable WRITE with 'False' instead of 0 for clarity.

Suggested change

WRITE: bool = 0

WRITE: bool = False

Copilot · 2025-06-01T18:45:44Z

tests/nlp/test_metrics.py

+        scorer = Scorer()
+        ss: ScoreSet = scorer.score(ScoreContext(self.pairs))
+        df: pd.DataFrame = ss.as_dataframe()
+        # give tolarance for arch high sig digits that might be off by epsilon


Typo in comment: 'tolarance' should be corrected to 'tolerance'.

Suggested change

# give tolarance for arch high sig digits that might be off by epsilon

# give tolerance for arch high sig digits that might be off by epsilon

Copilot · 2025-06-01T18:45:44Z

.github/workflows/test.yml

+      matrix:
+        python-version: ['3.11']
+    steps:
+      - name: Checkout reposistory


Typo in workflow step: 'reposistory' should be corrected to 'repository'.

Suggested change

- name: Checkout reposistory

- name: Checkout repository

jhnwu3

Looks good to me.

* fix file mode of a non-executable text file * add basic NLP scoring methods package, unit tests, and automation * fix unit tests * fix unit tests * add test resources; fix tests * add CI bulid status badge

plandes added 6 commits May 22, 2025 15:38

fix file mode of a non-executable text file

50907c9

add basic NLP scoring methods package, unit tests, and automation

b831a8b

fix unit tests

1afdce1

fix unit tests

787fe48

add test resources; fix tests

da40b62

add CI bulid status badge

2e26dd3

jhnwu3 requested a review from Copilot June 1, 2025 18:45

Copilot AI reviewed Jun 1, 2025

View reviewed changes

jhnwu3 approved these changes Jun 7, 2025

View reviewed changes

jhnwu3 merged commit 7a0a86c into sunlabuiuc:master Jun 7, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NLP metrics package#513

Add NLP metrics package#513
jhnwu3 merged 6 commits intosunlabuiuc:masterfrom
plandes:master

plandes commented May 22, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 1, 2025

Uh oh!

Copilot AI Jun 1, 2025

Uh oh!

Copilot AI Jun 1, 2025

Uh oh!

jhnwu3 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	# give tolarance for arch high sig digits that might be off by epsilon
	# give tolerance for arch high sig digits that might be off by epsilon

Conversation

plandes commented May 22, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 1, 2025

Choose a reason for hiding this comment

Uh oh!

jhnwu3 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants