Add NLP metrics package#513
Merged
jhnwu3 merged 6 commits intosunlabuiuc:masterfrom Jun 7, 2025
plandes:master
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This pull request adds a new NLP metrics package along with its associated unit tests, updated dependency files for NLP, and build/CI automation.
- Introduces the pyhealth.nlp.metrics package with edit distance, Rouge, and Bleu metrics.
- Adds unit tests and a base test class to control logging and output expectations.
- Updates dependency requirements and adds build/CI automation via a makefile and GitHub workflows.
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/nlp/test_metrics.py | New unit tests for the NLP metrics package with logging setup. |
| tests/base.py | Added a base test class with logging helpers for tests. |
| test-resources/nlp/metrics.csv | Provides baseline CSV data for testing metric outputs. |
| requirements-nlp.txt | Lists the new NLP dependencies required for testing and execution. |
| makefile | Introduces a build automation file to install deps and run tests. |
| README.rst | Updates documentation with an added CI badge. |
| .github/workflows/test.yml | Configures CI for pull requests and pushes on master. |
| .github/workflows/run-unit-tests.yml | Removes an outdated workflow configuration. |
| self.assertEqual(1., res1.scores['editdistance'].value) | ||
|
|
||
| def test_pandas(self): | ||
| WRITE: bool = 0 |
There was a problem hiding this comment.
Consider initializing the boolean variable WRITE with 'False' instead of 0 for clarity.
Suggested change
| WRITE: bool = 0 | |
| WRITE: bool = False |
| scorer = Scorer() | ||
| ss: ScoreSet = scorer.score(ScoreContext(self.pairs)) | ||
| df: pd.DataFrame = ss.as_dataframe() | ||
| # give tolarance for arch high sig digits that might be off by epsilon |
There was a problem hiding this comment.
Typo in comment: 'tolarance' should be corrected to 'tolerance'.
Suggested change
| # give tolarance for arch high sig digits that might be off by epsilon | |
| # give tolerance for arch high sig digits that might be off by epsilon |
| matrix: | ||
| python-version: ['3.11'] | ||
| steps: | ||
| - name: Checkout reposistory |
There was a problem hiding this comment.
Typo in workflow step: 'reposistory' should be corrected to 'repository'.
Suggested change
| - name: Checkout reposistory | |
| - name: Checkout repository |
dalloliogm
pushed a commit
to dalloliogm/PyHealth
that referenced
this pull request
Nov 26, 2025
* fix file mode of a non-executable text file * add basic NLP scoring methods package, unit tests, and automation * fix unit tests * fix unit tests * add test resources; fix tests * add CI bulid status badge
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request:
pyhealth.nlp.metricspackage that provides a common API to provide metrics for edit distance (Lev.), Rouge {1..9,L} and Bleu.requirements-nlp.txtfile that was needed to run the tests (this should be folded intorequirements.txt).makefile.