Skip to content

feat: stabilize Table class#979

Merged
lars-reimann merged 78 commits intomainfrom
877-improve-tests-for-table
Jan 12, 2025
Merged

feat: stabilize Table class#979
lars-reimann merged 78 commits intomainfrom
877-improve-tests-for-table

Conversation

@lars-reimann
Copy link
Member

@lars-reimann lars-reimann commented Jan 12, 2025

Closes #875
Closes #877
Closes partially #977

Summary of Changes

Stabilize the API of the Table class. This PR introduces several breaking changes to this class:

  • All optional parameters are now keyword-only, so we can reposition them later.
  • The data parameter of __init__ is now required.
  • Rename remove_columns_except to select_columns
    • The new method can also be called with a callback that determines which columns to select.
  • Rename add_table_as_columns to add_tables_as_columns
    • Multiple tables can now be passed at once.
  • Rename add_table_as_rows to add_tables_as_rows
    • Multiple tables can now be passed at once.

It also adds new functionality throughout the library:

  • New method Table.add_index_column to add a new column with auto-incrementing integer values to a table.
  • New method Table.filter_rows to keep only the rows matched by some predicate.
  • New method Table.filter_rows_by_column to keep only the rows that have a value in a specific column that matches some predicate.
  • New parameter random_seed for Table.shuffle_rows and Table.split_rows to control the pseudorandom number generator. Previously, the methods were deterministic, but the seed was hidden.
  • New parameter missing_value_ratio_threshold of Table.remove_columns_with_missing_values to be able to keep columns with only a few missing values.
  • Various static factory methods under ColumnType to instantiate column types. This prepares for Overwrite specified schema (selectively) #754.

Finally, the methods Table.summarize_statistics and Column.summarize_statistics are now considerably faster.

@lars-reimann lars-reimann requested a review from a team as a code owner January 12, 2025 18:17
@lars-reimann lars-reimann linked an issue Jan 12, 2025 that may be closed by this pull request
56 tasks
@lars-reimann lars-reimann changed the title feat; stabilize Table class feat: stabilize Table class Jan 12, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jan 12, 2025

🦙 MegaLinter status: ✅ SUCCESS

Descriptor Linter Files Fixed Errors Elapsed time
✅ JSON jsonlint 3 0 0.17s
✅ JSON npm-package-json-lint yes no 0.34s
✅ JSON prettier 3 0 0 1.22s
✅ JSON v8r 3 0 2.94s
✅ PYTHON black 277 0 0 5.96s
✅ PYTHON mypy 277 0 6.33s
✅ PYTHON ruff 277 0 0 0.33s
✅ REPOSITORY git_diff yes no 0.38s

See detailed report in MegaLinter reports
Set VALIDATE_ALL_CODEBASE: true in mega-linter.yml to validate all sources, not only the diff

MegaLinter is graciously provided by OX Security

@codecov
Copy link

codecov bot commented Jan 12, 2025

Codecov Report

Attention: Patch coverage is 96.08939% with 21 lines in your changes missing coverage. Please review.

Project coverage is 94.99%. Comparing base (29fdefa) to head (c973ead).
Report is 23 commits behind head on main.

Files with missing lines Patch % Lines
src/safeds/ml/classical/regression/_regressor.py 0.00% 6 Missing ⚠️
.../safeds/ml/classical/classification/_classifier.py 16.66% 5 Missing ⚠️
src/safeds/ml/classical/_supervised_model.py 50.00% 4 Missing ⚠️
src/safeds/ml/nn/_model.py 50.00% 2 Missing ⚠️
src/safeds/_validation/_check_schema_module.py 98.03% 1 Missing ⚠️
...ds/data/tabular/transformation/_one_hot_encoder.py 80.00% 1 Missing ⚠️
src/safeds/ml/metrics/_classification_metrics.py 50.00% 1 Missing ⚠️
src/safeds/ml/metrics/_regression_metrics.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #979      +/-   ##
==========================================
+ Coverage   94.42%   94.99%   +0.57%     
==========================================
  Files         121      123       +2     
  Lines        7459     7696     +237     
==========================================
+ Hits         7043     7311     +268     
+ Misses        416      385      -31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lars-reimann lars-reimann merged commit db85617 into main Jan 12, 2025
@lars-reimann lars-reimann deleted the 877-improve-tests-for-table branch January 12, 2025 19:48
lars-reimann pushed a commit that referenced this pull request Feb 26, 2025
## [0.30.0](v0.29.0...v0.30.0) (2025-02-26)

### Features

* add more mathematical operations ([#986](#986)) ([2539a20](2539a20)), closes [#977](#977)
* add more string operations ([#993](#993)) ([9bc5673](9bc5673)), closes [#977](#977)
* consistent `selector` parameters ([#983](#983)) ([dc4640b](dc4640b))
* improved operations on cells ([#985](#985)) ([7396c94](7396c94)), closes [#977](#977)
* make `data` parameter of `Table` and `Column` required ([#978](#978)) ([29fdefa](29fdefa))
* stabilize `Cell` class ([#984](#984)) ([96be911](96be911)), closes [#977](#977)
* stabilize `Column` ([#981](#981)) ([38dc89c](38dc89c)), closes [#754](#754) [#977](#977)
* stabilize `Row` class ([#980](#980)) ([ca1ce3d](ca1ce3d)), closes [#977](#977)
* stabilize `Table` class ([#979](#979)) ([db85617](db85617)), closes [#875](#875) [#877](#877) [#977](#977) [#754](#754)
* transform multiple columns of `Table` at once ([#982](#982)) ([2db9069](2db9069))
@lars-reimann
Copy link
Member Author

🎉 This PR is included in version 0.30.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@lars-reimann lars-reimann added the released Included in a release label Feb 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

released Included in a release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve tests for Table Improve tests for classes related to typing of tabular data

2 participants