Conversation
Implements pandas missing-value utilities as standalone exported functions: - `isna` / `notna` / `isnull` / `notnull` — detect missing values in scalars, Series, and DataFrames (mirrors pd.isna / pd.notna) - `ffillSeries` / `bfillSeries` — forward/backward fill for Series with optional `limit` parameter - `dataFrameFfill` / `dataFrameBfill` — column-wise or row-wise fill for DataFrames with optional `limit` and `axis` parameters Metric: 28 → 29 pandas_features_ported Run: https://github.com/githubnext/tsessebe/actions/runs/24263385922 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements pctChangeSeries() and pctChangeDataFrame() mirroring pandas.Series.pct_change() / pandas.DataFrame.pct_change(). - periods: configurable lag (positive = backward, negative = forward) - fillMethod: "pad" (default), "bfill", or null (no fill) - limit: cap consecutive fills - axis: column-wise (default) or row-wise for DataFrame Full test coverage: unit tests, edge cases, and fast-check property tests. Interactive playground page at playground/pct_change.html. Run: https://github.com/githubnext/tsessebe/actions/runs/24266545401
Run: https://github.com/githubnext/tsessebe/actions/runs/24281202174 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Run: https://github.com/githubnext/tsessebe/actions/runs/24282208612 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Run: https://github.com/githubnext/tsessebe/actions/runs/24282791339 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…g for Series and DataFrame Run: https://github.com/githubnext/tsessebe/actions/runs/24283807306 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- stats/duplicated.ts: duplicatedSeries, duplicatedDataFrame, dropDuplicatesSeries, dropDuplicatesDataFrame with keep='first'/'last'/false and subset support - core/sample.ts: sampleSeries, sampleDataFrame with n/frac, replace, weighted sampling, and seeded RNG (randomState) - 35 tests each (unit + fast-check properties) - Playground pages: duplicated.html, sample.html Run: https://github.com/githubnext/tsessebe/actions/runs/24285279820 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- stats/clip_advanced.ts: clipAdvancedSeries, clipAdvancedDataFrame with per-element bounds from scalar, array, Series (positional), or DataFrame (element-wise). DataFrame bounds support axis=0/1 for Series broadcasting. - stats/apply.ts: applySeries, mapSeries (function/dict/Map), applyDataFrame (reduce per col/row), applyExpandDataFrame (transform per col/row → DataFrame), mapDataFrame (element-wise). Helper decomposition satisfies Biome complexity rules. - 25+ unit + property-based tests each (fast-check) - Playground pages: clip_advanced.html, apply.html - Creates canonical branch autoloop/build-tsb-pandas-typescript-migration from iter 199 Run: https://github.com/githubnext/tsessebe/actions/runs/24287426738 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- stats/cut.ts: cut() for equal-width or user-defined bins, qcut() for quantile bins - cutCodes() returns integer bin codes; cutCategories() returns label arrays - CutOptions: right, labels, retbins, precision, includeLowest, ordered - QcutOptions: labels, retbins, precision, duplicates (raise/drop) - 30+ unit tests + fast-check property tests - Playground page: cut.html (8 interactive demos) - Export from stats/index.ts and src/index.ts Run: https://github.com/githubnext/tsessebe/actions/runs/24288003426 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…range Add `stats/interval.ts` with: - `Interval` class — single bounded interval with all four closed types (left/right/both/neither) - `IntervalIndex` — ordered array of intervals with fromBreaks, fromArrays, fromIntervals factories - `intervalRange()` — equal-length interval ranges by period count or step size - Lookup: indexOf, overlapping, append, isMonotonic - 60+ unit tests + fast-check property tests - Playground page interval.html (8 interactive demos) Run: https://github.com/githubnext/tsessebe/actions/runs/24288493950 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Run: https://github.com/githubnext/tsessebe/actions/runs/24289114918 Add `stats/get_dummies.ts` with: - `getDummies(data, options?)` — one-hot encode a Series or DataFrame (unified API) - `getDummiesSeries` — encode a single Series into binary indicator columns - `getDummiesDataFrame` — encode categorical columns in a DataFrame - `fromDummies(df, options?)` — reverse one-hot encoding back to a categorical Series Options: prefix, prefixSep, dummyNa, columns (DataFrame), dropFirst, dtype 45+ unit + fast-check tests. Playground page get_dummies.html (8 interactive demos). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add stats/crosstab.ts with crosstab() and crosstabSeries(): - Frequency count of co-occurrences of two factor arrays/Series - Custom aggfunc (count/sum/mean/min/max) with values parameter - margins: adds All row/column with totals - normalize: all/index/columns proportion tables - dropna: exclude/include null factor values 21 tests (unit + property-based) all pass. Lint clean. Metric: 43 (previous best: 42, delta: +1). Run: https://github.com/githubnext/tsessebe/actions/runs/24290127464 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements reshape/pivot_table.ts with full pandas.pivot_table() parity: - All aggfuncs: mean, sum, min, max, count, first, last - margins=true adds All row/column using raw data (not cell aggregates) - margins_name to customize the All label - sort option (default true) for lexicographic row/column ordering - fill_value and dropna support - Multiple index/column columns supported Tests: 25 unit tests + 4 property-based tests (fast-check) Playground: playground/pivot_table.html with 8 interactive demos Metric: 44 (previous best: 43, delta: +1) Run: https://github.com/githubnext/tsessebe/actions/runs/24290574060 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements pandas.DataFrame.explode / Series.explode: - explodeSeries: expand array-valued cells into individual rows - explodeDataFrame: explode one or more columns, repeating other columns - ignore_index option to reset to RangeIndex - Handles null, empty arrays, scalars, multi-column explosion (zip-longest) - 27 unit tests + property-based tests (fast-check) - Playground page with 8 interactive demos Run: https://github.com/githubnext/tsessebe/actions/runs/24291234244 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add two new pandas features: - src/stats/factorize.ts: factorize() and factorizeSeries() — integer encoding of categorical values. First-seen or sorted order, configurable NA sentinel. 30 unit tests + 4 property-based tests. Playground: factorize.html. - src/reshape/wide_to_long.ts: wideToLong() — reshape wide-format DataFrames to long format by gathering stub-prefixed columns. Supports multiple stubs, custom separator/suffix, multiple id columns. 14 unit tests + 3 property-based tests. Playground: wide_to_long.html. Metric: 47 pandas_features_ported (previous best: 46, delta: +1) Run: https://github.com/githubnext/tsessebe/actions/runs/24292269871 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements pandas Series.interpolate() and DataFrame.interpolate(): - interpolateSeries: linear, pad/ffill, backfill/bfill, nearest methods - interpolateDataFrame: axis=0 (column-wise) and axis=1 (row-wise) - limit: max consecutive NaN values to fill - limitDirection: forward, backward, both - limitArea: inside (interior gaps only) or outside (edge values only) - 35 unit tests + 4 property-based tests - Playground page with 8 interactive demos Run: https://github.com/githubnext/tsessebe/actions/runs/24292676836 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Port pandas DataFrame.select_dtypes(include, exclude) to TypeScript. Accepts exact dtype names and generic aliases (number, integer, floating, bool, string, datetime, timedelta, category, object, signed, unsigned). Run: https://github.com/githubnext/tsessebe/actions/runs/24293279696 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement io/read_excel.ts: a dependency-free XLSX reader built on a ZIP binary parser (EOCD + central directory + local headers), raw DEFLATE decompression via node:zlib inflateRawSync, and XML parsing via regex generators. Returns a full DataFrame with dtype inference, header/skipRows/nrows/naValues/indexCol options. Also exposes xlsxSheetNames() for metadata-only access. 26 passing tests, playground page added. Metric: 49 → 50 pandas features ported. Run: https://github.com/githubnext/tsessebe/actions/runs/24294236300 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements pandas.json_normalize() for tsb: - Flatten nested objects using configurable sep (default '.') - Unpack nested arrays of records via recordPath (string or path array) - Include parent-level fields as metadata via meta + metaPrefix - Apply recordPrefix to avoid column collisions - maxLevel to cap flattening depth - errors='raise'|'ignore' for complex meta values - 26 tests: unit + property-based (fast-check) - playground/json_normalize.html interactive tutorial Metric: 51 (+1 from 50) Run: https://github.com/githubnext/tsessebe/actions/runs/24294949963
|
Warning The 🤖 Iteration 217 — ✅ Accepted — Run
|
- src/stats/mode.ts: modeSeries/modeDataFrame — all tied modes sorted ascending; axis=0 (column-wise, null-padded) and axis=1 (row-wise); dropna and numericOnly options - src/stats/skew_kurt.ts: skewSeries/kurtSeries/skewDataFrame/kurtDataFrame — adjusted Fisher-Pearson skewness and bias-corrected excess kurtosis; skipna, axis, numericOnly options - Tests: mode.test.ts (16 unit + 3 property), skew_kurt.test.ts (18 unit + 3 property) - Playground: mode.html, skew_kurt.html - Metric: 53 (+2, from 51 → 53) Run: https://github.com/githubnext/tsessebe/actions/runs/24296661989 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add stats/sem_var.ts: varSeries/varDataFrame (sample/population variance, configurable ddof/skipna/minCount/axis/numericOnly) and semSeries/semDataFrame (SEM = sqrt(var/n)). StatFn type alias for clean reducer callbacks. 25 unit tests + 3 property tests. Add stats/nunique.ts: nuniqueSeries/nuniqueDataFrame (count unique values, dropna), anySeries/allSeries (boolean reductions, skipna, vacuous all), anyDataFrame/allDataFrame (axis, skipna, boolOnly). Extract anyInSlice/allInSlice/rowValues helpers to keep complexity under 15. 31 unit tests + 2 property tests. Playground: sem_var.html, nunique.html. Update playground/index.html. Metric: 55 (+2 from 53 actual baseline, beats best_metric 54). Run: https://github.com/githubnext/tsessebe/actions/runs/24299079452 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🤖 Iteration 220 — ✅ Accepted — Run Add two pandas features:
Metric: 55 (+1 vs best 54) | Commit: bb3f8f3 Previous metric 54 was from iter 219 (sem_var) which was lost in a push failure — this iteration recovers it and adds nunique/any/all. |
…aFrame Implements pandas Series.quantile() and DataFrame.quantile() with full feature parity: - quantileSeries(series, options): number | Series<Scalar> - q: scalar or array of quantile levels in [0, 1] - interpolation: linear, lower, higher, midpoint, nearest - skipna: true (default) ignores null/NaN - quantileDataFrame(df, options): Series<Scalar> | DataFrame - axis=0 (default): per-column quantiles - axis=1: per-row quantiles - numericOnly: true (default) skips non-numeric columns - multi-q returns DataFrame; scalar q returns Series - 46 unit + 4 property-based tests (fast-check) - monotonicity, q=0/1 min/max invariants, lower<=linear<=higher - playground/quantile.html — 7 interactive demos Metric: 56 (+1 vs 55) Run: https://github.com/githubnext/tsessebe/actions/runs/24299797044 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@copilot resolve merge conflicts |
…lidated features + unique branch additions Agent-Logs-Url: https://github.com/githubnext/tsessebe/sessions/95f77242-1e2f-4067-b5e3-4cea9a9ba1f8 Co-authored-by: mrjf <180956+mrjf@users.noreply.github.com>
Merge conflicts resolved in commit origin/main had advanced by 331 commits since the previous merge. This round's conflicts:
Result: 4593/4669 tests pass. The 76 failures are pre-existing API incompatibilities between this branch's diverged implementations. |
|
@copilot fix conflicts |
…s and playground index Co-authored-by: mrjf <180956+mrjf@users.noreply.github.com>
Conflicts fixed in commit The 3 remaining content conflicts were all the same pattern (our branch-unique exports vs. main's empty end-of-file):
|
- Add TimedeltaLike interface to Scalar type for stats Timedelta support - Export stats Timedelta (public ctor, totalMs) from src/index.ts instead of core - Fix diffSeries/shiftSeries exports to use new options-based API from diff_shift.ts - Fix sample.ts: use null instead of undefined for name fields; use ** operator - Fix na_ops.ts: use null instead of undefined for name fields - Fix read_excel.ts: handle TimedeltaLike in toLabel switch - Fix explode.ts formatting (biome auto-fix) - Fix SampleDataFrameOptions.axis to accept string forms 'index'/'columns' - Fix tests/stats/interval.test.ts: import Interval/IntervalIndex from stats module - Fix tests/stats/to_timedelta.test.ts: use optional chaining for nullable results - Fix tests/stats/shift_diff.test.ts: update to new options-based API, null expectations - Fix tests/reshape/explode.test.ts: use ignoreIndex instead of ignore_index - Fix tests/core/sample.test.ts: correct index construction and type annotations Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Commit pushed:
|
For W/W-MON etc. frequencies, genFromStart and genBetween now call
snapToAnchor() to advance the cursor to the first occurrence of the
anchor weekday on or after the provided start date, matching pandas
behaviour where date_range('2024-01-01', periods=4, freq='W') returns
four Sundays rather than a Monday followed by three Sundays.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Commit pushed:
|
|
🤖 Evergreen fix applied Issue: The "Test & Lint" CI check was failing due to a bug in Root cause: Fix: Added a TypeScript type-check (
|
|
@copilot fix test failures |
…l, explode, and sample Agent-Logs-Url: https://github.com/githubnext/tsessebe/sessions/29d9c670-7033-40c6-baa5-3f651bf0be32 Co-authored-by: mrjf <180956+mrjf@users.noreply.github.com>
|
@copilot fix ci |
…obal regex state Agent-Logs-Url: https://github.com/githubnext/tsessebe/sessions/29d9c670-7033-40c6-baa5-3f651bf0be32 Co-authored-by: mrjf <180956+mrjf@users.noreply.github.com>
Fixed all 12 failing tests in commit
Result: 4669/4669 tests pass (was 4657/4669). |
|
@copilot fix failing tests |
Agent-Logs-Url: https://github.com/githubnext/tsessebe/sessions/1986d026-3aa4-43dd-aff2-d6fe777651bc Co-authored-by: mrjf <180956+mrjf@users.noreply.github.com>
Fixed in commit
All test failures and the CI lint error were resolved in commits |
🤖 This PR is maintained by Autoloop. Each accepted iteration adds a commit to this branch.
Program Goal
Build tsb, a complete TypeScript port of pandas, one feature at a time. See Steering Issue #107 for coordination.
Current Best Metric
51 pandas_features_ported
Latest Iteration (216)
Added jsonNormalize() - pandas.json_normalize() port:
Metric: 51 (+1 from 50)