FEAT: streaming support in fetchone for nvarcharmax data type by gargsaumya · Pull Request #220 · microsoft/mssql-python

gargsaumya · 2025-09-03T15:50:52Z

Work Item / Issue Reference

AB#38110
AB#34162

GitHub Issue: #<ISSUE_NUMBER>

Summary

This pull request improves NVARCHAR data handling in the SQL Server Python bindings and adds comprehensive tests for NVARCHAR(MAX) scenarios. The main changes include switching to streaming for large NVARCHAR values, optimizing direct fetch for smaller values, and adding tests for edge cases and boundaries to ensure correctness.

NVARCHAR data handling improvements:

Updated the logic in ddbc_bindings.cpp to use streaming for large NVARCHAR/NCHAR columns (over 4000 characters or unknown size) and direct fetch for smaller values, optimizing performance and reliability.
Refactored data conversion for NVARCHAR fetches, using std::wstring for conversion and simplifying platform-specific handling for both macOS/Linux and Windows.
Improved handling of empty strings and NULLs for NVARCHAR columns, ensuring correct Python types are returned and logging is more descriptive.

Testing enhancements:

Added new tests in test_004_cursor.py for NVARCHAR(MAX) covering short strings, boundary conditions (4000 chars), streaming (4100+ chars), large values (100,000 chars), empty strings, NULLs, and transaction rollback scenarios to verify correct behavior across all edge cases.

VARCHAR/CHAR fetch improvements:

Improved direct fetch logic for small VARCHAR/CHAR columns and fixed string conversion to use the actual data length, preventing potential issues with null-termination and buffer size. [1] [2]

mssql_python/pybind/ddbc_bindings.cpp

sumitmsft

Left a few comments

bewithgaurav

will wait for latest main branch merge, will refactor this PR to a great extent

.coveragerc

.github/workflows/pr-format-check.yml

mssql_python/pybind/ddbc_bindings.cpp

gargsaumya · 2025-09-11T06:56:50Z

will wait for latest main branch merge, will refactor this PR to a great extent

The conflicts are now resolved. You can go ahead and re-review.

### Work Item / Issue Reference   > [AB#34162](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/34162) [AB#38110](https://sqlclientdrivers.visualstudio.com/c6d89619-62de-46a0-8b46-70b92a84d85e/_workitems/edit/38110)  > GitHub Issue: #<ISSUE_NUMBER> ------------------------------------------------------------------- ### Summary  This pull request improves the handling of large object (LOB) columns when fetching data from SQL Server in the `mssql_python/pybind/ddbc_bindings.cpp` file. The main change is to detect LOB columns (such as large strings or binary data) and switch to a per-row fetching strategy using `SQLGetData_wrap` for those columns, ensuring correct streaming and memory usage. For non-LOB columns, the batch fetching logic remains unchanged, but now includes logic to handle LOBs appropriately during batch fetches. **LOB detection and fetch strategy:** * Added logic in both `FetchMany_wrap` and `FetchAll_wrap` to detect LOB columns by checking column data types and sizes, and to fall back to per-row fetch using `SQLGetData_wrap` when LOBs are present. This avoids buffer overflows and streams LOBs correctly. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R2647-R2675) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R2769-R2797) * Updated calls to `FetchBatchData` in both wrappers to pass the list of detected LOB columns, so batch fetches can handle them appropriately. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2677-R2690) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2770-R2812) **Batch fetch improvements:** * Modified the signature of `FetchBatchData` to accept the LOB columns list and updated its logic to handle LOB columns differently: instead of throwing exceptions when buffer sizes are insufficient, it now streams LOB data using `FetchLobColumnData`. [[1]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2345-R2345) [[2]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R2388-R2396) [[3]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1R2408-R2410) [[4]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2428-R2424) [[5]](diffhunk://#diff-dde2297345718ec449a14e7dff91b7bb2342b008ecc071f562233646d71144a1L2520-R2518)

github-actions bot added the pr-size: medium Moderate update size label Sep 3, 2025

gargsaumya changed the title ~~FEAT: streaming support in fetch flow for nvarcharmax data type~~ FEAT: streaming support in fetchone for nvarcharmax data type Sep 3, 2025

github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 3, 2025

sumitmsft reviewed Sep 9, 2025

View reviewed changes

mssql_python/pybind/ddbc_bindings.cpp Outdated Show resolved Hide resolved

sumitmsft reviewed Sep 9, 2025

View reviewed changes

mssql_python/pybind/ddbc_bindings.cpp Show resolved Hide resolved

sumitmsft reviewed Sep 9, 2025

View reviewed changes

mssql_python/pybind/ddbc_bindings.cpp Outdated Show resolved Hide resolved

sumitmsft reviewed Sep 9, 2025

View reviewed changes

mssql_python/pybind/ddbc_bindings.cpp Show resolved Hide resolved

sumitmsft requested changes Sep 9, 2025

View reviewed changes

gargsaumya force-pushed the saumya/streaming-fetchone branch from f6b7389 to e21b47e Compare September 10, 2025 07:25

bewithgaurav reviewed Sep 11, 2025

View reviewed changes

.coveragerc Show resolved Hide resolved

.github/workflows/pr-format-check.yml Show resolved Hide resolved

mssql_python/pybind/ddbc_bindings.cpp Outdated Show resolved Hide resolved

github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 11, 2025

gargsaumya force-pushed the saumya/streaming-fetchone branch 2 times, most recently from 598a6be to 7f67326 Compare September 11, 2025 05:53

gargsaumya added 2 commits September 11, 2025 11:39

adding streaming support in fetch for varcharmax type

fd99431

added streaming support for fetch for nvarcharmax type

d0ccd44

gargsaumya force-pushed the saumya/fetchone-nvarcharmax branch from 63834f3 to d0ccd44 Compare September 11, 2025 06:20

github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 11, 2025

resolving merge conflicts

b6e7e4d

github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 11, 2025

adding test comments

9b9a226

github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 11, 2025

addressing review comments

76aa2a9

github-actions bot removed the pr-size: medium Moderate update size label Sep 11, 2025

github-actions bot added the pr-size: medium Moderate update size label Sep 11, 2025

gargsaumya requested review from bewithgaurav and sumitmsft September 11, 2025 06:57

sumitmsft approved these changes Sep 12, 2025

View reviewed changes

bewithgaurav approved these changes Sep 15, 2025

View reviewed changes

github-actions bot added pr-size: large Substantial code update and removed pr-size: medium Moderate update size labels Sep 15, 2025

gargsaumya merged commit fba171c into saumya/streaming-fetchone Sep 15, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: streaming support in fetchone for nvarcharmax data type#220

FEAT: streaming support in fetchone for nvarcharmax data type#220
gargsaumya merged 6 commits intosaumya/streaming-fetchonefrom
saumya/fetchone-nvarcharmax

gargsaumya commented Sep 3, 2025 •

edited by azure-boards bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sumitmsft left a comment

Uh oh!

bewithgaurav left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gargsaumya commented Sep 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gargsaumya commented Sep 3, 2025 • edited by azure-boards bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Work Item / Issue Reference

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sumitmsft left a comment

Choose a reason for hiding this comment

Uh oh!

bewithgaurav left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gargsaumya commented Sep 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gargsaumya commented Sep 3, 2025 •

edited by azure-boards bot

Loading