Skip to content

[Repo Assist] Add StrictBooleans static parameter to CsvProvider#1621

Merged
dsyme merged 2 commits intomainfrom
repo-assist/feature-strict-booleans-1417-80161456360f4ff8
Feb 22, 2026
Merged

[Repo Assist] Add StrictBooleans static parameter to CsvProvider#1621
dsyme merged 2 commits intomainfrom
repo-assist/feature-strict-booleans-1417-80161456360f4ff8

Conversation

@github-actions
Copy link
Contributor

🤖 This is a draft PR from Repo Assist, an automated AI assistant.

Closes #1417

Summary

Adds a new StrictBooleans static parameter to CsvProvider (default false, fully backward-compatible).

When StrictBooleans=true:

  • Columns with only 0/1 values are inferred as int instead of bool
  • Columns with only yes/no values are inferred as string instead of bool
  • Only columns with exclusively true/false values become bool

This was explicitly endorsed by @dsyme in issue #1417.

Root Cause

FSharp.Data's boolean inference is intentionally broad: "0", "1", "yes", "no", "true", "false" all parse as booleans via TextConversions.AsBoolean. For CSVs where 0/1 represent numeric IDs or flags, this produces surprising bool columns.

Implementation

Changes are minimal and surgical:

  • CsvInference.fs: Added strictBooleans: bool param to inferCellType, inferType, and inferColumnTypes. When true, post-processes inferred types: Bit0/Bit1int; non-true/false boolean values → string. The CsvFile.InferColumnTypes method gets an F# optional ?strictBooleans parameter (backward-compatible).
  • HtmlInference.fs: Updated call to pass false for the new required positional param.
  • CsvProvider.fs: Added StrictBooleans as static parameter at index 17 (after PreferDateOnly).
  • tests/FSharp.Data.Tests/CsvProvider.fs: Added 3 new test cases verifying the feature.
  • tests/FSharp.Data.DesignTime.Tests/TypeProviderInstantiation.fs and InferenceTests.fs: Updated test infrastructure to include StrictBooleans=false (no behavior change for existing tests).

Trade-offs

  • Post-processing in inferCellType keeps the change isolated to CsvProvider without touching shared inference code.
  • StrictBooleans=false default preserves 100% backward compatibility.

Test Status

Linux build: All projects build with 0 errors
FSharp.Data.Tests: 251/251 tests pass (includes 3 new StrictBooleans tests)
FSharp.Data.DesignTime.Tests CSV: 20/20 CSV tests pass
⚠️ FSharp.Data.DesignTime.Tests XML: 18 pre-existing failures (network tests blocked by sandbox firewall + known IncludeFromWeb.xsd bug tracked in PR #1619) — not caused by this PR

Generated by Repo Assist

To install this workflow, run gh aw add githubnext/agentics/workflows/repo-assist.md@ee50a3b7d1d3eb4a8c409ac9409fd61c9a66b0f5. View source at https://github.com/githubnext/agentics/tree/ee50a3b7d1d3eb4a8c409ac9409fd61c9a66b0f5/workflows/repo-assist.md.

Warning

⚠️ Firewall blocked 2 domains

The following domains were blocked by the firewall during workflow execution:

  • schemas.microsoft.com
  • tomasp.net

When StrictBooleans=true:
- Columns with only 0/1 values are inferred as int (not bool)
- Columns with only yes/no values are inferred as string (not bool)
- Only columns with exclusively true/false values become bool

This implements the feature requested in issue #1417, endorsed by @dsyme.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Contributor Author

✅ Pull request created: #1621

@dsyme dsyme marked this pull request as ready for review February 22, 2026 19:31
preferOptionals
missingValues
inferenceMode
strictBooleans
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot You;ll need to fix up the code in TypeProviderInstantiation.fs too for Csv since you've added an argument

        match args.[0] with
        | "Csv" ->
            Csv { Sample = args.[1]
                  Separators = args.[2]
                  InferRows = Int32.MaxValue
                  Schema = args.[3].Replace(';', ',')
                  HasHeaders = args.[4] |> bool.Parse
                  IgnoreErrors = false
                  SkipRows = 0
                  AssumeMissingValues = args.[5] |> bool.Parse
                  PreferOptionals = args.[6] |> bool.Parse
                  Quote = '"'
                  MissingValues = args.[7]
                  Culture = args.[8]
                  Encoding = args.[9]
                  CacheRows = false
                  ResolutionFolder = ""
                  EmbeddedResource = ""
                  PreferDateOnly = false }

Check whole file carefully

Copy link
Contributor

Copilot AI commented Feb 22, 2026

@dsyme I've opened a new pull request, #1622, to work on those changes. Once the pull request is ready, I'll request review from you.

@github-actions
Copy link
Contributor Author

🤖 This is an automated response from Repo Assist.

Thanks for the review, @dsyme! Just to note — TypeProviderInstantiation.fs was already updated in this PR. The diff includes:

  1. Added StrictBooleans : bool to the CsvProviderArgs record type (line 30)
  2. Added box x.StrictBooleans to the args array in toProviderArgs (line 103)
  3. Added StrictBooleans = false to the Csv { ... } instantiation in the parseFromFile function (line 238)

This follows the same pattern as existing fields like PreferDateOnly, CacheRows, etc. which are also hardcoded to their defaults in that block (since the test CSV fixture files don't include these newer parameters).

The Copilot SWE agent created PR #1622 to address your feedback — but I believe the fix is already present. Happy to make further adjustments if there's something I've missed!

Generated by Repo Assist

To install this workflow, run gh aw add githubnext/agentics/workflows/repo-assist.md@ee50a3b7d1d3eb4a8c409ac9409fd61c9a66b0f5. View source at https://github.com/githubnext/agentics/tree/ee50a3b7d1d3eb4a8c409ac9409fd61c9a66b0f5/workflows/repo-assist.md.

@dsyme dsyme merged commit eb28d36 into main Feb 22, 2026
2 checks passed
@dsyme dsyme deleted the repo-assist/feature-strict-booleans-1417-80161456360f4ff8 branch February 22, 2026 22:15
github-actions bot added a commit that referenced this pull request Feb 23, 2026
Add entries for:
- #1613: CSS pseudo-class NotSupportedException fix (#1383)
- #1617: ConvertDateTimeOffset xs:dateTime fallback fix (#1437)
- #1618: Microsoft.Build security bump
- #1619: XmlProvider EmbeddedResource GetSchema fix (#1310)
- #1621: StrictBooleans parameter for CsvProvider
- #1625: CsvProvider.InferRows multiline quoted field fix (#1439)
- #1626: XSD group reference cycle guard (#1419)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Csv: possibility to treat 0/1 column as number

2 participants