Skip to content

Conversation

@koic
Copy link
Contributor

@koic koic commented Jun 3, 2020

This PR adds invalid: :replace for CSV.open. It is a PR similar to #129.

field = String(field) # Stringify fields
# represent empty fields as empty quoted fields
if (@quote_empty and field.empty?) or @quotable_pattern.match?(field)
if (@quote_empty and field.empty?) or @quotable_pattern.match?(field.scrub)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this is a good approach.
If we have \uFFFD in @quotable_pattern, it matches all invalid characters.

We need to process this case carefully something like:

Suggested change
if (@quote_empty and field.empty?) or @quotable_pattern.match?(field.scrub)
if (@quote_empty and field.empty?) or (filed.valid_encoding? and @quotable_pattern.match?(field)

Could you create a separated pull request for this case and then we can back to this pull request?

Copy link
Contributor Author

@koic koic Jun 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion. I think that is definitely a safe approach and suitable. I've opened #131.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rebased this PR with the latest mater branch.

@koic koic force-pushed the add_invalid_replace_option_for_csv_open branch from 76bdf53 to 59b4305 Compare June 4, 2020 00:04
This PR adds `invalid: :replace` for `CSV.open`. It is a PR similar to ruby#129.

And this PR uses `String#scrub` to prevent the following `ArgumentError`.

```ruby
/[,"]/.match?("\x82\xA0")       #=> ArgumentError (invalid byte sequence in UTF-8)
/[,"]/.match?("\x82\xA0".scrub) #=> false
```
@koic koic force-pushed the add_invalid_replace_option_for_csv_open branch from 59b4305 to d1f9bc0 Compare June 4, 2020 03:20
@kou kou merged commit 5bf6873 into ruby:master Jun 4, 2020
@kou
Copy link
Member

kou commented Jun 4, 2020

Thanks!

@koic koic deleted the add_invalid_replace_option_for_csv_open branch June 4, 2020 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants