Skip to content

Throw parse exceptions on schema get errors for SchemaRegistryBasedAvroBytesDecoder#12080

Merged
jon-wei merged 2 commits intoapache:masterfrom
jon-wei:avro_schema_error_handle
Jan 13, 2022
Merged

Throw parse exceptions on schema get errors for SchemaRegistryBasedAvroBytesDecoder#12080
jon-wei merged 2 commits intoapache:masterfrom
jon-wei:avro_schema_error_handle

Conversation

@jon-wei
Copy link
Copy Markdown
Contributor

@jon-wei jon-wei commented Dec 18, 2021

This PR adjusts SchemaRegistryBasedAvroBytesDecoder to throw ParseException instead of RE when failure to retrieve a schema occurs.

Throwing an RE can be problematic when this decoder is used in streaming ingestion: if a corrupt record enters the stream with an invalid schema ID, the user cannot easily recover from this as the streaming ingestion tasks will continue to fail and attempt to read the corrupt record.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@jon-wei jon-wei changed the title Add option to throw parse exceptions on schema get errors for SchemaRegistryBasedAvroBytesDecoder Throw parse exceptions on schema get errors for SchemaRegistryBasedAvroBytesDecoder Dec 21, 2021
Copy link
Copy Markdown
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this change makes sense 👍

my intentions were good when i made this change originally - a bad configuration non-intuitively seemed to run tasks that just couldn't process the data and threw everything out, so if the configuration would never work it made sense to fail fast.

However, I didn't consider the case of a bad record, where throwing a RE means there is no way to process such streams without skipping the offset past such a record, while a parse exception handles this case nicely.

@jon-wei jon-wei merged commit 74c876e into apache:master Jan 13, 2022
@abhishekagarwal87 abhishekagarwal87 added this to the 0.23.0 milestone May 11, 2022
sachinsagare pushed a commit to sachinsagare/druid that referenced this pull request Nov 3, 2022
…roBytesDecoder (apache#12080)

* Add option to throw parse exceptions on schema get errors for SchemaRegistryBasedAvroBytesDecoder

* Remove option

(cherry picked from commit 74c876e)
gianm added a commit to gianm/druid that referenced this pull request Nov 21, 2022
…stry.

The change in apache#12080 lost the original exception context. This patch
adds it back.
gianm added a commit to gianm/druid that referenced this pull request Nov 21, 2022
…stry.

The change in apache#12080 lost the original exception context. This patch
adds it back.
gianm added a commit that referenced this pull request Nov 22, 2022
…stry. (#13403)

* Attach IO error to parse error when we can't contact Avro schema registry.

The change in #12080 lost the original exception context. This patch
adds it back.

* Add hamcrest-core.

* Fix format string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants