Skip to content

Conversation

@singhpk234
Copy link
Contributor

@singhpk234 singhpk234 commented Mar 27, 2022

we faced concurrent modification issue in ORC when trying to modify existing hadoop conf from HadoopFileIO in ORC write builder. ref : Issue#4383, PR#4384

Looks like this code snippet was present in Parquet#write as well, based on discussion #3810 (comment)

As per my understanding, we should do the same changes as we did in the ORC, here in parquet. The issue we had in ORC (given that we expect our PR fixed it), could also happen in Parquet#write under same circumstances, added a test for the same


cc @openinx, @hililiwei , @rdblue

@hililiwei
Copy link
Contributor

Thanks for the fix.

@singhpk234 singhpk234 changed the title Parquet : Avoid modifying existing conf of HadoopOutputFile rather create new one Parquet: Avoid modifying existing conf of HadoopOutputFile rather create new one Mar 28, 2022
@singhpk234 singhpk234 requested a review from jackye1995 March 30, 2022 04:17
@singhpk234
Copy link
Contributor Author

@openinx can you please take a pass when you have some time

@openinx openinx merged commit 023934f into apache:master Apr 2, 2022
@openinx
Copy link
Member

openinx commented Apr 2, 2022

Got this merged, thanks @singhpk234 for the contribution !

@singhpk234 singhpk234 deleted the fix/concurrent-modification-parquet branch April 2, 2022 03:19
kbendick pushed a commit to kbendick/iceberg that referenced this pull request Apr 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants