-
Notifications
You must be signed in to change notification settings - Fork 49
Shared ArrayList leads to occasional VerificationException #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared ArrayList leads to occasional VerificationException #112
Conversation
| .settings/ | ||
| .classpath | ||
| .gradle/ | ||
| .DS_Store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related, but a nice to have for anyone contributing who uses IntelliJ.
2 similar comments
|
Hi @janvanmansum thanks for submitting this. I have an error when I try to run your example project Also, what is the performance of this code versus using |
|
I did a small test and using |
|
Hi @johnscancella, sorry I forgot to mention that The synchronized ArrayList will also work. However, this will probably lead to some performance loss. There was a comment in the code warning about the perfomance hit. I don't know how bad it will get in practise, but I guess you are using multiple threads to increase performance, so I would probably still prefer to avoid writing to shared data. In my opinion, getting all the errors instead of just the first one would be a great benefit, especially if they are returned in a structured way, like a list. The library user can stil choose to ignore the other errors anyway, so I don't really see a downside. On the other hand, if there are multiple errors, it is an extra burden on the user to have to discover them one by one. |
|
I am ok with a slight performance loss if it prevents an error that you are seeing, that note was from another issue that wasn't causing an error. Basically is is smaller than doing two loops which is what your pull request is doing. In this case it is the difference of O(n+c) vs O(n^2). Using a built in JDK class over custom code is also preferred because it is less of a maintenance burden(i.e. we will never have the same resources that the makers of the JDK do). I released a new version with the synchronized update 5.1.1. Please let me know if you find any issues. I will add a issue for improving the verifier by returning all the errors. |
Please ensure you have completed the following before submitting:
Note: you can complete both boxes by running and fixing warnings/errors with
gradle clean checkTo reproduce the bug
mvn installmvn exec:javaExpected output is at least a couple of times the word
FAILand thenDONE.To check this PR
project.version = "5.0.0-${now}-SNAPSHOT"toproject.version = "5.0.0-SNAPSHOT"inbagit.gradlegradle buildgradle publishToMavenLocalbagit-java-bugproject'spom.xmlchange the dependency onbagitto5.0.0-SNAPSHOTand runmvn exec:javaagain.Expected out is no
FAIL, just the wordDONE.Explanation
The problem is in the existing code in `BagVerifier:
The
ArrayListexceptionsis shared between all the threads working on the verification. However, this class is not thread safe. The javadocs specifically warn against structural modifications of the ArrayList from multiple threads. InCheckManifestHashesTaskthis is exactly what is happening as exceptions are added to the list by multiple threads.This seems to lead to a
nullentry in the list occasionally, which leads toebeingnull, thuse instanceof CorruptChecksumExceptionevaluating tofalseand then to theVerificationException.The fix, instead of making the list synchronized, gives every
CheckManifestHashesTaskits own exceptions list and only combines them after the tasks are finished. (Btw, I realize now that this is not really necessary, as the subsequent code just throws on the first exception, so you might want to optimized this a bit, or - even better - return all the corrupt checksums. I'll leave that up to you.)