Use regular expressions to parse image data text files.#1971
Open
erictzeng wants to merge 1 commit intoBVLC:masterfrom
Open
Use regular expressions to parse image data text files.#1971erictzeng wants to merge 1 commit intoBVLC:masterfrom
erictzeng wants to merge 1 commit intoBVLC:masterfrom
Conversation
Member
|
@erictzeng this looks right -- thanks for fixing the brittle format -- but I think you need to update the travis script to install boost regex: https://github.com/BVLC/caffe/blob/master/scripts/travis/travis_install.sh. |
Contributor
|
Any updates on this? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1951.
This pull request consists of two changes:
ifstreammethod of parsing image data files, this pull request uses regular expressions for more robust matching.tools/convert_imageset.cppandsrc/caffe/layers/image_data_layer.cpp. This pull request pulls that common code out into a new function insrc/caffe/util/io.cppfor ease of maintenance.More details follow.
Each line of the input text file is matched against the following regular expression:
Feel free to play around with an interactive version so you can test it out and see what it matches. This regular expression handles a lot of cases that would've been difficult to handle using the previous naive approach. It captures whitespace within a filename, and enables quoting of filenames in case for some insane reason you have a space at the beginning of a file name.
Some concrete examples of really degenerate cases that will parse correctly:
One drawback is that this introduces
boost_regexas an additional dependency. However, since we already require Boost, this seems like an acceptable tradeoff.Implementation-wise, this pull request should be complete, though it's lacking tests, which I will get around to writing at some point in the near future.