-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Description
The issue appears to be that MongoInputFormat doesn't work on large datasets (too many mappers generated?), whereas InfiniteMongoInputFormat does (but backs off to MongoInputFormat if there are >8*12.5K documents).
The short term fix will be to force plugins to use InfiniteMongoInputFormat if in local mode, even if larger than the current limit (it's not like performance is going to be amazing in local mode anyway)
The workaround (from August OSS release) is to set "$splits" and "$docsPerSplit" (defaulting to 8 and 12500, see above) in the query object so that when multiplied they are larger than the number of documents to be processed.
This issue is recorded internally as INF-2018
Metadata
Metadata
Assignees
Labels
No labels