Conversation
There was a problem hiding this comment.
Let's use this header here too:
/*
* Licensed to Metamarkets Group Inc. (Metamarkets) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. Metamarkets licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
There was a problem hiding this comment.
fieldPathMap and fieldSpecs could be final
7a3bc0c to
6e590b7
Compare
There was a problem hiding this comment.
You don't need to save an ObjectReader; you can call readValue directly on the ObjectMapper.
There was a problem hiding this comment.
@gianm removed the reader, calling readValue on the mapper directly now
Add a line note
|
good stuff besides the nits. it appears, this class can replace JsonParser in terms of functionality, should we deprecate that so as to remove same in future. |
|
agree that deprecating the JSONParser is the way to go, assuming this new one is just as fast for already-flattened data. |
|
Hmm, let me revoke my +1 as I think this patch needs some more changes to be compatible with the old JSONParser and allow us to deprecate that. |
There was a problem hiding this comment.
For Lists this should be applied to each element of the list. Please also include unit tests for stuff like {"toomany":[1234567890000000000000]} and {"funkyCoding":["foo\uD900"]}.
There was a problem hiding this comment.
@gianm Conversion function is now applied to List entries, as well as Map entries
|
@jon-wei Some recent PRs clobbered the merge, can you please rebase? |
|
@drcrallen Rebase is done |
|
@drcrallen yeah, that's worth mentioning |
|
@drcrallen [] is a parseable field value, I've added that to the unit test. Do you have an example use case for [{"type":"emptyObj"},{"type":"emptyObj2"}]? |
There was a problem hiding this comment.
(non blocking) you can simply specify @Test(expected = ParseException.class) or something like that, I forget the exact name.
There was a problem hiding this comment.
I like the ExpectedException Rule :)
I think this is ok the way it is though
There was a problem hiding this comment.
@drcrallen @gianm Changed those tests to use ExpectedException
|
@jon-wei can you add the comments on the parser class as per the conversation in this thread? |
|
@drcrallen @gianm Can you clarify what it means to "support only objects"? e..g, {field: [{"type":"emptyObj"},{"type":"emptyObj2"}]} could be parsed with this parser, using a fieldSpec like field[0].type |
|
@jon-wei Ok, can you make sure that's called out in a unit test then? |
|
@drcrallen I have a field like that in the nestedJson test string in JsonPathParserTest |
|
@jon-wei I'm missing it, can you link to the line please? all the static strings look like objects |
|
@drcrallen It's at line 47 of JSONPathParserTest:
|
|
@jon-wei That's a nested object, I'm referring to the top-level thing being an array instead of an object. |
|
@drcrallen Ah, got it, let's just support objects only for now then, I'll document this |
|
@drcrallen @gianm I've noted that only JSON objects are supported in the parser javadocs as well as the druid flatten-json.md doc |
|
still 👍 from me |
|
Can someone please create a new maven version for java-util? I'll update the druid-api pom.xml to use the new version when that's ready. |
There was a problem hiding this comment.
You may find considerable CPU savings from keeping a static ObjectMapper around for the default case rather than creating a new ObjectMapper in the constructor. ObjectMappers have significant startup costs associated with reflection. They're thread-safe, so re-use of a singleton in concurrent scenarios is fine.
So imagine a static field
private static final ObjectMapper DEFAULT_MAPPER = new ObjectMapper();
then the constructor below on line 63 can read
this.mapper = mapper == null ? DEFAULT_MAPPER : mapper;
or (since you're willing to use Guava)
this.mapper = Objects.firstNonNull(mapper, DEFAULT_MAPPER);
There was a problem hiding this comment.
@joshrose That's a good point, although the expected use case of the Parser is to be created once and then shared and re-used, so in practice I think the current code should be OK.
There was a problem hiding this comment.
Makes sense. (I've encountered the CPU costs of repeatedly allocating identical ObjectMappers in a few past projects.) Thanks for the reply, @gianm
|
Fwiw, because of this, it is impossible to use anything that uses a different version of ASM than what the json path jar includes. |
|
Hrm, lookin gat the jar itself, it's not packaging ASM. I made these comments based on what others told me. Need to track that down some more... |
|
@cheddar it is brought in as a dep of net.minidev.json-smart |
|
problem is in net.minidev.asm , we should probably upgrade to json-smart-2.2.1 |
Part of a set of 3 related pull requests, addressing Druid issue:
apache/druid#1839
#34 -- new JSON parser
druid-io/druid-api#65 -- ingestion spec modifications
apache/druid#1921 -- docs and benchmark
Adds a parser using the JsonPath library that allows the user to specify fields and accessor expressions, currently used for flattening JSON during Druid ingestion.