-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-534 - WIP #345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-534 - WIP #345
Conversation
cpp/src/arrow/ipc/json-internal.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timestamps before 1970 are negative, so: yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK :)
integration/data/simple.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add data generator fixtures to integration_test.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
|
This will have some minor rebase conflicts with #347 |
|
Small rebase required |
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add date and time test cases for https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/ipc-adapter-test.cc#L178 and https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/ipc-file-test.cc#L183.
see e.g. https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/test-common.h#L283
Integration tests are blocked until ipc-file-test and ipc-adapter-test are passing , then we can add the data generators to integration_test.py. We'll also need to wait for ARROW-582 to get done
cpp/src/arrow/ipc/json-internal.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does rapidjson handle integers over 2^53? If it doesn't have any problems then this is OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See so, the following code works:
#include <iostream>
#include <sstream>
#include "rapidjson/document.h"
#include "rapidjson/istreamwrapper.h"
using namespace rapidjson;
int
main() {
std::stringstream in;
in << "{";
// 4611686018427387917
in << "\"x\": " << ((uint64_t(1)<<62) + 13) << ",";
in << "\"y\": 2";
in << "}";
IStreamWrapper isw(in);
Document doc;
doc.ParseStream(isw);
int64_t x = doc["x"].GetInt64();
std::cout << "x = " << x << std::endl;
int64_t y = doc["y"].GetInt64();
std::cout << "y = " << y << std::endl;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know. Suppose we'll have the proof in the working integration tests anyway
cpp/src/arrow/ipc/json-internal.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timestamps are always signed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fixing
cpp/src/arrow/ipc/json-internal.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you combine the logic for TimeType and TimestampType into a single function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. WIll do
cpp/src/arrow/ipc/json-internal.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the same parsing code as for integers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, will try to add is_base_of to the ReadArray method for PrimitiveCType and BooleanType.
cpp/src/arrow/type.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be deleted
cpp/src/arrow/type.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TimeUnit is a strongly typed enum, so these MIN/MAX fields aren't necessary. In Flatbuffers, the MIN/MAX values are these to help account for NULL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If unit is an invalid value, I don't believe it will have made it this far
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where will we fail? In the flatbuffer side of things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, i.e. it should not be possible to return an invalid TimeUnit from a flatbuffer
|
I note that a similar test fixture is missing in https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/ipc-json-test.cc |
|
@tebeka I opened https://issues.apache.org/jira/browse/ARROW-620 -- since you started on the JSON support in this patch, if you have time can you take this up in a new patch? |
Closes apache#345. I had mostly done this in apache#361 so this adds tests to `ipc-adapter-test` Author: Wes McKinney <wes.mckinney@twosigma.com> Closes apache#371 from wesm/ARROW-534 and squashes the following commits: cab6d4f [Wes McKinney] Add functions to make record batches for date, date32, timestamp, time. Fix bugs
Just for code review, not final code.