Skip to content

Conversation

@tebeka
Copy link
Contributor

@tebeka tebeka commented Feb 20, 2017

Just for code review, not final code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timestamps before 1970 are negative, so: yes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add data generator fixtures to integration_test.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@wesm
Copy link
Member

wesm commented Feb 23, 2017

This will have some minor rebase conflicts with #347

@wesm
Copy link
Member

wesm commented Feb 24, 2017

Small rebase required

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add date and time test cases for https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/ipc-adapter-test.cc#L178 and https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/ipc-file-test.cc#L183.

see e.g. https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/test-common.h#L283

Integration tests are blocked until ipc-file-test and ipc-adapter-test are passing , then we can add the data generators to integration_test.py. We'll also need to wait for ARROW-582 to get done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does rapidjson handle integers over 2^53? If it doesn't have any problems then this is OK

Copy link
Contributor Author

@tebeka tebeka Mar 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See so, the following code works:

#include <iostream>
#include <sstream>
#include "rapidjson/document.h"
#include "rapidjson/istreamwrapper.h"

using namespace rapidjson;

int
main() {

    std::stringstream in;
    in << "{";
    // 4611686018427387917
    in << "\"x\": " << ((uint64_t(1)<<62) + 13) << ",";
    in << "\"y\": 2";
    in << "}";
    IStreamWrapper isw(in);
    Document doc;

    doc.ParseStream(isw);
    int64_t x = doc["x"].GetInt64();
    std::cout << "x = " << x << std::endl;
    int64_t y = doc["y"].GetInt64();
    std::cout << "y = " << y << std::endl;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. Suppose we'll have the proof in the working integration tests anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timestamps are always signed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fixing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you combine the logic for TimeType and TimestampType into a single function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. WIll do

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the same parsing code as for integers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will try to add is_base_of to the ReadArray method for PrimitiveCType and BooleanType.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be deleted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TimeUnit is a strongly typed enum, so these MIN/MAX fields aren't necessary. In Flatbuffers, the MIN/MAX values are these to help account for NULL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If unit is an invalid value, I don't believe it will have made it this far

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where will we fail? In the flatbuffer side of things?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, i.e. it should not be possible to return an invalid TimeUnit from a flatbuffer

@wesm
Copy link
Member

wesm commented Mar 1, 2017

@wesm
Copy link
Member

wesm commented Mar 12, 2017

@tebeka I opened https://issues.apache.org/jira/browse/ARROW-620 -- since you started on the JSON support in this patch, if you have time can you take this up in a new patch?

@tebeka
Copy link
Contributor Author

tebeka commented Mar 13, 2017

@wesm OK, will look into ARROW-620

jeffknupp pushed a commit to jeffknupp/arrow that referenced this pull request Mar 15, 2017
Closes apache#345. I had mostly done this in apache#361 so this adds tests to `ipc-adapter-test`

Author: Wes McKinney <wes.mckinney@twosigma.com>

Closes apache#371 from wesm/ARROW-534 and squashes the following commits:

cab6d4f [Wes McKinney] Add functions to make record batches for date, date32, timestamp, time. Fix bugs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants