[DataFrame] Implements df.as_matrix#2001
Conversation
|
Test FAILed. |
python/ray/dataframe/dataframe.py
Outdated
There was a problem hiding this comment.
__array__ does the same thing here. Would be better if this called that under the hood, so that it can be optimized in one place later.
There was a problem hiding this comment.
do you think b/c as_matrix has the columns kwargs, that we should leave as_matrix like it is now, but then use return self.as_matrix() for __array__, so it will be optimized in one place, but we can deal with the columns?
There was a problem hiding this comment.
After review, it looks like they do similar things, but __array__ takes dtypes (our implementation is just disregarding that for some reason), and as_matrix takes columns, so let's just keep them separate for now. I'll add a TODO here also, though.
There was a problem hiding this comment.
Can you test array(df) here too.
There was a problem hiding this comment.
Please clarify what you mean by array(df).
There was a problem hiding this comment.
Test the __array__ function here.
There was a problem hiding this comment.
We already have a test___array__, why do we need to test that here also?
There was a problem hiding this comment.
Sorry, forgot about that. In that case, this is not necessary.
There was a problem hiding this comment.
Can you use the fixture model for testing here and define the numpy matrix in the tests to compare against?
There was a problem hiding this comment.
Why does this need a fixture model?
There was a problem hiding this comment.
fixture models are more in tune with what we have been using. The simplest way to do this test would be to run to_matrix or __array__ on both pd_df and ray_df and check equality.
There was a problem hiding this comment.
We will make a pass over the tests to unify them in a later PR.
e4a1624 to
490b889
Compare
|
Test FAILed. |
| frame = rdf.DataFrame(test_data.frame) | ||
| mat = frame.as_matrix() | ||
|
|
||
| frameCols = frame.columns |
There was a problem hiding this comment.
frameCols seems a bit inconsistent with the naming convention, maybe frame_columns?
There was a problem hiding this comment.
Good point, I'll make this change.
|
Test PASSed. |
|
Merged, thanks @SaladRaider! |
* master: (21 commits) Expand local_dir in Trial init (ray-project#2013) Fixing ascii error for Python2 (ray-project#2009) [DataFrame] Implements df.update (ray-project#1997) [DataFrame] Implements df.as_matrix (ray-project#2001) [DataFrame] Implement quantile (ray-project#1992) [DataFrame] Impement sort_values and sort_index (ray-project#1977) [DataFrame] Implement rank (ray-project#1991) [DataFrame] Implemented prod, product, added test suite (ray-project#1994) [DataFrame] Implemented __setitem__, select_dtypes, and astype (ray-project#1941) [DataFrame] Implement diff (ray-project#1996) [DataFrame] Implemented nunique, skew (ray-project#1995) [DataFrame] Implements filter and dropna (ray-project#1959) [DataFrame] Implements df.pipe (ray-project#1999) [DataFrame] Apply() for Lists and Dicts (ray-project#1973) Clean up syntax for supported Python versions. (ray-project#1963) [DataFrame] Implements mode, to_datetime, and get_dummies (ray-project#1956) [DataFrame] Fix dtypes (ray-project#1930) keep_dims -> keepdims (ray-project#1980) add pthread linking (ray-project#1986) [DataFrame] Add layer of abstraction to allow OID instantiation (ray-project#1984) ...
* master: (25 commits) [DataFrame] Add direct pandas imports for MVP (ray-project#1960) Make ActorHandles pickleable, also make proper ActorHandle and ActorC… (ray-project#2007) Expand local_dir in Trial init (ray-project#2013) Fixing ascii error for Python2 (ray-project#2009) [DataFrame] Implements df.update (ray-project#1997) [DataFrame] Implements df.as_matrix (ray-project#2001) [DataFrame] Implement quantile (ray-project#1992) [DataFrame] Impement sort_values and sort_index (ray-project#1977) [DataFrame] Implement rank (ray-project#1991) [DataFrame] Implemented prod, product, added test suite (ray-project#1994) [DataFrame] Implemented __setitem__, select_dtypes, and astype (ray-project#1941) [DataFrame] Implement diff (ray-project#1996) [DataFrame] Implemented nunique, skew (ray-project#1995) [DataFrame] Implements filter and dropna (ray-project#1959) [DataFrame] Implements df.pipe (ray-project#1999) [DataFrame] Apply() for Lists and Dicts (ray-project#1973) Clean up syntax for supported Python versions. (ray-project#1963) [DataFrame] Implements mode, to_datetime, and get_dummies (ray-project#1956) [DataFrame] Fix dtypes (ray-project#1930) keep_dims -> keepdims (ray-project#1980) ...
* master: [DataFrame] Add direct pandas imports for MVP (ray-project#1960) Make ActorHandles pickleable, also make proper ActorHandle and ActorC… (ray-project#2007) Expand local_dir in Trial init (ray-project#2013) Fixing ascii error for Python2 (ray-project#2009) [DataFrame] Implements df.update (ray-project#1997) [DataFrame] Implements df.as_matrix (ray-project#2001) [DataFrame] Implement quantile (ray-project#1992) [DataFrame] Impement sort_values and sort_index (ray-project#1977) [DataFrame] Implement rank (ray-project#1991) [DataFrame] Implemented prod, product, added test suite (ray-project#1994) [DataFrame] Implemented __setitem__, select_dtypes, and astype (ray-project#1941) [DataFrame] Implement diff (ray-project#1996) [DataFrame] Implemented nunique, skew (ray-project#1995) [DataFrame] Implements filter and dropna (ray-project#1959) [DataFrame] Implements df.pipe (ray-project#1999) [DataFrame] Apply() for Lists and Dicts (ray-project#1973)
What do these changes do?
Implements df.as_matrix