Skip to content

Conversation

@mroeschke
Copy link
Member

I moved some benchmarks to index_object.py that were testing method of (mostly) MultiIndexes. Otherwise mostly cleanup and now linting files that start with i.

$ asv dev -b ^indexing
· Discovering benchmarks
· Running 49 total benchmarks (1 commits * 1 environments * 49 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/indexing.py:211
[  2.04%] ··· Running indexing.IntervalIndexing.time_getitem_list         240μs
[  4.08%] ··· Running indexing.IntervalIndexing.time_getitem_scalar       136μs
[  6.12%] ··· Running indexing.IntervalIndexing.time_loc_list             208μs
[  8.16%] ··· Running indexing.IntervalIndexing.time_loc_scalar           222μs
[  8.16%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/indexing.py:245
[ 10.20%] ··· Running indexing.MethodLookup.time_lookup_iloc             10.4μs
[ 12.24%] ··· Running indexing.MethodLookup.time_lookup_ix               10.3μs
[ 14.29%] ··· Running indexing.MethodLookup.time_lookup_loc              10.1μs
[ 16.33%] ··· Running ...iesIndex.time_frame_assign_timeseries_index     6.08ms
[ 18.37%] ··· Running ....DataFrameNumericIndexing.time_bool_indexer     1.44ms
[ 20.41%] ··· Running indexing.DataFrameNumericIndexing.time_iloc         448μs
[ 22.45%] ··· Running ...ing.DataFrameNumericIndexing.time_iloc_dups      550μs
[ 24.49%] ··· Running indexing.DataFrameNumericIndexing.time_loc          827μs
[ 26.53%] ··· Running ...xing.DataFrameNumericIndexing.time_loc_dups     6.57ms
[ 28.57%] ··· Running ...g.DataFrameStringIndexing.time_boolean_rows      680μs
[ 30.61%] ··· Running ...rameStringIndexing.time_boolean_rows_object      671μs
[ 32.65%] ··· Running ...xing.DataFrameStringIndexing.time_get_value      215μs
[ 32.65%] ····· /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/indexing.py:115: FutureWarning: get_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead
                self.df.get_value(self.idx_scalar, self.col_scalar)

[ 34.69%] ··· Running ...DataFrameStringIndexing.time_getitem_scalar      238μs
[ 36.73%] ··· Running indexing.DataFrameStringIndexing.time_ix            327μs
[ 38.78%] ··· Running indexing.DataFrameStringIndexing.time_loc           265μs
[ 40.82%] ··· Running ...Column.time_frame_getitem_single_column_int      167μs
[ 42.86%] ··· Running ...lumn.time_frame_getitem_single_column_label      158μs
[ 44.90%] ··· Running ...xing.InsertColumns.time_assign_with_setitem     55.0ms
[ 46.94%] ··· Running indexing.InsertColumns.time_insert                  103ms
[ 48.98%] ··· Running indexing.MultiIndexing.time_frame_ix               17.5ms
[ 51.02%] ··· Running indexing.MultiIndexing.time_index_slice            11.7ms
[ 53.06%] ··· Running indexing.MultiIndexing.time_series_ix              17.2ms
[ 55.10%] ··· Running ...ing.NonNumericSeriesIndexing.time_get_value         ok
[ 55.10%] ···· 
               ========== ========
                 index            
               ---------- --------
                 string    23.4ms 
                datetime   4.74ms 
               ========== ========

[ 55.10%] ····· 
                
                For parameters: 'string'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/indexing.py:94: FutureWarning: get_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead
                  self.s.get_value(self.lbl)
                
                For parameters: 'datetime'
                /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/indexing.py:94: FutureWarning: get_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead
                  self.s.get_value(self.lbl)

[ 57.14%] ··· Running ...ericSeriesIndexing.time_getitem_label_slice         ok
[ 57.14%] ···· 
               ========== ========
                 index            
               ---------- --------
                 string    26.0ms 
                datetime   5.32ms 
               ========== ========

[ 59.18%] ··· Running ...umericSeriesIndexing.time_getitem_pos_slice         ok
[ 59.18%] ···· 
               ========== ========
                 index            
               ---------- --------
                 string    2.89ms 
                datetime   474μs  
               ========== ========

[ 61.22%] ··· Running ...onNumericSeriesIndexing.time_getitem_scalar         ok
[ 61.22%] ···· 
               ========== ========
                 index            
               ---------- --------
                 string    24.6ms 
                datetime   4.69ms 
               ========== ========

[ 63.27%] ··· Running ...ng.NumericSeriesIndexing.time_getitem_array         ok
[ 63.27%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    58.4ms 
                pandas.core.indexes.numeric.Float64Index   269ms  
               ========================================== ========

[ 65.31%] ··· Running ...umericSeriesIndexing.time_getitem_list_like         ok
[ 65.31%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    56.7ms 
                pandas.core.indexes.numeric.Float64Index   265ms  
               ========================================== ========

[ 67.35%] ··· Running ...ng.NumericSeriesIndexing.time_getitem_lists         ok
[ 67.35%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    64.9ms 
                pandas.core.indexes.numeric.Float64Index   269ms  
               ========================================== ========

[ 69.39%] ··· Running ...g.NumericSeriesIndexing.time_getitem_scalar         ok
[ 69.39%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    2.68ms 
                pandas.core.indexes.numeric.Float64Index   3.40ms 
               ========================================== ========

[ 71.43%] ··· Running ...ng.NumericSeriesIndexing.time_getitem_slice         ok
[ 71.43%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    284μs  
                pandas.core.indexes.numeric.Float64Index   3.61ms 
               ========================================== ========

[ 73.47%] ··· Running indexing.NumericSeriesIndexing.time_iloc_array         ok
[ 73.47%] ···· 
               ========================================== =======
                                 param1                          
               ------------------------------------------ -------
                 pandas.core.indexes.numeric.Int64Index    363μs 
                pandas.core.indexes.numeric.Float64Index   328μs 
               ========================================== =======

[ 75.51%] ··· Running ...g.NumericSeriesIndexing.time_iloc_list_like         ok
[ 75.51%] ···· 
               ========================================== =======
                                 param1                          
               ------------------------------------------ -------
                 pandas.core.indexes.numeric.Int64Index    231μs 
                pandas.core.indexes.numeric.Float64Index   236μs 
               ========================================== =======

[ 77.55%] ··· Running ...xing.NumericSeriesIndexing.time_iloc_scalar         ok
[ 77.55%] ···· 
               ========================================== =======
                                 param1                          
               ------------------------------------------ -------
                 pandas.core.indexes.numeric.Int64Index    132μs 
                pandas.core.indexes.numeric.Float64Index   134μs 
               ========================================== =======

[ 79.59%] ··· Running indexing.NumericSeriesIndexing.time_iloc_slice         ok
[ 79.59%] ···· 
               ========================================== =======
                                 param1                          
               ------------------------------------------ -------
                 pandas.core.indexes.numeric.Int64Index    220μs 
                pandas.core.indexes.numeric.Float64Index   221μs 
               ========================================== =======

[ 81.63%] ··· Running indexing.NumericSeriesIndexing.time_ix_array           ok
[ 81.63%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    58.6ms 
                pandas.core.indexes.numeric.Float64Index   266ms  
               ========================================== ========

[ 83.67%] ··· Running ...ing.NumericSeriesIndexing.time_ix_list_like         ok
[ 83.67%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    57.0ms 
                pandas.core.indexes.numeric.Float64Index   264ms  
               ========================================== ========

[ 85.71%] ··· Running indexing.NumericSeriesIndexing.time_ix_scalar          ok
[ 85.71%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    3.46ms 
                pandas.core.indexes.numeric.Float64Index   3.60ms 
               ========================================== ========

[ 87.76%] ··· Running indexing.NumericSeriesIndexing.time_ix_slice           ok
[ 87.76%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    3.33ms 
                pandas.core.indexes.numeric.Float64Index   3.74ms 
               ========================================== ========

[ 89.80%] ··· Running indexing.NumericSeriesIndexing.time_loc_array          ok
[ 89.80%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    58.2ms 
                pandas.core.indexes.numeric.Float64Index   266ms  
               ========================================== ========

[ 91.84%] ··· Running ...ng.NumericSeriesIndexing.time_loc_list_like         ok
[ 91.84%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    57.4ms 
                pandas.core.indexes.numeric.Float64Index   266ms  
               ========================================== ========

[ 93.88%] ··· Running indexing.NumericSeriesIndexing.time_loc_scalar         ok
[ 93.88%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    58.6ms 
                pandas.core.indexes.numeric.Float64Index   112ms  
               ========================================== ========

[ 95.92%] ··· Running indexing.NumericSeriesIndexing.time_loc_slice          ok
[ 95.92%] ···· 
               ========================================== ========
                                 param1                           
               ------------------------------------------ --------
                 pandas.core.indexes.numeric.Int64Index    2.82ms 
                pandas.core.indexes.numeric.Float64Index   3.63ms 
               ========================================== ========

[ 97.96%] ··· Running indexing.PanelIndexing.time_subset                 5.09ms
[100.00%] ··· Running indexing.Take.time_take                                ok
[100.00%] ···· 
               ========== ========
                 index            
               ---------- --------
                  int      11.1ms 
                datetime   11.0ms 
               ========== ========

@mroeschke
Copy link
Member Author

Here are the asv for the benchmarks that were moved:

asv dev -b ^index_object.MultiIndex
· Discovering benchmarks
· Running 11 total benchmarks (1 commits * 1 environments * 11 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/index_object.py:135
[  9.09%] ··· Running ...IndexValues.time_datetime_level_values_copy     25.0ms
[ 18.18%] ··· Running ...dexValues.time_datetime_level_values_sliced      538μs
[ 27.27%] ··· Running ...tiIndexDuplicates.time_remove_unused_levels     1.12ms
[ 36.36%] ··· Running ...MultiIndexGet.time_multiindex_large_get_loc      423ms
[ 45.45%] ··· Running ...IndexGet.time_multiindex_large_get_loc_warm      895ms
[ 54.55%] ··· Running ...t.MultiIndexGet.time_multiindex_med_get_loc     4.14ms
[ 63.64%] ··· Running ...tiIndexGet.time_multiindex_med_get_loc_warm     19.1ms
[ 72.73%] ··· Running ...IndexGet.time_multiindex_small_get_loc_warm     14.7ms
[ 81.82%] ··· Running ...ultiIndexGet.time_multiindex_string_get_loc      691μs
[ 90.91%] ··· Running ...x_object.MultiIndexInteger.time_get_indexer      337ms
[100.00%] ··· Running ..._object.MultiIndexInteger.time_is_monotonic      227ms
(pandas_dev)matt@matt-Inspiron-1545:~/Projects/pandas-mroeschke/asv_bench$ asv dev -b ^index_object.Float64IndexMethod
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[100.00%] ··· Running index_object.Float64IndexMethod.time_get_loc       7.80ms

@mroeschke mroeschke changed the title Asv clean indexing CLN: ASV indexing Jan 2, 2018
@gfyoung
Copy link
Member

gfyoung commented Jan 2, 2018

@mroeschke : Looks like some lint errors, but otherwise, Travis is happy.

[np.arange(1000), np.arange(20), list(string.ascii_letters)],
names=['one', 'two', 'three'])
self.mi_med = MultiIndex.from_product(
[np.arange(1000), np.arange(10), list('A')],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make a sub-dir scheme here

IOW

benchmarks/index/.....

and split out multi, numeric, datetime...etc

@jreback jreback added Benchmark Performance (ASV) benchmarks Indexing Related to indexing on series/frames, not to indexes themselves labels Jan 2, 2018
@codecov
Copy link

codecov bot commented Jan 3, 2018

Codecov Report

Merging #19031 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19031      +/-   ##
==========================================
- Coverage   91.57%   91.56%   -0.01%     
==========================================
  Files         150      150              
  Lines       48942    48942              
==========================================
- Hits        44818    44816       -2     
- Misses       4124     4126       +2
Flag Coverage Δ
#multiple 89.93% <ø> (-0.01%) ⬇️
#single 41.75% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/util/testing.py 84.74% <0%> (-0.22%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ab000a9...207c797. Read the comment docs.

@mroeschke
Copy link
Member Author

@jreback Some of the benchmarks in the file param'd over the various types of indexes or combined different indexes in the setup which would be difficult to split.

Instead, I put the MultiIndex benchmarks in their own file.

$ asv dev -b ^multiindex_object
· Discovering benchmarks
· Running 15 total benchmarks (1 commits * 1 environments * 15 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/multiindex_object.py:129
[  6.67%] ··· Running ...ject.Values.time_datetime_level_values_copy     24.9ms
[ 13.33%] ··· Running ...ct.Values.time_datetime_level_values_sliced      534μs
[ 20.00%] ··· Running multiindex_object.Duplicated.time_duplicated        223ms
[ 26.67%] ··· Running ...object.Duplicates.time_remove_unused_levels     1.18ms
[ 33.33%] ··· Running multiindex_object.GetLoc.time_large_get_loc         432ms
[ 40.00%] ··· Running ...index_object.GetLoc.time_large_get_loc_warm      890ms
[ 46.67%] ··· Running multiindex_object.GetLoc.time_med_get_loc          4.14ms
[ 53.33%] ··· Running multiindex_object.GetLoc.time_med_get_loc_warm     18.9ms
[ 60.00%] ··· Running ...index_object.GetLoc.time_small_get_loc_warm     15.0ms
[ 66.67%] ··· Running multiindex_object.GetLoc.time_string_get_loc        702μs
[ 73.33%] ··· Running multiindex_object.Integer.time_get_indexer          349ms
[ 80.00%] ··· Running multiindex_object.Integer.time_is_monotonic         245ms
[ 86.67%] ··· Running ...index_object.Sortlevel.time_sortlevel_int64      777ms
[ 93.33%] ··· Running multiindex_object.Sortlevel.time_sortlevel_one     18.2ms
[100.00%] ··· Running ...iindex_object.Sortlevel.time_sortlevel_zero     18.3ms

@jreback jreback added this to the 0.23.0 milestone Jan 3, 2018
@jreback jreback merged commit c883128 into pandas-dev:master Jan 3, 2018
@jreback
Copy link
Contributor

jreback commented Jan 3, 2018

thanks @mroeschke

@mroeschke mroeschke deleted the asv_clean_indexing branch January 3, 2018 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Benchmark Performance (ASV) benchmarks Indexing Related to indexing on series/frames, not to indexes themselves

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants