Skip to content

Rounding#45

Merged
OliverSherouse merged 31 commits intodevfrom
rounding
Apr 13, 2018
Merged

Rounding#45
OliverSherouse merged 31 commits intodevfrom
rounding

Conversation

@jnelson16
Copy link
Member

No description provided.

@jnelson16 jnelson16 requested a review from mgasvoda April 12, 2018 16:02
Copy link
Contributor

@OliverSherouse OliverSherouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small changes needed.

'--probability', action='store_true',
help='output probabilities instead of predictions')
estimate.add_argument(
'--precision', default=4,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set type to int here rather than cast in the function itself.

texts = (doc.text for doc in streamer)
yield from zip(streamer.index, pipeline.predict_proba(texts))
yield from zip(streamer.index, (i.round(int(precision))
for i in pipeline.predict_proba(texts)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round at the array level, not the obvservation level. So:

yield from zip(streamer.index, pipeline.predict_proba(texts).round(precision))

predicted = pipeline.predict_proba(texts)
for i, docidx in enumerate(streamer.index):
yield docidx, tuple(label_predictions[i]
yield docidx, tuple(label_predictions[i].round(int(precision))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

@jnelson16
Copy link
Member Author

@OliverSherouse fixed those changes

texts = (doc.text for doc in streamer)
truecol = list(int(i) for i in model.model.classes_).index(1)
predicted = ((round(i[truecol], int(precision))
predicted = ((round(i[truecol], precision)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we selecting the whole column and rounding at once? I think that would be pipeline.predict_proba[:,truecol].round(precision).

@OliverSherouse OliverSherouse merged commit 6686e0d into dev Apr 13, 2018
@OliverSherouse OliverSherouse deleted the rounding branch April 13, 2018 17:45
OliverSherouse added a commit that referenced this pull request Apr 17, 2018
* Inaugurated 0.4.0 dev series

* Sentiment analysis (#33)

Closes #11 #12  #13 and adds Sentiment analysis!

* complexity

* complexity builtins

* complexity builtins with tests

* code review updates

* option tests

* added nltk requirement

in setup.py

* add pip install to .travis.yml

* nltk fixes

* another nltk fix

* last nltk fix?

* you know the drill

* Update .travis.yml

* nltk troubles

* some final cleanup

* if it aint broke...

* textblob sentiment

* tests and error raising

* fixed install req

* pep8 fixes

* code review updates

* fix travis file

* import fixes

* small fix

* Test corpora (#35)

* complexity

* complexity builtins

* complexity builtins with tests

* code review updates

* option tests

* added nltk requirement

in setup.py

* add pip install to .travis.yml

* nltk fixes

* another nltk fix

* last nltk fix?

* you know the drill

* Update .travis.yml

* nltk troubles

* some final cleanup

* new corpora

in English!!

* hotfix to add timestamp as corpus identifier

* Skl compatibility (#41)

* Add sklearn 0.17 compatibility

Paper over library reorganization.

* renamed corpora to corpus, added deprecation warning (#42)

* renamed corpora to corpus, added deprecation warning
* moved load_driver and set up for future forcing of full imports of submodules

Closes #31

* S3 drivers (#44)

* initial working commit for s3 driver and database driver

* removing 3.6 formatting

* adding extra requirements list

* adding basic s3 driver test

* Removing unnecessary function

* This ain't 2007

* test updates

* adding s3driver to new corpus structure

* Rounding (#45)

* bumped version
OliverSherouse added a commit that referenced this pull request Apr 17, 2018
* Inaugurated 0.4.0 dev series

* Sentiment analysis (#33)

Closes #11 #12  #13 and adds Sentiment analysis!

* complexity

* complexity builtins

* complexity builtins with tests

* code review updates

* option tests

* added nltk requirement

in setup.py

* add pip install to .travis.yml

* nltk fixes

* another nltk fix

* last nltk fix?

* you know the drill

* Update .travis.yml

* nltk troubles

* some final cleanup

* if it aint broke...

* textblob sentiment

* tests and error raising

* fixed install req

* pep8 fixes

* code review updates

* fix travis file

* import fixes

* small fix

* Test corpora (#35)

* complexity

* complexity builtins

* complexity builtins with tests

* code review updates

* option tests

* added nltk requirement

in setup.py

* add pip install to .travis.yml

* nltk fixes

* another nltk fix

* last nltk fix?

* you know the drill

* Update .travis.yml

* nltk troubles

* some final cleanup

* if it aint broke...

* new corpora

in English!!

* hotfix to add timestamp as corpus identifier

* Skl compatibility (#41)

* Add sklearn 0.17 compatibility

Paper over library reorganization.

* renamed corpora to corpus, added deprecation warning (#42)

* renamed corpora to corpus, added deprecation warning
* moved load_driver and set up for future forcing of full imports of submodules

Closes #31

* S3 drivers (#44)

* initial working commit for s3 driver and database driver

* removing 3.6 formatting

* adding extra requirements list

* adding basic s3 driver test

* Removing unnecessary function

* This ain't 2007

* test updates

* adding s3driver to new corpus structure

* Rounding (#45)

* bumped version

* Fix NLTK loading bug

Fix evaluation order when NLTK is not present
OliverSherouse added a commit that referenced this pull request May 21, 2018
* hotfix to add timestamp as corpus identifier (#39)

* bumped version

* Release 0.4 (#47)

* Inaugurated 0.4.0 dev series

* Sentiment analysis (#33)

Closes #11 #12  #13 and adds Sentiment analysis!

* complexity

* complexity builtins

* complexity builtins with tests

* code review updates

* option tests

* added nltk requirement

in setup.py

* add pip install to .travis.yml

* nltk fixes

* another nltk fix

* last nltk fix?

* you know the drill

* Update .travis.yml

* nltk troubles

* some final cleanup

* if it aint broke...

* textblob sentiment

* tests and error raising

* fixed install req

* pep8 fixes

* code review updates

* fix travis file

* import fixes

* small fix

* Test corpora (#35)

* complexity

* complexity builtins

* complexity builtins with tests

* code review updates

* option tests

* added nltk requirement

in setup.py

* add pip install to .travis.yml

* nltk fixes

* another nltk fix

* last nltk fix?

* you know the drill

* Update .travis.yml

* nltk troubles

* some final cleanup

* new corpora

in English!!

* hotfix to add timestamp as corpus identifier

* Skl compatibility (#41)

* Add sklearn 0.17 compatibility

Paper over library reorganization.

* renamed corpora to corpus, added deprecation warning (#42)

* renamed corpora to corpus, added deprecation warning
* moved load_driver and set up for future forcing of full imports of submodules

Closes #31

* S3 drivers (#44)

* initial working commit for s3 driver and database driver

* removing 3.6 formatting

* adding extra requirements list

* adding basic s3 driver test

* Removing unnecessary function

* This ain't 2007

* test updates

* adding s3driver to new corpus structure

* Rounding (#45)

* bumped version

* Fix NLTK loading bug

Fix evaluation order when NLTK is not present
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants