Skip to content

Conversation

@zero323
Copy link
Member

@zero323 zero323 commented Sep 18, 2016

What changes were proposed in this pull request?

ReplacesValueError with IndexError when index passed to ml / mllib SparseVector.__getitem__ is out of range. This ensures correct iteration behavior.

Replaces ValueError with IndexError for DenseMatrix and SparkMatrix in ml / mllib.

How was this patch tested?

PySpark ml / mllib unit tests. Additional unit tests to prove that the problem has been resolved.

@SparkQA
Copy link

SparkQA commented Sep 18, 2016

Test build #65582 has finished for PR 15144 at commit 4b90ee5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member

LGTM!

@BryanCutler
Copy link
Member

It looks like SparseMatrix also raises a ValueError. Can you fix that here also?

@zero323
Copy link
Member Author

zero323 commented Sep 19, 2016

@BryanCutler Sure, here you are. Actually both Sparse and Dense had to be fixed.

Note: Technically speaking this can is a breaking change if someone expects ValueError.

On a side note it would be nice to support slice and iteration over rows as well but I still hope there is a chance for doing something with SPARK-16626.

@SparkQA
Copy link

SparkQA commented Sep 19, 2016

Test build #65608 has finished for PR 15144 at commit 4162b06.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

test this please

@SparkQA
Copy link

SparkQA commented Oct 4, 2016

Test build #66292 has finished for PR 15144 at commit 4162b06.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

LGTM, merging with master and branch-2.0
Thank you @zero323 for the PR and @BryanCutler for reviewing !

@asfgit asfgit closed this in d8399b6 Oct 4, 2016
asfgit pushed a commit that referenced this pull request Oct 4, 2016
…_getitem__ contract

## What changes were proposed in this pull request?

Replaces` ValueError` with `IndexError` when index passed to `ml` / `mllib` `SparseVector.__getitem__` is out of range. This ensures correct iteration behavior.

Replaces `ValueError` with `IndexError` for `DenseMatrix` and `SparkMatrix` in `ml` / `mllib`.

## How was this patch tested?

PySpark `ml` / `mllib` unit tests. Additional unit tests to prove that the problem has been resolved.

Author: zero323 <zero323@users.noreply.github.com>

Closes #15144 from zero323/SPARK-17587.

(cherry picked from commit d8399b6)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
@zero323 zero323 deleted the SPARK-17587 branch April 6, 2017 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants