Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions R/pkg/R/mllib.R
Original file line number Diff line number Diff line change
Expand Up @@ -1398,20 +1398,22 @@ setMethod("summary", signature(object = "KSTest"),
distParams <- unlist(callJMethod(jobj, "distParams"))
degreesOfFreedom <- callJMethod(jobj, "degreesOfFreedom")

list(p.value = pValue, statistic = statistic, nullHypothesis = nullHypothesis,
nullHypothesis.name = distName, nullHypothesis.parameters = distParams,
degreesOfFreedom = degreesOfFreedom)
ans <- list(p.value = pValue, statistic = statistic, nullHypothesis = nullHypothesis,
nullHypothesis.name = distName, nullHypothesis.parameters = distParams,
degreesOfFreedom = degreesOfFreedom, jobj = jobj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does jobj show up in the summary? this seems to be different to our other implementation?

Copy link
Contributor Author

@yanboliang yanboliang Sep 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it's ok to show up. This implementation is a little different from others since we would like to reuse the summary information output string which was defined at Scala side, otherwise, we should reimplement a same one at R side. We implement the summary information output string at R side directly for other wrappers such as GLM, since we do not have appropriate toString implementation at Scala side. Maybe we may unify them later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly - I'm not sure if there is a better way to do this; but likely better if we have some consistent way to do this for later.

class(ans) <- "summary.KSTest"
ans
})

# Prints the summary of KSTest

#' @rdname spark.kstest
#' @param x test result object of KSTest by \code{spark.kstest}.
#' @param x summary object of KSTest returned by \code{summary}.
#' @export
#' @note print.summary.KSTest since 2.1.0
print.summary.KSTest <- function(x, ...) {
jobj <- x@jobj
jobj <- x$jobj
summaryStr <- callJMethod(jobj, "summary")
cat(summaryStr)
invisible(summaryStr)
cat(summaryStr, "\n")
invisible(x)
}
16 changes: 2 additions & 14 deletions R/pkg/inst/tests/testthat/test_mllib.R
Original file line number Diff line number Diff line change
Expand Up @@ -760,13 +760,7 @@ test_that("spark.kstest", {

expect_equal(stats$p.value, rStats$p.value, tolerance = 1e-4)
expect_equal(stats$statistic, unname(rStats$statistic), tolerance = 1e-4)

printStr <- print.summary.KSTest(testResult)
expect_match(printStr, paste0("Kolmogorov-Smirnov test summary:\\n",
"degrees of freedom = 0 \\n",
"statistic = 0.38208[0-9]* \\n",
"pValue = 0.19849[0-9]* \\n",
".*"), perl = TRUE)
expect_match(capture.output(stats)[1], "Kolmogorov-Smirnov test summary:")

Copy link
Member

@felixcheung felixcheung Sep 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have test for the function and its output?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's easy to turn this into capture.output to avoid polluting output.

Copy link
Contributor Author

@yanboliang yanboliang Sep 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output is relevant to the implementation of KolmogorovSmirnovTestResult.toString, if it was updated, this test will be broken which is not reasonable IMO. We have already checked the summary output value in unit test, I think it's not necessary to check another kinds of these value again. And I believe we will add more summary information later, which will require updating the expected printing summary again and again. In all other ML wrappers, we did not check print summary. Thanks.

testResult <- spark.kstest(df, "test", "norm", -0.5)
stats <- summary(testResult)
Expand All @@ -775,13 +769,7 @@ test_that("spark.kstest", {

expect_equal(stats$p.value, rStats$p.value, tolerance = 1e-4)
expect_equal(stats$statistic, unname(rStats$statistic), tolerance = 1e-4)

printStr <- print.summary.KSTest(testResult)
expect_match(printStr, paste0("Kolmogorov-Smirnov test summary:\\n",
"degrees of freedom = 0 \\n",
"statistic = 0.44003[0-9]* \\n",
"pValue = 0.09470[0-9]* \\n",
".*"), perl = TRUE)
expect_match(capture.output(stats)[1], "Kolmogorov-Smirnov test summary:")
})

sparkR.session.stop()