Skip to content

Conversation

@WeichenXu123
Copy link
Contributor

@WeichenXu123 WeichenXu123 commented Mar 27, 2023

What changes were proposed in this pull request?

make mapInPandas / mapInArrow support "is_barrier"

Why are the changes needed?

feature parity.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

Manually:

bin/pyspark --remote local:

from pyspark.sql.functions import pandas_udf
df = spark.createDataFrame([(1, 21), (2, 30)], ("id", "age"))
def filter_func(iterator):
    for pdf in iterator:
        yield pdf[pdf.id == 1]
df.mapInPandas(filter_func, df.schema,  is_barrier=True).collect()

def filter_func(iterator):
    for batch in iterator:
        pdf = batch.to_pandas()
        yield pyarrow.RecordBatch.from_pandas(pdf[pdf.id == 1])

df.mapInArrow(filter_func, df.schema, is_barrier=True).collect()

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Copy link
Contributor

@zhengruifeng zhengruifeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhengruifeng
Copy link
Contributor

I guess you forgot to run dev/connect-gen-protos.sh ?

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
@WeichenXu123 WeichenXu123 marked this pull request as ready for review March 27, 2023 08:01
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
@WeichenXu123
Copy link
Contributor Author

Merged to master

@WeichenXu123 WeichenXu123 changed the title [SPARK-42929] make mapInPandas / mapInArrow support "is_barrier" [SPARK-42929][CONNECT] make mapInPandas / mapInArrow support "is_barrier" Mar 27, 2023
HyukjinKwon pushed a commit that referenced this pull request Mar 29, 2023
### What changes were proposed in this pull request?

This is a follow-up of #40559 and #40571.

Renames `isBarrier` to `barrier` in Spark Connect, too.

### Why are the changes needed?

#40571 changed the argument name from `isBarrier` to `barrier`, so Spark Connect should follow it.

### Does this PR introduce _any_ user-facing change?

Yes, it renames the parameter.

### How was this patch tested?

Existing tests.

Closes #40579 from ueshin/issues/SPARK-42929/barrier.

Authored-by: Takuya UESHIN <ueshin@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
HyukjinKwon pushed a commit that referenced this pull request Feb 28, 2024
…s/mapInArrow

### What changes were proposed in this pull request?

Add barrier mode tests for mapInPandas and mapInArrow.

### Why are the changes needed?

This is the follow-up of #40559

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

The newly added tests can pass the CIs

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #45310 from wbo4958/barrier-test.

Authored-by: Bobby Wang <wbo4958@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
TakawaAkirayo pushed a commit to TakawaAkirayo/spark that referenced this pull request Mar 4, 2024
…s/mapInArrow

### What changes were proposed in this pull request?

Add barrier mode tests for mapInPandas and mapInArrow.

### Why are the changes needed?

This is the follow-up of apache#40559

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

The newly added tests can pass the CIs

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#45310 from wbo4958/barrier-test.

Authored-by: Bobby Wang <wbo4958@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
ericm-db pushed a commit to ericm-db/spark that referenced this pull request Mar 5, 2024
…s/mapInArrow

### What changes were proposed in this pull request?

Add barrier mode tests for mapInPandas and mapInArrow.

### Why are the changes needed?

This is the follow-up of apache#40559

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

The newly added tests can pass the CIs

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#45310 from wbo4958/barrier-test.

Authored-by: Bobby Wang <wbo4958@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants