forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-45130][CONNECT][ML][PYTHON] Avoid Spark connect ML model to ch…
…ange input pandas dataframe ### What changes were proposed in this pull request? Currently, to avoid data copy, Spark connect ML model directly changes input pandas dataframe for appending prediction columns. But we can use `pandas_df.copy(deep=False)` to shallow copy it and then append prediction columns in copied dataframe. This is easier for user to use it. ### Why are the changes needed? This makes `pyspark.ml.connect` model `transform` method has more similar behavior with `pyspark.ml` model, i.e., the input dataframe is intact after `transform` is called. Otherwise user might be surprise at the new behavior and have to change more code to migrate their workload to `pyspark.ml.connect` ### Does this PR introduce _any_ user-facing change? Yes. Previous behavior: In `pyspark.ml.connect`, `model.transform` will append new columns into input pandas dataframe, and return input dataframe object. Changed behavior: In `pyspark.ml.connect`, `model.transform` will shallow copy input pandas dataframe and append new columns into shallow copied pandas dataframe, then return copied pandas dataframe. ### How was this patch tested? Unit tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#42887 from WeichenXu123/spark-ml-connect-model-avoid-change-input-dataframe. Authored-by: Weichen Xu <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
1 parent
d5ff04d
commit 99a979d
Showing
5 changed files
with
25 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters