Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-50674][PYTHON] Fix check for ‘terminate’ method existence in U…
…DTF evaluation ### What changes were proposed in this pull request? Fix check for ‘terminate’ method existence in UDTF evaluation ### Why are the changes needed? To ensure that UDTFs without a terminate method can still be used with partitioning without causing an AttributeError. Previously, udtf with partitioning will raise an AttributeError if the terminate method is not defined, as shown below ```py >>> from pyspark.sql.functions import udtf >>> from pyspark.sql import Row >>> >>> udtf(returnType="a: int") ... class TestUDTF: ... def eval(self, row: Row): ... if row[0] > 5: ... yield row[0], ... >>> spark.udtf.register("test_udtf", TestUDTF) <pyspark.sql.udtf.UserDefinedTableFunction object at 0x10298a1d0> >>> spark.sql("SELECT * FROM test_udtf(TABLE (SELECT id FROM range(0, 8)) PARTITION BY id)").show() org.apache.spark.api.python.PythonException: Traceback (most recent call last): ... File "...pyspark/worker.py", line 1052, in eval if self._udtf.terminate is not None: AttributeError: 'TestUDTF' object has no attribute 'terminate' ``` However, the terminate method is not required in such cases. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#49299 from xinrong-meng/udtf_terminate. Authored-by: Xinrong Meng <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
- Loading branch information