-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hotfix][clients] Refactor PackagedProgram#getJobJarAndDependencies to reduce some redundant lines. #26238
Conversation
|
||
private static List<URL> getJobJarAndDependencies( | ||
URL jarFile, List<File> extractedTempLibraries) { | ||
List<URL> libs = new ArrayList<>(extractedTempLibraries.size() + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I am curious why we initialize the list with a + 1. Is this in case we need to add the Python lib?
If so, I wonder if we should use isPython
to allocate the correct size - ie. without Python we would not have the + 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @davidradl for the comments, That sounds good to me.
I updated it. PTAL :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to hardcode it to extractedTempLibraries.size() + 2
instead of checking isPython or not for 2 reasons:
- The expected list size still needs to +1
if (jarFile != null)
, so I don't think checkisPython
is enough if we'd like to initialize a correct size. - If we'd like to initialize a proper list size, we need to consider both
jarFile
andisPython
.- However, I think it's pretty complicated if checking both.
- The code base is client related code base instead of data processing, so it's not performance sensitive code.
As performance-awareness part[1] mentioned in Flink code style document:
We can conceptually distinguish between code that “coordinates” and code that “processes data”.
Code that coordinates should always favor simplicity and cleanness.
Data processing code is highly performance critical and should optimize for performance.
I agree it, the simplicity and cleanness are more important than performance for coordinates related code. So a simple implementation is + 2 directly. It only requires a little little memory, and it's compatible with both jarFile
and isPython
.
[1] https://flink.apache.org/how-to-contribute/code-style-and-quality-common/#performance-awareness
Hi, @1996fanrui Could you help take a look at the refactor? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, and left a minor comment~
flink-clients/src/main/java/org/apache/flink/client/program/PackagedProgram.java
Outdated
Show resolved
Hide resolved
|
||
private static List<URL> getJobJarAndDependencies( | ||
URL jarFile, List<File> extractedTempLibraries) { | ||
List<URL> libs = new ArrayList<>(extractedTempLibraries.size() + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to hardcode it to extractedTempLibraries.size() + 2
instead of checking isPython or not for 2 reasons:
- The expected list size still needs to +1
if (jarFile != null)
, so I don't think checkisPython
is enough if we'd like to initialize a correct size. - If we'd like to initialize a proper list size, we need to consider both
jarFile
andisPython
.- However, I think it's pretty complicated if checking both.
- The code base is client related code base instead of data processing, so it's not performance sensitive code.
As performance-awareness part[1] mentioned in Flink code style document:
We can conceptually distinguish between code that “coordinates” and code that “processes data”.
Code that coordinates should always favor simplicity and cleanness.
Data processing code is highly performance critical and should optimize for performance.
I agree it, the simplicity and cleanness are more important than performance for coordinates related code. So a simple implementation is + 2 directly. It only requires a little little memory, and it's compatible with both jarFile
and isPython
.
[1] https://flink.apache.org/how-to-contribute/code-style-and-quality-common/#performance-awareness
Thank you @1996fanrui for your comments and explanations. |
…o reduce some redundant lines. Co-authored-by: Rui Fan <[email protected]> Co-authored-by: davidradl <[email protected]>
@flinkbot run azure |
Hi, @davidradl could you help have a double-check ? Because your suggestions are important & indispensable to the pr. 👍 Any input is appreciated! |
Let me merge it now as no more comments for over a week~ |
Thank you for your attention and review. @1996fanrui @davidradl |
What is the purpose of the change
[hotfix][clients] Refactor PackagedProgram#getJobJarAndDependencies to reduce some redundant lines.
Brief change log
[hotfix][clients] Refactor PackagedProgram#getJobJarAndDependencies to reduce some redundant lines.
Verifying this change
This change is a trivial rework / code cleanup without any test coverage.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (yes / no)Documentation