-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add methods #43
base: main
Are you sure you want to change the base?
Add methods #43
Conversation
This branch has not been rebased yet on master, as the initial goal is to create simple test cases to understand how methods should be represented in UML. My initial suggestion would be to sort methods by type as follow:
@lucsorel, what's your view on method representation? Should they be sorted by type or appear in the same order as in the code? Any other test cases that shall be added before looking into the AST approach? |
tests/data/with_methods.puml
Outdated
y: str | ||
void __init__(x: int, y: str, unit: str, u: float, z: List[int]) | ||
(float, float) get_coordinates() | ||
{static} Point from_values(x: int, y: str, u: float, z: List[int]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
different remarks about the type of the returned value:
-
with @jonykalavera, we thought of:
- annotating the type of the returned value "the Python way" (like
from_values(x: int, y: str, u: float, z: List[int]) -> Point
) because it would be pythonistas who are going to use the tool, not UML experts - not annotating anything if nothing is returned, instead of returning "void", because type annotations are optional in Python and can be used in an heterogeneous way across files. We thought that it would make more sense of specifying the type of the returned value only when it is specified
- annotating the type of the returned value "the Python way" (like
-
the type of the value by returned by Point.get_coordinates() could be more consistent with the way we would write it in Python:
tuple
,Tuple[float, float]
ortuple[float, float]
, for instance (that would be what we actually chose in the parsed source code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since py2puml produces UML class diagrams I'd rather stick to the original UML syntax with return type declared before the signature. But I kept the Python return type hinting syntax in the next commit to leave this discussion opened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of agree with you, let's follow the UML convention. Using the Python type annotation syntax here was an artefact of using signature=str(signature)
(see https://github.com/lucsorel/py2puml/pull/43/files#diff-adfc82ab9429d3c5fa93391a8044c10a783c3c77df333a23f8a5eaf31168f91fR87), but I didn't like relying on the str
representation of a signature, which can change over time 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted the expected results in the test cases to the UML convention in my latest commit .
First of all, thank you @grayfox88 for contributing to this issue 🙏 😃
I have answered partly to that on comment #43 (comment). I am not strongly opinionated about that and am looking for a consensus rather than giving directions. What do you think of sticking rather to Python type conventions rather than UML conventions?
In a pragmatic approach to reduce the workload involved in this feature request, I would say that the order should be what the parsing returns 😃 (which is likely to be the source code order). Depending on later feedback, we could consider sorting things.
I would suggest adding a test case with 2 classes, one extending the other, both of them having methods. So that we can make sure that inherited methods do not appear in the child class. What do you think? |
I can help rebasing on the |
tests/py2puml/test_py2puml.py
Outdated
expected_data = fref.read() | ||
puml_content = py2puml.py2puml('tests/modules/withmethods', 'tests.modules.withmethods') | ||
actual_data = ''.join(puml_content) | ||
assert actual_data == expected_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the main branch, there are now functions dedicated to comparing the output of a parsing command with the expected PlantUML contents. We should use them after rebasing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
This test case assumes that the methods will be sorted by type as follow: | ||
1a - instance methods (special methods aka "dunder") | ||
1b - all other instance methods | ||
2 - static methods """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in between, there can be class methods too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but as I said elsewhere, we can consider sorting methods later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Class methods are a bit tricky, because they are specific to Python (and maybe a dozen other languages...) and do not have a formal representation in the UML-language as such. One option could be to make use of "separators" to group class methods, but it would require to start sorting methods 😜 See code snippet below:
__Class methods__
first_class_method()
second_class_method()
Using "icons" to represent a class method is not an option since these are reserved for other use.
Any reflection on that matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the 2 links (for separators and icons) seem broken: they don't seem to refer to the things you said.
However:
- I see what separators are and they could be used
- I believe the "icons" you mentioned are the one dedicated to visibility (private / public); thus I agree that we shouldn't use them for something else
As a first step, I suggest that we don't sort methods for now: this PR is already going to be big. What do you think?
Second question: should we keep the self
and cls
used in the methods' signature? I think we should, as they carry the characteristics of being either an instance or a class method. Static methods would have neither of them; thus we would not need to find visual things to distinguish them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree both on both questions.
However for the static methods, I still think that we should stick to UML and use the {static}
modifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok: let's use the {static}
modifier (as py2uml already uses it for class's attributes), let's sort later
New test case has been added with subclass inheriting methods. I kept the Python-like typing syntax in this commit in the output puml-files but would prefer to adopt the official UML syntax. It is the formal way to describe class diagram and is meant to be a standard way of representing object oriented systems ... no matter which programming language is used! The "U" in UML stands for "unified" after all 😉 |
|
||
assert point_umlitem.methods[0].name == 'from_values' | ||
assert point_umlitem.methods[1].name == 'get_coordinates' | ||
assert point_umlitem.methods[2].name == '__init__' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also assert the return types of the methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the return type is not available yet in the dataclass UmlMethod. I suggest to update the test case once this class has been fleshed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed 👍
I agree with you, see #43 (comment) 👍 |
Hello @lucsorel, feel free to rebase this branch on top of |
I have rebased this branch on top of And now, the implementation can start 😃 |
@lucsorel, here comes a first fraft that implements methods using AST. It took me quite a while to get my head around the structure of this package and how the AST works but I do believe AST is the right way to go. Note that this is only a first implementation, the return type of the method is for example not included. I also noticed that I added to the astvisitors.py module a lot of unused import statement that will have to be removed in the next commit. Just wanted to get your input on this first attempt though ... |
Thank you @grayfox88 for proposing a first draft of the implementation ❤️ I have not had a look at it yet and I am really busy with the summer holiday deadlines. Please, give me some time and don't be offended 🙏 |
@lucsorel No worries. In the latest commit, I corrected some of the clumsy issues I mentioned in my earlier post (useless import statements, failing test cases) and attempted to implement the return_type representation. I'm still a bit puzzled when it comes to how AST visiting works so you might want to look into my code and suggests a better implementation since there is for example no proper support for compound type in the return type. Also, more of a general question related to return type: if no type hinting is available in the source code, are we supposed to infer it? Or should we just omit it for the puml representation as done today? |
@lucsorel, I noticed that one of the test cases is failing: test_py2puml_with_methods. The UML representation of the class ThreeDimensionalPoint contains wrongly the class variable PI that belongs to its parent class Point. The culprit seems to be the function inspectclass.inspect_static_attributes which relies on the dunder attribute annotations to retrieve class variables. See explanation from annotations official documentation:
AST would do some good here too 😉 Expected UML content:
Actual UML content:
|
Definitely! I started this project using inspection / reflexion because it was handy and easy (writing an AST parser is harder). But I now realize that I should give up the reflexion way because it cannot help separating inherited properties from the other ones. Removing reflexion from the library will be a huge step, but it will make the library failproof to parsing artefacts. Particularly when an execution part is not protected in a |
I agree that refactoring the whole code to embrace AST is a bigger nut to crack. Should this be handled in a separate pull-request? Will soon push a new commit that solves this bug without having to go over to AST right now. |
@@ -20,6 +20,7 @@ def assert_py2puml_is_stringio(domain_path: str, domain_module: str, expected_co | |||
def assert_multilines(actual_multilines: List[str], expected_multilines: Iterable[str]): | |||
line_index = 0 | |||
for line_index, (actual_line, expected_line) in enumerate(zip(actual_multilines, expected_multilines)): | |||
# print(actual_line[:-1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Superfluous print statement. Can be removed?
hello @grayfox88 and @stiebels, Forgive me for the long wait, I am ready to contribute to and review this branch. To ease the rebasing on You will need to delete your local branch and check it out again, afterwards. Forgive me again for this inconvenience. Have a nice day! |
Hello @lucsorel, yes go ahead and squash it. |
29dea7a
to
a2750b2
Compare
a2750b2
to
ccf983f
Compare
hello @grayfox88 and @stiebels, I hope you are all right 😃 Forgive me for the long wait. Yesterday night, I squashed the commits of the branch and applied the automatic formatting on the code in the perspective of rebasing the branch on In the mean time, the default version of python used to develop py2puml has moved on from 3.8 to 3.10 (to support the Also, please do not introduce unittest in this project, please 🙏 ; for the sake of consistency with the rest of the codebase: unittest tests are more verbose than the pytest approach with isolated test functions. 😃 |
This pull request is based on old PR "#30 add methods support" created by jonykalavera