Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add methods #43

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Add methods #43

wants to merge 1 commit into from

Conversation

grayfox88
Copy link

@grayfox88 grayfox88 commented Apr 9, 2023

This pull request is based on old PR "#30 add methods support" created by jonykalavera

@grayfox88
Copy link
Author

This branch has not been rebased yet on master, as the initial goal is to create simple test cases to understand how methods should be represented in UML. My initial suggestion would be to sort methods by type as follow:

  1. instance "special" methods defined by user (aka dunder method)
  2. all other instance methods
  3. static methods

@lucsorel, what's your view on method representation? Should they be sorted by type or appear in the same order as in the code? Any other test cases that shall be added before looking into the AST approach?
I agree that only method declared in the class shall be represented, not the one inherited.

@grayfox88 grayfox88 mentioned this pull request Apr 9, 2023
y: str
void __init__(x: int, y: str, unit: str, u: float, z: List[int])
(float, float) get_coordinates()
{static} Point from_values(x: int, y: str, u: float, z: List[int])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different remarks about the type of the returned value:

  • with @jonykalavera, we thought of:

    • annotating the type of the returned value "the Python way" (like from_values(x: int, y: str, u: float, z: List[int]) -> Point) because it would be pythonistas who are going to use the tool, not UML experts
    • not annotating anything if nothing is returned, instead of returning "void", because type annotations are optional in Python and can be used in an heterogeneous way across files. We thought that it would make more sense of specifying the type of the returned value only when it is specified
  • the type of the value by returned by Point.get_coordinates() could be more consistent with the way we would write it in Python: tuple, Tuple[float, float] or tuple[float, float], for instance (that would be what we actually chose in the parsed source code)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since py2puml produces UML class diagrams I'd rather stick to the original UML syntax with return type declared before the signature. But I kept the Python return type hinting syntax in the next commit to leave this discussion opened.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of agree with you, let's follow the UML convention. Using the Python type annotation syntax here was an artefact of using signature=str(signature) (see https://github.com/lucsorel/py2puml/pull/43/files#diff-adfc82ab9429d3c5fa93391a8044c10a783c3c77df333a23f8a5eaf31168f91fR87), but I didn't like relying on the str representation of a signature, which can change over time 👍

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted the expected results in the test cases to the UML convention in my latest commit .

@lucsorel
Copy link
Owner

First of all, thank you @grayfox88 for contributing to this issue 🙏 😃

what's your view on method representation?

I have answered partly to that on comment #43 (comment). I am not strongly opinionated about that and am looking for a consensus rather than giving directions. What do you think of sticking rather to Python type conventions rather than UML conventions?

Should they be sorted by type or appear in the same order as in the code?

In a pragmatic approach to reduce the workload involved in this feature request, I would say that the order should be what the parsing returns 😃 (which is likely to be the source code order). Depending on later feedback, we could consider sorting things.

Any other test cases that shall be added before looking into the AST approach?

I would suggest adding a test case with 2 classes, one extending the other, both of them having methods. So that we can make sure that inherited methods do not appear in the child class. What do you think?

@lucsorel
Copy link
Owner

I can help rebasing on the main branch; because it would take less time to apply the changes for me than for anyone else, I believe.

expected_data = fref.read()
puml_content = py2puml.py2puml('tests/modules/withmethods', 'tests.modules.withmethods')
actual_data = ''.join(puml_content)
assert actual_data == expected_data
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the main branch, there are now functions dedicated to comparing the output of a parsing command with the expected PlantUML contents. We should use them after rebasing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

This test case assumes that the methods will be sorted by type as follow:
1a - instance methods (special methods aka "dunder")
1b - all other instance methods
2 - static methods """
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in between, there can be class methods too

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but as I said elsewhere, we can consider sorting methods later

Copy link
Author

@grayfox88 grayfox88 Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class methods are a bit tricky, because they are specific to Python (and maybe a dozen other languages...) and do not have a formal representation in the UML-language as such. One option could be to make use of "separators" to group class methods, but it would require to start sorting methods 😜 See code snippet below:

__Class methods__
first_class_method()
second_class_method()

Using "icons" to represent a class method is not an option since these are reserved for other use.

Any reflection on that matter?

Copy link
Owner

@lucsorel lucsorel Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 2 links (for separators and icons) seem broken: they don't seem to refer to the things you said.

However:

  • I see what separators are and they could be used
  • I believe the "icons" you mentioned are the one dedicated to visibility (private / public); thus I agree that we shouldn't use them for something else

As a first step, I suggest that we don't sort methods for now: this PR is already going to be big. What do you think?

Second question: should we keep the self and cls used in the methods' signature? I think we should, as they carry the characteristics of being either an instance or a class method. Static methods would have neither of them; thus we would not need to find visual things to distinguish them.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree both on both questions.
However for the static methods, I still think that we should stick to UML and use the {static} modifier.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok: let's use the {static} modifier (as py2uml already uses it for class's attributes), let's sort later

@grayfox88
Copy link
Author

grayfox88 commented Apr 12, 2023

New test case has been added with subclass inheriting methods. I kept the Python-like typing syntax in this commit in the output puml-files but would prefer to adopt the official UML syntax. It is the formal way to describe class diagram and is meant to be a standard way of representing object oriented systems ... no matter which programming language is used!

The "U" in UML stands for "unified" after all 😉


assert point_umlitem.methods[0].name == 'from_values'
assert point_umlitem.methods[1].name == 'get_coordinates'
assert point_umlitem.methods[2].name == '__init__'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also assert the return types of the methods

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the return type is not available yet in the dataclass UmlMethod. I suggest to update the test case once this class has been fleshed out.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed 👍

@lucsorel
Copy link
Owner

New test case has been added with subclass inheriting methods. I kept the Python-like typing syntax in this commit in the output puml-files but would prefer to adopt the official UML syntax. It is the formal way to describe class diagram and is meant to be a standard way of representing object oriented systems ... no matter which programming language is used!
The "U" in UML stands for "unified" after all wink

I agree with you, see #43 (comment) 👍

@grayfox88
Copy link
Author

Hello @lucsorel, feel free to rebase this branch on top of main when you have time 😃 Or let me know if you want to correct some of the changes made to the test case.

@lucsorel
Copy link
Owner

I have rebased this branch on top of main. This required to force-push the new commits history. You may need to delete your local add-methods branch and checkout this new one.

And now, the implementation can start 😃

@grayfox88
Copy link
Author

@lucsorel, here comes a first fraft that implements methods using AST. It took me quite a while to get my head around the structure of this package and how the AST works but I do believe AST is the right way to go. Note that this is only a first implementation, the return type of the method is for example not included. I also noticed that I added to the astvisitors.py module a lot of unused import statement that will have to be removed in the next commit. Just wanted to get your input on this first attempt though ...
In a more general manner, I do have suggestion that could improve the cohesion of certain modules and improve the readability too. But we can discuss these refactoring suggestions later.

@lucsorel
Copy link
Owner

Thank you @grayfox88 for proposing a first draft of the implementation ❤️

I have not had a look at it yet and I am really busy with the summer holiday deadlines. Please, give me some time and don't be offended 🙏

@grayfox88
Copy link
Author

@lucsorel No worries. In the latest commit, I corrected some of the clumsy issues I mentioned in my earlier post (useless import statements, failing test cases) and attempted to implement the return_type representation. I'm still a bit puzzled when it comes to how AST visiting works so you might want to look into my code and suggests a better implementation since there is for example no proper support for compound type in the return type.

Also, more of a general question related to return type: if no type hinting is available in the source code, are we supposed to infer it? Or should we just omit it for the puml representation as done today?

@grayfox88
Copy link
Author

grayfox88 commented Jun 1, 2023

@lucsorel, I noticed that one of the test cases is failing: test_py2puml_with_methods.

The UML representation of the class ThreeDimensionalPoint contains wrongly the class variable PI that belongs to its parent class Point. The culprit seems to be the function inspectclass.inspect_static_attributes which relies on the dunder attribute annotations to retrieve class variables. See explanation from annotations official documentation:

The problem is that, since annotations is optional on classes, and because classes can inherit attributes from their base classes, accessing the annotations attribute of a class may inadvertently return the annotations dict of a base class.

AST would do some good here too 😉

Expected UML content:

class tests.modules.withmethods.withinheritedmethods.ThreeDimensionalPoint {
  z: float
  __init__(self, int x, str y, float z)
  move(self, int offset)
  bool check_positive(self)
}

Actual UML content:

class tests.modules.withmethods.withinheritedmethods.ThreeDimensionalPoint {
  PI: float {static}
  z: float
  __init__(self, int x, str y, float z)
  move(self, int offset)
  bool check_positive(self)
}

@lucsorel
Copy link
Owner

lucsorel commented Jun 5, 2023

AST would do some good here too

Definitely! I started this project using inspection / reflexion because it was handy and easy (writing an AST parser is harder). But I now realize that I should give up the reflexion way because it cannot help separating inherited properties from the other ones.

Removing reflexion from the library will be a huge step, but it will make the library failproof to parsing artefacts. Particularly when an execution part is not protected in a if __name__ == '__main__': condition

@grayfox88
Copy link
Author

grayfox88 commented Jun 6, 2023

I agree that refactoring the whole code to embrace AST is a bigger nut to crack. Should this be handled in a separate pull-request?

Will soon push a new commit that solves this bug without having to go over to AST right now.

@@ -20,6 +20,7 @@ def assert_py2puml_is_stringio(domain_path: str, domain_module: str, expected_co
def assert_multilines(actual_multilines: List[str], expected_multilines: Iterable[str]):
line_index = 0
for line_index, (actual_line, expected_line) in enumerate(zip(actual_multilines, expected_multilines)):
# print(actual_line[:-1])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superfluous print statement. Can be removed?

@lucsorel
Copy link
Owner

hello @grayfox88 and @stiebels,

Forgive me for the long wait, I am ready to contribute to and review this branch. To ease the rebasing on main (and solving the conflicts), I would like to squash the commits of this branch; is that ok for you?

You will need to delete your local branch and check it out again, afterwards. Forgive me again for this inconvenience.

Have a nice day!

@lucsorel lucsorel added the enhancement New feature or request label Oct 16, 2023
@grayfox88
Copy link
Author

Hello @lucsorel, yes go ahead and squash it.

@lucsorel
Copy link
Owner

hello @grayfox88 and @stiebels, I hope you are all right 😃

Forgive me for the long wait. Yesterday night, I squashed the commits of the branch and applied the automatic formatting on the code in the perspective of rebasing the branch on main.

In the mean time, the default version of python used to develop py2puml has moved on from 3.8 to 3.10 (to support the int | float syntax for union types). That move broke the implementation of TypeVisitor which receives instances of ast.Tuple (for example) instead of ast.Index ones when handling types involving brackets (see the deprecation notes of https://docs.python.org/3/library/ast.html#ast.AST, https://greentreesnakes.readthedocs.io/en/latest/nodes.html#Index and https://greentreesnakes.readthedocs.io/en/latest/nodes.html#Tuple). This changes the way inner elements (component types, in this case) can be reached. I'll try to fix that, or maybe get rid of the whole visitor since the project already has the shorten_compound_type_annotation function to standardize the type representation of variables, which can be used for argument types and return types of method as well.

Also, please do not introduce unittest in this project, please 🙏 ; for the sake of consistency with the rest of the codebase: unittest tests are more verbose than the pytest approach with isolated test functions. 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants