Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script to insert keyword links #410

Merged
merged 13 commits into from
Nov 15, 2024
Merged

Conversation

hakonhagland
Copy link
Collaborator

Builds on #408 and #409 which should be merged first.

The script replaces keyword names inside <text:p ...> tags with links to the corresponding keyword definition page. See #408 for more information. Since there are over 1000 keywords, and since I am not very confident that it will work correctly on every file, the plan is to apply the script to some keywords at a time. For example, starting with the 168 keywords in chapter 5. After it has been applied to all keyword files, we can consider applying it to the chapters and appendices also.

@hakonhagland hakonhagland changed the title Kw convert script Script to insert keyword links Nov 4, 2024
@hakonhagland
Copy link
Collaborator Author

There is currently an issue if the text contains fixed width fonts like for the H2STORE keyword. For example, at page 288 in the manual:

image

Do we want to insert links in the fixed width font part in the bottom of the screenshot? Currently, the script will insert links here too.

@hakonhagland
Copy link
Collaborator Author

The script also needs to handle recursive <text:p> tags. This is an issue if a paragraph contains footnotes that are also marked up with <text:p>.

@gdfldm
Copy link
Collaborator

gdfldm commented Nov 4, 2024

Do we want to insert links in the fixed width font part in the bottom of the screenshot?

I would be inclined not to add links here. I think if we would like to have a link then the keyword could be mentioned in the text surrounding the example input.

@gdfldm
Copy link
Collaborator

gdfldm commented Nov 4, 2024

Where a "keyword" appears multiple times in a KEYWORD.fodt file do we want add links to every occurrence or perhaps just the first?

@hakonhagland
Copy link
Collaborator Author

hakonhagland commented Nov 5, 2024

do we want add links to every occurrence or perhaps just the first?

@gdfldm We could also think of the links as a special markup for the keywords. Currently, the user recognizes a keyword by being an uppercase word, but there are upper case words that are not a keyword. However, if the keyword is also displayed in in e.g. italics and in a different color, the user will more easily recognize a keyword. Here is an example for how it will look for the H2STORE keyword:

image

Note that currently the keyword itself H2STORE is not marked up, but we could change that such that (it could link to itself) to avoid confusion (whether the H2STORE is a keyword or not).

@blattms
Copy link
Member

blattms commented Nov 5, 2024

Thanks a lot. I'll review this (with my limited Python knowledge) after the others are merged. That hopefully saves time.

@hakonhagland
Copy link
Collaborator Author

hakonhagland commented Nov 5, 2024

Currently, the script misses keywords that are split by span tags, for example if CO2STORE occurs like this within the text <text:span text:style-name="T6">C</text:span>O2STORE

@hakonhagland
Copy link
Collaborator Author

hakonhagland commented Nov 5, 2024

Rebased this onto #408

@hakonhagland
Copy link
Collaborator Author

Rebased this and added a commit that will generate the mapping from keywords to URIs on the fly, see discussion in #409 (comment) for more information

@hakonhagland
Copy link
Collaborator Author

Added three more commits to address the review comments in #411 .

@hakonhagland hakonhagland force-pushed the kw_convert_script branch 2 times, most recently from f0889b5 to be6564a Compare November 9, 2024 10:24
@hakonhagland
Copy link
Collaborator Author

Updated script to handle aliases, see #411 (comment) for more information

@blattms
Copy link
Member

blattms commented Nov 11, 2024

Seems like this needs a rebase, sorry.

Hopefully, the number changed files is smaller after this. 15 seems a bit high.

@hakonhagland
Copy link
Collaborator Author

hakonhagland commented Nov 11, 2024

Hopefully, the number changed files is smaller after this.

@blattms Actually this PR includes #417, so maybe we should merge that first? @gdfldm can you have a look at this: #417 (comment)

Then, I can rebase this one after that one has been merged.

@blattms
Copy link
Member

blattms commented Nov 13, 2024

Ok the other PR is merged. Please rebase this and I hope that @lisajulia will take another look at the Python code.

@hakonhagland
Copy link
Collaborator Author

Ok the other PR is merged. Please rebase this

@blattms Thanks! I rebased this now.

@@ -201,8 +201,20 @@ ENDNUM __RefHeading___Toc123125_83452205
ENDSCALE __RefHeading___Toc68146_2267116897
ENDSKIP __RefHeading___Toc605472_3199477706
ENKRVD __RefHeading___Toc69787_621662414
ENKRVDX __RefHeading___Toc69787_621662414
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The changes to parts/meta/kw_uri_map.txt are intended?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. These are inserted by the alias functions. See https://github.com/OPM/opm-reference-manual/pull/410/files#diff-45a6fd9dbaf527c01ddb0d629af97c5b4ecf940c546aa9f3e834044d1985c6dbR135. They create aliases to keywords that are included in another keyword. See #411 (comment) for more information

Copy link
Collaborator

@lisajulia lisajulia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed and I tested the commands fodt-gen-kw-uri-map and fodt-link-keywords --subsection=5.3 - everything seems fine :).
Yet I suggest to name the files differently:
keyword_linker.py -> keyword_uri_map_generator.py
keyword_linker2.py -> keyword_linker.py

To provide a consistent markup experience for the keywords, we should
also link the keyword to itself
Generate the keyword to URI map on the fly, if the user
does not request to load it from a file instead
The "_40_Example" style is used indirectly by other styles, but
it also used by itself so we need to add it to the example styles.
Handle manually inserted span tags that indicate that a word is not
a keyword
Add aliases for certain keywords that are included in another
keyword file
Fix keyword regex to avoid matching shorter keyword as part of a longer
keyword
@hakonhagland
Copy link
Collaborator Author

I suggest to name the files differently

@lisajulia Thanks for the review :) I have renamed the files in the last update.

@lisajulia lisajulia merged commit bbf01d4 into OPM:main Nov 15, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants