Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Upgrade sqlalchemy #5

Merged
merged 40 commits into from
Oct 10, 2023
Merged

RFC: Upgrade sqlalchemy #5

merged 40 commits into from
Oct 10, 2023

Conversation

leo-mansini
Copy link
Contributor

No description provided.

@leo-mansini leo-mansini changed the base branch from master to migration-from-gitlab December 29, 2022 23:04
@@ -126,11 +126,14 @@ def __setattr__(self, attr, value):
elif attr == 'driver':
value = format_drivername(self.dialect, value)
attr = 'drivername'
self.conn_url = self.conn_url.set(**{attr: value})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quickly understand the change, mind explaining it to me if you remember? :)

If it's a breaking change with sqlalchemy 1.3 we could add a version check to remain compatible with both 1.3 and 1.4.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a breaking change. Previously one could just do

connection.url = "my_url"

But now the attribute is protected from changes like that, and requires to be changed with its set method.

Regarding double compatibility, I'm answering you in another comment.

numpy = ">=1.22.0,<2.0"
jinjasql = "^0.1.8"
ipython = "^7.29.0"
ipython = ">=8.0.1"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@MartinCura MartinCura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi 👋🏼 @leo-mansini.

Nice PR you've got there. Is this ready for action? Do you want me to spam someone so it's code reviewed? It looks fairly inoffensive, with the caveat that the PR name is not complete as you're modifying several package, aren't you?

I was just looking to upgrade project-lisa to mlflow 1.28.0 which requires sqlalchemy 1.4... but this lovely package muttlib doesn't let me :) (because of the current sqlalchemy <1.4 limitation) ... unless you merge with the change i'm requesting.

pyproject.toml Outdated
@@ -58,28 +58,27 @@ progressbar2 = "^3.55.0"
PyYAML = ">=5.1,<7.0"
scikit-learn = "^1.0.1"
scipy = "^1.7.2"
SQLAlchemy = "<1.4.0,>=1.3.0"
SQLAlchemy = "^1.4,<1.5"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be changed to >=1.3,<2.0? This would retain compatibility with many more projects already using muttlib. Or is the 1.3 -> 1.4 upgrade breaking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<2.0 and <1.5 is the same since there aren't any versions in >=1.5, <2.0, but I think <2.0 is better.

Regarding >=1.3, I can try and check!

However I'm not sure how to manage this with MLFlow or other libraries not compatible with 1.3, would muttlib need compatibility for the older version of those libraries too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just started a poetry project with sqlalchemy 1.3 and this branch to test this. In the last commit I attempted the double compatibility and no issues with poetry or pytest 👍 , just need to test it on sqlalchemy 1.3 since poetry.lock set it to 1.4.

@@ -126,11 +126,14 @@ def __setattr__(self, attr, value):
elif attr == 'driver':
value = format_drivername(self.dialect, value)
attr = 'drivername'
self.conn_url = self.conn_url.set(**{attr: value})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quickly understand the change, mind explaining it to me if you remember? :)

If it's a breaking change with sqlalchemy 1.3 we could add a version check to remain compatible with both 1.3 and 1.4.

@leo-mansini
Copy link
Contributor Author

@MartinCura answering this. The changes in dependencies are, as far as I can remember, all motivated by sqlalchemy version bump, except for IPython, where I added the vulnerability comment.

Other cases that I remember were particular are:

  • sqlalchemy-trino not only had problems with sqlalchemy but is also deprecated as it was added as an extra to Trino.
  • wrapt was not used, and also had conflicts with sqlalchemy.

@MartinCura
Copy link
Contributor

OK great, and what do you think about retaining compatibility for both sqlalchemy 1.3 and 1.4?

@MartinCura
Copy link
Contributor

@CrossNox do i bother you for a Code Review? 😅

psycopg2-binary = { version = "^2.9.2", optional = true }
PyMySQL = { version = "^1.0.2", optional = true }
pymssql = { version = "^2.2.2", optional = true }
pymongo = { version = "^3.12.1", optional = true }
snowflake-connector-python = { version = "^2.7.1", optional = true }
cryptography = { version = "3.4.7", optional = true }
snowflake-sqlalchemy = { version = "1.2.4", optional = true }
snowflake-sqlalchemy = { version = ">=1.2.4, <=1.4.4, !=1.2.5", optional = true } # https://github.com/snowflakedb/snowflake-sqlalchemy/issues/234
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding 1.2.5 because of incompatibility with sqlalchemy 1.3 (source).

@leo-mansini
Copy link
Contributor Author

leo-mansini commented Mar 27, 2023

@MartinCura Added compatibility with SQLAlchemy >1.3. Keep in mind the changes were tested only in the project included tests and in my particular uses, which are Google Sheets and BigQuery.

pyproject.toml Outdated
matplotlib = "^3.5.0"
pyarrow = { version = "6.0.0", optional = true }
cx-Oracle = { version = "^8.3.0", optional = true }
PyHive = { version = ">=0.6.1", optional = true }
thrift = { version = "^0.15.0", optional = true }
thrift = { version = "0.16.0", optional = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinning to a specific version for same reason? i.e. missing ^?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question remains

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving it with ^ works anyways as latest is 0.16.0 so no reason to not have it 🤷

@MartinCura
Copy link
Contributor

@MartinCura Added compatibility with SQLAlchemy >1.3. Keep in mind the changes were tested only in the project included tests and in my particular uses, which are Google Sheets and BigQuery.

Thanks! Looking good!

Base automatically changed from migration-from-gitlab to master April 5, 2023 19:42
@leo-mansini
Copy link
Contributor Author

Merged this to master, just have an issue with bandit, fails due to a pd.read_pickle I didn't modify in this PR 😢

Copy link
Contributor Author

@leo-mansini leo-mansini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI passes @aoelvp94 @MartinCura

We can merge if it lgty

@@ -183,7 +183,7 @@ def clean_quotes(s):
elif ext == 'feather':
return pd.read_feather(fn)
elif ext in ['pickle', 'pkl']:
return pd.read_pickle(fn)
return pd.read_pickle(fn) # nosec
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For bandit to pass it's either this or removing the feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a comment to entrypoint function cautioning about passing unsafe pickle files because it's a security issue; apart from that i can't think of anything else you can do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to docstring (if by entrypoint you mean the df_read_multi util).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and maybe local_df_cache()

muttlib/dbconn/base.py Show resolved Hide resolved
@@ -183,7 +183,7 @@ def clean_quotes(s):
elif ext == 'feather':
return pd.read_feather(fn)
elif ext in ['pickle', 'pkl']:
return pd.read_pickle(fn)
return pd.read_pickle(fn) # nosec
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a comment to entrypoint function cautioning about passing unsafe pickle files because it's a security issue; apart from that i can't think of anything else you can do?

pyproject.toml Outdated
matplotlib = "^3.5.0"
pyarrow = { version = "6.0.0", optional = true }
cx-Oracle = { version = "^8.3.0", optional = true }
PyHive = { version = ">=0.6.1", optional = true }
thrift = { version = "^0.15.0", optional = true }
thrift = { version = "0.16.0", optional = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question remains

muttlib/utils.py Outdated Show resolved Hide resolved
Copy link
Contributor

@MartinCura MartinCura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat

i'm checking whether there's some project that is endangered because of lack of lockfile, but after that we can merge

@MartinCura
Copy link
Contributor

i'm checking whether there's some project that is endangered because of lack of lockfile, but after that we can merge

Seems like there's none :)

@leo-mansini
Copy link
Contributor Author

TY @MartinCura for the review!

@leo-mansini leo-mansini merged commit 753f2a1 into master Oct 10, 2023
@MartinCura MartinCura deleted the upgrade-sqlalchemy branch October 18, 2023 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants