This library provides q
method for easy querying of pandas.DataFrame
.
Internally, q
generates the query string for pandas.DataFrame.query.
The goal of pandaq
is to save time when querying.
pip install pandaq
pandaq
provides two ways to select pandas.DataFrame
rows by query string.
from pandaq import Q
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv')
qstr = Q().q(PassengerId=1) # -> "PassengerId==1"
df.query(qstr)
import pandaq.patch
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv')
df.q(PassengerId=1)
Query Type | How It Works |
---|---|
q("str") | Equivalent to pd.DataFrame.query("str") |
q(dict) | Works like q(**dict) |
q(k=v) | Where the column named k is equal to v |
q(k=[v1, ...]) | Where the column named k is in [v1 , ...] |
q(k=(op1, v1, ...)) | Where the column named k meets the condition k op1 v1 and ... |
q(k="!str") | Where the column named k is NOT str (see Advance) |
q(k="?str") | Where the column named k contains str (see Advance) |
q(k="!?str") | Where the column named k does NOT contain str (see Advance) |
q(k="/str") | Where the column named k match the regex str (see Advance) |
Type | Kind | pandaq.q | Equivalent to query-string |
---|---|---|---|
Number | Equal | q(a=1) |
"a==1" |
Number | Contain | q(a=[1, 2]) |
"a==1 or a==2" |
Number | Inequality | q(a=(">", 1)) |
"a>1" |
Number | Inequality | q(a=(">", 1, "<=", 3)) |
"a>1 and a<=3" |
Bool | Equal | q(a=True) |
"a==True" |
Str | Full-Match | q(a="text") |
'a=="text"' |
Str | Not Full-Match | q(a="!text") |
'a!="text"' |
Str | Partial-Match | q(a="?text") |
'a.str.contains("text", regex=False, na=False)' |
Str | Not Partial-Match | q(a="!?text") |
'not a.str.contains("text", regex=False, na=False)' |
Str | Regex-Match | q(a="/textA|textB") |
'a.str.contains("textA|textB", regex=True, na=False)' |
Datetime | Equal | q(a=dt.date(1970,1,1)) |
'a=="1970-01-01"' |
Datetime | Inequality | q(a=(">", dt.date(1970,1,1))) |
'a>"1970-01-01"' |
pandaq.q | Equivalent to query-string |
---|---|
q(a=1, b=1) |
"(a==1 & b==1)" |
q(a=1).q(b=1) |
"a==1 & b==1" |
q({"a": 1, "b": 1}) |
"(`a`==1 & `b`==1)" |
To set the prefix
individually, configure the following settings.
from pandaq import config
# Default settings
config.str_prefix = StringPrefix(
neq="!", # The prefix for not equal
regex="/", # The prefix for regular expressions
partial="?", # The prefix for partial match
neq_partial="!?", # The prefix for not equal partial match
)
pandaq
is distributed under the terms of the MIT license.