Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partition size is less than overlapping window size. Try using df.repartition to increase the partition size. #180

Closed
Vildnex opened this issue Apr 17, 2022 · 1 comment

Comments

@Vildnex
Copy link

Vildnex commented Apr 17, 2022

I am trying to compute the slope of a time series using Pandas with Switfer by doing this.

The full code with this problem can be found in this repository

After I am running the test.py from this code I am getting the following error which so far as I understood is caused by swifter

dask.multiprocessing.NotImplementedError: Partition size is less than overlapping window size. Try using ``df.repartition`` to increase the partition size.

Full traceback:

Traceback (most recent call last):
  File "/home/vlad/Crypto_15m_data/test.py", line 51, in <module>
    add_slop_indicator(dicts[name_pair], f"EMA_{val}", days)
  File "/home/vlad/Crypto_15m_data/test.py", line 27, in add_slop_indicator
    ohlc[f'slope_{val}_{days}'] = ohlc[ind].swifter.rolling(window=candles_back, min_periods=candles_back).apply(
  File "/home/vlad/Crypto_15m_data/venv/lib/python3.10/site-packages/swifter/swifter.py", line 521, in apply
    return self._dask_apply(func, *args, **kwds)
  File "/home/vlad/Crypto_15m_data/venv/lib/python3.10/site-packages/swifter/swifter.py", line 562, in _dask_apply
    dd.from_pandas(self._comparison_pd, npartitions=self._npartitions)
  File "/home/vlad/Crypto_15m_data/venv/lib/python3.10/site-packages/dask/base.py", line 292, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/vlad/Crypto_15m_data/venv/lib/python3.10/site-packages/dask/base.py", line 575, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/vlad/Crypto_15m_data/venv/lib/python3.10/site-packages/dask/multiprocessing.py", line 220, in get
    result = get_async(
  File "/home/vlad/Crypto_15m_data/venv/lib/python3.10/site-packages/dask/local.py", line 508, in get_async
    raise_exception(exc, tb)
  File "/home/vlad/Crypto_15m_data/venv/lib/python3.10/site-packages/dask/multiprocessing.py", line 110, in reraise
    raise exc
dask.multiprocessing.NotImplementedError: Partition size is less than overlapping window size. Try using ``df.repartition`` to increase the partition size.

Traceback
---------
  File "/home/vlad/Crypto_15m_data/venv/lib/python3.10/site-packages/dask/local.py", line 221, in execute_task
    result = _execute_task(task, data)
  File "/home/vlad/Crypto_15m_data/venv/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/vlad/Crypto_15m_data/venv/lib/python3.10/site-packages/dask/dataframe/rolling.py", line 29, in overlap_chunk
    raise NotImplementedError(msg)


Process finished with exit code 1

Can anyone explain to me if this is a bug or if I did something wrong? And if so, what I did and how can I fix it?

@jmcarpenter2
Copy link
Owner

Per the dask github issue, it's about how you specify the partition size vs. the window size. Please read their discussion on that page for more information. As this is not an issue with swifter I will close this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants