Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DuplicatePreparedStatementError not solved for Pgbouncer behind ha settings #239

Open
danigosa opened this issue Jan 1, 2018 · 4 comments

Comments

@danigosa
Copy link

danigosa commented Jan 1, 2018

  • asyncpg version: 0.14.0
  • PostgreSQL version: 9.6.6
  • Python version: 3.6.4
  • Platform: Alpine Linux
  • Do you use pgbouncer?: Yes, in pool_mode=session
  • Did you install asyncpg with pip?: Yes
  • If you built asyncpg locally, which version of Cython did you use?: 0.27.3
  • Can the issue be reproduced under both asyncio and
    uvloop?
    :
    I don't think so

The problem is persisting when a cluster setup (stolon-ha in Kubernetes) is configured. This is the configuration:

  • 5 nodes (1 master, 4 slaves hot replicas)
  • 2 proxies, one to the elected master always, one that distribute reads to all 5 nodes
  • When tried to use asyncpg pool directly to the Kubernetes ClusterIP proxies, they were very unreliable mostly because it does not sanitize, heartbeat and reconnect to those proxies, I had 25% of Connection's errors of all types (Stopped in the middle of operation, connection refused, connection does not exist, etc.)
  • So we put 2 pgbouncer in front *just to keep the pools healthy, with an aggressive configuration for connection checks, sanitation and reconnecting. 99% of connection problems wiped out.
  • Problem come when using cursors (statement_cache=0, pg_bouncer.pool_mode=session) that this problem arises, it does not with a single instance deployed

My guess: proxies (above all round robin slaves) ara a "single connection" that round robins by TCP to all the nodes, so I don't know what's going on but it started to raise this error in the 25% of read queries, what is pretty annoying, all other environments with a single node and a single server pgbouncer works. Should there be a more robust statement name generation so the names have a real random component? that is avoid that 2 processes can clash into the same name (even if in ideal conditions this would mean no problem because you can detect it or whatever). I think random statement names would solve most of the issues, asyncpg_stmt_N_XXXX would avoid that two asyncpg_stmt_8* clashes in these edge cases.

[23:46:06][ERROR] robbie.http.microservice.middleware.errors api.py:__call__:242 | ('Error during SQL operation: prepared statement "__asyncpg_stmt_8__" already exists\nHINT:  \nNOTE: pgbouncer with pool_mode set to "transaction" or\n"statement" does not support prepared statements properly.\nYou have two options:\n\n* if you are using pgbouncer for connection pooling to a\n  single server, switch to the connection pool functionality\n  provided by asyncpg, it is a much better option for this\n  purpose;\n\n* if you have no option of avoiding the use of pgbouncer,\n  then you must switch pgbouncer\'s pool_mode to "session".\n', DuplicatePreparedStatementError('prepared statement "__asyncpg_stmt_8__" already exists',))
Traceback (most recent call last):
  File "robbie/sql/postgresql/utils.py", line 67, in robbie.sql.postgresql.utils.sql_exceptions_handler_asyncgen.wrapper
  File "robbie/sql/postgresql/client.py", line 112, in fetch
  File "robbie/sql/postgresql/client.py", line 113, in robbie.sql.postgresql.client.PgSQLClient.fetch
  File "robbie/sql/postgresql/client.py", line 114, in robbie.sql.postgresql.client.PgSQLClient.fetch
  File "/usr/local/lib/python3.6/site-packages/asyncpg/cursor.py", line 174, in __anext__
    self._query, self._timeout, named=True)
  File "/usr/local/lib/python3.6/site-packages/asyncpg/connection.py", line 286, in _get_statement
    statement = await self._protocol.prepare(stmt_name, query, timeout)
  File "asyncpg/protocol/protocol.pyx", line 168, in prepare
asyncpg.exceptions.DuplicatePreparedStatementError: prepared statement "__asyncpg_stmt_8__" already exists
HINT:
NOTE: pgbouncer with pool_mode set to "transaction" or
"statement" does not support prepared statements properly.
You have two options:

* if you are using pgbouncer for connection pooling to a
  single server, switch to the connection pool functionality
  provided by asyncpg, it is a much better option for this
  purpose;

* if you have no option of avoiding the use of pgbouncer,
  then you must switch pgbouncer's pool_mode to "session".
@danigosa
Copy link
Author

danigosa commented Jan 1, 2018

Another option is to do cursor.fetch without prepared statements, is that possible?

@elprans
Copy link
Member

elprans commented Jan 1, 2018

Prepared statements are per-session, so theoretically there should not be a collision when phbouncer is in the session mode.

Is there an easy way to replicate your setup?

@danigosa
Copy link
Author

danigosa commented Jan 2, 2018

Unfortunately I cannot think of an easy way to replicate, the architecture is very tied to Kubernetes so in order to get something similar you have to setup a local K8s cluster (minikube, etc.), then installing this chart should do the full setup and create the proxies:
https://github.com/lwolf/stolon-chart

besides that, thinking of a pgbouncer instance targeting a TCP proxy balancer in front of the 5 nodes for reads, through a single IP, is the scenario we are facing. Pgbouncer has not any reset query parameters set other than the default. We are doing our internal investigations on the fact that pgbouncer declares more client connections/sessions than real server connections/sessions, may this be the cause

@elprans
Copy link
Member

elprans commented Jun 6, 2018

a pgbouncer instance targeting a TCP proxy balancer in front of the 5 nodes

So you have another balancer behind pgbouncer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants