-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
Fixed #35629 -- Added support for async database connections and cursors. #18408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ed91e56
to
83099c9
Compare
3a3fc57
to
e6bbcd1
Compare
1b81214
to
b42cc45
Compare
b42cc45
to
86158fb
Compare
86158fb
to
3be837e
Compare
3be837e
to
288b3c9
Compare
288b3c9
to
df58a82
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sufficiently familiar with the internals of the ORM/database layer to comment on much of the internal logic here, but the general shape of this code and approach seems to make sense to me
(left a few questions, nothing terribly blocking)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps out of scope for this PR, but I don't think it's clear where this Cursor
comes from. I think it's defined in psycopg3
, but I'm not confident in that.
This should also be clarified in the ServerSideCursor
docstring above too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anything under Database.
comes from psycopg2
or psycopg3
.
In this specific case, ClientCursonMixin
and AsyncClientCursor
come from psycopg3
ServerSideCursor
Is our custom class that derives from postgres' ClientCursorMixin
and AsyncServerCursor
/ServerCursor
django/db/backends/utils.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might this need to be async?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked across all backends, and last_executed_query
is a non-blocking method that basically only does string interpolation / decoding of a string that's already stored on the DBAPI cursor.
259d10c
to
53fae35
Compare
efafc58
to
49141bb
Compare
126f15a
to
fd3053b
Compare
|
||
return self.conn | ||
|
||
async def __aexit__(self, exc_type, exc_value, traceback): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I’ve found an issue with an async context manager — everything works fine until an asyncio.CancelledError
is raised.
Here's a short example to reproduce it.
import asyncio
class AsyncResource:
async def __aenter__(self):
print("Enter context")
return self
async def __aexit__(self, exc_type, exc, tb):
print("Exiting context: start")
await asyncio.sleep(2)
print("Exiting context: end")
async def task_that_uses_resource():
async with AsyncResource():
await asyncio.sleep(1)
async def main():
task = asyncio.create_task(task_that_uses_resource())
# Let the task run a bit and then cancel it
await asyncio.sleep(1.5)
print("Cancelling the task...")
task.cancel()
try:
await task
except asyncio.CancelledError:
print("Task was cancelled!")
asyncio.run(main())
results
Enter context
Exiting context: start
Cancelling the task...
Task was cancelled!
It’s clear why we don’t see Exiting context: end
, but I’m not sure how to fix it quickly. This problem can lead to a leak of connections and incorrect nesting of connections due to pop_connection
.
I'm trying to do something like that
# use sync context manager instead
@contextmanager
def independent_connection(self):
active_connection = self.all(initialized_only=True)
try:
for conn in active_connection:
del self[conn.alias]
yield
finally:
# run async code in a separate coroutine without await
asyncio.create_task(self.close_all())
for conn in active_connection:
self[conn.alias] = conn
But it still doesn't cover every case. @fcurella, any thoughts on how we can to tackle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if shielding the exit logic would fix it? something like:
from asyncio import shield
class new_connection:
...
async def __aexit__(self, exc_type, exc_value, traceback):
await shield(self._aexit(exc_type, exc_value, traceback))
async def _aexit(self, exc_type, exc_value, traceback):
autocommit = await self.conn.aget_autocommit()
if autocommit is False:
if exc_type is None and self.force_rollback is False:
await self.conn.acommit()
else:
await self.conn.arollback()
await self.conn.aclose()
await async_connections.pop_connection(self.using)
Could you write a test case to expose the bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure. Shielding seems to help with closing the connection, but I'm concerned it might not fix issues with broken nested ordering caused by pop_connection.
I'll try to write a unit test when I have some free time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import asyncio
from django.db import new_connection, async_connections, DEFAULT_DB_ALIAS
from django.test import TransactionTestCase, skipUnlessDBFeature
@skipUnlessDBFeature("supports_async")
class NestedConnectionTest(TransactionTestCase):
available_apps = ["async"]
async def test_nesting_error1(self):
async def aget_autocommit_new():
raise asyncio.CancelledError
async with new_connection() as conn1:
assert async_connections.count == 1
main_connection = async_connections.get_connection(DEFAULT_DB_ALIAS)
aget_autocommit_old = conn1.aget_autocommit
try:
async with new_connection() as conn2:
assert async_connections.count == 2
conn2.aget_autocommit = aget_autocommit_new
except:
pass
conn1.aget_autocommit = aget_autocommit_old
assert main_connection == async_connections.get_connection(DEFAULT_DB_ALIAS)
assert async_connections.count == 1
This test contains monkey_patch, but it's clearly demonstrating the root of the problems. Is it helpful for you?
======================================================================
FAIL: test_nesting_error1 (async.test_async_connections.NestedConnectionTest.test_nesting_error1)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/m.havelya/work/django/env/lib/python3.13/site-packages/asgiref/sync.py", line 325, in __call__
return call_result.result()
~~~~~~~~~~~~~~~~~~^^
File "/opt/homebrew/Cellar/[email protected]/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
~~~~~~~~~~~~~~~~~^^
File "/opt/homebrew/Cellar/[email protected]/3.13.1/Frameworks/Python.framework/Versions/3.13/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/Users/m.havelya/work/django/env/lib/python3.13/site-packages/asgiref/sync.py", line 365, in main_wrap
result = await awaitable
^^^^^^^^^^^^^^^
File "/Users/m.havelya/work/django/tests/async/test_async_connections.py", line 31, in test_nesting_error1
assert main_connection == async_connections.get_connection(DEFAULT_DB_ALIAS)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, that's really helpful!
We could wrap the __exit__
in a try ... finally
block:
async def __aexit__(self, exc_type, exc_value, traceback):
try:
autocommit = await self.conn.aget_autocommit()
if autocommit is False:
if exc_type is None and self.force_rollback is False:
await self.conn.acommit()
else:
await self.conn.arollback()
finally:
await self.conn.aclose()
await async_connections.pop_connection(self.using)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can get this problem during aclose
call as well
When this is finalized and merged to master will it also be added to the subsequent Django 5.2.x release? |
Dear django team, what's your decision on this? Kindly reminder, a lot of people waiting for low level async db connection |
@dima-kov I’ve been swamped at work and couldn’t dedicate much time to it. I would definitely appreciate some help! |
@dima-kov I think the way forward here is to pull this work into a third-party backend so we can all get trying it out. This is on my list but I, too, haven't had spare capacity, yet. |
Co-authored-by: Mykhailo Havelia <[email protected]>
fd3053b
to
2f3dbce
Compare
20104d6
to
da20298
Compare
Right now, it’s just a cursor implementation. To test it properly in a real project, we’ll need the ORM parts (transactions, get, update, delete, etc.), otherwise, it doesn’t make much sense. 😌 We already have contributors willing to finish it up, but it will need input from the core team for validation and review. @carltongibson would it make sense to form a small group with 3-4 maintainers familiar with the ORM, together with contributors (django forum, discord, etc)? That way, if one person is busy, others could still provide guidance and keep things moving. What do you think? |
@Arfey yep, I think coordinating effort is good. (I've wanted to pick this up, but haven't had capacity. I'm totally happy for others to do so.) My concern here is that it needs to be properly tested in production prior to merging. That means we need to give people an easy way to add it to their existing projects. Which means we need to extract it to a package. (I don't see another way there. Maybe there is one.) There are guaranteed to be weird deadlocking issues (and similar) that folks run into when they start using this, and we need to fix issues and document the usage patterns, so that it's received well. |
@carltongibson Got it. I'll put together a brief overview of this ticket for the Discord chat. Hopefully, we can form a working group 😌
I've been thinking about it. It's a really tricky part. Even an async connection required significant changes within Django. Extracting it into a separate extension would involve a lot of copy-pasting, and after testing it in production, merging it back into Django could be risky and might introduce bugs. I see two options:
|
I don't really see this as a problem. Yes, there'll be some duplication. (But there's lots here that wouldn't be needed in a proof of concept IIRC. E.g. can we subclass the Postgres backend just to add the new api? Etc. It doesn't seem intractable.) If we put it in an external package we can ask people to pick it up as soon as it's available. (Rather than waiting until Sept 26 for Django 6.1.) We can then test and fix issues on a much quicker cycle than waiting for changes in Django. I'm pretty sure we can correctly merge again once the edge cases are worked out. (We'll then be doing so knowing we have a battle-tested approach.) (Just personally, I could easily install and define an alias to try this on the work project, but to wait for Django 6.1 is going to be an age.) |
@carltongibson, probably I'm overthinking too much, excuse me 😌 Maybe you’re right. There are a lot of places that we can’t reuse from Django without rewriting the entire code base. A quick example is async def aget_new_connection(self, conn_params):
# copy code ...
if self.apool:
await self.apool.open()
connection = await self.apool.getconn()
else:
connection = await self.Database.AsyncConnection.connect(**conn_params)
# copy code ...
return connection Very often, we can’t reuse code by extending a class. So we will have a lot of duplication. We need to avoid any changes inside duplication code and sync Django’s changes with our extension. I think it’s tricky 🙂 Anyway, your proposal has a lot of benefits (especially when we are talking about speed). Let’s try. I’m gonna prepare POC. Is it possible to add to Django yield statements to the DB backend class? from this def _close(self):
if self.connection is not None:
with self.wrap_database_errors:
self.connection.close() to this @sync_async_method_adapter # <---- sync_async_method_adapter
def _close(self):
if self.connection is not None:
with self.wrap_database_errors:
yield self.connection.close() # <---- yield django/django/utils/sync_async.py Lines 25 to 76 in 5235448
|
Super! 🏅 Just to be totally clear, I really want us to progress this feature. It's the last main bit of the async story. They'll be more to (refinements) follow but if we had this in place, we'd have a whole DB-to-wire story folks could use.
Gotcha. It's still the case that pulling them out (as duplicates) lets us get going, and we can say "Can we merge this small part here?", which is easier to review and check for correctness. I know this all feels like more work. But I've seen subtle problems with async come up again and again over these last number of years — the discussion about cancellation above is exactly of that kind — and I can't see a way for us to be really confident that we're have a solid (Django Quality™) approach without getting people to put it under stress in production and then report back. I hope that makes sense. |
Agree. Just to confirm. Am I right that changing |
🤷 :) I haven't had a chance to sit down with #19576 and really think it through. Maybe the In general, I think "Here's the simplest thing we can do" is going to be easier to get moving. |
Understood. I'll create a simple MR adding |
I might be missing something, but doesn't creating an external package feel like unnecessary overhead? Why not just publish it as a separate Git tag - e.g. |
@Arfey even if such a patch is accepted immediately (which is unlikely) it wouldn't be available until Django 6.1 next September. The whole point here of our discussion is to get people being able to use async cursors now. I was talking about this with the other Steering Council members today. We agreed this needs to be tested in production before we can merge it. (The example of being shown to handle high numbers of connections in production without causing issues was part of the discussion.) We can't get there without an external implementation first, as a pathway to getting this resolved. |
@carltongibson got it. I'll prepare a fully separated version.
Could u share some thoughts about testing it in production? Let's say we have a ready external implementation of an asynchronous cursor. What are my next steps? Do I need to install it and rewrite some parts of my project using raw SQL and execute it with an asynchronous cursor? |
Trac ticket number
ticket-35629
Branch description
This would cover the new new_connection context managers, and provide an async cursor that the user could use.
The goal of this first phase is to give users a low-level async cursor that they can use for raw SQL queries
Manual transaction management, such as
acommit
andarollback
is in scope, buttransaction.atomic
is out of scope and will be addressed in a future Pull Request.Checklist
main
branch.