Skip to content

Conversation

adzshaf
Copy link
Contributor

@adzshaf adzshaf commented Aug 17, 2024

Trac ticket number

ticket-32519

Branch description

This PR introduces support for JSONField key transforms in update so we can do partial update JSONField more efficiently. Consider following example to update a JSONField.

>>> user_preferences = UserPreferences.objects.create(
...     settings={
...         "theme": "dark",
...         "notifications": True,
...         "font": {"size": 10, "name": "Arial"},
...     }
... )
>>> UserPreferences.objects.update(
...     settings__theme="light",
...     settings__font__size=20,
... )
>>> user_preferences = UserPreferences.objects.get(pk=user_preferences.pk)
>>> print(user_preferences.settings)
{'theme': 'light', 'notifications': True, 'font': {'size': 20, 'name': 'Arial'}}

You can also remove a key by providing NOT_PROVIDED to parameter value.

>>> from django.db.models import NOT_PROVIDED
>>> user_preferences = UserPreferences.objects.create(
...     settings={
...         "theme": "dark",
...         "notifications": True,
...         "font": {"size": 10, "name": "Arial"},
...     }
... )
>>> UserPreferences.objects.update(
...     settings__theme=NOT_PROVIDED,
...     settings__font__size=20,
... )
>>> user_preferences = UserPreferences.objects.get(pk=user_preferences.pk)
>>> print(user_preferences.settings)
{'notifications': True, 'font': {'size': 20, 'name': 'Arial'}}

This feature is implemented using the newly added databases functions JSONSet and JSONRemove which serves as abstraction on top of the suitable functions on each database backend.

This PR is a part of the Google Summer of Code program.

Checklist

  • This PR targets the main branch.
  • The commit message is written in past tense, mentions the ticket number, and ends with a period.
  • I have checked the "Has patch" ticket flag in the Trac system.
  • I have added or updated relevant tests.
  • I have added or updated relevant docs, including release notes if applicable.
  • I have attached screenshots in both light and dark modes for any UI changes.

Copy link
Contributor

@laymonage laymonage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great work here! I haven't fully reviewed this, but could we squash the commit history, then break it down into smaller commits to make it easier for others to review?

I can see this broken down into at least these commits:

  1. Adding JSONField handling for Cast() on SQLite, and the corresponding tests.
  2. Extracting KeyTransform.unwrap_transforms() from KeyTransform.preprocess_lhs()
  3. JSONSet + tests
  4. JSONRemove + tests
  5. Changes to UpdateQuery.add_update_values() in subqueries.py and addition of Transform.get_update_expression() in lookups.py + tests for the NotImplementedError and FieldError
  6. Addition of KeyTransform.get_update_expression() + tests for using key transforms in .update()

Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems correct according to https://apex.oracle.com/database-features/

However, the function seems to be documented for 19c as well: https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/json_transform.html

According to https://stackoverflow.com/a/75286939 it was backported to 19c in 19.10. Could we update this and see if it works? According to https://code.djangoproject.com/wiki/CI we use v19.3.0.0.0 on the CI though, so that may need to be upgraded, or we can test ourselves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed on following the https://apex.oracle.com/database-features/ website.

Comment on lines +40 to +44
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we extract the changes in Cast(), along with the corresponding test, into a separate commit please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done ✔️

Comment on lines 100 to 107
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic seems similar to the one that already exists in Query.try_transform(). We've discussed this before and encountered a few issues in doing so, but I still wonder if it's possible to reuse that method. I'll do some experiments later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we extract the changes in this file and the Transform.get_update_expression() method in lookups.py, along with the tests (albeit only testing for the NotImplementedError and FieldError) into a separate commit please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done ✔️

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's create a separate commit for JSONSet+tests, and another commit for JSONRemove+tests. Then, create a separate commit for adding KeyTransform.get_update_expression() and the corresponding tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done ✔️

@adzshaf adzshaf force-pushed the ticket_32519 branch 3 times, most recently from 6eb6c8f to 7200b96 Compare September 14, 2024 15:39
Copy link
Contributor

@laymonage laymonage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@adzshaf and I have been iterating on this together for quite a while. I have yet to fully review the tests and docs, but in general we're pretty happy with the implementation's approach.

There are bits I'm still not sure about, mostly around whether the update expression for e.g. .update(json_field__key2__key2a=123) should be implemented in the KeyTransform class vs. the JSONField itself. There are implications toward future extensibility for other transforms, so I'd like to get the community's opinions on this.

I've left some comments to add a bit of context that hopefully helps explain the intentions.

Comment on lines +40 to +44
elif output_type == "JSONField":
template = "JSON(%(expressions)s)"
return super().as_sql(
compiler, connection, template=template, **extra_context
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason why we need this:

https://www.sqlite.org/json1.html#jset

If the value of a path/value pair is an SQLite TEXT value, then it is normally inserted as a quoted JSON string, even if the string looks like valid JSON. However, if the value is the result of another json function (such as json() or json_array() or json_object()) or if it is the result of the -> operator, then it is interpreted as JSON and is inserted as JSON retaining all of its substructure. Values that are the result of the ->> operator are always interpreted as TEXT and are inserted as a JSON string even if they look like valid JSON.

It's very similar to the other workarounds in as_mysql for MariaDB and in as_oracle. Instead of putting the workarounds directly in Cast.as_vendor(), do we want to introduce a new database operation specifically for casting JSONField instead? We have something similar for the datetime transforms:

def datetime_cast_date_sql(self, sql, params, tzname):
"""
Return the SQL to cast a datetime value to date value.
"""
raise NotImplementedError(
"subclasses of BaseDatabaseOperations may require a "
"datetime_cast_date_sql() method."
)
def datetime_cast_time_sql(self, sql, params, tzname):
"""
Return the SQL to cast a datetime value to time value.
"""
raise NotImplementedError(
"subclasses of BaseDatabaseOperations may require a "
"datetime_cast_time_sql() method"
)

Comment on lines +51 to +56
# Use Value to serialize the data to string,
# then use Cast to ensure the string is treated as JSON.
value = Cast(
Value(value, output_field=self.output_field),
output_field=self.output_field,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Value() with JSONField() would allow us to reuse its get_db_prep_value() which does the serialization of Python objects based on the DB vendor. Then, we wrap it in Cast() because this value will be used as a parameter to the DB's JSON_SET function, and some DBs (e.g. SQLite) will treat the value as a plain SQL string (even if it's valid JSON) if we don't use Cast.

This is unlike saving values directly to the column, in which case the DBs will happily accept a JSON-formatted string and do any conversion to the actual JSON data type (if they have one).

Comment on lines +94 to +99
class ToJSONB(Func):
function = "TO_JSONB"

new_source_expressions.append(
ToJSONB(value, output_field=self.output_field),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting to JSONB with ::jsonb on Postgres only works for strings (text) that are JSON-formatted. If you have an expression that resolves to a non-string SQL value, e.g. when you refer to an integer column with F(), using Cast() doesn't work. Postgres has a dedicated to_jsonb function for this: https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-JSON-CREATION-TABLE

Comment on lines +90 to +91
if not hasattr(value, "resolve_expression"):
new_source_expressions.append(Value(value, output_field=self.output_field))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need Cast() because Value() with a JSONField is correctly handled as JSONB by Postgres.

Comment on lines +136 to +141
if not hasattr(value, "resolve_expression"):
# Interpolate the JSON path directly to the query string, because
# Oracle does not support passing the JSON path using parameter
# binding.
return f"{args[0]}, SET '{key_paths_join}' = {args[-1]} FORMAT JSON"
return f"{args[0]}, SET '{key_paths_join}' = {args[-1]}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interpolating the JSON path directly is nothing new:

# Add paths directly into SQL because path expressions cannot be passed
# as bind variables on Oracle.
return sql % tuple(params), []

Though I think we should move the comment to be outside the if, because this applies to both cases.

Suggested change
if not hasattr(value, "resolve_expression"):
# Interpolate the JSON path directly to the query string, because
# Oracle does not support passing the JSON path using parameter
# binding.
return f"{args[0]}, SET '{key_paths_join}' = {args[-1]} FORMAT JSON"
return f"{args[0]}, SET '{key_paths_join}' = {args[-1]}"
# Interpolate the JSON path directly to the query string, because
# Oracle does not support passing the JSON path using parameter
# binding.
if not hasattr(value, "resolve_expression"):
return f"{args[0]}, SET '{key_paths_join}' = {args[-1]} FORMAT JSON"
return f"{args[0]}, SET '{key_paths_join}' = {args[-1]}"

Using FORMAT JSON is similar to using Cast() in other DBs but this is specific to Oracle's JSON_TRANSFORM.

https://docs.oracle.com/en/database/oracle/oracle-database/21/adjsn/oracle-sql-function-json_transform.html

If the result expression evaluates to a SQL value that is not JSON type, you can convert it to JSON data by following the expression immediately with keywords FORMAT JSON. This is particularly useful to convert the SQL string 'true' or 'false' to the corresponding JSON-language value true or false. Example 11-7 illustrates this.

self.assertEqual(related.name, "new")


class TransformUpdateTests(TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, these tests are not particularly useful for real-life cases. However, this ensures if you try to use update() with transforms that do not support it yet, a helpful error message is thrown (instead of some unexpected exception/undefined behaviour).

def get_update_expression(self, value, lhs=None):
from ..functions import JSONRemove, JSONSet

field, key_transforms = self.unwrap_transforms()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bit of round trip around the transforms, so let me explain the flow.

  • We do MyModel.objects.update(json_field__key1="foo", json_field__key2__key2a=123)
  • QuerySet.update() will construct an UpdateQuery which calls add_update_values() to prepare the expressions for each field that we are updating.
  • In add_update_values(), we'll split the kwargs using LOOKUP_SEP.
    • Then, we ensure each name in the resulting split are valid transforms by doing a get_transform() on the field with each string to get the transform class. If it is found, we'll construct the transform. In JSONField's case, it will always give you KeyTransform (because keys can be of any string).
    • If the transform is chained, we keep on building the transforms by composing them, e.g. KeyTransform(KeyTransform(...)). For other fields, this may not be of the same class, e.g. TransformA(TransformB(...)). There's no concrete example for this right now, though.
    • After constructing the final Transform instance (the last one from the result of .split(LOOKUP_SEP)), we call its get_update_expression() method.
    • In KeyTransform, we use the unwrap_transforms so that we can "unwrap" the transform composition e.g. KeyTransform("key2a", KeyTransform("key2", "json_field")) back into field, ["key2", "key2a"].
    • Then, we join the keys with LOOKUP_SEP again, so we can pass it to JSONSet()/JSONRemove() as JSONSet(field, key2__key2a=value).
    • Then JSONSet()/JSONRemove() is going to – yet again –, split the kwargs with LOOKUP_SEP, because it needs to build the final JSON path string e.g. $.key2.key2a.
    • We can probably optimize the last bits by introducing a classmethod on JSONSet/JSONRemove that accepts a pair of a KeyTransform instance and the value...

Alternatively, we could also make it so that the field itself, rather than the transform, defines the expression. This lets us skip the round trip of splitting/joining on LOOKUP_SEP entirely.

For example, if this get_update_expression is implemented in JSONField (instead of KeyTransform), we could just pass the kwargs from .update() directly to the JSONSet() and JSONRemove() functions. This skips KeyTransform entirely.

This approach will result in a much more straightforward code flow. However, it would mean that any use of LOOKUP_SEP in update() is handled entirely by the field itself, rather than going through the transforms mechanism like in .filter().

I think, to make this extensible and more consistent with .filter(), we should do the transforms approach, even if it means doing a bit of round trips in the case of KeyTransform.

values_seq = []
field_dict = dict()
for name, val in values.items():
name, *transforms = name.split(LOOKUP_SEP)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is used by QuerySet.update(), as it constructs an UpdateQuery to build the SQL UPDATE steatement. Previously, Django never split the kwargs on LOOKUP_SEP (unlike in .filter()) because LOOKUP_SEP is normally used for:

  • referring to a field in a related model
  • doing lookups and transforms
    and both of which are currently unsupported in update().

AFAIK, .update() can only update the table of the model itself, and not the table of a related model (because a single SQL UPDATE statement can only update one table) – so this is not something we need to consider.

Lookups only make sense in the context of making conditionals, e.g. gte, startswith. It doesn't make sense to use them in the context of update(), e.g. what does it mean to do MyModel.objects.update(some_integerfield__gte=42)?

For transforms, they make sense in both cases. You can already do MyModel.objects.filter(some_datefield__year=2023). Adding support for something like .update(some_datefield__year=2024) doesn't seem too far-fetched.

So, in here, we add .split(LOOKUP_SEP) and assume that the rest that follow the first element are transforms.



@skipUnlessDBFeature("supports_partial_json_update")
class JSONSetTests(TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think many of the test methods here can be combined into a single method with subTest.


# It should append the value to the array if the database supports it.
if not connection.features.json_set_array_append_requires_length_as_index:
if connection.vendor == "oracle":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens here is that we're testing the behaviour of JSON_SET when updating an array with an index that's out-of-range. For example, with JSON {"items": ["a","b","c"]}.

If we do JSONSet(field, items__3='d'), this results in {"items": ["a","b","c","d"]} on all databases.

If we do JSONSet(field, items__4='d'), however, this gives different results:

  • On SQLite, {"items": ["a","b","c"]}. The append does not happen.
  • On Oracle, {"items": ["a","b","c",null,"d"]}. It fills out the "gap" with null, before appending "d" to the correct index.
  • On other databases, {"items": ["a","b","c","d"]}. It just appends to the last index of the array.

We weren't sure if this is worth adding another feature flag, so we kind of combined Oracle and SQLite's behaviour into a single json_set_array_append_requires_length_as_index flag, with the base assumption being that the append does not happen, and Oracle being the outlier.

Perhaps, we should only do the full array assertion inside a vendor check for both oracle and sqlite. Then, add a general assertion that field["items"][3] does not equal "d" – this is true for both Oracle and SQLite, and we still leave room for external DB backends which may have the flag also set to true but we don't know the exact behaviour.

@laymonage
Copy link
Contributor

@bmispelon :shipit:

Copy link
Contributor

@sarahboyce sarahboyce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @adzshaf 🚀 this looks really strong

I had a good catch up with @laymonage so I am a little up to speed

I have added a couple of notes 👍


Updating ``JSONField``
======================

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. versionadded:: 5.2


``JSONRemove``
--------------

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. versionadded:: 5.2

Also think it's worth writing a release note for JSONRemove in the minor features as part of b90bf8f


``JSONSet``
-----------

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. versionadded:: 5.2

I also think it might be worth adding a release not for JSONSet in the "Minor features" section as part of this commit (3600f05)

"notifications": {"email": False, "sms": True},
},
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I was curious what happens when we do update_or_create using this syntax and the answer is, it doesn't fail but it doesn't update (the following test fails expecting it would update).

Suggested change
def test_update_or_create_json_key_transform(self):
user_preferences = UserPreferences.objects.create(
settings={
"theme": {"color": "black", "font": "Arial"},
"notifications": {"email": False, "sms": True},
}
)
updated_user_preferences, created = UserPreferences.objects.update_or_create(
defaults={"settings__theme__color": "white"},
id=user_preferences.id
)
self.assertEqual(created, False)
self.assertEqual(
updated_user_preferences.settings,
{
"theme": {"color": "white", "font": "Arial"},
"notifications": {"email": False, "sms": True},
},
)

Can you investigate whether adding support for update_or_create feasible?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Sarah, that's a good point! We haven't investigated into adding transforms support for update_or_create, but thanks to your review we looked into how JSONSet and JSONRemove behaves with update_or_create().

We're going to open a separate PR for adding those two functions. In the meantime here are our findings for using update_or_create with the functions.

With the following example:

updated_user_preferences, created = UserPreferences.objects.update_or_create(
    defaults={"settings": JSONSet("settings", theme__color="white")},
    id=user_preferences.id,
)
  • In the case where it does an update (created is False), it worked well – the test passed.
  • In the case where it does an insert (created is True), it fails because we'd be referring to the column for the row which hasn't been inserted yet. It makes sense that this would fail, but I think it would be nice if JSONSet() and JSONRemove() can fall back to an empty JSON object as the starting point in the insert case.
    • The main blocker we found in this case is that the for_save parameter in BaseExpression.resolve_expression() doesn't let you differentiate whether it's an INSERT or an UPDATE query – it's set to True in both cases.
    • We hacked the places where we pass for_save=True and changed it to pass the strings "create" and "update" depending on the query, and we were able to make JSONSet() and JSONRemove() resolve the expression accordingly. See adzshaf@207598f for more details.
    • I'm not sure the best course of action on that, though. That being said, I just noticed this forum thread about atomic upserts that would also benefit from the ability to differentiate between INSERT vs UPDATE in resolve_expression(). I'll ask there and see if others have an opinion.

Copy link
Contributor

@laymonage laymonage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick update: @adzshaf and I still work on this regularly every week (we just haven't pushed the updates to the PR branch). We're going to split this PR into two:

  • Adding JSONSet() and JSONRemove()
  • Adding support for using key transforms in QuerySet.update() and QuerySet.update_or_create() for JSONFields (which uses JSONSet and JSONRemove under the hood)

Along with a few more tweaks based on @sarahboyce's feedback at the DjangoCon US sprints. We found a few more edge cases so it's taking us quite a bit of time, but hopefully we'll open a new PR soon.

"notifications": {"email": False, "sms": True},
},
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Sarah, that's a good point! We haven't investigated into adding transforms support for update_or_create, but thanks to your review we looked into how JSONSet and JSONRemove behaves with update_or_create().

We're going to open a separate PR for adding those two functions. In the meantime here are our findings for using update_or_create with the functions.

With the following example:

updated_user_preferences, created = UserPreferences.objects.update_or_create(
    defaults={"settings": JSONSet("settings", theme__color="white")},
    id=user_preferences.id,
)
  • In the case where it does an update (created is False), it worked well – the test passed.
  • In the case where it does an insert (created is True), it fails because we'd be referring to the column for the row which hasn't been inserted yet. It makes sense that this would fail, but I think it would be nice if JSONSet() and JSONRemove() can fall back to an empty JSON object as the starting point in the insert case.
    • The main blocker we found in this case is that the for_save parameter in BaseExpression.resolve_expression() doesn't let you differentiate whether it's an INSERT or an UPDATE query – it's set to True in both cases.
    • We hacked the places where we pass for_save=True and changed it to pass the strings "create" and "update" depending on the query, and we were able to make JSONSet() and JSONRemove() resolve the expression accordingly. See adzshaf@207598f for more details.
    • I'm not sure the best course of action on that, though. That being said, I just noticed this forum thread about atomic upserts that would also benefit from the ability to differentiate between INSERT vs UPDATE in resolve_expression(). I'll ask there and see if others have an opinion.

Copy link
Member

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the ping from this forum thread from @laymonage

I think the idea of making for_save a (enum.Enum, str) to differentiate between different scenario could could as most usage out there are doing if for_save and most likely don't need to differentite between both.

My immediate concerns with these changes are that they are possibly trying to do too much at once?

Is there a reason why we can't first land support for JSONSet / JSONRemove in a first set of changes targeted for 5.2, let them simmer for a release, and then consider adding support for the transform combining feature later on?

I like the ideas put forward by this patch and the separation of concern between field and transform logic but these are more fundamental changes to the ORM works and bring larger questions. For example, should we support save(update_fields) with transforms as well? Is it time to introduce a specialized type of transform that denotes nested data structure access instead of trying to push the current abstraction in this space?

IMO these questions can likely be more easily answered while users are allowed to play with JSONSet and JSONRemove and these primitives are iterated on to iron out bugs first.

if lhs is None:
lhs = field

if value is NOT_PROVIDED:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using a sentinel to differentiate between null and key deletions (it's something that I believe is missing for JSON null vs SQL NULL) bet I'm not sure we should piggy back on that one since it was meant for a completely different purpose.

Copy link
Contributor

@laymonage laymonage Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree @charettes! This was a temporary solution as suggested by Markus the other day, and I linked to your exact comment: https://fosstodon.org/@laymonage/112949402890177435 (Like I said, big fan 😂)

Happy to introduce a new sentinel for this, although if we want to also support that sentinel in JSONField at the same time, that sounds like a separate work in itself.

I think the idea of making for_save a (enum.Enum, str) to differentiate between different scenario could could as most usage out there are doing if for_save and most likely don't need to differentite between both.

That's what I thought too – though I suppose it could still break if people check for the boolean explicitly e.g. if for_save is True. Still doable, just need to be more careful when we do the transition.

My immediate concerns with these changes are that they are possibly trying to do too much at once?

Is there a reason why we can't first land support for JSONSet / JSONRemove in a first set of changes targeted for 5.2, let them simmer for a release, and then consider adding support for the transform combining feature later on?

No particular reason, really. When we implemented the first pass for the JSONSet and JSONRemove functions, we were curious how much work it would take to implement the transforms support in update() directly. Turns out it wasn't much, but indeed it's a bit more "controversial".

Sarah and I discussed this at the DjangoCon US sprints and we decided to split this PR into two. One for the DB functions, and one for the nicer syntactic sugar via update() directly – just like you said. I noted this in my comment #18489 (review) just yesterday.

@adzshaf and I are still working on polishing the DB functions to be ready for a new, smaller PR. We found a few more edge cases so it's taking us a bit longer than we thought. We'll post an update on the ticket when we put the PR up 😄

I like the ideas put forward by this patch and the separation of concern between field and transform logic but these are more fundamental changes to the ORM works and bring larger questions. For example, should we support save(update_fields) with transforms as well? Is it time to introduce a specialized type of transform that denotes nested data structure access instead of trying to push the current abstraction in this space?

IMO these questions can likely be more easily answered while users are allowed to play with JSONSet and JSONRemove and these primitives are iterated on to iron out bugs first.

Thank you, and yes there are definitely more areas of the ORM that we haven't considered like bulk_{create,update}, update_or_create, etc. and save(update_fields) is one of them. I'd be happy to focus on getting JSONSet and JSONRemove in first, then discuss the more uncertain bits after.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, it's great to see you push all of these JSONField improvement forward. It's has been a game changer to be able to use JSONField for non-Postgres database and a lot of that is thanks to you 🙇

I think that the two-commit approach has a lot of merit considering how hard it is to get these things right the first time. Let me know when things have shappen if you'd like more feedback. As I said already I think the proposed approach is great in the sense that it demonstrates that it can be done without too many invasive changes.

@adzshaf
Copy link
Contributor Author

adzshaf commented Nov 3, 2024

Just like what @laymonage mentioned, I have created separate PR for adding JSONSet and JSONRemove:

Once that PR is merged, we will revisit the support for using key and path transforms in update() in a new PR.

Thank you @sarahboyce @charettes for the reviews ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants