Skip to content

Conversation

abbasidaniyal
Copy link
Contributor

@abbasidaniyal abbasidaniyal commented May 24, 2021

ticket-33012

This PR is in accordance with this GSoC project
The detailed proposal can be found here

This PR aims at adding support for Redis to be used as a caching backend with Django. As redis is the most popular caching backend, adding it to django.core.cache module would be a great addition for developers who previously had to rely on the use of third party packages.

Major Tasks :

Remaining Tasks :

Future Tasks :

  • Add support for providing username and password in OPTIONS. This will be possible in the upcoming version of redis-py
  • Milli second support (To be discussed)

@abbasidaniyal
Copy link
Contributor Author

Hey @carltongibson
So I was able to get a quick and simple implementation up and running. However, there are several decisions that I made for simplicity, like using pickle for serializing data.
I am still not sure how shall be handle multiple servers. Do we setup a sharding based client (like memcached) or should it cater to a replication based setup with a "Primary and replica" based setup?

Do let me know what improvement I should make and how I shall proceed!

@carltongibson carltongibson self-requested a review May 25, 2021 12:44
Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @abbasidaniyal — good speedy start! 🏎

The initial thing is tests. Can you look at the coverage for the existing backends and adapt the tests for the new backend? Then, that immediately gives us a todo list in terms of API.

@WisdomPill
Copy link

Sorry for stepping in.

I have some questions, isn't this what jazzband/django-redis is doing?

Will there be some code copying, what will be the faith of it?
By the way django-redis is already somewhat compatible with django api and widely used by the community.

Should there be any discussion about it? Has it ever been one?

@carltongibson
Copy link
Member

Hey @WisdomPill — The proposal is to add a Redis backend to core.

It will be simpler than that provided by django-redis, for instance customising the serialiser is out-of-scope for the initial pass.

It's been discussed quite a few times on django-developers. The swinger was last year's survey showing approx twice as many users opting for Redis (where we have no backend) over memcached (where we have several).

@abbasidaniyal
Copy link
Contributor Author

Hi @abbasidaniyal — good speedy start!

The initial thing is tests. Can you look at the coverage for the existing backends and adapt the tests for the new backend? Then, that immediately gives us a todo list in terms of API.

Sure. I'll start working on the tests. Will be sharing coverage reports soon!

@abbasidaniyal
Copy link
Contributor Author

abbasidaniyal commented May 26, 2021

Sorry for stepping in.

I have some questions, isn't this what jazzband/django-redis is doing?

Will there be some code copying, what will be the faith of it?
By the way django-redis is already somewhat compatible with django api and widely used by the community.

Should there be any discussion about it? Has it ever been one?

Hey @WisdomPill!
You can find a link to my proposal here. A lot of inspiration is taken from the django-redis package as mentioned in the proposal. There might be some code similarities with django-redis as the end-goal of remains the same, that is, connecting redis-py client with the django caching framework.

@abbasidaniyal
Copy link
Contributor Author

abbasidaniyal commented May 31, 2021

Hey @carltongibson!
So I was going trying to work on the coverages of each backend. However, one tests keeps failing.
cache.tests.CreateCacheTableForDBCacheTests.test_createcachetable_observes_database_router

======================================================================
FAIL: test_createcachetable_observes_database_router (cache.tests.CreateCacheTableForDBCacheTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniyal/Desktop/django/django/test/utils.py", line 430, in inner
    return func(*args, **kwargs)
  File "/home/daniyal/Desktop/django/tests/cache/tests.py", line 1238, in test_createcachetable_observes_database_router
    management.call_command('createcachetable', database='other', verbosity=0)
  File "/home/daniyal/Desktop/django/django/test/testcases.py", line 86, in __exit__
    self.test_case.assertEqual(
AssertionError: 1 != 5 : 1 queries executed, 5 expected
Captured queries were:
1. 
            SELECT c.relname,
            CASE WHEN c.relispartition THEN 'p' WHEN c.relkind IN ('m', 'v') THEN 'v' ELSE 't' END
            FROM pg_catalog.pg_class c
            LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
            WHERE c.relkind IN ('f', 'm', 'p', 'r', 'v')
                AND n.nspname NOT IN ('pg_catalog', 'pg_toast')
                AND pg_catalog.pg_table_is_visible(c.oid)
        

----------------------------------------------------------------------

I did not completely understand the reason for this.
My test settings were

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'mydb_default',
        'USER': 'myuser',
        'PASSWORD': 'password',
        'HOST': 'localhost',
        'PORT': '5432',
    },
    'other': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'mydb_other',
        'USER': 'myuser',
        'PASSWORD': 'password',
        'HOST': 'localhost',
        'PORT': '5432',
    }
}

CACHES = {
    "default": {
        "BACKEND": "django.core.cache.backends.db.DatabaseCache",
        "LOCATION": "my_cache_table",
    },
}

Tried and tested with SQLite as well, and got the same results.
Error with SQLite

======================================================================
FAIL: test_createcachetable_observes_database_router (cache.tests.CreateCacheTableForDBCacheTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniyal/Desktop/django/django/test/utils.py", line 430, in inner
    return func(*args, **kwargs)
  File "/home/daniyal/Desktop/django/tests/cache/tests.py", line 1238, in test_createcachetable_observes_database_router
    management.call_command('createcachetable', database='other', verbosity=0)
  File "/home/daniyal/Desktop/django/django/test/testcases.py", line 86, in __exit__
    self.test_case.assertEqual(
AssertionError: 1 != 5 : 1 queries executed, 5 expected
Captured queries were:
1. 
            SELECT name, type FROM sqlite_master
            WHERE type in ('table', 'view') AND NOT name='sqlite_sequence'
            ORDER BY name

----------------------------------------------------------------------

However, when I comment out this line, which call createcachetable, the test passes.

call_command('createcachetable', database=self.connection.alias)

@carltongibson
Copy link
Member

Hey @abbasidaniyal — super. That's what we like to see. 😃

Can you push your latest and I will take a look hopefully tomorrow.

@abbasidaniyal
Copy link
Contributor Author

Hey @abbasidaniyal — super. That's what we like to see.

Can you push your latest and I will take a look hopefully tomorrow.

I'm currently running the tests against the main branch.
Sharing the coverage results as screenshots here

I'll start adapting these existing tests for the new backend now!

@abbasidaniyal abbasidaniyal force-pushed the redis-cache-backend branch from 0f724ec to 9239713 Compare June 2, 2021 22:48
@abbasidaniyal
Copy link
Contributor Author

Hey @carltongibson !
I've just pushed the lastest update that I have. I've adapted the existsing tests for the new backend. The tests are failing at two instance

  • Culling
  • zero and negative timeout handling : Redis-py does not support 0 or negative timeouts. I have implemented the get_backend_timeout similar to the memcache backend but I'm still not sure about how to handle 0 timeout. Ideally it should not store the key in the first place.

@abbasidaniyal
Copy link
Contributor Author

Hey @carltongibson!
So I was going trying to work on the coverages of each backend. However, one tests keeps failing.
cache.tests.CreateCacheTableForDBCacheTests.test_createcachetable_observes_database_router

======================================================================
FAIL: test_createcachetable_observes_database_router (cache.tests.CreateCacheTableForDBCacheTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniyal/Desktop/django/django/test/utils.py", line 430, in inner
    return func(*args, **kwargs)
  File "/home/daniyal/Desktop/django/tests/cache/tests.py", line 1238, in test_createcachetable_observes_database_router
    management.call_command('createcachetable', database='other', verbosity=0)
  File "/home/daniyal/Desktop/django/django/test/testcases.py", line 86, in __exit__
    self.test_case.assertEqual(
AssertionError: 1 != 5 : 1 queries executed, 5 expected
Captured queries were:
1. 
            SELECT c.relname,
            CASE WHEN c.relispartition THEN 'p' WHEN c.relkind IN ('m', 'v') THEN 'v' ELSE 't' END
            FROM pg_catalog.pg_class c
            LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
            WHERE c.relkind IN ('f', 'm', 'p', 'r', 'v')
                AND n.nspname NOT IN ('pg_catalog', 'pg_toast')
                AND pg_catalog.pg_table_is_visible(c.oid)
        

----------------------------------------------------------------------

I did not completely understand the reason for this.
My test settings were

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'mydb_default',
        'USER': 'myuser',
        'PASSWORD': 'password',
        'HOST': 'localhost',
        'PORT': '5432',
    },
    'other': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'mydb_other',
        'USER': 'myuser',
        'PASSWORD': 'password',
        'HOST': 'localhost',
        'PORT': '5432',
    }
}

CACHES = {
    "default": {
        "BACKEND": "django.core.cache.backends.db.DatabaseCache",
        "LOCATION": "my_cache_table",
    },
}

Tried and tested with SQLite as well, and got the same results.
Error with SQLite

======================================================================
FAIL: test_createcachetable_observes_database_router (cache.tests.CreateCacheTableForDBCacheTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/daniyal/Desktop/django/django/test/utils.py", line 430, in inner
    return func(*args, **kwargs)
  File "/home/daniyal/Desktop/django/tests/cache/tests.py", line 1238, in test_createcachetable_observes_database_router
    management.call_command('createcachetable', database='other', verbosity=0)
  File "/home/daniyal/Desktop/django/django/test/testcases.py", line 86, in __exit__
    self.test_case.assertEqual(
AssertionError: 1 != 5 : 1 queries executed, 5 expected
Captured queries were:
1. 
            SELECT name, type FROM sqlite_master
            WHERE type in ('table', 'view') AND NOT name='sqlite_sequence'
            ORDER BY name

----------------------------------------------------------------------

However, when I comment out this line, which call createcachetable, the test passes.

call_command('createcachetable', database=self.connection.alias)

@carltongibson I was able to figure this one out. I was following the documentation to setup the DatabaseCache.

CACHES = {
    "default": {
        "BACKEND": "django.core.cache.backends.db.DatabaseCache",
        "LOCATION": "my_cache_table",
    },
}

However, I believe my_cache_table was conflicting with this

django/tests/cache/tests.py

Lines 1213 to 1220 in 225d965

@override_settings(
CACHES={
'default': {
'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
'LOCATION': 'my_cache_table',
},
},
)

Setting the "LOCATION" to some other table name leads to the test passing.

Maybe we could mention it in the docs somewhere or update the test to check if duplicate table names exists? This is a little off-topic from this PR. Should I create a separate ticket for this? Or should we let it be for now?

Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @abbasidaniyal.

Good stuff. Nice progress.

OK, so... I think time to start on the docs.

If redis-py isn't installed, or if Redis isn't running we get a couple of errors:

django.core.cache.backends.base.InvalidCacheBackendError: Could not find backend 'django.core.cache.backends.redis.RedisCache': No module named 'redis'

and:

redis.exceptions.ConnectionError: Error 61 connecting to None:6379. Connection refused.

We should skip the tests unless Redis is set up. See the same for PyLibMCCache:

@unittest.skipUnless(PyLibMCCache_params, "PyLibMCCache backend not configured")

Then updating the Running all the tests section of the contributing guide would be the next step — let's make sure that's right.

Then I think we can start adding bullet points (or better) in docs/topics/cache.txt.

#14437 (comment):

The tests are failing at two instance

  • Culling
  • zero and negative timeout handling : Redis-py does not support 0 or negative timeouts. I have implemented the get_backend_timeout similar to the memcache backend but I'm still not sure about how to handle 0 timeout. Ideally it should not store the key in the first place.

Noted:

3 Current failures
======================================================================
FAIL: test_cull (cache.tests.RedisCacheTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/carlton/Documents/Django-Stack/django/tests/cache/tests.py", line 650, in test_cull
    self._perform_cull_test('cull', 50, 29)
  File "/Users/carlton/Documents/Django-Stack/django/tests/cache/tests.py", line 647, in _perform_cull_test
    self.assertEqual(count, final_count)
AssertionError: 49 != 29

======================================================================
FAIL: test_zero_cull (cache.tests.RedisCacheTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/carlton/Documents/Django-Stack/django/tests/cache/tests.py", line 653, in test_zero_cull
    self._perform_cull_test('zero_cull', 50, 19)
  File "/Users/carlton/Documents/Django-Stack/django/tests/cache/tests.py", line 647, in _perform_cull_test
    self.assertEqual(count, final_count)
AssertionError: 49 != 19

======================================================================
FAIL: test_zero_timeout (cache.tests.RedisCacheTests)
Passing in zero into timeout results in a value that is not cached
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/carlton/Documents/Django-Stack/django/tests/cache/tests.py", line 614, in test_zero_timeout
    self.assertIsNone(cache.get('key1'))
AssertionError: 'eggs' is not None

----------------------------------------------------------------------

Let's have a think about those; need to look at them. If you can make progress on the above points meanwhile. 👍

I wouldn't worry about the 'my_cache_table' failure — you kind of have to opt
into it — it doesn't come up unless you explicitly define a DB cache setting.

Copy link
Member

@ngnpope ngnpope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had an initial pass through this.

I've not looking into too much detail yet at the option handling and pool/client instantiation stuff. Will look into that later.

@abbasidaniyal
Copy link
Contributor Author

Hey!
I've made the changes suggested by @pope1ni. I've updated the tests and added

redis_excluded_caches = {'cull', 'zero_cull'}
...
@override_settings(CACHES=caches_setting_for_tests(
    base=RedisCache_params,
    exclude=redis_excluded_caches
))
class RedisCacheTests(BaseCacheTests, TestCase):
    ...

Now only on test fails. Handling zero timeout. Redis-py does not support it natively and django expects to no set a key with zero timeout. I'm not sure at which level should this be handled. I was wondering if we perform the check in get_backend_timeout method and return a value (eg: None) or raise a suitable exception.

@abbasidaniyal
Copy link
Contributor Author

@carltongibson I'm starting off with the documentation now. Should post updates in a few days!

Copy link
Member

@ngnpope ngnpope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've still got more to look into around the connection pooling and option handling, but this should be enough to get on with for now.

In addition, I think that you can easily implement the following in RedisCacheClient and RedisCache to provide better performance over the default implementations inherited from BaseCache:

  1. Implement .has_key() using Redis.exists() which uses the EXISTS command.
  2. Implement .incr() using Redis.incr() which uses the INCRBY command.
  3. Implement .decr() using Redis.decr() which uses the DECRBY command.
  4. Implement .delete_many() using Redis.delete() which uses the DEL command. (Note that this is the same as the normal delete, but passing multiple keys at the same time. You could change RedisCacheClient.delete() to take *keys instead of key, but it might need some work around the handling of .get_client()...)

I also thought that .get_or_set() could be enhanced to use the SET command with NX and GET options. I see that using these together requires Redis >= 7.0 and GET requires >= 6.2. Also redis-py seems to lack support for GET in Redis.set(). See