Skip to content

Conversation

orf
Copy link
Contributor

@orf orf commented May 27, 2017

Ticket: #28236

This PR integrates dj-database-url into Django, allowing users to configure their caches and databases via URLs:

# settings.py
from django.conf.service_url import configure_db, configure_cache, register_db_backend

register_db_backend('customDB', 'my.custom.db.adapter')
register_cache_backend('customCache', 'my.custom.cache.adapter')

# These values would come from environment variables rater than be hard-coded like this

DATABASES = {
    'default': configure_db('postgres://user:password@localhost/myDb?sslmode=verify-full'),
    'sqlite': configure_db('sqlite://:memory:'),
    'custom': configure_db('customDB://10.10.10.10/something?else=1')
}

CACHES = {
    'default': confgure_cache('memcached://127.0.0.1:11211'),
    'file': configure_cache('file:///tmp/cache'),
    'custom': configure_db('customCache://10.10.10.10/something?else=1')
}

It's extensible, allowing third party adapters to plug in and extend the parsing. It does this by using a config_from_url static method to the base cache adapters and DB wrappers, which allows subclasses to adapt the result - i.e mysql parses some options from the querystring into specific configuration options.

The test suite is copied/adapted from dj-database-url as are a lot of the quirks parsing, so thanks to @kennethreitz for those.

To-Do:

  • Documentation
  • More full-suite tests, ensuring the resulting config actually works
  • Add support for easy overriding config options via kwargs

@kennethreitz
Copy link

✨🍰✨

@kennethreitz
Copy link

Let's get some docs added to this, @orf!

@orf
Copy link
Contributor Author

orf commented Jun 9, 2017 via email

@kennethreitz
Copy link

👍

@kennethreitz
Copy link

If there's anything I can do to help, let me know!

@orf orf changed the title Fixed #28236 - Integrate dj-database-url into Django WIP: Fixed #28236 - Integrate dj-database-url into Django Jun 18, 2017
@orf orf changed the title WIP: Fixed #28236 - Integrate dj-database-url into Django WIP: Fixed #28236 - Integrate dj-database-url into Django Jun 18, 2017
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer need the "_psycopg2" part.

@orf
Copy link
Contributor Author

orf commented Aug 20, 2017

I've finally got around to pushing what I've got so far - sorry about the delay.

So my idea is to add a configure_from_url classmethod to BaseDatabaseWrapper (and in future to the base cache class). This gets passed a scheme and url argument.

I think a class based solution is best here, and keeping it inside the database class itself is a good idea. I'll keep working on it, but please let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these tests don't actually connect to a database, is it possible to remove the skipUnless (here and below)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of this approach is that the DB module needs to be imported, which will fail unless the DB-specific adapters are installed (psycopg2 etc).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could register_db_backend be a decorator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, these need to be available pre-import

@auvipy
Copy link
Contributor

auvipy commented Sep 14, 2017

is this going to be make in django 2.0?

@orf
Copy link
Contributor Author

orf commented Sep 14, 2017

Unfortunately not. There is a fair bit to do on the PR to get it up to scratch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be classmethod?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good catch 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps TypeError would be more appropriate here and in the next place below? Having a method is usually a property of a type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user who writes his own backend makes a typo for backend_type, this function will return without error but the registration will mysteriously not have worked. The two options I see are to make a whitelist of allowed backend_types, or make this function private (by adding an underscore in front of the name) and just expose the convenience functions such as register_db_backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've made them private.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function's name starts with "add" but the convenience functions' names start with "register". Would it help to keep these uniform?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the base class's method, it checks whether URL is already parsed. Does this class need to do the same check before calling urlparse?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also the SQLite backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit iffy, but I added the check to only the base class so that if the method was over-ridden and the url was parsed the result could be passed in without needing to re-call urlparse. config_from_url, when over-ridden, is always going to parse the URL, but the public API should expect only strings to be passed in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if someone wants to subclass the Postgres backend directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. :<

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_ is used elsewhere for gettext. Maybe better to just take the [0] slice instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you define dsn once and do url += dsn to reduce duplication?

@orf
Copy link
Contributor Author

orf commented Sep 14, 2017

Thank you for your review @wkschwartz! All your comments seemed very reasonable so I've made the changes requested. If anyone else has any time to review this it would be awesome. Documentation still needs to be added, but this could be ready for 2.0.

@orf
Copy link
Contributor Author

orf commented Sep 15, 2017

I've added support for configuring caches as well. It's a little bit more tricky than database configurations.

I ended up creating a parse_url method to deal with common (in db/caches) quirks in the standard library urlparse, namely case-insensitive hostnames. Still a few more tests to run, but seems to work quite well.

Copy link
Contributor

@wkschwartz wkschwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about caches, so someone else should review those parts of the code.

Thanks for working on this @orf! I'm looking forward to this feature being baked into Django.


BACKENDS = defaultdict(dict)

BaseParsedURL = namedtuple('ParsedURL', 'scheme username password hostname port path options')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you inherit from urllib.parse.ParseResult to make the class hierarchy easier for people already familiar with the stdlib's data types? Then you could do away with the namedtuple import and simply do

from urllib import parse
class ParseResult(parse.ParseResult):
    @property
    def netloc(self):
        # ...what you had written below...

This should override ParseResult.netloc correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yes. I remember why I did all this, it's because really the ParseResult is pretty unsuited for this due to a lot of annoying quirks. I think it should be a class entirely distinct from ParseResult.

@@ -0,0 +1,135 @@
import inspect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to have url_config go in django/conf/? If you do that, url_config might be a confusing name. Maybe something like services_url_config? I'm trying to think of something (like "services") that captures both DBs and caches, and the fact that they might be hosted somewhere else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you're right, I think it should be under conf. service_url sounds like a good module name I think?


class ParsedURL(BaseParsedURL):
@property
def netloc(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Helpful if this should be a public class: A docstring for either ParsedURL, netloc, or both, explaining what netloc does and how ParsedURL differs from urllib.parse.ParseResult

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should definitely have a doc-string, because I've completely forgotten how it differs....!

return BACKENDS[backend_type][scheme]


def configure(backend_type, value):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should configure be private in the same way/for the same reason that _register_backend is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep 👍



def register_db_backend(scheme, path):
if not path.endswith('.base.DatabaseWrapper'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find the documentation for how to make a database or cache backend. I'm guessing this is not the only place that explicitly requires this path, but if I'm wrong, should we document this for anyone looking to create their own backends?

@wkschwartz
Copy link
Contributor

The other thing I just noticed is that dj-database-url allows you to pass in option overrides such as conn_max_age to the config function. Perhaps simple **kwargs forwarding from configure_db to configure to Database.config_from_url would do the trick. Then config_from_url would simply append kwargs to the OPTIONS value, making the key in the kwargs capitalized. I think this will make configure_db a drop-in replacement for dj-database-url's config.

Thoughts? Is something similar necessary for the cache configurator (at least for uniformity's sake)? (I don't know me much caches.)

@orf orf changed the title WIP: Fixed #28236 - Integrate dj-database-url into Django Fixed #28236 - Integrate dj-database-url into Django Feb 4, 2018
@orf
Copy link
Contributor Author

orf commented Feb 4, 2018

I think that needs some thought @wkschwartz, but I agree it's a good feature to have. Implementing overriding individual attributes would be easy, i.e `password='abc', but how we handle nested options (like under 'OPTIONS' etc) would be tricky.

Also, thanks for the review, I've fixed up the branch, rebased it and got the tests to pass. I've made a lot of changes but I think it's a lot neater now. The biggest one was removing the named tuple and instead returning a simple, mutable dictionary from parse_url. I've also moved all the code under django.conf as it does seem like the right place to have this.

I'll spend some more time fixing up a few small things, any feedback on the code would be much appreciated.


def close(self, **kwargs):
# libmemcached manages its own connections. Don't call disconnect_all()
# libmemcached managepip install pipenvs its own connections. Don't call disconnect_all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"managepip"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh nuts, I accidentally pasted that there and didn't spot it, I'll remove it.

@wkschwartz
Copy link
Contributor

Maybe we don't even need to support all possible option overrides, just the most common, such as conn_max_age. For more complicated things, we could document this pattern:

DATABASES = {'default': configure_db(os.environ['DATABASE_URL'])}
DATABASES['default']['OPTIONS'] = {'FOO': 'bar'}

@LuRsT
Copy link
Contributor

LuRsT commented Sep 17, 2019

@orf Need help here?

@wlonk
Copy link
Contributor

wlonk commented Jun 5, 2020

@LuRsT and @orf I'm also willing to provide help. I'm interested in reducing the distance from startproject to deployable, and I think this would help.

@orf
Copy link
Contributor Author

orf commented Jul 7, 2020

Hey @wlonk, feel free to take the ticket and have a go. This is on my to-do list but it’s pretty far down, so please anyone who wants to build on this go for it!

@felixxm
Copy link
Member

felixxm commented Jul 8, 2020

Closing due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants