-
Notifications
You must be signed in to change notification settings - Fork 270
core.where: If cacert.pem does not exist, extract from zip #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This will make certifi work if packed in a zip, such as when running in an egg, or packaged using Pex or Subpar. tests/test_zip: Test referencing certifi packaged in an egg.
The unit test works when run with |
tests/test_zip.py
Outdated
_TEST_SCRIPT='; '.join('''import certifi | ||
import os.path | ||
assert os.path.exists(certifi.where()), "certs do not exist" | ||
print "PASS"'''.split('\n')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The python
binary could just as easily be linked to python3
in a virtualenv. Please use print("PASS")
I applied this and confirmed that it fixes the AWS Glue use case that I mentioned, as well. Thanks! I'm not sure it fixes the original issue that #66 was filed for, since it's still using |
I'm still not wild about this, but in any event, there's a race condition between when |
I did say on bug #66 that I would accept "no thanks we won't support zip packaged libraries" (since although Python does support this, it isn't that widely used). Great catch on the race condition! Oops! I've submitted a change that should be python3 friendly, and fixes that issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what’s the plan for addressing the race?
I believe I did: The solution is not to assign to the global variable until after the certificate content has been written out to disk. Due to the Python GIL, this should be thread safe I believe? |
There's a second race condition :-)
|
Amazing, you are completely correct. This means we should probably do this at import time since otherwise we need a lock to protect the initialization. Updating the PR now. Thanks. |
Since Python has a single import lock, this ensures a single thread will perform this initialization once per Python process, and calling where() is still efficient.
Is there more development that needs to be done on this before it is merged in? I'd be happy to help if so, I'd love to see this merged. |
Same here, would love to see this make it to the main line. |
def where(): | ||
f = os.path.dirname(__file__) | ||
_cert_tempfile = None | ||
_cert_path = os.path.join(os.path.dirname(__file__), 'cacert.pem') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line will fail in the environments that don't have __file__
defined, such as cx_Freeze or PyOxidizer.
The fix is IMHO quite simple - to catch NameError
from this line. The rest should work fine - pkgutil.get_data
works cross platform.
It would be great to get this into a release. At the moment, it's quite hard to package a single executable depending on e. g. requests
library.
# assume we are in a zip: attempt to extract to a temp file | ||
cert_data = pkgutil.get_data(__package__, 'cacert.pem') | ||
if cert_data is not None: | ||
_cert_tempfile = tempfile.NamedTemporaryFile(suffix='.pem') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NamedTemporaryFile
delete
defaults to True
, and this algorithm relies on an open file handle for the duration of the runtime invocation, and breaks if the certifi
module is reloaded, as other modules will have the old where()
result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be a fix for this? Making _cert_path
a subclass of str
and storing a reference to _cert_tempfile
there?
Maybe this edge case doesn't need to be solved together with this issue. If the module works for you now, it will continue to work. The only case that breaks is if you're in an evironment where the packages are installed outside of standard filesystem AND you reload certifi
.
What is the use case anyway? If the app runs from a zip file or from a compiled EXE, I don't see a need for reloading certifi
- the module will likely not change during runtime.
I think at this point this has been subsumed by the PRs to use the |
I opened this years ago, but have since stopped working on stuff that uses PEX/eggs, so my answer is: I don't know. I assume that enough things related to Python packaging have changed since I started this, that this is likely not the right approach any more anyway? There has been no activity in two years, so I'm going to take the extreme approach of closing this. If there are people with more recent experience who know that this is still a problem, please comment and we can attempt to figure out the right new solution. |
This will make certifi work if packed in a zip, such as when running
in an egg, or packaged using Pex or Subpar.
tests/test_zip: Test referencing certifi packaged in an egg.
Fixes #66