-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
PEP 544: Protocols #224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEP 544: Protocols #224
Conversation
I'm excited by this proposal but couldn't help but make some minor changes while reading through it. I hope they'll help make it more readable.
Copyediting changes
JukkaL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments based on a quick read-through.
pep-0544.txt
Outdated
| provide sophisticated runtime instance and class checks against protocol | ||
| classes. This would be difficult and error-prone and will contradict the logic | ||
| of PEP 484. As well, following PEP 484 and PEP 526 we state that protocols are | ||
| **completely optional** and there is no intent to make them required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little unclear what 'completely optional' means here and whether the following sentences are related to this. This could perhaps mean some of these things:
- They are optional like everything defined in PEP 484: They don't affect Python runtime semantics and they are merely a feature in
typing. Programmers are free to not use them. - The standard library doesn't use protocols outside
typing. - Programmers are free to not use them even if they use type annotations.
- Python implementations don't need to support them.
This could rephrased like this to avoid the ambiguity:
... are completely optional:
- No runtime semantics will be imposed for variables or parameters annotated with a protocol class.
- Any checks will be performed only by third-party type checkers and other tools.
- ... maybe more
There is no intent to make protocols non-optional in the future.
| a compatible type signature. | ||
|
|
||
|
|
||
| Protocol members |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to imply that sequences no longer support iteration, since Sequence.__iter__ is not abstract and thus is not a protocol member. Is this intentional?
My view is that protocols should not have non-protocol attributes or methods. For example, none of the ABCs in collections.abc seem to have what I'd consider non-protocol members. Protocols should describe an interface, not an implementation. Default implementations are okay, but they should probably only use other members defined in the protocol.
If we want to support non-protocol members, I think we should restrict them to names with a double underscore prefix to make this super explicit. However, even this brings in some arguably unneeded complexity: self will now have a special type, since accessing non-protocol members is only safe through self. Example:
class C(Protocol):
...
def f(self, other: 'C') -> None:
self.__g() # ok
other.__g() # not ok
def __g(self) -> None:
<something>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to imply that sequences no longer support iteration, since
Sequence.__iter__is not abstract and thus is not a protocol member. Is this intentional?
Sequence implements __iter__ therefore it will be accepted wherever Iterable is expected. But asking for __iter__ on Sequence indeed will fail.
This indeed looks problematic. Here is a solution that I could propose: we do not have such thing as non-protocol members. However, there will be two kinds of members, abstract and non-abstract. All of them should be implemented in order to consider a class as structural (implicit) subtype.
The reason to have non-abstract methods is that one can get them "for free" using explicit subclassing provided abstract methods are implemented in subclass. So that for larger protocols like Sequence or Mapping one doesn't need to do
def method(self):
return super().method()for every method.
This coincides with how ABCs work at runtime, so that this will be intuitive for someone who already worked with ABCs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Protocols should describe an interface, not an implementation.
👍
Default implementations are okay, but they should probably only use other members defined in the protocol.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a solution that I could propose: we do not have such thing as non-protocol members.
That's now specified below in the PEP right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's now specified below in the PEP right?
Yes.
pep-0544.txt
Outdated
| Generic protocol types follow the same rules of variance as non-protocol | ||
| types. Protocols can be used in all special type constructors provided | ||
| by the ``typing`` module and follow the corresponding subtyping rules. | ||
| For example, protocol ``PInt`` is a subtype of ``Union[PInt, int]``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unclear. What is PInt?
pep-0544.txt
Outdated
|
|
||
| Generic protocol types follow the same rules of variance as non-protocol | ||
| types. Protocols can be used in all special type constructors provided | ||
| by the ``typing`` module and follow the corresponding subtyping rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be rephrased as something like "Protocol types can be used in all contexts where any other types can used, such as in (examples). Generic protocols follow the rules for generic abstract classes, except for using structural compatibility instead of compatibility defined by inheritance relationships."
pep-0544.txt
Outdated
| finish(GoodJob()) # OK | ||
|
|
||
| In addition, we propose to add another special type construct | ||
| ``All`` that represents intersection types. Although for normal types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consider leaving out All from this proposal, at least initially. This is something we can add later if there is a need, but currently I don't see evidence for this being important. Effectively the same behavior can be achieved through multiple inheritance. Besides, it would be a bit arbitrary if All only worked with protocols, and making All work with arbitrary types would be a much more complex feature, and not as directly related to this PEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could include this for three reasons:
- I could easily imagine a function parameter that requires more than one protocol, especially if the preferred style is to have several simple protocols, rather than few complex. By the way, some classes in
collections.abcare simple intersections, for exampleCollection == All[Sized, Iterable, Container]. - It could be easily implemented for protocols. I also propose to use
Allinstead ofIntersectionto emphasize it only works for protocols (mnemonics: variable implements all these protocols). On the contrary, I think that intersections of nominal classes would be very rarely used. - If we decide to include this later, then people might simply be not aware of this feature (this is a minor reason, but still we need to take this into account).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Jukka -- I think it's confusing for All to work only with protocols, extending All to work with all types would add considerable complexity, and I'm not convinced there's a pressing need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was pushing for Intersection for the original PEP because I saw the need for being able to intersect ABCs. Given that protocols themselves solve this, and we can use multiple inheritance as Jukka is suggesting, I agree that it's wiser to leave this out.
I would still like to have it, but that can be a separate PEP. I agree it's complex to implement for concrete types, and better yet, for combinations of both.
pep-0544.txt
Outdated
| (``Iterable`` is statically equivalent to ``Iterable[Any]`). | ||
| The metaclass of ``typing.Protocol`` will automatically add | ||
| a ``__subclasshook__()`` that provides the same semantics | ||
| for class and instance checks as for ``collections.abc`` classes:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the automatic isinstance check support, even though that's how the mypy Protocol class used to work. I don't think that they can be made to work reliably for arbitrary ABCs. Consider this example:
class P(Protocol):
@abstractmethod
def common_method_name(self, x: int) -> int: ...
class X:
<a bunch of methods>
def common_method_name(self) -> None: ...
def do_stuff(o: Union[P, X]) -> int:
if isinstance(o, P):
return o.common_method_name(1) # oops what if it's an X instance instead?
else:
return (do something with an X)Maybe a type checker can warn about this example, but more generally the isinstance check could be unreliable, except for things where there is a common signature convention such as Iterable.
Another potentially problematic case:
class P(Protocol):
x: int
class C:
def initialize(self) -> None:
self.x = 0
c1 = C()
c2 = C()
c2.initialize()
isinstance(c1, P) # False
isinstance(c2, P) # True
def f(x: Union[P, int]) -> None:
if isinstance(x, P):
# type of x is P here
...
else:
# type of x is int here?
print(x + 1)
f(C()) # oopsAgain, maybe we can live with this, but this could be pretty surprising. I'd argue that requiring an explicit class decorator would be better, since we can then attach warnings about problems like this in the documentation. The programmer would be able to evaluate whether the benefits outweigh the potential for confusion for each protocol and explicitly opt in -- but the default behavior would be safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that requiring an explicit class decorator would be better, since we can then attach warnings about problems like this in the documentation.
This is how I initially wrote this, since I intuitively felt that this could not be 100% reliable. Guido wanted to make it default, by now we have good examples to make this opt-in, so that I will revert this to class decorator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand how a class decorator would solve the problem. Wouldn't that mean that some classes that implement the protocol (i.e. the decorated ones) will work properly with isinstance at runtime but others will not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
classes that implement the protocol (i.e. the decorated ones)
The decorator is applied to protocol, not to implementation. This was in a previous draft. At runtime the decorator will add same __subclasshook__() to a protocol, that collections.abc classes have now. As Jukka pointed out, there are pros and cons of such approach, and a user could decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would the __subclasshook__ work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, if a class has all necessary attributes, then it is a subclass. This is why I mention below that "type checkers will narrow types after such checks by the type erased Proto (i.e. with all variables having type Any and all methods having type Callable[..., Any])" because we cannot reliably check the types of attributes at runtime.
This is how it is done in collections.abc see https://github.com/python/cpython/blob/master/Lib/_collections_abc.py#L72 and e.g. https://github.com/python/cpython/blob/master/Lib/_collections_abc.py#L93
pep-0544.txt
Outdated
| # Error! Can't use 'isinstance()' with subscripted protocols | ||
|
|
||
|
|
||
| Using Protocols in Python 3.5 and Earlier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about Python 2.7?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be used the same way for 2.7 - 3.5, modulo the fact that function type comments should be used in 2.7.
pep-0544.txt
Outdated
| reasons: | ||
|
|
||
| * Protocols don't have some properties of regular classes. In particular, | ||
| ``isinstance()`` is not always well-defined for protocols, whereas it is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better argument would be something like this: isinstance, as defined for classes that don't explicitly extend Protocol, is based on the nominal hierarchy. In order to make everything a protocol by default and have isinstance work would require changing the semantics of isinstance, which won't happen.
| mainly because their interfaces are too large, complex or | ||
| implementation-oriented (for example, they may include de facto | ||
| private attributes and methods without a ``__`` prefix). | ||
| * Most actually useful protocols in existing Python code seem to be implicit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional reason: Many built-in functions only accept concrete instances of int (and subclass instances), and similarly for other built-in classes. Making int a structural type wouldn't be safe without major changes to the Python runtime, which won't happen.
| References | ||
| ========== | ||
|
|
||
| .. [typing] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it customary to include references to discussions that have happened about the PEP, such as in the typing issue tracker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could do more deep linking to specific comments or subthreads? (No big deal if you think it's too much effort, but it might save some readers some time finding the right part of the long discussion in e.g. python/typing#11 -- including myself :-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will do this.
|
@JukkaL Thank you for helpful comments! I agree with almost all things. I will make corresponding changes in the text soon. The only thing where we do not agree is Also we probably need to find a way how to indicate whether an abstract method has a default implementation or not in stubs. |
|
As a general comment, I don't think it's a good idea to standardize this before having an implementation. An implementation would allow us to understand what's difficult to implement, what sorts of unexpected interactions it has with other type-system features, what parts are confusing, and what parts are practically useful. I think it will be very difficult to get an understanding of those things otherwise. |
pep-0544.txt
Outdated
| ----------------------------------- | ||
|
|
||
| To explicitly declare that a certain class implements the given protocols, they | ||
| can be used as regular base classes. In this case: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried this could be pretty confusing: you could have a class that explicitly implements some protocols and implicitly implements others. I don't know of any other language that allows this.
pep-0544.txt
Outdated
| finish(GoodJob()) # OK | ||
|
|
||
| In addition, we propose to add another special type construct | ||
| ``All`` that represents intersection types. Although for normal types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Jukka -- I think it's confusing for All to work only with protocols, extending All to work with all types would add considerable complexity, and I'm not convinced there's a pressing need.
|
|
||
| Protocols are essentially anonymous. To emphasize this point, static type | ||
| checkers might refuse protocol classes inside ``NewType()`` to avoid an | ||
| illusion that a distinct type is provided:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this restriction makes sense, actually -- trying to construct a NewType from a protocol isn't cogent. You probably want an alias instead. Having NewType create a non-protocol subtype is surprising behavior and is not consistent with how NewType works in general, in my opinion. We shouldn't support this without a clear need.
pep-0544.txt
Outdated
| (``Iterable`` is statically equivalent to ``Iterable[Any]`). | ||
| The metaclass of ``typing.Protocol`` will automatically add | ||
| a ``__subclasshook__()`` that provides the same semantics | ||
| for class and instance checks as for ``collections.abc`` classes:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand how a class decorator would solve the problem. Wouldn't that mean that some classes that implement the protocol (i.e. the decorated ones) will work properly with isinstance at runtime but others will not?
| be provided at runtime. | ||
|
|
||
|
|
||
| Changes in the typing module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you propose to handle the backwards-incompatibility problem for all currently released versions of Python?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that there should be no backward incompatibility. The classes in typing module (as well as in collections.abc) already behave like protocols at runtime:
class Whatever:
def __iter__(self): return []
assert isinstance(Whatever(), Iterable)as I mention below "Practically, few changes will be needed in typing since some of these classes already behave the necessary way at runtime", most of these will be changed only in typeshed.
Also, after I implement the changes proposed by Jukka, all code that passes mypy now, will still pass (unless I don't see some subtleties).
|
@ddfisher Of course I am not going to wait until this PEP is accepted :-) |
|
@ilevkivskyi when you're ready for the first version to go in just let us know and we will commit it. Then you can have discussions go to your personal fork and just send PRs for updates as needed. |
This adds static support for structural subtyping. Previous discussion is here python/typing#11
Fixes #222