Skip to content

Conversation

@sweeneyde
Copy link
Member

No description provided.

@sweeneyde sweeneyde closed this Mar 28, 2022
@sweeneyde sweeneyde reopened this Mar 28, 2022
@markshannon
Copy link
Member

How did you choose the categories? What do the stats look like?

They look quite tailored to the benchmark suite. Would it possible to generalize them a bit?
Certain builtin classes, like zip and enumerate are worth checking for. But beyond that it is probably only worth categorizing into broader categories like: "iterator for underlying sequence", "computed iterator", "implemented in C", "implemented in Python".
That sort of thing.

@sweeneyde
Copy link
Member Author

I chose the categories by running the test suite and pyperformance with a printf at the end of _PySpecialization_ClassifyIterator until there stopped being so many prints. I can pare down the number of cases to ones we care about, but here are some of the results with all of those details:

@markshannon
Copy link
Member

Ok, we can always change the categories later, if we need to.
In the meantime, this is useful information.

#define SPEC_FAIL_COMPARE_OP_EXTENDED_ARG 24

/* FOR_ITER */
#define SPEC_FAIL_FOR_ITER_REVERSED 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The values below 8 are common, in the section marked /* Common */ above.
You can always raise SPECIALIZATION_FAILURE_KINDS if you need.

@sweeneyde
Copy link
Member Author

Results from python -m test, now with the new ascii string iterator:

Details
Failure kind Count Ratio
list 40313806 30.7%
range 37450044 28.5%
itertools 10394906 7.9%
map 9850377 7.5%
tuple 7643483 5.8%
generator 7477618 5.7%
enumerate 3882317 3.0%
ascii string 3266112 2.5%
dict items 3198088 2.4%
callable 1707168 1.3%
dict keys 1115718 0.8%
bytes 1105824 0.8%
zip 984525 0.7%
seq iter 974430 0.7%
other 801038 0.6%
set 678627 0.5%
dict values 222521 0.2%
reversed list 184092 0.1%
string 72383 0.1%

I think this is useful information, so I'll go ahead and merge. As was said, we can always adjust things more later.