Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
I was looking at some profiles and noticed canonicalize_shape showing up as a noticeable overhead in certain cases. Which makes sense, given that we carefully check all possible cases before trying to consider integers as plausible elements (which are the most popular _by far_). And this function is pretty hot, because it gets called any time we create a new `ShapedArray`. I wrote a small benchmark that repeatedly calls canonicalize_shape on a 4-sized tuple of integers. Before: 7.62µs ± 8% After: 1.42µs ± 2% So a pretty easy 5x improvement overall. And in more real cases, when resharding an array onto 8 TPUs, 50% of the time was spent on creating shapes for avals of device buffers. PiperOrigin-RevId: 516795311
- Loading branch information