Adding a sharding address generator #17375

jvegaTT · 2025-01-30T17:03:28Z

We require a sharding address generator that is fully compatible with its interleaved counterpart's api, is easy to set up by OP writers, and enables the native support of sharding in all cores

### Ticket #17375 ### Sales Pitch This is a big one to review but hopefully very helpful to alot of you. This PR introduces the sharding address generator which makes navigating sharded tensors as easy as interleaved ones !!! In fact you can use the exact same API as the interleaved address generators so no more # ifdefs to differentiate between interleaved or sharded after initializing your address generator. ### Problem description This PR introduces the sharded address generator accessor which allows you to get noc addresses for sharded tensors. It is meant to be as easy to use as possible, the instructions are below: ShardedAddrGen requires the type definition of a Sharded_Info class object who's templates hold the CT information ex. typedef Sharded_Info < SHARD_TYPE, NUMBER_OF_CORES, PAGE_SIZE_JUMP, PAGES_PER_TENSOR_ROW, CONTIGUITY, PAGES_PER_SHARD_WIDTH, ROWS_PER_SHARD_HEIGHT> tensor_1_shard_info; The above parameters are usually obtained using get_compile_time_arg_val. In the program factory you can create an vector containing the above parameters in order using the function shard_pf_builder:sharding_ct_table_builder(const tt::tt_metal::IDevice* device, const tt::tt_metal::Tensor& t) defined in ttnn/cpp/ttnn/operations/ccl/sharding_addrgen_pf_helper.cpp It also needs a shard array map which can be extracted from the RT args using shard_addr_gen_utils::parse_map function which requires the Sharded_Info class object ex. auto mapping = parse_map<tensor_1_shard_info>(rt_index); const uint32_t* const shard_array_map = mapping.first; //Contains the shard array map rt_index = mapping.second;//contains the new runtime index In the program factory you can create an vector containing the runtime arguments extracted by this function using the function shard_pf_builder:get_linear_shard_list(const tt::tt_metal::IDevice* device, const tt::tt_metal::Tensor& t) defined in ttnn/cpp/ttnn/operations/ccl/sharding_addrgen_pf_helper.cpp Lastly you need the bank_base_address from the Tensor object just like interleaved addr gen You can then create a sharded addrgen as follows: s = ShardedAddrGen <tensor_1_shard_info> {.bank_base_address = bank_base_address, .shard_array=shard_array_map}; This object can then be used by the get_noc_addr api: uint64_t noc_address = get_noc_addr(page_num,s); I also introduce get_contiguous_noc_addr which returns both the noc address and the number of contiguous pages after that point to speed up operations. ### Checklist - [x] Post commit CI passes - https://github.com/tenstorrent/tt-metal/actions/runs/13058740954 -- just a clang-format failure on a file I didn't write, should be fixed after rebase - [ ] Blackhole Post commit (if applicable) - [ ] Model regression CI testing passes (if applicable) - [ ] Device performance regression CI testing passes (if applicable) - [ ] **(For models and ops writers)** Full [new models](https://github.com/tenstorrent/tt-metal/actions/workflows/full-new-models-suite.yaml) tests passes - [ ] New/Existing tests provide coverage for changes

jvegaTT added the bug Something isn't working label Jan 30, 2025

jvegaTT self-assigned this Jan 30, 2025

jvegaTT mentioned this issue Jan 30, 2025

#17375: Adding sharded addrgen to ccl folders #17052

Merged

6 tasks

jvegaTT added feature and removed bug Something isn't working labels Jan 30, 2025

jvegaTT closed this as completed Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a sharding address generator #17375

Adding a sharding address generator #17375

jvegaTT commented Jan 30, 2025

Adding a sharding address generator #17375

Adding a sharding address generator #17375

Comments

jvegaTT commented Jan 30, 2025