Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LightMetal - Initial Replay infra/library for LightMetalBinary and standalone runner (#17039) #17524

Merged
merged 1 commit into from
Feb 5, 2025

Conversation

kmabeeTT
Copy link
Contributor

@kmabeeTT kmabeeTT commented Feb 4, 2025

Ticket

I am breaking up original PR I put up recently into smaller bite-sized chunks as @omilyutin-tt suggested. Here is round 6.

Note: This builds on round 5 PR (#17514) I opened earlier today, but since the changes were ~minimal, ready and cleaned up, I opened this Round 6 now. Set target branch as previous PR branch to reduce diff until it's merged, then will adjust to target main.

[Feature Request] Add Light Metal capture/replay initial changes to tt-metal for some workloads #17039

Problem description

This is initial/bootstrapping changes for "Light Metal" capture/replay feature that uses Flatbuffers as serialization/deserialization library. See my previous bigger PR (#16573) if you want to see how this will be used by followup merge, or context.

What's changed

  • This is round 6/6 for now, builds upon previous 5 merges for LightMetal in past week and enables e2e capture + replay in unit tests now that replay is supported.
  • This brings the replay library/executor for a LightMetalBinary which handles replaying all the commands and traces captured by workload to binary. Like capture time, complex objects are stored in map after creation, and referenced by global_id by functions that re-use them.
  • Light Metal standalone CLI runner initial infra which just loads an existing binary on disk and executes it using replay librarys's ExecuteLightMetalBinary()

Incorporates bunch of feedback in big original parent PR.

Checklist

  • Post commit CI passes
  • Blackhole Post commit (if applicable)
  • Model regression CI testing passes (if applicable)
  • Device performance regression CI testing passes (if applicable)
  • (For models and ops writers) Full new models tests passes
  • New/Existing tests provide coverage for changes

tt_metal/impl/lightmetal/lightmetal_replay.cpp Outdated Show resolved Hide resolved
tt_metal/impl/lightmetal/lightmetal_replay.cpp Outdated Show resolved Hide resolved
tt_metal/impl/lightmetal/lightmetal_replay.hpp Outdated Show resolved Hide resolved
tt_metal/impl/lightmetal/lightmetal_replay.hpp Outdated Show resolved Hide resolved
@kmabeeTT kmabeeTT force-pushed the kmabee/light_metal_replay_and_runner branch from 2d0cf2f to fa33198 Compare February 5, 2025 05:14
@kmabeeTT kmabeeTT force-pushed the kmabee/light_metal_capture_and_tests branch 2 times, most recently from b936169 to bea47ea Compare February 5, 2025 17:49
Base automatically changed from kmabee/light_metal_capture_and_tests to main February 5, 2025 18:25
@kmabeeTT kmabeeTT force-pushed the kmabee/light_metal_replay_and_runner branch from fa33198 to 8770752 Compare February 5, 2025 19:58
…andalone runner (#17039)

 - This is round 6/6 for now, builds upon previous 5 merges for LightMetal in past week
   and enables e2e capture + replay in unit tests now that replay is supported.

 - This brings the replay library/executor for a LightMetalBinary which handles
   replaying all the commands and traces captured by workload to binary. Like
   capture time, complex objects are stored in map after creation,
   and referenced by global_id by functions that re-use them.

 - Light Metal standalone CLI runner initial infra which just loads an existing
   binary on disk and executes it using replay librarys's ExecuteLightMetalBinary()

 - Some PR Reedback: Update asserts, remove default cases, more comments, etc.
@kmabeeTT kmabeeTT force-pushed the kmabee/light_metal_replay_and_runner branch from 8770752 to 7a6efdc Compare February 5, 2025 22:07
@kmabeeTT
Copy link
Contributor Author

kmabeeTT commented Feb 5, 2025

Tag @omilyutin-tt for re-review please. Rebased on latest, resolved all conversations, fixup into single commit and force pushed. Checking build and tests pass locally (Edit: Yes) and would merge after approved. Edit2: Also rerun code-analysis and all-build-configs (passed).

@kmabeeTT kmabeeTT requested a review from omilyutin-tt February 5, 2025 22:09
@kmabeeTT kmabeeTT merged commit cca6d4f into main Feb 5, 2025
29 checks passed
@kmabeeTT kmabeeTT deleted the kmabee/light_metal_replay_and_runner branch February 5, 2025 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants