Use swf Tag parsing in core #3174

madsmtm · 2021-02-09T21:09:31Z

I noticed that a few parts in core was accessing swf::TagCode and swf::Reader methods directly, and doing a lot of the parsing that swf already does.

This means that core and swf are very tightly coupled, and makes it hard to change swf to be more streaming (see e.g. #74), since a lot of internal details are "leaked".

Would like feedback whether I should continue this work?

Herschel · 2021-02-09T23:47:24Z

Thanks for your work! Pinging @adrian17 since he's profiled most of the code quite heavily and can weigh in.

Ruffle used to use Tag directly as this PR does, but I specifically moved away from it to avoid excessive allocations/copies. This is less of an issue now that most things deal with slices, but I still worry about changing back.

Specifically, for speed, the ideal code would compile down to code equal to the SWF/AVM parsing being all inline:

match tag_code {
  TagCode::FOO_TAG => {
	let num_foos = stream.read_u16();
	for _ in 0..num_foos { // No need for a `Vec` here if we don't need it.
		let foo_id = stream.read_u16();
		println!("{}", foo_id);
	}
   TagCode::OTHER_TAG => (), // noop
}

vs.

match read_tag() {
  Tag::Foo(foos) => {
	for foo_id in foos { 
                // We may have allocated a Vec even though we don't need it.
		println!("{}", foo_id);
	}
   }
   Tag::Other(other) => {
          // Who knows what we may have parsed, but we don't care about it.
    }
}

With reading a Tag, this is scary, because it relies on the optimizer being smart enough and everything getting inlined. Using the parsing functions directly, it is explicit that we are skipping unwanted tags, or it even allows us parse partially (for example, if we wanted to skip definitions of character IDs we'd already defined).

I guess for preload using Tag isn't too bad, because it runs once on startup and we want to own most of the resulting types. But, for example, if you modified run_frame_internal to use Tag, it may waste time reading and parsing a chunky DefineShape that it doesn't care about. We'd have to verify that this gets optimized out.

swf is at odds with this, as it was originally a convenience library for my day job. Tag is quite large (120 bytes currently!), and memcpys are slow in Wasm. Most variants of Tag need to be changed to be boxed, but even then this is a heap-allocation that shouldn't be necessary. Further, many of the tags contain Vecs that may be used once and then thrown away. In a perfect world, we'd read as a stream instead of collecting it into a Vec. This would allow the calling code to collect into a Vec or other structure as it pleases.

This is much more important in the AVM, where we may parse thousands of ops per frame. For example, we should definitely not collect into a vec in Ruffle for ActionPush, as we do now.

Basically, I'm wary of relying too much on the compiler optimizing everything. Doing the above in a way that a) avoids copying and allocs, b) keeps the parsing code in swf with minimal duplication, and c) allows swf to still have convenient methods and owned types for other uses, is an open question. "a)" is the most important. I know the code is bad with all of this currently -- hopefully we can figure out the best path forward.

(There are some other things in here that could be good as their own PRs, like keeping the list of actions for Call. Thank you!)

madsmtm · 2021-02-10T15:05:27Z

I think my overall plan was to move the parsing in run_frame_internal into preload, so that all SWF parsing and other work that can be done beforehand is "cached" somewhere (this will fundamentally require a heap allocation), a bit like I did with actions_for_frame or like all characters being inserted into the library beforehand.

Later, we would move the entire preload code into run_frame_internal, where we first would check whether the current frame has been parsed or not, and if not, wait for the data to arrive and parse it there. (This should be possible, only thing that is missing now is that we need a way for web to stream sound data). - This is pretty close to the ideas you talked about on Discord, though doing it this way means we don't require any special handling for "frame 1 preloaders".

In the meantime, we would improve the swf module to use iterators and never allocate. However, this won't solve the points "if we wanted to skip definitions of character IDs we'd already defined" or the Tag type being large. I agree with not wanting to rely on the optimizer too much, but I think we have a lot of work to do before "a big type that may or may not be inlined away" is a bottleneck 😉.

I agree with other points on Discord:

We allocate the memory for the decompressed SWF ahead of time, and decompress into that on the fly from the downloading task.
From there, the Player tries to preload up until the number of frames in framesLoaded

maybe involve the time budget logic there?

bytes

madsmtm added 4 commits February 9, 2021 23:24

swf: Load Sprite tags lazily

bba46ec

core: Use existing swf tag parsing in MovieClip.preload

afd5cbb

core: Store actions by frame so we don't need to search for them

30a0b23

Don't run run_frame_internal unnecessarily on single frames

deb87d5

madsmtm force-pushed the use-swf-parser branch from 37d9abf to deb87d5 Compare February 9, 2021 22:24

danielhjacobs added swf Related to the SWF format A-core Area: Core player, where no other category fits T-refactor Type: Refactor / Cleanup labels Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use swf Tag parsing in core #3174

Use swf Tag parsing in core #3174

madsmtm commented Feb 9, 2021 •

edited

Loading

Herschel commented Feb 9, 2021 •

edited

Loading

madsmtm commented Feb 10, 2021

Use swf Tag parsing in core #3174

Are you sure you want to change the base?

Use swf Tag parsing in core #3174

Conversation

madsmtm commented Feb 9, 2021 • edited Loading

Herschel commented Feb 9, 2021 • edited Loading

madsmtm commented Feb 10, 2021

madsmtm commented Feb 9, 2021 •

edited

Loading

Herschel commented Feb 9, 2021 •

edited

Loading