Scraps performance – Draw call dynamic batching

This post will act as both a DevLog entry and, hopefully, some small help to people trying to get Unity’s black-box dynamic batching working.

Scraps as it is right now performs fantastically on my Desktop machine with a Radeon 7850, pulling over 300fps. It performs pretty poorly on my four-year-old laptop with a GeForce 9400M, where I have to cut the graphics settings to get 50fps. Now, it’s a fairly old machine, but Scraps is a pretty basic-looking game right now, and if the thing can run Portal 2 on low settings, it sure should be able to run Scraps without turning the graphics way down. Plus, a vehicle can have any number of parts, increasing the amount of things to render indefinitely.

One of the big issues with Scraps is the number of draw calls. Well, it’s not a big issue yet, but I can see it’s going to be, because when you build a vehicle out of lots of separate parts – parts that can come off again or be added to – it’s necessarily made up of lots of separate meshes. And separate meshes usually mean more draw calls. And more draw calls always means more CPU (yeah, CPU) time.

For non-moving objects, you can tell Unity they’re Static and it  can automatically batch several objects into one draw call. All the objects on a vehicle in Scraps are moving (because the vehicle moves), but it’s still possible to coax  Unity into dynamically batching multiple meshes into a single draw call if you fulfill a bunch of somewhat esoteric criteria. The draw call page in the manual has a list under “Tips”, but it turns out that it’s not the whole list.

One big one (which is on the list) is that meshes have to share the same material. Almost all vehicle parts in scraps had essentially the same material, but just with different textures for each one, so I combined them all into a couple of 2048×2048 texture atlases (sorry 20th century graphics card users), updated the UVs, and merged a whole bunch of materials into one without any visible change. The only negative that I can think of is that if a lot of parts aren’t used at all in a scene, it’s a small waste of texture space.

I also met the other criteria for scaling and polygon count, at least on some sub-meshes. So now there should be lots of batched draw calls right?  Well, there weren’t. There were zero.

Forward rendering, combined materials: No draw calls batched.

The tips say “Objects that receive real-time shadows will not be batched.”  Turning off receiving shadows on all parts had no effect. But I noticed that switching from Forward Rendering to Vertex Rendering did:

Vertex rendering, same settings: Some batched.

Plus vertex rendering has way less draw calls overall – of course it also looks horrible and the terrain has no texture anymore. Anyway, could the now-working batching be because vertex rendering has no shadows at all? Turns out, yes. Turning off casting and receiving shadows in forward rendering created about as many batched draw calls as expected. But then I had no shadows.

Combining Meshes

The best solution would be to combine all the different part meshes into one big vehicle mesh. Then we wouldn’t need to worry about dynamic batching at all, because if it’s all one big object with one material, it’s only one draw call anyway. Unity provides a cool script called CombineChildren in the standard Scripts assets that does exactly that automatically. Using it on a vehicle I made, it merged everything as-advertised into five combined meshes, since I was using five different materials total:

Forward rendering, combined meshes: No batching needed, way less draw calls.

This was a bit of a naive approach – it also combined all my gun parts, wheels etc so nothing animated or moved at all anymore. But my solid-block vehicle sliding along the terrain was super-efficient. If only this was a normal vehicle game with normal single-mesh vehicles.

This could be the road I end up taking, but I foresee potential horrible headaches with implementation. Every time parts are added or removed, the mesh will have to be recreated, and that takes a bit of time. Plus it’ll need checks added for stuff like parts that move, or whether the mesh goes over the max vertex limit. Do we have time for all that in the middle of a firefight when a generator gets knocked off?

One Weird Tip (not invented by “a Mom”)

Turns out there’s a secret to at least getting some dynamic batching, but I don’t yet fully understand why it works in some cases and not others. Essentially you want to put each material in a different render queue.

You can theoretically do this by using Material.renderQueue in code, but I got dynamic batching to work by specifying the queue position in the shader itself, e.g.:

Tags {"Queue" = "Transparent-1" }

That will set the shader’s position in the queue to the usual position for transparent objects, minus one. Then use -2, +1 etc to differentiate on other materials. This gave me some working batching at least, with Forward rendering and shadows still on:

Forward rendering, original settings: Some batched!

OK, so that’s cool, but for me it only works if I use “Transparent-1″ or numbers close to it. And my shaders aren’t even transparent shaders, so I want to use the likes of “Geometry+1″ instead, but then I get no dynamic batching. And I don’t know why. For now it’s good enough, but I’ll need to look into it sometime. Unfortunately there’s no obvious information about it on the Internet that I haven’t already repeated here. If anyone really knows why it works like it does, I’d love to know as well.

Note: Switching from Forward to Deferred rendering produced the same results – no dynamic batching until I added the render queue trick, then the same amount as I get in Forward rendering.

Bookmark the permalink. Both comments and trackbacks are currently closed.