- Set the CBuffer variables
- Call DrawIndexed
Simple. Ok next scenario is if the models are instanced. Yes I can use DrawIndexedInstanced, but my question is: What's the best way to send the instance data to the GPU?
So far, I can think of 3 ways:
Option 1 - Storing and using one Instance Buffer per modelThe render loop would then be something like this:
for (model in scene) { if (model.hasInstances()) { if (model.isDynamic()) { model.UpdateInstanceBuffer(....); } model.DrawIndexedInstanced(....); } else { model.DrawIndexed(...); } }
Option 2 - Using a single Instance buffer for the entire scene and updating it for each draw call
The render loop would then be something like this:
for (model in scene) { if (model.hasInstances()) { UpdateInstanceBuffer(&model.InstanceData, ....); model.DrawIndexedInstanced(....); } else { model.DrawIndexed(...); } }
Option 3 - Caching instances into a buffer for an entire batch (or if memory requirements aren't a problem, the whole frame). Directly inspired by Battlefield 3 - slide 30.
The render loop would then be something like this:
std::vectorinstanceData; std::vector offsets; for (instancedModel in scene) { offsets.push_back(instanceData.size()); for (float4 data in model.InstanceData) { } } BindInstanceBufferAsVSResource(&instanceData); uint instanceOffset = 0; for (uint i = 0; i < scene.size(); ++i) { UpdateVSCBuffer(offsets[i], ....); model.DrawIndexedInstanced(....); }
Pros and Cons:
Option 1 - Individual InstanceBuffers per modelPros:
- Static instancing is all cached, ie. you only have to map/update/unmap a buffer once.
- A ton of instance buffers. I may be over thinking things, but this seems like a lot of memory. Especially since all buffers are static size. So you either have to define exactly how many instances of an object can exist, or include some extra memory for wiggle room.
Pros:
- Only one instance buffer. Potentially a much smaller memory footprint than Option 1. However, we need it to be as large as our largest number of instances.
- Requires a map/update/unmap for every model that needs to be instanced. I have no idea if this is expensive or not.
Pros:
- Much less map/update/unmap than Option 2
- Can support multiple types of instance buffers (as long as they are multiples of float4)
- Static instances still need to be update every frame.
- Indexes out of a cbuffer. (Can cause memory contention)
So those are my thoughts. What are your thoughts? Would you choose any of the three options? Or is there a better option? Let me know if the comments below or on Twitter or with a Pastebin/Gist.
Happy coding
-RichieSams
No comments:
Post a Comment