Tuesday, May 6, 2014

How to do you implement Geometry Instancing?

So at some point in the graphics pipeline, you have a list of models that need to rendered. Normal scenario:
  1. Set the CBuffer variables
  2. Call DrawIndexed

Simple. Ok next scenario is if the models are instanced. Yes I can use DrawIndexedInstanced, but my question is: What's the best way to send the instance data to the GPU?

So far, I can think of 3 ways: 

Option 1 - Storing and using one Instance Buffer per model

The render loop would then be something like this:

for (model in scene) {
    if (model.hasInstances()) {
        if (model.isDynamic()) {
            model.UpdateInstanceBuffer(....);
        }
        model.DrawIndexedInstanced(....);
    } else {
        model.DrawIndexed(...);
    }
}

Option 2 - Using a single Instance buffer for the entire scene and updating it for each draw call

The render loop would then be something like this:

for (model in scene) {
    if (model.hasInstances()) {
        UpdateInstanceBuffer(&model.InstanceData, ....);

        model.DrawIndexedInstanced(....);
    } else {
        model.DrawIndexed(...);
    }
}

Option 3 - Caching instances into a buffer for an entire batch (or if memory requirements aren't a problem, the whole frame). Directly inspired by Battlefield 3 - slide 30.

The render loop would then be something like this:

std::vector instanceData;
std::vector offsets;

for (instancedModel in scene) {
    offsets.push_back(instanceData.size());
    for (float4 data in model.InstanceData) {
        
    }
}

BindInstanceBufferAsVSResource(&instanceData);

uint instanceOffset = 0;
for (uint i = 0; i < scene.size(); ++i) {
    UpdateVSCBuffer(offsets[i], ....);

    model.DrawIndexedInstanced(....);
}

Pros and Cons: 

Option 1 - Individual InstanceBuffers per model
Pros:
  1. Static instancing is all cached, ie. you only have to map/update/unmap a buffer once.
Cons:
  1. A ton of instance buffers. I may be over thinking things, but this seems like a lot of memory. Especially since all buffers are static size. So you either have to define exactly how many instances of an object can exist, or include some extra memory for wiggle room.
Option 2 - Single InstanceBuffer for all models
Pros:
  1. Only one instance buffer. Potentially a much smaller memory footprint than Option 1. However, we need it to be as large as our largest number of instances.
Cons:
  1. Requires a map/update/unmap for every model that needs to be instanced. I have no idea if this is expensive or not.
Option 3 - CBuffer array with all the instances for a frame/batch
Pros:
  1. Much less map/update/unmap than Option 2
  2. Can support multiple types of instance buffers (as long as they are multiples of float4)
Cons:
  1. Static instances still need to be update every frame. 
  2. Indexes out of a cbuffer. (Can cause memory contention)

So those are my thoughts. What are your thoughts? Would you choose any of the three options? Or is there a better option? Let me know if the comments below or on Twitter or with a Pastebin/Gist. 

Happy coding
-RichieSams

No comments:

Post a Comment