Monday, March 10, 2014

Introducing the Halfling Project

Hello everyone!

It's been entirely too long since I've posted about what I've been working on. Granted, I did make a post a couple weeks ago about Git, but that was mostly for my class. So here goes!

We last left off with me wrapping up GSoC with ScummVM. I have since joined the ScummVM dev team (Yay!) and my current progress on the ZVision engine was merged into the master branch. Unfortunately, due to school keeping me quite busy and another project, I haven't had much time to work more on the engine. That said, it's not abandoned! I'm planning on working more on it after I graduate in August.

I have always been quite fascinated by computer graphics, especially in the algorithms that make real-time graphics possible. Wanting to get into the field, I started teaching myself DirectX 11 last December using Frank Luna's wonderful book, An Introduction to 3D Game Programming with DirectX 11. However, rather than just using his base code, I chose to create my own rendering framework, and thus The Halfling Project was born.

"Why re-invent the wheel?", you ask? Because it forces me to fully understand the graphics concepts, rather than just copy-pasting cookie-cutter code. Also, no matter how recent a tutorial is, there is bound to be some code that is out of date. For example, Frank Luna's code uses .fx files and the D3DX library. Effect files can still be used, but Microsoft discourages it. And the D3DX library doesn't exist anymore. Granted it has a replacement (DirectXMath), but it has a slightly different API. Thus, even if I were to 'copy-paste', I would still have to change the code to fit the new standards.

That said, I didn't come up with everything from scratch. The Halfling Project is heavily influenced by Luna's code, MJP's sample framework, and Glenn Fiedler's blog posts. Overall, The Halfling Project is just a collection of demos that happen to use the same base framework. So, with that in mind, let me describe some of the demos and what I plan for the future.

(If you would like to try out the demos for yourself, there are compiled binaries in my Git repo. You will need a DirectX11 capable graphics card or integrated graphics and will need to install the VS C++ 120 redistributable, which is included with the demos.)

Crate Demo:

My "Hello World" of DirectX 11! Ha ha! So much code for a colored box.... I can't tell you how happy I was when it worked though!

Me: "Look! Look what I made!"
My roommate: "What? It's a box."
Me: "But.... it was hard..."

I guess he had a point though. On to more interesting things!

Wave Simulation Demo:

So the next thing to change was to make the geometry a bit more interesting. I borrowed a wave simulation algorithm from Frank Luna's code and created this demo. Each update, it applies the wave equation to each vertex and updates the Vertex Buffer.

Lighting Demo:

So now we had some interesting geometry, now it was time for some lights! Well, one light...

I actually didn't use the wave simulation geometry because it required a dynamic vertex buffer. (Yes I know you could do it with a static buffer and transformations, but baby steps) Instead, I borrowed another function from Frank Luna's code that used sin/cos to create hills. The lighting is a forward renderer using Lambert diffuse lighting and Blinn-Phong specular lighting. Rather than bore you with my own re-hash of what's already written, I will point you to Google.

Deferred Shading Demo:

This is where I diverged from Frank Luna's book and started off on my own. I like to read graphics white papers and talks on my bus ride to and from school. One that I really liked was Andrew Lauritzen's talk about Tiled Shading. In my head, deferred shading was the next logical step after traditional forward shading, so I launched in, skipping right to tiled deferred shading. However, it wasn't long before I was in way over my head. I guess I should have seen that coming, but hind-sight is 20-20. Therefore I resolved to first implement naïve deferred shading, and THEN think about tiled (and perhaps clustered).

So how is deferred shading different than forward shading? 

Traditional Forward:
  1. The application submits all the triangles it wants rendered to the GPU.
  2. The hardware rasterizer turns the triangles into pixels and sends them off to the pixel shader
  3. The pixel shader applies any lighting equations you have
    • Assuming no light culling, this means the lighting equation is invoked
      ((# pixels from submitted triangles) x (# lights)) times
  4. The output merger rejects pixels that fail the depth test and does pixel blending if blending is enabled

Traditional Deferred:
  • GBuffer Pass:
    1. The application submits all the triangles it wants rendered to the GPU.
    2. The hardware rasterizer turns the triangles into pixels and sends them off to the pixel shader
    3. The pixel shader stores the pixel data in a series of texture buffers called Geometry Buffers or GBuffers for short
      • GBuffer contents vary by implementation, mostly depending on your lighting equation in the second pass
      • Common data is World Position, Surface Normal, Diffuse Color, Specular Color, and Specular Power
    4. The output merger rejects pixels that fail the depth test. Blending is NOT allowed.
  • Lighting Pass:
    1. The application renders a fullscreen quad, guaranteeing a pixel shader thread for every pixel on the screen
    2. The pixel shader samples the GBuffers for the data it needs to light the pixel
    3. Then applies the lighting equation and returns the final color
      • Assuming no light culling, this means the lighting equation is invoked
        ((# pixels on screen) x (# lights)) times
    4. The output merger is pretty much a pass-though, as we don't use a depth buffer for this pass.

So what's the difference? Why go through all that extra work?

Deferred Shading invokes the lighting equation fewer times (generally)

In the past 10 years, there has been a push to make real-time graphics more and more realistic. A massive part of realism is lighting. But, lighting is usually THE most expensive calculation for a scene. In forward shading, you calculate lighting for each and every pixel that the rasterizer creates. However, depending on your scene, a large number of these pixels will be rejected by the depth test. Thus, a large number of calculations were *wasted* in a sense. Granted there are ways around this, but they aren't perfect and I'll leave that for future exploration. Thus, deferred shading effectively separates scene complexity and lighting complexity.

This all said, deferred shading isn't the cure-all for everything; it does have some significant draw-backs
  1. It requires a large* amount of bandwidth and memory to store the GBuffers
    • Large is a relative term. It ultimately depends on what platform you're targeting
  2. It requires hardware that allows multiple render targets
    • Somewhat of a moot point with today's hardware, but still something to watch for
  3. No hardware anti-aliasing.
  4. No transparent geometry / blending

So how is my deferred shading demo implemented?

Albedo-MaterialIndex DXGI_FORMAT_R8G8B8A8_UNORM

8 bits 8 bits 8 bits 8 bits
Albedo Red Albedo Green Albedo Blue Material Index
Normal Phi Normal Theta

Albedo Stores the RGB diffuse color read from texture mapping
MaterialIndex An offset index to a global material array in the shader
Normal The fragment surface unit normal stored in spherical coordinates. (We don't store radius since we know it's 1 for a unit normal)
Depth The hardware depth buffer. It stores (1 - z/w). By swapping the depth planes, we spread the depth precision out more evenly.

Converting the normal to/from spherical coordinates is just some trig, but here is the code I use. Note: My code assumes that the GBuffer can handle non-uniform data. (AKA, potentially outside the range [0, 1])

I use the depth buffer to calculate the world position of the pixel. The basic principle is that since we know the position of the pixel on the screen, using that, the depth, and the inverse ViewProjection matrix, we can calculate the world postion. I'll point you here and here for more information.

So you managed to get through all that, let me reward you with a video and some screenshots. :)

With 500 point lights and 500 spot lights

Visualizing the GBuffers

And one last one to show you that the depth buffer does actually have data in it:

Well that's it for now! I have another demo I'm working on right now, but I'll leave that for another post. If you want a sneak peak, there is a build of it in my repo.

As always, feel free to ask questions and leave comments or suggestions.


1 comment: