Friday, March 13, 2015

Shooting Objects from Across the Way

This is the third post in a series documenting my adventures in creating a GPU path tracer from scratch
1. Tracing Light in a Virtual World
2. Creating Randomness and Accumulating Change

If you recall, the very high level algorithm for path tracing is:
  1. From each pixel, fire a ray out into the scene from the eye.
  2. If the ray hits an object
    1. Use the material properties to accumulate attenuation
    2. Bounce the ray in a new direction
    3. GOTO 2
  3. If the ray hits a light
    1. Multiply the light and the attenuation
    2. Add the result to the accumulation buffer
    3. GOTO 5
  4. If the ray doesn't hit anything
    1. GOTO 5
  5. GOTO 1 until sufficiently converged

The first post in the series covered step 1 and the second post covered step 3b. This post is going to cover how to test if a ray hits an object in the scene (aka, 2 and 3).

Ray - Sphere Intersection

Foreword: Scratchapixel has an excellent lesson covering ray-object intersections. Much of what I say below is based on their lessons. Much props to the authors.

A sphere can be represented mathematically with vector representing the location of the sphere's center, and its radius. (You can save a multiplication in a later calculation if you store the radius squared instead)
struct Sphere {
    float3 Center;
    float RadiusSquared;

A ray can be represented as a vector representing its origin and a unit vector representing its direction:
struct Ray {
    float3 Origin;
    float3 Direction;

Let's look at a picture of the most common type of ray - sphere intersection:
\overrightarrow{R_{o}} &= ray \: origin \\
\overrightarrow{R_{d}} &= ray \: direction \\
\overrightarrow{S_{c}} &= sphere \: center \\
S_{r} &= sphere \: radius \\
\overrightarrow{P_{0}} &= first \: intersection \: point \\
\overrightarrow{P_{1}} &= second \: intersection \: point \\
t_{0} &= distance \: from \: \overrightarrow{R_{o}} \: to \: \overrightarrow{P_{0}} \\
t_{1} &= distance \: from \: \overrightarrow{R_{o}} \: to \: \overrightarrow{P_{1}} \\

We would like to find $\overrightarrow{P_{0}}$ and $\overrightarrow{P_{1}}$. Mathematically, they are defined as:
\overrightarrow{P_{0}} &= \overrightarrow{R_{o}} + t_{0} \overrightarrow{R_{d}} \\
\overrightarrow{P_{1}} &= \overrightarrow{R_{o}} + t_{1} \overrightarrow{R_{d}} \\

We already know $\overrightarrow{R_{o}}$ and $\overrightarrow{R_{d}}$, so we just need to find $t_{0}$ and $t_{1}$. In order to do so, let's define a few new variables.

And then define $t_{0}$ and $t_{1}$ in terms of $t_{ca}$ and $t_{hc}$:
t_{0} &= t_{ca} - t_{hc} \\
t_{1} &= t_{ca} + t_{hc} \\

Now we can begin solving for our unknown variables. To start, let's look at the right triangle formed by $t_{ca}$, $\overrightarrow{L}$, and $d$.

We can solve for $t_{ca}$ by using the definition of cosine and the dot product:
\[\cos\left ( \theta \right ) = \frac{adjacent}{hypotenuse} = \frac{t_{ca}}{\left \| \overrightarrow{L} \right \|}\]
\[\overrightarrow{m} \cdot \overrightarrow{n} = \left \| m \right \| \left \| n \right \| \cos \left ( \theta \right )\]

\[\begin{split} ie: \end{split} \qquad \qquad
\overrightarrow{R_{d}} \cdot \overrightarrow{L} &= \left \| \overrightarrow{R_{d}} \right \| \left \| \overrightarrow{L} \right \| \cos \left ( \theta \right ) \\
                                 &= \frac{\left \| \overrightarrow{L} \right \| t_{ca}}{\left \| \overrightarrow{L} \right \|} \\
                                 &= t_{ca}

$\overrightarrow{R_{d}}$ is a unit vector. Therefore, $\left \| \overrightarrow{R_{d}} \right \| = 1$ and cancels out. Then, if we replace $\cos \left ( \theta \right )$ with its definition, we can cancel $\left \| \overrightarrow{L} \right \|$ from top and bottom. Thus, we're left with just $t_{ca}$.

Using the Pythagorean Theorem and a trick with the dot product, we can solve for d:
\[\overrightarrow{m} \cdot \overrightarrow{m} \equiv \left \| \overrightarrow{m} \right \| ^{2}\]

\[\begin{align} t_{ca} \: ^{2} + d^{2} &=  \left \| \overrightarrow{L} \right \| ^{2}\\
t_{ca} \: ^{2} + d^{2} &= \overrightarrow{L} \cdot \overrightarrow{L} \\
                      d^{2} &= \overrightarrow{L} \cdot \overrightarrow{L} - t_{ca} \: ^{2}

To solve for $t_{hc}$, let's look at the triangle formed by $S_{r}$, $t_{hc}$, and $d$.

Using the Pythagorean Theorem again, we can solve for $t_{hc}$:
t_{hc} \: ^{2} + d^{2} &= S_{r} \: ^{2} \\
                      t_{hc} &= \sqrt{S_{r} \: ^{2} - d^{2}}

Whew! We made it! Using the above equations, we can find the two intersection points. Let's look at some code that implements the above and walk through some special cases / early outs.
 * Test for the intersection of a ray with a sphere
 * @param ray           The ray to test
 * @param sphere        The sphere to test
 * @param normal_out    Filled with normal of the surface at the intersection point. Not changed if no intersection.
 * @return              The distance from the ray origin to the nearest intersection. -1.0f if no intersection
__device__ float TestRaySphereIntersection(Scene::Ray &ray, Scene::Sphere &sphere, float3 &normal_out) {
    float3 L = sphere.Center - ray.Origin;
    float t_ca = dot(L, ray.Direction);
    // Ray points away from the sphere
    if (t_ca < 0) {
        return -1.0f;
    float d_squared = dot(L, L) - t_ca * t_ca;
    // Ray misses the sphere
    if (d_squared > sphere.RadiusSquared) {
        return -1.0f;
    float t_hc = sqrt(sphere.RadiusSquared - d_squared);
    float t_0 = t_ca - t_hc;
    float t_1 = t_ca + t_hc;
    float nearestIntersection;
    float normalDirection;
    if (t_0 > 0 && t_1 > 0) {
        // Two intersections
        // Return the nearest of the two
        nearestIntersection = min(t_0, t_1);
        normalDirection = 1.0f;
    } else {
        // Ray starts inside the sphere
        // Return the far side of the sphere
        nearestIntersection = max(firstIntersection, secondIntersection);
        // We reverse the direction of the normal, since we are inside the sphere
        normalDirection = -1.0f;
    normal_out = normalize(((ray.Origin + (ray.Direction * nearestIntersection)) - sphere.Center) * normalDirection);
    return nearestIntersection;

The algorithm can be summarized as follows:
  1. Calculate $t_{ca}$
  2. If $t_{ca}$ is negative, $\overrightarrow{R_{d}}$ is pointing away from the sphere. Thus, there can not be an intersection
  3. Calculate $d^{2}$
  4. If $d^{2}$ is greater than $S_{r} \: ^{2}$, the ray misses the sphere.
  5. Calculate $t_{hc}$
  6. Calculate $t_{0}$ and $t_{1}$.
  7. If $t_{0}$ and $t_{1}$ are both positive, the ray starts outside the sphere and intersects it twice. Choose the closest of the two intersections
  8. If either $t_{0}$ or $t_{1}$ is negative, the ray starts inside the sphere and intersects it on the way out. Choose the positive intersection.
    • They both can't both be negative, since that would mean the ray is pointing away from the sphere, and we already checked for that.
    • See following picture.

Ray - Plane Intersection

A plane can be represented using any point on the plane, and a unit normal vector from the plane.
struct Plane {
    float3 Point;
    float3 Normal;

\overrightarrow{P_{p}} &= plane \: point \\
\overrightarrow{P_{n}} &= plane \: normal

Let's look at the intersection from the top.

We would like to find $d$. We can start by taking the dot product between $\overrightarrow{R_{d}}$ and  $\overrightarrow{P_{n}}$. We get $\cos \left ( \theta_{1} \right )$, since both  $\overrightarrow{R_{d}}$ and  $\overrightarrow{P_{n}}$ are unit vectors.
\overrightarrow{R_{d}} \cdot \overrightarrow{P_{n}} &= \left \| \overrightarrow{R_{d}} \right \| \left \| \overrightarrow{P_{n}} \right \| \cos \left ( \theta_{1} \right ) \\
          &= \cos \left ( \theta_{1} \right )

Let's look at the triangle formed by $\overrightarrow{P_{n}}$ and $\overrightarrow{L}$, where $\overrightarrow{L}$ is the vector between $\overrightarrow{P_{p}}$ and $\overrightarrow{R_{o}}$.

If we take the dot product between $\overrightarrow{P_{n}}$ and  $\overrightarrow{L}$, we get $a$.

\overrightarrow{P_{n}} \cdot \overrightarrow{L} &= \left \| \overrightarrow{P_{n}} \right \| \left \| \overrightarrow{L} \right \| \cos \left ( \theta_{2} \right ) \\
                                 &= \frac{\left \| \overrightarrow{L} \right \| a}{\left \| \overrightarrow{L} \right \|} \\
                                 &= a

Finally, let's look at the two triangles formed between $\overrightarrow{P_{n}}$, $\overrightarrow{R_{d}}$, and the plane itself.

They are similar triangles, so we can solve for $d$ using the Law of Similar Triangles and the definition of cosine:
\[\cos \left ( \theta_{1} \right ) = \frac{b}{c}\]

\frac{a + b}{c + d} &= \frac{b}{c} \\
c \left ( a + b \right ) &= b \left ( c + d \right ) \\
ac + bc &= bc + bd \\
ac &= bd \\
d &= \frac{ac}{b} \\
d &= \frac{a}{\cos \left ( \theta_{1} \right )}

Let's look at the code implementation of the above and walk through it.
 * Test for the intersection of a ray with a plane   
 * @param ray           The ray to test
 * @param plane         The plane to test
 * @param normal_out    Filled with normal of the surface at the intersection point. Not changed if no intersection.
 * @return              The distance from the ray origin to the nearest intersection. -1.0f if no intersection
__device__ float TestRayPlaneIntersection(Scene::Ray &ray, Scene::Plane &plane, float3 &normal_out) {
    float cos_theta1 = dot(plane.Normal, ray.Direction);
    // If cos_theta1 is greater than -epison,
    // the ray is perpendicular to the plane or points away from the plane normal
    if (cos_theta1 > -1.0e-6f) {
        return -1.0f;
    normal_out = plane.Normal;
    float a = dot(plane.Normal, plane.Point - ray.Origin);
    return a / cos_theta1;

The algorithm can be summarized as follows:
  1. Calculate $\cos \left ( \theta_{1} \right )$
  2. If $\cos \left ( \theta_{1} \right )$ is greater than or equal to zero, the ray is perpendicular to the plane, or faces away
    • In the code, we use a small episilon to account for floating point innacuracies
  3. Calculate a
  4. Return $\frac{a}{\cos \left ( \theta_{1} \right )}$

Testing Them Out

In order to test the code out, I added on to the eye ray code we created in the first post. After creating the ray, I tried to intersect it with an object in the scene. If it hit, I did some basic Lambertian shading with a hard-coded directional light. If it didn't, I colored the pixel black. Here's the results:

Very nice! If I do say so myself. I don't have a plane in the scene because I didn't implement the plane intersection code until I had skipped ahead to the path tracing itself. (I got too excited. Ha ha!) So you'll have to take my word for it that the code does work. Below is the kernel code used to create the image above:
__global__ void PathTraceKernel(unsigned char *textureData, uint width, uint height, size_t pitch, DeviceCamera *g_camera, Scene::Sphere *g_spheres, uint numSpheres, uint hashedFrameNumber) {
    int x = blockIdx.x * blockDim.x + threadIdx.x;
    int y = blockIdx.y * blockDim.y + threadIdx.y;
    // Create a local copy of the camera
    DeviceCamera camera = *g_camera;
    // Global threadId
    int threadId = (blockIdx.x + blockIdx.y * gridDim.x) * (blockDim.x * blockDim.y) + (threadIdx.y * blockDim.x) + threadIdx.x;
    // Create random number generator
    curandState randState;
    curand_init(hashedFrameNumber + threadId, 0, 0, &randState);
    // Calculate the first ray for this pixel
    Scene::Ray ray = {camera.OriginCalculateRayDirectionFromPixel(x, y, width, height, camera, &randState)};
    // Generate a uniform random number
    //float randNum = curand_uniform(&randState);
    // Try to intersect with the spheres;
    float closestIntersection = FLT_MAX;
    float3 normal;
    for (uint i = 0; i < numSpheres; ++i) {
        float3 newNormal;
        float intersection = TestRaySphereIntersection(ray, g_spheres[i], newNormal);
        if (intersection > 0.0f && intersection < closestIntersection) {
            closestIntersection = intersection;
            normal = newNormal;
    float3 pixelColor;
    if (closestIntersection < FLT_MAX) {
        float attentuation = max(dot(normal, make_float3(0.70710678118f, 0.70710678118f, -0.70710678118f)), 0.0f);
        pixelColor = make_float3(0.846, 0.933, 0.949) * attentuation + make_float3(0.15f, 0.15f, 0.15f);
    } else {
        pixelColor = make_float3(0.0f, 0.0f, 0.0f);
    if (x < width && y < height) {
        // Get a pointer to the pixel at (x,y)
        float *pixel = (float *)(textureData + y * pitch) + 4 /*RGBA*/ * x;
        // Write out pixel data
        pixel[0] += pixelColor.x;
        pixel[1] += pixelColor.y;
        pixel[2] += pixelColor.z;
        // Ignore alpha, since it's hardcoded to 1.0f in the display
        // We have to use a RGBA format since CUDA-DirectX interop doesn't support R32G32B32_FLOAT


There we go! We now have everything we need to do path tracing. The next post will be just that: creating a basic path tracing kernel. I've already implemented it in code, so it's just a matter of how quickly I can write up the post. (teaser picture). Keep an eye out!

The code for everything in this post is on GitHub. It's open source under the Apache license, so feel free to use it in your own projects. The path tracing code is already pushed to the repo, if you're feeling antsy, you can give that a look.

As always, feel free to ask questions, make comments, and if you find an error, please let me know.

Happy coding!


  1. Hi, I'm trying to run your code, but ran into some problems. The code compiles fine in vs2013 with cuda 6.5, but when running it I get an error with d3d11.dll pointing to ID3D11ShaderResourceView *hdrSRV in basic_path_tracer.cpp. Can you tell me what to do to solve it?

    1. Hmmm. That is odd. I'm aware of another issue with between the CUDA and D3D textures that I believe to due to the interop code. I'm going to try to fix that today. I'll post here when I push and you can try the updated code.

  2. Thanks a lot! Looking forward to try the update. I should also mention that I don't have the latest DX11 SDK installed, and I'm running this on a laptop with a GT 840M.

    1. The code uses the DirectX SDK that's shipped with the Windows SDK (which comes with Visual Studio). If you're able to compile, then it shouldn't be a problem.

      Are you running the Debug build or the Release build? The Debug build is horridly slow. I found that the WDDM will cause the driver to reset because it thinks the GPU has hung. You can manually override the timeout time (or turn off the timeout) in the NVIDIA Nsight Monitor.

      Or run the Release build. Which is much much much faster.

      I pushed some changes if you want to test those out. (I wish I had another computer I could test on. To help prevent the 'it works on my computer' problems)

  3. Thanks for your time. I tried your new code in Release mode but got the same error as before: Unhandled exception at 0x0F8B03CD (nvwgf2um.dll) in BasicPathTracer.exe: 0xC0000005: Access violation reading location 0x0000BA0D.

    I'll do some more googling to see if I can solve it.

  4. Nevermind, it runs when I open the exe outside Visual Studio :) Really fast, are you planning to develop this further?

    1. Oh! I know what the problem is. Visual Studio stores the working directory for debugging sessions in a .user file (which I don't store on the repo since it contains other user information) The default working directory is the directory of the project file itself. which is, of course, wrong.

      If the working directory is wrong, the program will throw exceptions when it tries to find the cuda dll and/or the shader file.

      To fix the problem, go to the project settings and change the working directory to: $OutDir)

    2. And I'm definitely going to develop it further!! I've become addicted. Ha ha! I'm working on the next blog post right now

    3. Great! Changing working directory did the trick.I know how addictive gpu path tracing can be, been trying it myself. It can be really fun. I actually have some crazy ideas to speed up GI calculations, but haven't had the chance to test them yet. Looking forward to your next post.

    4. Next post is up: