Introduction to raytracing

In class we discussed the basics of ray tracing. Ray tracing is an object based rendering method. In contrast, the polygon-based "Z-buffer" rendering that we will be doing later this semester is an image-based method, because it makes use of image coherence from one pixel to the next in an image. This makes Z-buffering fast but limited in its power. Graphics Processing Units, like the products currently offered by nVidia and ATI, rely on image-based Z-buffer methods.

But ray tracing is interesting because it is simple yet powerful, since it doesn't rely on image coherence. You can easily do things with ray tracing, such as casting shadows and seeing reflections in objects, that are very difficult to approach with image-based methods. Unfortunately, ray tracing tends to be slow (precisely because it does not make use of pixel-to-pixel coherence in the image). But that's the price you pay.

We talked in class about how you set up a camera (eye point and image plane) with ray tracing, and we started to talk about how you ray trace to a scene consisting of a collection of shiny spheres, like the examples I had showed in the previous lecture.

One nice thing about setting up a camera view for ray traced rendering is that it is conceptually so simple: for each pixel of the final image, you shoot a ray (V,W) - having origin point V and direction vector W - backwards from the eye, toward the appropriate point on the image plane, and just see what the ray hits first.

Although you can place the camera eye point anywhere, and the image plane anywhere, it's easiest to think about having the eye point at the origin, and the image plane floating somewhere in negative Z, aligned with the X,Y plane. You can always transform everything later with a matrix transformation.

In particular, we can place the image as a rectangle floating in the object world, where the left edge of the rectangle is at X = -1.0, the right edge of the rectangle is at X = +1.0, and the Z coordinate of the rectangle is Z = -focal_length.

So the origin point of a camera ray is always just V = (0,0,0), and the direction of the camera ray to pixel (i,j) of an M×N pixel image is:

W = ( 2.0 * i / M - 1.0 , 2.0 * (N-j) / M - 1.0 , -focal_length )

If you have a scene of spheres, then for every pixel ray (V,W) you need to figure out which sphere the ray hits first.

A sphere is described by its origin and radius, which is just the four numbers (Cx, Cy, Cz,R), and consists of all points (x,y,z) in space that the equation:

(x - Cx)2 + (y - Cy)2 + (z - Cz)2 - R2 = 0
The ray (V,W) represents all points:

(x,y,z) = ( Vx + t Wx , Vy + t Wy , Vz + t Wz ),     where t ≥ 0
Substituting the second equation into the first, we get:

(Vx + t Wx - Cx)2 + (Vy + t Wy - Cy)2 + (Vz + t Wz - Cz)2 - R2 = 0     where t ≥ 0
When we combine terms, this turns into some quadratic polynomial:

A t2 + B t + C = 0
for some values of A, B and C. We can just solve for t via the quadratic equation we showed in class, which produces either no real roots (the ray has missed the sphere) or two real roots (the ray goes through the sphere).