As AHR is now "mature", I decided to go and try to implement it inside Unreal Engine 4. Still at early stages, but getting there.
Also, AHR is rendering at about 40 FPS on my system for the High preset ( 16 rays/ 32 samples), all while using less memory, as one of the grids is no longer needed.
Anyway, for more information, follow the thread here : Unreal Engine Forums link
Well, this blog needed some fresh air, so here it is.
I have been working in my GI technique, and one aspect I wanted to fix was the voxel clear code.
As is the trend today, AHR relies on a voxelized scene representation to compute the GI, and given that the scene can change each frame, the grid needs to be cleared each frame also.
That is a problem, as calling ClearUnorderedAccessViewUint() on a 512x512x512 grid takes about 25 ms on my system ( GTX 750 Ti + FX-6300 ). It's quite a bad result, taking into account that, using a high quality preset, rendering the GI itself takes 25 ms.
So, how did I manage to bring that 25 ms down to 0.7 ms ( a ~3500 % improvement ) you may ask? Well, here is how.

I'm using 32 bits per voxel because DirectX forces me to do it if I want to use atomics, but the data I store doesn't need that much precision. So I took 8 bits out of the voxel, and use them to store a frame counter. When raytracing, I compare the ID from the voxel to the current frame ID to see if it's valid.
So now I have 256 frames to clear the grids, wich I do with a simple compute shader. I could use ClearView(), and that would be awesome, but it just for DX 11.1, and only ( to the extend of my knowledge) AMD supports creating a 11.1 device, so that's a no-go.

Overall, the results are really good. With that change, the total frame time got a 50% reduction, without any noticeable impact on visual quality.
Still have some things to fix, but the idea is there

Screenshot!


Well, this blog has been dead for a while. I was working hard on getting AHR ready for the GDC, as I was rejected at SIGGRAPH (but not by far actually).
So, here's a new video:
Again, I can't provide a lot of details. Just hope that I get into the GDC, will explain everything there ;)
Global Illumination is the thing of the moment. Rendering it at real time is a really complex problem, so a lot of techniques have surfaced lately, mainly influenced by the amount of raw power and programability of current gen GPUs, and the new generation of consoles.
Therefore, I took my take at that problem.
I got inspired in my technique after looking at the Sparse Voxel Octree Global Illumination a couple years back. I can't comment a lot right now because I applied to present it at the Real Time Live! conference, that is a part of the SIGGRAPH.

Below is a video of a baseline implementation of the algorithm. The demo actually runs more fluent, but Camtasia seems to make a significant impact on the framerate.


With Visual Studio 2012 Microsoft dumped the DirectX SDK, and the effects library with it( along other things), which I needed for my engine.
To compensate for that, I wrote some code to load an xml file that describes the outputs and inputs of the shaders, just like the .fx files used to do.
Having that working, I realized xml is a bit too cumbersome for that use, and coming up with some small language will speed up development time, plus, I've always wanted to write a parser/compiler/vm.
Before starting with it, I decided to try to do something smaller first, something to parse/execute a kind of assembly code meant to run embedded in an application.
So after a lot of coding, and messing with regular expressions, I finally managed to get it working. 
The parser can load code like this:
and produce a memory representation that the vm can execute. The example above adds in parallel two arrays of floats, and stores the result in another array. 
The vm code runs from 10 % to 20 % slower, which isn't that bad for a simple implementation. Ok, so now some code! I won't put the VM, as it is a lot of code, but I'll put the parser.

Parser:

So I finally got to work again in the particles simulation, and I made some advances. Mainly, fixing the displacement of the particles and adding a "pseudo z culling" step. Actually, what I'm doing is compare the distance of the sample to the camera with the distance of the scene to the camera, and only add the sample to the final color if the distance is smaller.
Oh, I forgot. I also optimized part of the voxel update code, so I'm now using 80.000 particles at 90 - 110 fps. Quite nice!
There are things to do though. First of all, add some shading, maybe even shadows. I have other thing in mind, that I mentioned earlier, regarding compression of the voxels ( particles are sparse after all right? ) but it still needs some more work.
For now, here are some screens:





Working on a parser the other day, I decided to use the new <regex> library in C++ 11.
It's awesome, once you get to know the regex sintax. To make development faster, I coded a couple functions. They are nothing fancy, but they speed up my workflow.

Pages

Powered by Blogger.