The talks by the two big desktop graphics hardware vendors gave me some interesting insight into how much more goes into creating a good VR experience than one might think. I already knew that creating content for VR requires new thinking. You can't just take any game mechanics and make a straight port. But I had not realized how much work there has been on the graphics hardware and drivers to support VR well.
The first talk was by Nathan Reed of nVidia and Dean Beeler of Oculus, and the second one was by Layla Mah of AMD. What hit me was that although their presentation styles were different, the information in their talks was very similar. Apparently both nVidia and AMD have implemented very similar features in their graphics drivers to support VR.
So why do they even need to update their drivers to support VR at all? Can't you just render a right eye and a left eye view and be done with it? That's definitely possible, but the main problem is that for VR, it is extremely important to keep latency as low as possible. If you move your head, the display has to be updated as soon as possible, or there is a great risk that the user gets sick. You can get motion sickness after only a couple of seconds of high latency rendering, and it can stay for hours. To avoid this, they say "movement to photons" should be below 20 milliseconds. Traditionally graphics drivers have been optimized for throughput, not for low latency, which is why they had to implement new features to make VR painless.
With a screen update frequency of 60 Hz (as on the Samsung Gear VR), one frame is approximately 17 milliseconds. Traditionally you start rendering right after one vblank, and if you're rendering fast enough the result will be visible after the next vblank 17 milliseconds later. That is dangerously close to the 20 millisecond limit, so if there is any frame rate hickup, you'll easily be over the limit. If I understood it correctly, Mah even suggested a limit of 10 milliseconds, which would be impossible to do if you read out the head orientation right after a vsync and rendered a scene with that camera rotation. In this case the head may have moved too much before the user sees the new frame one vsync later.
The solution to this is asynchronous timewarp. It works like this: You get the head orientation right before rendering the scene. Then you render the scene to a frame buffer that is slightly larger than what will be visible. This is the main rendering and can take several milliseconds, under which the user's head may move. Then you get the head orientation again and warp the rendered image to adjust for the new head orientation. (This warping reminded me of good old Quicktime VR, but of course you don't need to render a full 360 degree panorama for asynchronous timewarp.)
To enable asynchronous timewarp without requiring the CPU to wait until right before vsync to issue the draw calls for the warping, the graphics card vendors implemented late latching. Late latching allows the CPU to update constant buffers in the GPU after the draw calls have been submitted. So you can queue up the warping draw calls so they will run after the scene rendering, but keep on updating the head rotation matrices continuously. So when the CPU finally executes the warping, it will use the latest values.
I suppose late latching should be useful not only for head rotation, but also to get low latency when using hand tracking systems such as Leap Motion or Sixense STEM, so the GPU renders the user's hand late as possible.
A lot of the complexity of all this comes from the old fashioned way that displays work. Even if nothing has changed on the screen, the display updates at a fixed frequency just like an old CRT. And just like on an old CRT, you have to wait for the next vsync before your latest frame buffer can be visible (unless you accept tearing of course). There are displays that get rid of this heritage (using nVidia's G-Sync and AMD's Freesync) but none of them are used for VR.
Also, late latching doesn't help if your scene rendering takes more than one frame. In that case, you would want the asynchronous timewarp to run right before vsync even if the new frame isn't ready yet, reusing the frame buffer from the previous finished rendering, so you at least get the rotation updated if not the position and moving objects. This requires the graphics driver to be able to preempt the current rendering and run the timewarp before switching back to the original rendering (like task switching on the CPU). Both AMD and nVidia mentioned this in their talks, so apparently they are working on it. Both of them also mentioned the possibility to use two graphics cards, one for each eye, and submit the same draw calls to both of them, instead of doing very much the same calls twice. That's only useful if you have two graphics cards, but it seems they're working on making this possible with a single GPU too.
I saw a comment somewhere that all of these solutions and workarounds are temporary, and in the future we will not have to worry about frame rate because GPUs will be fast enough to always render at a high frame rate... I wouldn't hold my breath. For the foreseeable future, artists won't think the state of the art consumer graphics cards are good enough for everything they want to do. And until then, these tricks are useful to squeeze as much as possible out of the GPU when doing VR.