Here's a site that goes over three implementations of compensating for lens distortion. It turns out, that using a vertex shader to distort the geometry of the scene based on the position of the camera is the fastest, since it doesn't involve a second shader pass or copying the scene to an intermediate texture.
The Oculus Go uses Fixed Foveated Rendering, which means that the headset's display is divided into 19 sections of varying resolution. The Snapdragon 821, which powers the Oculus Go, is able to render the pieces in parallel.
Interestingly enough, today it was revealed that the new Oculus will have "varifocal lens" to track what you're looking at, and presumably to do foveated rendering.
Following up on the comment above, I'd like to shamelessly plug the Virtual Reality DeCal here. We do a lecture on the many applications of VR which you can find here, and it covers some of the things discussed in the lecture, by alpan, and more.
Broadly speaking, there are two kinds of controllers: those with 3-DOF (rotation) and those with 6-DOF (rotation + position). 3-DOF controllers are usually cheaper and rely on an internal IMU with gyroscope to know its rotation. The most common example of a 3-DOF controller is your phone whenever you use it for tilt controls.
6-DOF controllers use different tech, often matching with whatever headset it's paired with. The HTC Vive controllers (shown above), contain a number of IR sensors that detect timed sweeps pf IR light from the base stations and send that timing data to the computer, just like the headset itself. They're contained in the little dips and are why the controller needs the big ring at the top. It's easy to google if you want to learn more, but I particularly like this visual rendition.
The Oculus Rift's Touch controllers have a similar story, but in reverse. They're covered in IR lights that get picked up by the external sensors (again just like its headset). The big difference here is that Touch controllers don't have to stream sensor data back to the computer - instead they emit IR that is interpreted by the sensors and computer. If you're ever using one, you can use a phone camera to see the IR lights on the device (provided your phone isn't really fancy and has an IR filter).
The Windows MR controllers are a little more interesting since there are no external sensors in the setup. Cameras on the headset pick up white LED lights on the controllers and use that data to solve for the controller's positions relative to the headset. This means that if you put the controller behind you and out of sight, it will no longer track!
A lot of the times, people think of VR as a sort of gimmick-technology that's mostly meant for games - however there's a LOT of potential applications with VR that can have benefits in the workplace and many different fields. For one, VR in the military could be used for flight / battle simulation, which allows people to practice expensive and dangerous tasks with no risk. VR is also very useful in healthcare, where it can be used to do things such as robotic surgery.
Most of the focus so far has been on the actual VR headset and display itself - I was wondering how hand-held controllers (like the tilt brush pictured here) fit into the picture. Is it just a motion tracker that sends signals to the headset about its position? What other information does it provide?
Say we wanted to have a full rendering of just the bunny. With this robot, how does the gantry capture light coming out of the base of the bunny (where it is sitting on the cloth)? Would it be better to place it on some sort of see-through surface so the camera can capture what it looks like on the bottom?
Although slightly inaccurate as you point out, the term "360-degree" is often taken to mean views in every direction, including up and down. "360-degree video" for example. I guess the proper term would be "4pi steradians FOV", but that's not quite as catchy.
To chime in on the last 2 paragraphs by James, there is research being done at Berkeley that wants to pre-filter images and use special displays such that nearsighted or farsighted users can see the image clearly without glasses. Check out prof Brian Barsky! I don't know much about his project other than he has the project, but (correct me if I'm wrong) this involves some kind of microlens array with concepts not unlike light field cameras we've studied just prior.
Didn't know where to put this comment since it wasn't mentioned in any of the VR lectures, but a technique for lower latency between rendering and movement in the Oculus Rift system is called "time warping", with design help from a big graphics/Oculus/gaming name John Carmack. Here is a video which explains time warping. In essence, the system collects the head orientation at the start of the frame, renders the image, and right before the refresh of the screen, takes head movement again and updates the rendered image so that it has more current head orientation.
As mentioned in the next lecture, this software is called foveated rendering. Basically, spend more time rendering where the eye gaze lands.
Why is it that the lens is not adjusted in a way to force the light to go to the photodiode instead of how most of it is just going to the circuitry. I'm not sure if what I mean here is possible, but wouldn't it be better to rotate it a little bit to try to force the light that would in this setup go to the circuitry instead go to the diode?
Why do most cameras not try to get a wider gamut just like Sony and Kodak? Is it a matter of cost? or is the difference not really noticeable?
I'm not sure if I'm imagining this, but it seems like even the highest quality of photos taken with very capable light field cameras are still a bit blurry. Why is that? Is it because of the computation of the re-focusing?
One possible way around this could be eye-tracking software. One could also imagine contact lens with electronics on them, which could possibly track eye motions while rendering an image for the user, although at the moment that sounds more Sci-fi for me.
Don't we need beyond 360 degree FOV? Since that wont account for looking up or down. What would that be called, to have a view at every direction around you?
For tracking position, you'd probably want to use a kalman filter: https://en.wikipedia.org/wiki/Kalman_filter
This technique takes in account prior knowledge of states in order to determine and correct for the probable degree of error, making it more robust than other techniques.
@stecd: SNR improvement measures the factor of improvement between the two pixel types, one that captures 10,000, another that captures 1000, so what we want is SNR_10000 / SNR_1000, which equals sqrt(10000) / sqrt(1000) and that makes sqrt(10)
Shouldn't it be $3 \times N$ for 3 rows, N columns (one row for each primary light)?
@kana-mycin: Great questions!
As to your first question, it is not possible to have a real spectrum that would lie outside the convex hull of visible colors shown here. To do so, we would mathematically need to have a spectrum with "negative energy" at some wavelengths. My earlier comment on this slide gives a bit more detail on this.
As to your second question, yes, we can extend the gamut of a display by adding more primary lights, each with their own unique spectral power distributions. For example, some TVs try to add "yellow" pixels in addition to red, green and blue, for exactly this reason. Your intuition that monochromatic spectra would maximize the addressable gamut by placing these primary lights on the convex hull shown on the figure. Your choice of four wavelengths is strategic too, and would geometrically cover most of the visible colors shown. Of course creating a device with real pixels emitting those monochromatic wavelengths would be an engineering challenge.
Also, when designing displays, the fact that we only have three values R G B limits the color gamut to a triangular subset of the CIE chromaticity plot. Is there a reason why we can't simply add a few more display colors around the outside of the plot? For example if we had pure 460, 500, 520, and 620 nm spectra, wouldn't the resulting quadrilateral cover a pretty large portion of the chromaticity diagram?
Is it possible to generate a spectrum that falls outside this convex hull, by perhaps specifying values of x and y? For example the point (x,y) = (0.7, 0.1). And what would that look like?
Just to clarify what @cs184-afm was saying for anyone reading these slides, the left image is the true spectral power distrtibution of the sun and the right image is the spectral power distribution that might be produced by a photo of the sun on your computer.
@whyalex These are extremal rays passing through the aperture and through conjugate points. To calculate the correct angle that you describe we would use Gauss' ray construction as shown in previous slides. Here, we are taking a first look at specific ranges of conjugate distances.
You're right, it does cut down the resolution of the original sensor because we're allocating CMOS sensors to capture "redundant light" (although we really know that we're also capturing the direction of the light ray as well).
However, Ren's demo was done on images that were already at the resolution captured by each of the microlensees. Playing with depth of field is not going to change the resolution of the image because all you're doing is summing up several of the u, v images together. So that means the resolution of the image will be identical whether you're using a subaperture image or creating a virtual aperture.
Yes, typo, thank you for the correction!
@dunkin_donuts: think of it in linear algebra terms. There is no reason that the coefficients on the DCT basis functions will be non-negative!
I like this discussion. Yes, if you have only three primary lights, then attempting to match $s$ at every wavelength as closely as possible will generally give a signal that is perceptually less similar. We have to take into account the viewer's sensitivity functions.
This distribution is for independent, identically distributed random events. Raindrops, arrival of photons, ...
@dunkin_donuts: Great question! For short wavelength visible blue light, 90% of photons are absorbed in the first micron. For long wavelength red light, it can take up to 8-9 microns. This site has a table.
FYI: afterimage = an impression of a vivid sensation (especially a visual image) retained after the stimulus has ceased.
we vividly see colors in this image due to the previous image with hue and saturation content. we see the opposite colors.
How was the sqrt(10) improvement calculated? we have 10,000 photoelectrons which has SNR = sqrt(10,000) = 100, and for the 1,000 photoelectrons SNR = sqrt(10,00 = 31.67. Where did sqrt(10) come from?
What are the rays being traced in this picture? They're not the parallel/chief/focal ones, right? How do we know what the correct angle of the ray is after it passes through the lens?
I don't think it particularly matters. The only thing that is consistent is that the lens is represented by u and v, though you could just as easily swap the variables around since most of these just represent coordinates anyways.
@Anna I think you can do this by just taking a single pixel out of every microlen.
@hwl I suspect it's due to the design of microlens and sensor arrays.
I don't think BDPT is covered?
This diagram seems to depict a Max function as opposed to linearity?
@changchuming nothing special for image processing, I think you are correct that it should be written in (row column) to match the illustration.
In this slide, (x,y) represents a microlens location within the image. But on previous slides, x represented a position on the focus plane. Which one is the proper definition? Or are they equivalent?