For me, this was the most anticipated talk of the two days of Google I/O 2015. Some of it pleasantly surprised me and some of it concerns me. My concerns are for key concepts where VR developers need to have a firm grasp but felt glazed over. You can watch the presentation on YouTube :https://youtu.be/Qwh1LBzz3AU
The talk was about 45 minutes long, and I took careful notes. I’m putting my notes here and adding my comments with additional information. The idea to post my notes came to me because I noticed after the presentation some people were asking basic questions about the information they had just explained. I believe this was because there was not much depth, and these are huge concepts to new comers. In my opinion, the presentation works well as a good first introduction and survey of initial VR application concepts.
I would recommend that you watch the presentation, take your own notes and use this post as a tool to compare. I may not have captured every detail. I hope I can help you grasp some of the basic building blocks in order to prepare for some of the more advanced concepts.
Why VR?
The discussion begins asking the question “why Virtual Reality” and the general answer is “Teleportation”. They mention virtual ruins, transporting users to virtual exhibits, going to any location and finally, “super power” of flying.
Advice for VR application developers
“Start small” and “focus on real ‘game changers’”. What part of an existing application would be improved in VR? An example of adding VR to an existing 2D application: real estate. In this example they say that putting the photo gallery into VR would be an improvement to a real estate application. Other parts of the application are left on the 2D screen.
Best practices when supporting Cardboard
It is best to refer the VR feature as Cardboard rather than VR so it is clear to the user what they need to use. Instructions on use of Cardboard should be referenced by a link to the cardboard site and not be provided directly.
Over one million Cardboard units have been sold in a year’s time; the audience for applications will continue to build.
When crafting applications you must test them, try early and often.
Empathy for Users
Have empathy for users and ask yourself “how would it feel?”. Where does the user drop into the environment? What is a good starting point to ease into the virtual world?
Users are intimidated by large objects, they feel weak and small. Users are protective of their eyes and do not like airborne projectiles around the eyes. They do not like being in proximity of sharp objects.
Large objects moving in close to the user triggers a fear of being crushed. Users do not want to collide with objects.
Comfortable environment: If the user is made tall the small environment is less threatening. Large spaces can feel intimidating.
Perspective: First or Third?
Perspective is important. Most virtual environments are first-person but some are third-person perspective (looking at the main character). A third person example: VR Cardboard Arcade Game “RoboBliteration” https://play.google.com/store/apps/details?id=com.otherworld.robobliterationfull&hl=en
Applications for Humans
We are building applications for humans and need to understand basic mechanisms.
Convergence: the eyes cross when objects get closer. Take your finger and move closer to the eyes, the eyes will cross to “converge” on the image. Objects become too close to allow focus.
When creating the scene, have the user focus on one thing at a time. People are better at sorting depth for closer images than for far-away images.
The stereoscopic scene needs to be set up with the user's inter pupillary distance, the distance between the eyes.
Must insure scale of objects in the scene to match objects in the real world.
Note: an interesting application demonstrating a 2D overhead view of the user centered in a series of rings that represent distance, an object moved by the mouse between rings, placed around the user caused a front view to be updated in the perspective of the user.
Software Applications
Cardboard now has support for the Unity SDK. There are easy to use “prefabs” that drop into Unity and “just work”. There is a camera script that updates an existing camera in Unity to work with Cardboard. .
Design Principles
It is best to learn design principles while in VR. A new application is available called “Cardboard Design Lab” that features 10 lessons.
Interaction Design
Design the environment to leverage the entire canvas. The user camera has the free will to look anywhere. Use cues to direct attention to the action area of the environment. Blacken the view and rotate the environment to desired area.
Edge Placement
New users do not automatically know they should start turning their head to use the tracker. Place an object partially off screen at the edge of the screen as a cue to the user to turn their head without being told. The user will learn about head tracking naturally through discovery.
Light Cues
Use light rays to highlight a directed path or objects to interact with. Illuminate the door you wish the user to pass through.
Gaze Cues: detect where the user's head is directed, i.e. cause stars in the sky to flash.
Human Technical Specifications
Vestibular Ocular Disparity - a condition that evolved in humans when the surrounding view is moving at a different rate than the fluid in the Cochlea. The result is Motion Sickness.
Must always maintain head tracking! If lag is detected, fade to black rather than show a delayed image update.
We do not detect when we are in motion at a constant velocity. We detect when we are accelerating and decelerating.
Always keep a stable horizon line, not moving like on a rocking boat.
Resolution, Acuity and Field of View
Resolution - the resolution needed to match the eye would be 60 pixels for each degree. It would take a 60 by 60 pixel display to fill one degree of the field. That would provide 20 / 20 acuity.
The human field of view is 210 degrees horizontal and 100 degrees vertical. We would need a display of 12,600 by 6,000 pixels to equal that on a display. Note only the fovea can see high resolution and that is just 2 degrees worth of information. The eye peers in order to bring the fovea onto objects in the field.
The Google Nexus 7 phone in Cardboard provides 100 degrees at 13 pixels per degree.
In World User Interface
Situational Awareness: Embody the interface in the world. Instead of the “meta interface” of health bar, show actual damage graphically.
Hearing
Hearing lets you build a mental model of the environment. An environment author can direct attention with spatial audio. Play the sounds of objects around the user before they see them around the field.
Powerful Medium
Immersed
VR is such a powerful medium that you can forget that you are in VR. You can become very distracted by the tasks you are performing. The sense of presence plus workflow of tasks can enhance the creative work flow, the notion of being lost in or immersed in your work experience.
Memory
During the actual VR experience you are aware you are in VR. The difference in memory of virtual events are not immediately distinguishable from real ones. The memories can seem real and you have to remember that it came from a virtual event.
Answers to questions at the end reveal that as Field of View (degrees) go up, pixels per degree (acuity) goes down. Remember you need 60 pixel per degree for 20/20 vision but only in the center 2 degrees of fovea.
My favorite thing that came from the talk was an expression “Snackable VR” given in an answer. The question was in reference to adding more complex features to cardboard. They explained the trade-offs in viewer designs as the more features you try to support, the more constraints you are up against. The goal of Cardboard is pure simplicity. To make it as simple as possible and still give an effective experience to the user. They mention the success of “Tilt Brush”.
There will be future posts on this blog to expand on the last concept, how trade-offs work in the constraints of more complexity. One example is a huge razor in the feature matrix of viewer product selection: supporting eyeglasses. I am very close to this subject as not only am I nearsighted but I have some pretty strong astigmatism (meaning my eyeballs are shaped more like footballs than spheres). This means I need an exact prescription to match my eyes in order to correct my vision. No simple focus knob on binoculars are going to help me completely.
To understand the impact of supporting eyeglasses on the entire viewer, we must first look where the glasses reside in the system. Obviously, glasses must go between the eye lens of the viewer and the eye. One of the first things I notice when trying out Brand X viewers (or brand “O” as the case may be”) is that the light mask is not wide enough to let my glasses fit horizontally.
Of the dimensional impacts on the system, width is the least concern. The dimension of most concern is the depth of the viewer box, how far out in front of the eyes will the screen be? To fit glasses in between the eye and the lens means we must have an eye relief (distance from eye to lens) of at least an inch between the eye and the lens.
This means the screen will ultimately be pushed away further from the head. We call the effect of the screen weight being away from the head a “lever arm”. It is like a seesaw in the park, the screen is the person on the other side and your neck is the fulcrum. A small child can lift an adult if his seat is far enough from center.
The further out the screen is from the center of the neck the more effective weight it has. What’s more interesting, you get the full effect of inertia on sharp head turns. Once you get a more distant screen in motion and suddenly stop, it wants to keep going and give your neck a little adjustment for free. It can even twist your glasses off your face in a unique and unfriendly way. Back in the day, our helmets weighed over four pound and we had to counter weight them in the back. That is another story.
If you keep the lens the same size and move it further from the eye, perspective causes the lens to become smaller. Instead of seeing the same field coming through the lens, you get to see the edges of the lens and a vignetted image. That requires you use larger diameter lenses to compensate for eye distance.
When a spherical lens of of the same power is increased in diameter, it quickly becomes thicker and heavier. As we will see, larger spherical lenses are prone to greater distortions. At that point, you have to consider using multiple thinner lenses together. Of course that means...moving the screen out further to fit the longer optics box…
Note: one of the developments we accomplished at ThreeSight VR in 2015 was creating the TopMount Viewer. The phone sits above the eyes with the screen facing down. A waveguide steers the images to the eyes. The weight of the phone is much more distributed as it moves out and creates a more sleek formfactor, a wedge shape on the side profile. We are looking for software application partners to update their games, etc to work with it.
I will be writing soon on more viewer design topics. Thanks for reading!