Designing Interactive Experiences in Virtual Worlds

I remember when a folder was an awkward thing that spilled papers all over the floor when you picked it up the wrong way. When I was in high school, the word 'icon' still carried religious connotations. I saw my first hourglass in the hands of a witch in the movie Wizard of Oz

The symbolism of the current digital age was established during my generation's youth (and yeah, I'm kind of old). This means that although my son will never hold a floppy disk, the symbol for 'save' in most applications is still a floppy disk, and not a USB key.

There’s a word for these metaphors we bring into a later age with us to help us navigate it—skeuomorphs. But these skeuomorphs are changing, too. Thus, we now click clouds with arrows instead of floppy disks when we want to save data.

Virtual reality is pushing this change forward. Affordances like cursors don't work that well in a full VR environment, especially if you add gestures into the equation. Moving your arm is a counterintuitive action when moving a cursor in 3D space. Cursors were designed to move in 2D space - that's how we expect them to move. 

2D cursors are an example of how using existing symbols in a 3D environment causes user pain.

The biggest difference between interacting in immersive 3D space and interacting on a screen is that we are born interacting in an immersive 3D space, so the affordances we use (power buttons, light switches, teapot handles) are already deeply ingrained.

We are currently in a state of pre-convergence. In order to replicate real world interactions in VR, we need haptic feedback and accurate hand tracking. We must feel the switch move up and down, and see our hands responding with as little latency as possible. 

That level of realism is on the horizon, but we aren't quite there yet. It will be a few years before haptic hand tracking technology is available at the consumer level. 

How do we create intuitive user experiences for 3D space using available technology? 

I believe the key to good interaction design in VR is smart use of physical gestures. New control hardware is available that is designed specifically to allow extra degrees of freedom. A 2D cursor in a 3D environment is confusing, but a 3D cursor which can obviously be moved in 3D space on a plane (forward, backwards, side to side) and on the y axis (up and down), is a good way for users to instantly understand the new constraints. 

Imagine a mouse that can be raised and lowered, rotated, and pushed forward and backwards. The Razer Hydra does this pretty well, and there are other gaming controllers on the consumer market that can be used to interact in 3D space. But there aren't enough.

Objects that 'look clickable' in 3D space must appear to be within reach.

When designing for controllers like the Hydra, it is important to remember that waving one's arm is not a comfortable or natural mode of interaction for anything except perhaps racket sports, and protecting oneself from angry bees. Large gestures or control mechanisms that require a user to hold an arm out or wave it for extended periods should be avoided. 

In many cases, it helps to default the motion of the affordance (like the cursor) to an x/z plane when selections need to be made based on depth, and to an x/y plane when you want the user to interact with a flat surface (drawing on a canvas or choosing from a flat menu). In tests involving the Oculus Rift and the Meta augmented reality glasses, my team and I noticed that users will first move the controller side to side, and won't attempt to move it forward and back or rotate it until later.

Ensure that a user's first natural motion makes it obvious that they can interact in three dimensions.

How do you transfer the metaphor of clickable menu items and objects to a non-flat display environment? Unlike 2D displays, you can move your head in VR, and objects in the environment are at different depths. This means that reaching out to touch them (especially if you don't have good cursor affordance tracking) can be disorienting. Things can get in the way of other things in 3D.

On the other hand, the fact that you can move your head and look around objects allows for a level of interaction that transcends point-and-click. For example, the act of sustaining your gaze on an interactive object for a period of time can trigger an interaction. Sustaining your gaze combined with small gestures opens up a whole new set of interaction possibilities.

This use of the Hydra, the Oculus Rift, Ableton Live, and the Pensato VR interface is a great example.

The user's attention is a control mechanism.

This is exciting because virtual and augmented reality can add relevant layers of information and data to objects the user is paying attention to.

When you look at a tree in the real world, you know it's a tree. Many facts about its unique qualities of tree-ness pop into your mind instantly (coniferous or deciduous, fruit bearing, maple, birch, pine). In VR, you can add additional layers of data such age, history; even an option to zoom into the sub-atomic particles of the tree. 

Virtual objects can reference a unique user's past interactions and alter what comes next.

Consider how well positioned VR is to leverage things like voice control, eye tracking, emotive tracking, and brain-computer interfaces that don't work very well on screen-based displays. Every object in an augmented or virtual environment can potentially respond to a user's actions, voice, motions, presence, and feelings.

This is very different than what currently exists in 2D environments, where users have been trained to look for clickable things.

The long-term solution to the current control metaphor problem is to base interactions on intuitive, real-life behavior. Reliable voice and gesture control, combined with hardware that creates haptic feedback, is not too far away. These solutions should be considered when designing for virtual environments.

I believe mimicking and then augmenting real-life actions and reactions should be the foundation for interaction design in VR if we want to create the best possible user experiences in these new worlds.