When the Nintendo Wii video game console was released in 2006 it ushered in a new paradigm for video game interfaces. To many even the humble gamepad represents a complex control system abstracted from what’s happening on-screen. If you grew up with a gamepad in your hands this isn’t something that seems to be an issue, but for the non-gaming majority it’s a control system that represents a significant learning curve.
The Wii changed this by introducing motion controls that removed some of that abstraction. When you swung the controller the character on-screen would swing their bat or tennis racket the same way. The learning curve was diminished and the floodgates to mainstream or “casual” gaming was opened. Sales of the Wii were phenomenal, shipping over 100 million units. The two “core” console makers, Microsoft and Sony, probably never expected the humble Wii console with its comparably weak technical specifications to be so successful, but clearly they took note. Today both Xbox and Playstation consoles from the last two generation have motion gaming options.
Sony chose to combine it’s camera system with handheld controllers much like that of the Wii, but Microsoft went a different route entirely. Their system of motion tracking uses no controllers at all. It consists of a sensor package that sits above or below the screen or television, watching the user and translating body motion and gestures into an input that a video game or other software package can understand.
There are several versions of the Kinect, the original version designed for the Xbox 360 was codenamed Project Natal and was marketed quite heavily by Microsoft as a revolutionary product for its console. Reportedly the marketing budget for the device was $500 million, a massive push to capture the mainstream market that the Wii did so successfully.
The release of the the Kinect in 2010 was followed by a Windows version in 2012. Microsoft’s latest console, the Xbox One, also has its own updated version of the Kinect.
What does the Kinect Do?
The Kinect uses a combination of sensors and software to interpret motion, gestures, voice commands and facial features. The Kinect can therefore provide games with the ability to understand your posture, gestures and actions. Translating them into interactive input information.
Of course since the user is pantomiming without any physical feedback the only way to let you see what you were doing is by visual, on-screen cues. Different developers have different approaches to this, but usually this is done with iconography or via an avatar that mirrors the user’s movements.
So in essence, the Kinect creates a digitised version of the user that can directly interact with the virtual world on screen.
A demo of the interaction possibilities Microsoft envisioned known as “Milo” was presented to much fanfare around the release time for the original device. The demo shows a young woman interacting naturally with a digital character named Milo. They converse, have eye contact, exchange objects and generally act as two people in the same space. Much of the demonstration was however pre-rendered and recorded. In the end a fully realized version of this software was never released and it remains an ambitious technology demonstration only.
How Does The Kinect Work?
The Kinect is the result of years of research by Microsoft and its partners. Indeed, the newly announced Microsoft Hololens incorporates technology that was first introduced in the form of the Kinect.
The Kinect contains an infrared (IR) depth-sensing camera, capable of detecting movement towards and away from the device. So if a user reached towards something on the screen the Kinect will know. It also has an IR emitter that allows the depth-sensing camera to see.
The Kinect has yet another camera, known as the colour sensor, which can track the colour and features for object recognition.
Finally there is an array of microphones that allow for positional audio.
The sensor bar also sits atop a motorised stand, allowing the device to automatically correct for movements that take the user out of its field of view.
These sensors generate several streams of data for use by software applications such as video games. For example, a stream for skeletal data, a stream for colour data and a stream for depth data.
The latest version of the Kinect (Known as the V2) meant for use with the latest Xbox console, the Xbox One, has several improvements over the original Xbox 360 version.
The Kinect V2 now has much improved resolutions,a new advanced time-of-flight camera, a wider field of view, the ability to track six player skeletons at once, the ability to directly measure user heart rate and now requires less physical space for the user than the original.
The Significance of the Kinect for Virtual Reality
The capabilities of the Kinect have obvious applications for augmented reality as can be seen from the integration of its technology in the Hololens augmented reality head mounted display (HMD). At the same time it is poised as an important technology for virtual reality as well. Even when wearing a virtual reality HMD the Kinect can still see the user and transport a digitised representation of movements into the virtual space. Wearing an HMD, you might look down and see your own body, raise your arm and thanks to the Kinect you will see your virtual arm do the same. Using the Kinect in combination with an HMD removes the awkward barrier represented by the screen and allows full body motion tracking without the need for cumbersome equipment worn on the body. Simply stand where the Kinect can see you and it can track your movements.
The convenience of this approach has not escaped Microsoft’s competitors. Sony’s Playstation VR relies on external tracking by the Playstation Camera and Valve Corporation’s SteamVR system also makes use of external IR scanners to perform motion tracking, although both implementations are different enough from the Kinect to make it a unique piece of technology on its own.