Soar Volumetric Explained

By: Conor Stokes and Justin Baker


What do we do at Soar?

The essence of Soar is end-to-end volumetric video solutions, including live streaming… but what does that mean in detail?

For many people, volumetric video is a new form of media, and the concepts and production pipeline involved will seem unfamiliar. The purpose of this article is to expound upon that, with some specifics about how we at Soar make that happen for you.

What is Volumetric Video?

Modern 3D video games and computer-generated animations start with artists creating a 3D geometric representation of a scene, such as a textured triangle mesh model, and performing projective rendering, such as ray tracing or rasterization, to show that as a 2D image. Volumetric video flips that around, taking 2D video streams from multiple cameras and turning them into an animated 3D geometric representation.

Think of it as full color 3D scanning, but as an animation sequence that can be played back in real-time like a film clip.

Unlike traditional stereo 3D video, there is a full 3D representation of the scene, so this can be used with those same game engines and CGI renderers that we mentioned earlier to re-render the scene from any view. It can also be rendered from multiple views at once, for use with a stereo display, such as a VR or mixed reality headset, or a light-field display that allows glasses free 3D.

Soar capture running on a Looking Glass

A piece of the real world that can be taken and seen somewhere else, virtual or the real, the closest thing we have today to teleportation.

Soar capture in AR

It also opens up the captured scene to any effects that are possible in those systems, giving creators and viewers a level of power to manipulate content that regular 2D video simply does not. Realistic 3D particle systems, physical interactions, and lighting effects can be added that are very hard to add to traditional 2D video.

You can be the controller, and see yourself doing it; because these aren’t holograms, you can actually play with them.

Soar's Solution

Our solution starts with inputs from one or more time synchronized camera feeds providing color and depth, for example in real-time from a color + depth camera such as an Azure Kinect. To run it, you’ll need at least a modern gaming PC with a mid-to-high end video card at a minimum, as well as enough I/O to feed in all of your cameras (up to 10). This can be setup in any light controlled environment where the cameras are stable - no green screen required.

raw-feeds

Color, Depth, Infrared Feeds

Calibration is done with a single box shaped marker, which is used to figure out where all the cameras are, as well set the frame of reference, in the 3D scene. This process takes about a minute and only needs to be done when the cameras have been moved, or you want a new frame of reference. Once this has been done, you can also calibrate to remove static background objects, an optional process which takes only a few seconds.

calibration-cube

Calibration Cube

Now you’re ready to start making volumetric video! There’s a record button, a stop button, and the ability to instantly play back your videos, just like a phone camera. Our solution offers full end-to-end real-time, so you’ll be able to see a 3D preview of what is in the scene straight away.

To do this, we take the potentially noisy and incomplete depth images, add in some estimates of the missing information and perform a water-tight surface reconstruction. This means we find the most probable location of the surface of the objects in the capture area, given the inputs. We also do this in such a way that every surface is closed to form a volume. This means our system is not limited to capturing only people. It can capture furniture, props, walls, ceilings, and floors.

baseball

Soar capture of a person holding baseball

This surface reconstruction is available in the form of a triangle mesh, which is the most common represention used for 3D geometry today, with near ubiqutious support in content creation pipelines and rendering systems from VFX to games engines. It’s also quick to reconstruct geometry in other common and useful representations, such as signed distance fields, or voxels.

wireframe

Soar capture in Wireframe mode

Color is added to the surfaces by mixing the input from all of the cameras in a way that best represents what the viewer might see from their point of view during rendering. This leads to higher quality textures and view dependent effects, making our captures appear more realistic.

tpose

Soar capture in a T-Pose

Geometry can be captured, surface reconstructed and then compressed, with textures and audio, in real-time. It can be streamed up to the cloud and broadcast like a traditional live-stream, or saved locally. Our in-house compression format for geometry can reproduce high quality meshes, with no loss of topology, even at under 5 bits a triangle including all connectivity and vertex information. This means perceptually lossless streams at 30 frames per second under 20Mbps.

Soar capture of Josh Allen

Raw captures of camera feeds can also be saved, with all of the required information to re-process captures with different settings and time-cuts. The lets you get the absolute best results out of data you don’t need to stream straight away.

You can then export sequences of traditional meshes and textures in standard formats, such as OBJ and GLB, for integration into other pipelines out of our system, for use in VFX in traditional video, offline rendering, as static scans for content creation or 3D printing (with some clean-up). Exports can be used with volumetric video editing software such as Arcturus HoloEdit.

meshlab

Soar capture in MeshLab

Playback of both pre-recorded captures and streams is supported inside of 3D engines like Unity (Unreal coming soon, as well as web and a native library). Because content has a full triangle mesh representation, a wide range of the standard features of Unity are available, and your content can be viewed in a scene with other geometry straight away.

Soar capture in a Boxing Ring

We also work well with AR…

Soar capture real-time streamed to a phone in AR

and particle systems VFX.

Soar capture with VFX

How is it used Today?

Customers are using Soar’s volumetric for everything from up close and personal interactive content and art pieces, to real-time simulated training, to premiere TV production VFX, and nearly everything in between. Our customers take advantage of the flexibility of volumetric media and our multiple available generalized workflows that can easily be adapted to suit their needs.

Soar capture in a Virtual Environment

We believe that as an emerging form of media, it should be easy and accessible to use, with straight forward and understandable workflows, no matter what industry you work in, your skill level, or your use case. But the big secret is how fun it is! Try it for yourself and find out!