Multi-User Holoconferencing:
What is holoportation?
Holoportation allows real-time, 3D visual interactions by capturing, compressing, and transmitting fully volumetric representations of humans into XR/VR/AR environments. Think of it as teleporting someone into a virtual space, where you can interact with their realistic 3D representation as if they were physically present. Unlike standard 2D video calls, holoportation creates a dynamic, three-dimensional space that mimics real-world interaction.
This pillar centers on the research, design, and development of a real-time holoportation system for XR applications, with the goal of achieving photorealistic, fully volumetric 3D reconstruction of humans in real-time. Key objectives include advancing light-field-based volumetric capture, optimizing compression and transmission methods, and enabling scalable multi-user holoconferencing. These accomplishments represent significant progress towards WP2’s goals and establish a strong foundation for further development and integration of the holoportation system:
- Finalized the initial system architecture, core API definitions, and foundational data structures for volumetric video processing.
- Refined the deployment architecture and mitigated risks through effective contingency planning.
- Established centralized repositories to support collaborative development.
- Conducted internal SDK and API workshops.
- Performed initial internal tests, successfully validating system functionality.
- First extrinsic calibration steps towards automatic calibration.
- Implemented basic automatic cloud deployment services.
- Prepared and validated pre-recorded datasets for upcoming testing phases.
Holoportation is emerging as a transformational technology in the quest of fully immersive experiences. Holoportation adds a new level to the manner in which we connect, socialize, and cooperate over great/vast distances by allowing distant individuals to engage in real-time, life-like visual interactions. The key components of this invention are advanced compression methods, live volumetric capturing, and the capacity to provide smooth experiences even in the presence of diverse network and computational conditions.
The Technology Behind the Magic:
Realistic Volumetric Capturing: To achieve high-end holoportation, the process begins with volumetric capturing, which uses multiple cameras to record a person from every angle. Advanced light-field technology captures not just the visual data, but also the depth, allowing the system to create an accurate 3D representation. This technology provides a photorealistic, full-volume reconstruction that captures even subtle details, such as facial expressions and body movements, creating the illusion of physical presence.
Compression and Optimization for Real-Time Transmission: Given the massive amount of data generated by volumetric capturing, compression becomes essential. The system applies advanced compression techniques to reduce the size of the 3D data while maintaining high visual quality. This enables real-time streaming of volumetric video over existing network infrastructures, from high-speed 5G to standard broadband. But the real innovation lies in the ability to deliver these experiences under heterogeneous computation and network conditions. Whether users are on high-end setups or limited hardware, the system dynamically optimizes the transmission, balancing quality and performance to ensure smooth holoportation sessions regardless of the user’s device or connection speed.
Scalable Multi-User Experiences: The future of holoportation isn’t just about one-on-one interactions. With multi-user holoconferencing, entire groups of people (up to 8 users for instance) can engage in shared virtual environments, with each person’s 3D representation interacting seamlessly. This requires robust solutions to handle the scalability of volumetric video, using edge computing and cloud technologies to process and stream multiple users’ data in real time.
Challenges:
1. Real-Time Photorealistic Volumetric Reconstruction: With current performance only reaching 15 frames per second, it is a difficult task to achieve real-time, photorealistic, full-body 3D reconstruction at a goal frame rate of 30 frames per second. Continuous endeavors are being made to enhance processing velocity, minimize latency, and elevate quality.
2. Hardware Limitations for High-Resolution Capture: Achieving full-body, high-resolution volumetric reconstruction across multiple devices, e.g Android, poses significant challenges with current techniques.
3. Complex Multi-User Scalability: Supporting real-time multi-user holoportation requires overcoming considerable obstacles relating to latency, bandwidth, and computing capacity. Implementing scalable solutions that preserve quality while adjusting to user and session counts is the job at hand.
4. Compression: The volumetric data needs to be efficiently compressed in real-time. The project is working on algorithms to handle high volumes of data while staying within bandwidth limits and maintaining quality.
5. Volumetric Video Streaming Efficiency: The final goal is to maximize performance and user perception, while also guaranteeing interoperability and lowering the overall resources usage. To reach the desired goal, we need to integrate innovative enablers and strategies for adaptive volumetric video streaming, including client- and server-based rate adaptation, Edge-assisted architectures, QoS-aware pipelines and advanced streaming techniques leveraging users’ viewing patterns and activity (e.g. viewports, positions…).
6. Achieving Accurate Calibration for Multi-View Systems: To achieve accurate point cloud reconstruction in a multi-camera setup, it’s essential to calibrate each camera’s internal parameters (intrinsics) and their relative positions and orientations (extrinsics). This ensures that the volumetric data remains consistent throughout the scene. The team is actively working on improving calibration techniques to address specific challenges, including lens distortions from wide-angle cameras and issues arising in low-contrast areas. Their approach blends traditional optimization methods with advanced machine learning algorithms for enhanced precision and resilience.
7. SDK elaborating and Development: Seamlessly integrate the Capturer SDK to optimize its deployment across various use cases. This requires refining core APIs, enhancing data structures, and conducting comprehensive workshops to equip team members for successful functionality testing and validation. Achieving this integration is crucial for delivering real-time, photorealistic, and scalable multi-user holoconferencing experiences.
Key Features:
Photorealistic 3D Representation: Real-time volumetric reconstruction of individuals with detailed capture of facial expressions, gestures, and body movements. Creates the illusion of physical presence in virtual spaces.
Real-Time Interaction: Enables live, immersive communication in virtual or augmented reality environments, mimicking natural interaction.
Multi-User Scalability: Supports shared virtual environments for multiple users (e.g., up to 8 participants). Allows real-time interaction among all participants within the same virtual space.
Advanced Compression Techniques: Reduces the size of volumetric data to enable seamless streaming over diverse network conditions, from high-speed 5G to standard broadband.
Dynamic Optimization: Balances quality and performance based on available computational power and network bandwidth, ensuring smooth experiences on a range of devices.
Cloud and Edge Computing Integration: Employs distributed computing to handle the processing load and enable scalability for larger groups or complex sessions.
Immersive Experiences Across Devices: Ensures compatibility with various XR/VR/AR setups, including high-end systems and lower-powered devices.
Automatic Calibration and Adaptation: Uses machine learning-enhanced methods to improve multi-camera calibration, ensuring accurate point cloud reconstruction.
QoS-Aware Streaming: Implements Quality of Service (QoS) pipelines and viewport-based streaming to adapt to user activity and maintain efficient performance.
Main Components:
Volumetric Capture System: Multi-camera setups for capturing users from all angles. Light-field technology records depth and texture for 3D reconstruction.
Compression and Transmission Modules: Advanced algorithms for real-time compression of 3D data. Optimized for both high-quality visuals and low-bandwidth transmission.
Cloud and Edge Computing Infrastructure: Facilitates scalable processing and distribution of volumetric video streams. Enables real-time computation across distributed systems.
Real-Time Rendering Engine: Converts captured volumetric data into realistic 3D visuals for display in XR/VR/AR environments. Ensures smooth rendering under diverse network and hardware constraints.
Multi-User Management System: Handles synchronization, interaction, and data integration for multiple participants. Provides mechanisms to minimize latency and maintain consistency in shared environments.
Calibration System: Integrates extrinsic and intrinsic calibration tools for multi-camera systems. Corrects for lens distortion and aligns depth data for accurate 3D reconstruction.
Software Development Kit (SDK): Core APIs and data structures for developers to integrate holoportation into various applications. Includes deployment tools and documentation for smooth implementation.
Impact:
The impact of holoportation goes beyond casual conversations. In sectors like healthcare, education, remote work, and entertainment, holoportation is set to redefine how we interact. Imagine virtual classrooms where students and teachers can engage as if they were physically in the same space, or remote work meetings where team members feel truly present, no matter their location. Thanks to its ability to adapt to various network conditions and computational environments, holoportation opens up new possibilities for global collaboration, bringing high-end, realistic interactions to users across all types of devices. Holoportation represents a bold step toward the future of communication, allowing for photorealistic, real-time interactions in virtual spaces. By combining the power of volumetric capturing, cutting-edge compression, and scalable network optimization, this technology is making remote communication more immersive, interactive, and real than ever before. Whether for personal use or professional collaboration, holoportation is set to bridge the gap between the physical and digital worlds, transforming how we connect with others across the globe.