XVI Robotics Why this name?
CONFIDENTIAL · SERIES ANGEL+ 2026
Agent-Native · Humanoid Foundation Model

Agent-Native
Universal Humanoid Foundation Model

Agent · VLM · WBC — a unified three-layer architecture,
the only path bridging digital intelligence and physical intelligence.

FOUNDED
2025.12
TEAM
10 + N Agents
RAISE
$1520M
STAGE
Angel++
THE CORE BOTTLENECK

Embodied AI's core bottleneckno general-purpose brain

Hardware is mature enough, but the industry has no universal brain — the ecosystem can't take off.
Why? Two hidden structural fractures.

02 · PROBLEM
ISSUE 01 · DATA
01

Data has never really Scaled

LLMs unlocked Scaling Law with internet-scale text;
embodied data has never reached that scale —
teleop is expensive, low coverage, no scalable data engine.

COST/UNIT
HIGH
COVERAGE
LOW
SCALING LAW
N/A
ISSUE 02 · TWO WORLDS
02

Digital and physical R&D are siloed

Agents and LLMs evolve at warp speed in the digital world;
robots evolve in isolation in the physical world.
But Physical AGI = Digital AI + Physical AI.

DIGITAL
LLM · Agent
PHYSICAL
Robot · WBC
What's missing: one unified brain that bridges the digital and physical worlds.
CORE TEAM · AI-NATIVE COMPANY

Core Team · An Agent-Driven AI-Native Company

Full-stack coverage: LLMs, agents, motion control, robot hardware — 10 humans + N AI agents, per-capita output far above traditional teams.

03 · TEAM
Flood Sung
FOUNDER & CEO

Flood Sung

宋鸿涌
Former Head of Post-Training / RL at Moonshot AI; deeply involved in the K-series LLMs
Hands-on with RLHF, long-chain reasoning, and Agentic Task training — Transformer-as-brain conviction
Creator of MetaBot — already battle-tested Agent-Native org paradigm
Full-stack experience across LLM · Agent · motion control · robot hardware
CORE TEAM · 4 LEADS
04
VP
YH · VP of Technology — former Head of Long-Context Post-Training at Moonshot AI; ByteDance Seed researcher
WBC
FHQ · Head of Humanoid Locomotion — Nanjing University PhD, first-author Nature Communications
NAV
WZC · Humanoid Navigation — Shanghai AI Lab postdoc, core author of InternVLA-N1
MANI
ZZA · Loco-Manipulation — Tsinghua MS, Renforce-Dynamics community lead
ORG MODEL · AGENT-NATIVE
10 + N
HUMAN + AGENTS
Agent-Native organization powered by MetaBot — per-capita output equivalent to a ≈ 50-person team.
METABOT · OPEN-SOURCE AI AGENT INFRA

MetaBot · Agent-Native Org Infrastructure

An agent framework reaching from digital to physical — the gateway to Physical AGI.

04 · INFRA
github.com/xvirobotics/metabot
MODULE 01

MetaMemory

Persistent knowledge base; agents share memory docs and HTML — org knowledge accrues automatically.
MODULE 02

Skill Hub

Agents upload and share the skills they accumulate — experience becomes reusable across the team.
MODULE 03

Agent Bus

Agents interconnect across instances — task delegation, real-time messaging, and live collaboration.
MODULE 04

T5T · Top 5 Things

Project-management skill — every agent's project has a kanban; the committee sees everything at a glance.
WHY THIS IS THE MOAT

Why this is the moat

01
Cognitive moat — only an agent-native founder builds this. Putting "agent-native infrastructure" at the very top of the company's priority list is a cognition decision, not a technical one. Teams not living in this paradigm can't even see the path.
02
The path to a fully self-evolving multi-agent organization. The org accumulates memory / skills / goals / projects like a swarm — every accumulation is the launchpad for the next step. MetaMemory + Skill Hub + Agent Bus + T5T is its skeleton.
03
The org is the testbed — iteration speed is a generation ahead. Every day we use MetaBot to validate and accelerate ourselves — an order of magnitude faster than a normal company.
04
One framework extending into the physical world — the gateway to Physical AGI. Open-sourced, claiming the Agent-for-Robotics niche.
10 + N
OUTPUT
Per-capita output ≈ 50-person team
THREE-LAYER UNIFIED STACK

An Agent-Driven Three-Layer Unified Architecture

05 · ARCHITECTURE
L1
LAYER ONE

MetaBot · Agent Layer

ALWAYS-ON
Top-level orchestration bridging digital and physical worlds
01Top-level agent orchestration: task planning · multi-step reasoning · error recovery
02MetaMemory + Skill Hub + Agent Bus + T5T
03The hub connecting digital and physical worlds
04Human supervision + self-evolution loop
L2
LAYER TWO

VLM · Vision-Language Brain

5–10 Hz
Humanoid Foundation Model · perception & decision
01 video pretrain → post-training → RL Computer Use Agent analogy ↗
02Core components: DreamVPT + IDM
03In-Context RL · learn on the fly, adapt fast to new tasks
L3
LAYER THREE

WBC · Motion Cerebellum

50–500 Hz
Whole-Body Controller · execution layer
01Controls full body 29 DOF + dexterous hands 22×2
02Trained independently in RL sim (Isaac Gym)
03Vision-aware, adapts across terrains
STACK OVERVIEW
Digital Intelligence ⇋ Physical Intelligence
L1 · AGENT MetaBot Digital World · Always-On L2 · VLM Visual-Language Brain Perception + Decision · 5–10 Hz L3 · WBC Whole-Body Control Physical World · 50–500 Hz
KEY INSIGHT
L2 and L3 connect via latent space — decision and control flow seamlessly.
KEYWORDS · CORE TECH

Core Technology Keywords

From the underlying method up to model-level capability, four keywords define XVI's brain.

06 · CORE TECH
KEYWORD 01
/method

DreamVPT

Synthetic video + visual pretraining. The brain learns physical intuition from massive "dreamed" videos.
Inverting the data pyramid ↗
KEYWORD 02
/architecture

Long Context
WholeBody VLA

Long context · vision-language-action unified — end-to-end policy covering full body and dexterous hands.
Why this architecture ↗
KEYWORD 03
/learning

In-Context RL

Learn on the fly, no retraining. Policy evolves inside the task context — GPT-style few-shot.
Why this is unavoidable in the endgame ↗
KEYWORD 04
/capability
VLM WBC ×

Compositional
Generalization

VLM × WBC compositional generalization — WBC covers full-body motion, VLM perceives and understands the world, their product covers everything.
Why A × B = everything ↗
GENERAL FOUNDATION · UNIVERSAL HUMANOID FM

Same playbook as the LLMs · Benchmark-Driven

We're building a universal humanoid foundation model — not a vertical solution. Same scaling as LLMs, same benchmark grind.
Every public humanoid benchmark — indoor, outdoor, manipulation, navigation, single-step, long-horizon — we aim to top all of them. General capability proven by hard evidence.

07 · GENERAL
PUBLIC BENCHMARKS · FULL COVERAGE
DOMAIN 01

Indoor Manipulation

Home · office · lab — grasping, placement, tool use
DOMAIN 02

Outdoor Locomotion

Complex terrain · dynamic environments · long-range autonomous navigation
DOMAIN 03

Bimanual Coordination

Symmetric / asymmetric two-hand tasks · assembly · transport · tool handoff
DOMAIN 04

Long-Horizon Tasks

Multi-step planning · error recovery · tool-chain calls
DOMAIN 05

Human-Robot Collab

Natural language understanding · joint operation · intent inference
DOMAIN 06

Generalization

New objects · new scenes · zero-shot transfer
The general foundation is the chassis — each public benchmark is hard evidence that "we can ship anything." Not a claim — a leaderboard.
TASTE × MOAT · TARGETED BETS

Beyond general · betting on high-value physical-world scenarios

We bet on high-value physical-world scenarios — places humans can't go, won't go, or shouldn't go.
These three directions are not capability boundaries — they're resource focus. Exclusive data, exclusive scenarios, exclusive benchmarks: a moat nobody else can replicate.

08 · MARKET
PRIVATE BENCHMARKS · EXCLUSIVE SCENARIOS
MARKET 01
PRIVATE BENCHMARK

Humanoid Astronauts

Space-station inspection · lunar/Mars base construction · scientific payload deployment
COST ↓
1–2 orders
UPTIME
24 × 7
Why astronauts ↗
MARKET 02
PRIVATE BENCHMARK

Robot Hardware Engineers

Robots autonomously testing other robots — replacing human hardware engineers
ITERATION
24 × 7
COST ↓
Massive
Robots testing robots ↗
MARKET 03
PRIVATE BENCHMARK

Robot Lab Technicians

Replacing physics-experiment researchers — autonomously design and run experiments
THROUGHPUT
10×
SAFETY
HIGH
Why this is the biggest bet ↗
ANALOGY · THE TRAILBLAZER PATH

Claude is a general LLM · Anthropic bet on coding · topped SWE-bench · shipped Claude Code.

XVI is the universal humanoid foundation · betting on these three directions · each one a killer app of embodied AI.

General is the foundation · taste is the moat — both required, no conflict.

ROADMAP × BUSINESS MODEL

Model Leadership → Vertical Integration

Two-phase path — Phase 1 obsesses over the model layer to establish authority; Phase 2 launches in-house hardware toward mass-produced humanoids. Move fast first, go heavy later — never both at once.

09 · ROADMAP
PHASE 01 · MODEL-FIRST
2026 — 2027 H1 · win the model layer first
2026 · H1

Tech Validation

DreamVPT + IDM + WBC
~100h real-robot seed
core PoC running
2026 · H2

Open-Source Release

scale to 1000h data
model open-sourced · arXiv paper
DeepSeek-style playbook
2026 · Q4

Mars Demo

Ulanqab field site
In-Context RL closed loop
first public live demo
2027 · H1

Model SOTA

leading VLA benchmarks
10000h data
authority established
MODE · LASER FOCUS

10 people, all-in on the model

No hardware distraction · no proactive OEM partnerships · no commercial KPIs
OPEN SOURCE · COMMUNITY-DRIVEN

Models + papers fully open

DeepSeek playbook · community-first · let the SOTA model do the talking
AUTHORITY · MODEL LEADERSHIP

Top the benchmarks

Establish embodied VLA authority · build leverage for Phase 2 fundraise
PHASE 02 · VERTICAL INTEGRATION
From 2027 H2 · in-house hardware · mass-produced humanoids
2027 · H2

Launch In-House Hardware

funding closes → form humanoid team
supply-chain build-out
in-house roadmap locked
2028

GPT-4 Moment

embodied GPT-4 moment
full-body prototype v1
industry inflection reached
2029

Mass Production

XVI in-house humanoids ship
MARKET 01-03 first-party RaaS
data flywheel kicks in
2030

Scale Deployment

humanoids in everyday spaces
high-value first, household last
core position in the value chain
PRIMARY · MAIN BATTLE

XVI first-party humanoid RaaS

MARKET 01-03 delivered on our own humanoids · full-stack end-to-end service
DATA FLYWHEEL

In-house humanoid · 100% data ownership

Scarce high-value data flows back into the brain · model moat compounds
SECONDARY · SECOND CURVE

API licensing to OEMs

Model leadership spills over naturally · doesn't cannibalize first-party humanoid
Build model authority first · then vertically integrate — not Tesla doing both at once, not Mobileye staying out of hardware forever. We take the third path.
THE RAISE

Funding

A clear capital allocation to fuel the full arc — from PoC to open-source release to ecosystem build-out.

10 · FUNDING
ROUND · ANGEL++

Angel++ · Lead investor welcome

RAISE AMOUNT
$1520M
USD 15M – 20M
RUNWAY
18 – 24 MONTHS
MILESTONE
2026 Q4 DEMO
ALLOCATION
100%
COMPUTE · 40%
DATA · 30%
TEAM · 20%
HW · 10%
40%
COMPUTE

Compute

GPU cluster leasing — fuels large-scale RL simulation and VLM training
30%
DATA

Data

Video capture · annotation · synthetic-data generation pipeline
20%
TEAM

Team

Core hires + long-term equity incentives (ESOP)
10%
HARDWARE

Robots

Procuring humanoid platforms and dexterous hands for real-robot validation
CONTACT
floodsung@xvirobotics.com · xvirobotics.com
→ LET'S BUILD THE UNIFIED BRAIN