next-state

Video-to-3D scene compiler with live AI-driven agent simulation. Upload a short video of any indoor space and watch it come alive as an interactive 3D scene populated with autonomous agents.

How it works

Video Upload → Gemini Analysis → ADK Compile Pipeline → CompiledScenePackage JSON → Three.js Simulation

Upload a short video (up to 20s) of an indoor space
Gemini 3.1 Pro analyzes the video — detects objects, people, zones, spatial layout
ADK pipeline compiles the analysis into a structured 3D scene with agents, furniture, navigation graphs
Three.js frontend renders the scene and runs a live simulation with utility-based AI agents

The server is stateless at runtime — the browser owns all live world state.

Tech stack

Layer	Choice
Frontend	Vite + React 19 + Three.js r183 + R3F v9.5
State	Zustand
3D Renderer	WebGPURenderer (auto-fallback to WebGL 2)
UI	shadcn/ui + Tailwind CSS
Backend	Node.js + TypeScript
AI	Gemini 3.1 Pro/Flash-Lite via `@google/genai`
Orchestration	`@google/adk` v0.4.0
Validation	Zod
Testing	Vitest

Getting started

Prerequisites

Node.js 20+
A Google AI API key (paid account)

Setup

# Clone and install
git clone <repo-url> && cd next-state
npm install

# Configure environment
cp .env.example .env  # or create .env manually

Add your API key to .env:

GEMINI_API_KEY=<your-key>
PORT=3001
NODE_ENV=development
CORS_ORIGIN=http://localhost:5173

Run

# Terminal 1 — Backend (port 3001)
cd server && npm run dev

# Terminal 2 — Frontend (port 5173)
cd client && npm run dev

Open http://localhost:5173 and upload a video.

Project structure

client/          React + Three.js frontend
  src/
    scene/       3D rendering (agents, environment, furniture, labels)
    store/       Zustand state management
    components/  UI panels (inspector, debug overlay, upload)
server/          Node.js API server
  src/
    adk/         Gemini ADK compile pipeline
      agents/    Structuring, style extraction
      prompts/   LLM prompt templates
shared/          Shared TypeScript types and Zod schemas

Features

Procedural 3D agents with animated walking, sitting, talking, fidgeting
Agent props — laptops, phones, cups rendered on agent bodies
Thought bubbles — pop up when agents change their intent
Click-to-inspect — click any agent to see their mind state, traits, goals
Furniture inference — deterministic infill adds chairs, tables, and venue-appropriate objects
Density-aware population — scenes auto-populate with synthetic agents based on crowd density
Style extraction — colors, materials, and lighting matched from the source video
Utility-based AI — agents use softmax sampling over utility scores for natural behavior

API endpoints

Method	Path	Description
POST	`/api/upload-video`	Upload video to Gemini Files API
POST	`/api/compile-scene`	Start ADK compile pipeline
GET	`/api/compile-progress/:jobId`	SSE stream of compile steps
GET	`/api/scene/:sceneId`	Full CompiledScenePackage JSON
POST	`/api/agent-refresh`	Sparse cognitive update
POST	`/api/intervention`	World-state mutation

Testing

cd server && npm run test
cd client && npm run test

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
client		client
docs		docs
server		server
shared		shared
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

next-state

How it works

Tech stack

Getting started

Prerequisites

Setup

Run

Project structure

Features

API endpoints

Testing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

next-state

How it works

Tech stack

Getting started

Prerequisites

Setup

Run

Project structure

Features

API endpoints

Testing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages