Surface Studio

A unified projection mapping studio that combines quad warping, 13 interactive widget types, camera-based hand and face tracking, configurable TouchOSC-style actions, voice commands with Whisper STT, and LLM integration — all in a single self-contained HTML file.

Projection Mapping Hand Tracking Face Detection Voice Commands 13 Widgets TouchOSC Actions Ollama LLM Zero Install

▶ Quick Start

Serve the interactive-spaces/ folder over HTTP (e.g. python -m http.server 8000)
Open the launcher page in your browser
Click "Open Editor" to enter the full editing interface
Click "+ Quad" in the toolbar to create your first surface
Drag widgets from the palette on the left onto the canvas
Switch to Preview mode to see warped output and drag corners
Open a second window with "Open Projection" for fullscreen output
Connect a camera in Camera mode to enable hand/face tracking

Tip: The projection window updates live as you edit. Move widgets, drag corners, change properties — everything syncs instantly via BroadcastChannel.

☰ Modes

Surface Studio uses hash-based routing with three modes:

🏠

Launcher

The landing page. Two buttons: "Open Editor" and "Open Projection". Each opens in a new window.

✎

Editor #editor

Full app interface with menu bar, 3-panel layout, toolbar, and status bar. Where you build your project.

▢

Projection #projection

Fullscreen black canvas with warped quads and interactive widgets. This goes on your projector.

⚙ Architecture

The entire app is a single index.html file (~2100 lines). No build step, no dependencies beyond CDN imports for MediaPipe and Whisper.

Data Flow

Key Technologies

Homography — 4-point DLT algorithm for perspective transforms, converted to CSS matrix3d()
MediaPipe — HandLandmarker (21 landmarks) + FaceLandmarker (468+ landmarks) via WebAssembly
Whisper — Tiny English model via @huggingface/transformers, runs entirely in-browser
Ollama — Local LLM via localhost:11434 REST API with streaming responses
BroadcastChannel — Real-time cross-window sync without WebSocket

▦ Editor Layout

☰ Menu Bar

Desktop-style menu bar. Click to open, hover to switch between menus.

File

New Project Ctrl+N — Clear everything and start fresh
Open / Import Ctrl+O — Load a .json project file
Save Ctrl+S — Save to localStorage (auto-saves on every change)
Export JSON — Download the project as a .json file
Screenshot Ctrl+Shift+P — Save the canvas as a PNG

Edit

Undo / Redo Ctrl+Z / Ctrl+Shift+Z — Up to 40 levels of undo
Delete Del — Remove selected widget or quad
Duplicate Ctrl+D — Clone the selected widget with offset

View

Edit / Preview / Camera 1 2 3 — Switch canvas mode
Show Grid Ctrl+G — Toggle 60px alignment grid
Snap to Grid — Snap widget movement to 20px increments
Toggle Panels — Hide/show left and right panels

Quads

Add / Delete Quad — Create or remove surfaces
Layout: Grid / Row / Column — Auto-arrange quads in the projection space
Reset Corners — Reset selected quad to default rectangle

Widgets

Quick-add any of the 13 widget types to the selected quad.

Camera

Start / Stop Camera — Toggle webcam feed
Start Calibration — Begin the 4-point calibration flow
Show Hand Overlay — Toggle skeleton visualization
Test Grid — Project alignment grid for verification

Voice

Start/Stop Listening F5 — Toggle voice recognition
LLM Prompt Ctrl+L — Open text prompt for Ollama
Check Ollama — Test connection to local LLM

Hierarchy Panel

The left panel shows a tree of all quads and their widgets. Click to select. Each quad shows a colored dot matching its border color. Widgets are indented with a type icon prefix.

Widget Palette

Below the hierarchy, a 3-column grid shows all 13 widget types. Click any item to add it to the center of the selected quad.

Canvas Area

The center panel has three views:

Edit Mode 1

Flat view of the quad's virtual canvas. Click to select widgets, drag to move, drag the bottom-right handle to resize. Shows all widgets as labeled rectangles.

Preview Mode 2

Live warped preview showing actual widget rendering with CSS matrix3d transforms. Drag corner handles to warp quads. Drag the quad body to move the entire surface. Updates the projection window in real-time.

Camera Mode 3

Live webcam feed with MediaPipe hand skeleton and face mesh overlay. Used for calibration and monitoring tracking quality.

Inspector Panel

The right panel has three tabs:

Properties Tab

Shows editable fields for the selected item:

Quad selected: Name, color, canvas dimensions, corner coordinates
Widget selected: Type info, transform (X/Y/W/H), all type-specific properties (label, color, value, src, etc.)

Actions Tab

Configure TouchOSC-style triggers and actions for the selected widget. See the Action System section.

Camera Tab

Camera controls, calibration buttons, hand/face tracking stats, and settings toggles.

Status Bar

The bottom bar shows live status indicators:

Voice — Listening status (green dot when active)
Whisper — Model load state and readiness
Ollama — Connection status and active model name
Face — Face count and mouth state when camera is active
Cal — Calibration accuracy in pixels
Info — Quad and widget counts

▪ 13 Widget Types

▣

Button

Pressable button with label and color. Triggers: press, release. Default 160×60.

━

Slider

Horizontal slider with min/max/value. Draggable in projection. Trigger: value-change. Default 200×50.

◎

Knob

Rotary knob with canvas rendering. Drag up/down to change value. Trigger: value-change. Default 80×80.

⭘

Toggle

On/off switch with sliding thumb animation. Triggers: toggle-on, toggle-off. Default 120×40.

⬛

Big Button

Large circular/rounded button with icon support, custom background, and font size. Triggers: press, release. Default 240×240.

Readout

Numeric/text display with label, value, unit, and format. Great as an action target. Trigger: touch. Default 160×60.

Label

Static text with configurable font size and color. Trigger: touch. Default 200×40.

▨

Image

Displays an image from URL or data URI. Supports contain/cover fit modes. Drag & drop supported. Default 320×240.

▶

Video

Embedded video player with autoplay, loop, and muted options. Drag & drop supported. Default 640×360.

⌗

Web (iframe)

Live web page embed. Supports any URL including GitHub Pages. Renders as actual interactive iframe in projection. Default 640×480.

♫

Soundboard

Grid of sound trigger buttons. Configurable columns and sound sources. Trigger: press. Default 300×200.

●

Mic

Circular microphone button with recording state animation. Triggers: record-start, record-stop. Default 100×100.

</>

HTML

Raw HTML content rendered directly. Use for custom layouts, embedded content, or dynamic displays. Trigger: touch. Default 400×200.

▢ Quads & Projection Mapping

Quads are the projection surfaces. Each quad is a rectangular canvas that gets warped via 4-point homography to match a physical surface.

Working with Quads

Add: Click "+ Quad" in the toolbar or use Quads > Add Quad
Select: Click in the hierarchy panel or on the canvas
Warp: Switch to Preview mode, drag the corner handles
Move: In Preview mode, click and drag the quad body
Layout: Use Quads > Layout to auto-arrange as grid, row, or column
Delete: Select and press Del or use the inspector

Homography Math

Each quad's four corners define a perspective transform computed via Direct Linear Transform (DLT). The resulting 3×3 homography matrix is converted to a CSS matrix3d() for GPU-accelerated rendering.

Multi-quad: You can create as many quads as needed. Each has its own virtual canvas (default 1920×1080), color, and independent corner positions. All quads render simultaneously in the projection window.

⚡ Action System (TouchOSC-style)

Every widget can have triggers that fire actions. This is the core interactivity system, inspired by TouchOSC's trigger/action model.

How It Works

Select a widget and go to the Actions tab in the inspector
Choose a trigger event from the dropdown (e.g., press) and click "+ Trigger"
Under the trigger, click "+ Action" to add what happens
Configure the action type, target widget, and parameters
Optionally set a voice keyword on the trigger for voice activation

Trigger Events

Widget Type	Available Triggers
Button, Big Button	`press`, `release`, `face-detected`, `face-lost`, `mouth-open`, `mouth-close`, `blink`
Slider, Knob	`value-change`
Toggle	`toggle-on`, `toggle-off`
Mic	`record-start`, `record-stop`
Readout, Label	`touch`, `face-detected`, `face-lost`, `mouth-open`, `mouth-close`, `blink`
Image, Video, HTML, Web	`touch`
Soundboard	`press`

Action Types

Set Widget Prop

Set a property on any widget. E.g., set a readout's value to 100, change a label's text, update a slider's value.

Toggle Widget Prop

Toggle a boolean property on a target widget (e.g., a toggle's on state).

Play Sound

Play an audio file by URL. Use for sound effects, alerts, or ambient audio triggered by interactions.

Send Broadcast

Send a message on a named BroadcastChannel. Use to communicate with other web apps or custom integrations.

Run Command

Execute a voice command string programmatically. E.g., add button at 100 200.

Query LLM

Send a prompt to Ollama and display the streaming response in the command log.

Example: Button Controls a Readout

Add a Button and a Readout to a quad
Select the Button, go to Actions tab
Add trigger: press
Add action: Set Widget Prop
Target: the Readout widget
Prop: value, Value: 100
Now pressing the button in projection mode sets the readout to "100"

Voice Keywords: Set a voice keyword on any trigger (e.g., "start show"). When that phrase is spoken, the trigger fires — even before checking built-in voice commands.

📎 Drag & Drop

Drop files or URLs directly onto the editor canvas:

Image files (PNG, JPG, GIF, SVG) — Creates an Image widget with the file embedded as a data URI
Video files (MP4, WebM) — Creates a Video widget with the file embedded
URLs (dragged from browser address bar or links) — Creates an iframe widget with the URL as source

Position: The widget is created at the drop location on the canvas, so you can place it precisely where you want.

🌐 Live Web Embeds

The Web (iframe) widget embeds live web pages that render at full resolution within warped quads in projection mode.

Use Cases

Embed dashboards, data visualizations, or monitoring tools
Display GitHub Pages projects as interactive surfaces
Show web apps, documentation, or any URL
Combine multiple web embeds across different quads

Setting Up

Add a Web widget from the palette
In the inspector, set the src property to your URL
The iframe renders live in projection mode with script/form support

CORS note: Some sites block iframe embedding via X-Frame-Options. GitHub Pages, personal sites, and most web apps work fine. Sites like Google or Twitter will not load in iframes.

📷 Camera Setup

Surface Studio uses your webcam for two things: hand tracking (touch interaction on projected surfaces) and face tracking (expression-based triggers).

Starting the Camera

Switch to Camera mode (3) — camera starts automatically
Or use Camera > Start Camera from the menu
Grant browser permission when prompted
The camera runs in the background even when viewing Edit/Preview

First load: MediaPipe models (~5-10MB) are downloaded from CDN on first use and cached by the browser.

✋ Hand Tracking

MediaPipe HandLandmarker detects up to 2 hands with 21 landmarks each, running on your GPU via WebAssembly.

How Touch Works

The index fingertip (landmark 8) position determines the cursor location
Touch detection uses two signals:
- Finger extended (tip above PIP joint)
- Z-depth < -0.05 (finger pressing toward camera)
When both conditions are met, a touch event fires
After calibration, camera coordinates are transformed to projector space via homography

Landmark Map

Camera View Overlay

When enabled, the camera view draws:

Green skeleton — Bone connections between landmarks
Green dots — Each landmark position
Pink dot — Index fingertip (landmark 8), larger than others
Stats text — Hand count and face count in the top-left corner

😶 Face Tracking

MediaPipe FaceLandmarker detects up to 2 faces with 468+ landmarks each, enabling expression-based interactions.

What's Detected

Face Presence

Detects when faces appear or disappear. Fires face-detected and face-lost triggers on widgets.

Mouth Open/Close

Measures upper-to-lower lip distance relative to face height. Fires mouth-open and mouth-close triggers.

Eye Blinks

Eye Aspect Ratio (EAR) detects when both eyes close and reopen. Fires blink trigger. Tracks total blink count.

Head Pose

Compares nose tip position to face center to estimate looking direction: left, right, up, down, or center.

Face Overlay

The camera view draws face landmarks in cyan/aqua:

Jawline contour — Full face silhouette
Eyebrows — Left and right brow arcs
Lips — Outer lip contour
Nose bridge — Center vertical line
Eye centers — Larger dots using iris landmarks
Key points — Nose tip, eye corners, lip centers

Expression Triggers

Use face events as triggers in the Action System. Example use cases:

Open mouth to trigger a sound effect
Blink to cycle through slide content
Face detection starts an ambient animation, face lost pauses it
Head pose direction controls which quad is highlighted

Performance: Hand and face tracking run simultaneously on the GPU. On modern hardware, both maintain 30+ FPS. The camera inspector shows live FPS stats.

🎯 Calibration

Calibration maps camera coordinates to projector coordinates so hand touches land on the right widgets.

4-Point Calibration Flow

Open the projection window on your projector
Point your camera at the projected surface
Go to Camera > Start Calibration (or say "calibrate")
The projection shows dot 1 (top-left) with a pulsing ring animation
In the camera view, click where you see dot 1
Dot 1 turns green (confirmed), dot 2 appears
Repeat for all 4 corners: TL → TR → BR → BL
System computes homography and reports reprojection error
Status bar shows permanent "Cal: 2.1px" accuracy badge

Calibration Quality

< 5px — Excellent. Touch will be very accurate.
5-10px — Good. Adequate for buttons and large widgets.
> 10px — Warning shown. Consider redoing calibration. Make sure dots are clearly visible and click precisely.

Tips for accuracy: Keep the camera steady and perpendicular to the surface. Ensure good lighting so the projected dots are clearly visible in the camera feed. Click the exact center of each dot.

Persistence

Calibration data is saved to localStorage and restored on reload. Use Camera > Reset Calibration to clear it.

👆 Touch Interaction

After calibration, your hand becomes a touch controller for the projected surface.

Interaction Model

Hover: Point your index finger at the surface — cursor follows
Press: Push your finger toward the surface (z-depth decreases) — triggers touch-start
Release: Pull finger back — triggers touch-end

Widget Interactions

Widget	Touch Behavior
Button / Big Button	Press visual feedback + fires `press` trigger
Slider	Drag horizontally to change value
Knob	Drag up/down to change value
Toggle	Tap to switch on/off
Soundboard	Tap individual sound buttons
Mic	Tap to start/stop recording indicator

Touch Cursor

The projection window shows a circular cursor that follows the tracked fingertip. It shrinks and fills when pressing, giving clear visual feedback.

🎤 Voice Commands

Press F5 to start voice recognition. Whisper STT processes 3-second audio chunks locally in your browser.

Built-in Commands

Command	Example	Description
`add [type]`	"add button at 200 300 labeled Play"	Add a widget at position with optional label
`add quad`	"add quad named Display"	Create a new quad
`select [name]`	"select play button"	Select a widget or quad by name
`move [x] [y]`	"move to 500 400"	Move selected widget
`resize [w] [h]`	"resize 300 200"	Resize selected widget
`set color [color]`	"set color red"	Change widget color (named colors or hex)
`delete`	"delete"	Delete selected widget/quad
`duplicate`	"duplicate"	Clone selected widget
`layout [mode]`	"layout grid"	Auto-arrange quads (grid/row/column)
`save`	"save"	Save project
`calibrate`	"calibrate"	Start calibration flow
`start/stop camera`	"start camera"	Toggle webcam
`ask [question]`	"ask suggest a layout"	Query Ollama LLM

Priority: Widget voice keywords are checked first, then built-in commands. This lets you override or extend the command vocabulary per project.

Whisper STT

Speech recognition uses Xenova/whisper-tiny.en running entirely in your browser via @huggingface/transformers.

Model size: ~40MB, downloaded once and cached
Processing: 3-second audio chunks at 16kHz
Silence detection: Chunks below energy threshold 0.005 are discarded
Language: English only (tiny.en model)

The status bar shows download progress during first load.

Ollama LLM Integration

Connect to a local Ollama instance for AI-powered assistance.

Setup

Install Ollama and pull a model: ollama pull llama3
Make sure Ollama is running on localhost:11434
Surface Studio auto-detects available models (prefers llama3, then mistral)
Status bar shows connection status and model name

Usage

Menu: Voice > LLM Prompt Ctrl+L
Voice: Say "ask" followed by your question
Action: Use query-llm action type on widget triggers

Responses stream into the command log in real-time.

Figurate API Key

Surface Studio includes a built-in place to store your Figurate key for voice and image-description/analyze integrations.

Open Help > Add Figurate Key...
Paste your key in the prompt. Figurate keys start with fg_
Click OK to save, or clear the field to remove the key

Storage: The key is saved locally in this browser under surface-studio-figurate-key. It is not written into exported project JSON.

Use this when you want Surface Studio or related integrations to call Figurate-backed voice, narration, image description, or analyze endpoints without re-entering the key each session.

Voice Keywords

Any trigger can have a voiceKeyword — a phrase that activates the trigger when spoken.

Example

Add a Button widget labeled "Start Show"
Go to Actions tab, add a press trigger
Set voice keyword to start show
Add actions under that trigger (e.g., set readout value, play sound)
Now saying "start show" fires all those actions

⌨ Keyboard Shortcuts

Key	Action
`1`	Edit mode
`2`	Preview mode
`3`	Camera mode
`Ctrl+N`	New project
`Ctrl+O`	Open / Import
`Ctrl+S`	Save
`Ctrl+Z`	Undo
`Ctrl+Shift+Z`	Redo
`Ctrl+D`	Duplicate widget
`Ctrl+G`	Toggle grid
`Ctrl+L`	LLM prompt
`Ctrl+Shift+P`	Screenshot
`Del` / `Backspace`	Delete selected
`F5`	Toggle voice listening
`H`	Projection HUD: show / hide controls
`M`	Projection HUD: toggle minimal view
`C`	Projection HUD: toggle cursor
`G`	Projection HUD: toggle test grid
`F`	Projection HUD: toggle fullscreen
`Escape`	Deselect all

💾 Save & Export

Auto-save: Every change is debounced (500ms) and saved to localStorage
Manual save: Ctrl+S or File > Save
Export: Downloads the full project as a .json file
Import: Load a previously exported .json file
Undo/Redo: 40-level undo stack, survives within the session
Calibration: Saved separately in localStorage, persists across sessions

📡 BroadcastChannel API

The editor and projection windows communicate via BroadcastChannel('surface-studio').

Message Types

Type	Direction	Description
`project-update`	Editor → Projection	Full project state + test grid flag
`widget-update`	Both directions	Single widget props update
`widget-activated`	Projection → Editor	Widget interaction event
`touch`	Editor → Projection	Hand tracking cursor position
`touch-start`	Editor → Projection	Finger press detected
`touch-end`	Editor → Projection	Finger released
`touch-visibility`	Editor to Projection	Shows or hides the projected touch cursor when tracking changes

Custom integration: Use the send-broadcast action type to send messages on any channel. External web apps can listen on the same channel to react to Surface Studio events.

⚠ Troubleshooting

Camera not starting

Ensure you're serving over HTTP (not file://). Camera requires secure context.
Check browser permissions — click the lock icon in the address bar
Try a different browser (Chrome recommended for MediaPipe)

MediaPipe models not loading

Models are fetched from CDN on first use. Check your internet connection.
If behind a firewall, ensure cdn.jsdelivr.net and storage.googleapis.com are accessible

Whisper not transcribing

First load downloads ~40MB. Watch the status bar for progress.
Speak clearly and at normal volume
Silence detection threshold is 0.005 RMS — very quiet speech may be discarded

Ollama not connecting

Ensure Ollama is running: ollama serve
Check CORS: Ollama 0.1.24+ allows localhost by default
Verify models: ollama list

Projection window not updating

Both windows must be on the same origin (same host:port)
BroadcastChannel requires same-origin — no cross-port communication
Try closing and reopening the projection window

Iframe not loading

The target site may block iframes via X-Frame-Options or CSP
GitHub Pages, personal sites, and most web apps work fine
Test with a simple URL like https://example.com first

Calibration inaccurate

Keep the camera steady and avoid parallax
Click the exact center of each calibration dot
Ensure the camera can see the full projected area
Redo calibration if you move the camera or projector

Surface Studio is part of the Interactive Spaces toolkit. Built with vanilla HTML, CSS, and JavaScript. No build tools required.

Surface Studio

▶ Quick Start

☰ Modes

⚙ Architecture

Data Flow

Key Technologies

▦ Editor Layout

☰ Menu Bar

File

Edit

View

Quads

Widgets

Camera

Voice

Toolbar

Hierarchy Panel

Widget Palette

Canvas Area

Inspector Panel

Properties Tab

Actions Tab

Camera Tab

Status Bar

▪ 13 Widget Types

▢ Quads & Projection Mapping

Working with Quads

Homography Math

⚡ Action System (TouchOSC-style)

How It Works

Trigger Events

Action Types

Example: Button Controls a Readout

📎 Drag & Drop

🌐 Live Web Embeds

Use Cases

Setting Up

📷 Camera Setup

Starting the Camera

✋ Hand Tracking

How Touch Works

Landmark Map

Camera View Overlay

😶 Face Tracking

What's Detected

Face Overlay

Expression Triggers

🎯 Calibration

4-Point Calibration Flow

Calibration Quality

Persistence

👆 Touch Interaction

Interaction Model

Widget Interactions

Touch Cursor

🎤 Voice Commands

Built-in Commands

Whisper STT

Ollama LLM Integration

Setup

Usage

Figurate API Key

Voice Keywords

Example

⌨ Keyboard Shortcuts

💾 Save & Export

📡 BroadcastChannel API

Message Types

⚠ Troubleshooting

Camera not starting

MediaPipe models not loading

Whisper not transcribing

Ollama not connecting

Projection window not updating

Iframe not loading

Calibration inaccurate