Slides Open Editor

Surface Studio

A unified projection mapping studio that combines quad warping, 13 interactive widget types, camera-based hand and face tracking, configurable TouchOSC-style actions, voice commands with Whisper STT, and LLM integration — all in a single self-contained HTML file.

Projection Mapping Hand Tracking Face Detection Voice Commands 13 Widgets TouchOSC Actions Ollama LLM Zero Install

Quick Start

  1. Serve the interactive-spaces/ folder over HTTP (e.g. python -m http.server 8000)
  2. Open the launcher page in your browser
  3. Click "Open Editor" to enter the full editing interface
  4. Click "+ Quad" in the toolbar to create your first surface
  5. Drag widgets from the palette on the left onto the canvas
  6. Switch to Preview mode to see warped output and drag corners
  7. Open a second window with "Open Projection" for fullscreen output
  8. Connect a camera in Camera mode to enable hand/face tracking
Tip: The projection window updates live as you edit. Move widgets, drag corners, change properties — everything syncs instantly via BroadcastChannel.

Modes

Surface Studio uses hash-based routing with three modes:

🏠
Launcher
The landing page. Two buttons: "Open Editor" and "Open Projection". Each opens in a new window.
Editor #editor
Full app interface with menu bar, 3-panel layout, toolbar, and status bar. Where you build your project.
Projection #projection
Fullscreen black canvas with warped quads and interactive widgets. This goes on your projector.

Architecture

The entire app is a single index.html file (~2100 lines). No build step, no dependencies beyond CDN imports for MediaPipe and Whisper.

Data Flow

BROADCASTCHANNEL DATA FLOW Editor Window Edit quads & widgets Drag corner handles Camera feed & tracking Voice commands Projection Window Render warped quads Display live widgets Handle touch input Fire action triggers project-update widget-update touch events localStorage Auto-save project state & calibration data read/write read-only Camera MediaPipe WASM landmarks State sync Touch / Camera

Key Technologies

Editor Layout

Surface Studio File Edit View Quads Widgets Camera Voice MENU BAR My Project + Quad Edit Preview Cam TOOLBAR HIERARCHY Quad 1 ▣ Button ━ Slider # Readout Quad 2 ▨ Image ⌗ Web WIDGET PALETTE Button Slider Knob Toggle Label Image Video Web ... LEFT PANEL 170px Quad 1 (1920x1080) Button Slider # Readout Edit mode: flat widget view Preview: warped output | Camera: video feed CENTER CANVAS flex Props Actions Camera TRANSFORM X: 240 Y: 168 W: 80 H: 30 PROPERTIES Label: Play Color: #4488cc TYPE-SPECIFIC fontSize: 14 icon: none RIGHT PANEL 260px Voice Whisper: ready Ollama: llama3 Faces: 1 Cal: 2.1px 3 quads · 8 widgets STATUS BAR

Toolbar

The toolbar shows the project name (editable), a "+ Quad" button, and the canvas mode switcher (Edit / Preview / Camera).

Hierarchy Panel

The left panel shows a tree of all quads and their widgets. Click to select. Each quad shows a colored dot matching its border color. Widgets are indented with a type icon prefix.

Widget Palette

Below the hierarchy, a 3-column grid shows all 13 widget types. Click any item to add it to the center of the selected quad.

Canvas Area

The center panel has three views:

Edit Mode 1
Flat view of the quad's virtual canvas. Click to select widgets, drag to move, drag the bottom-right handle to resize. Shows all widgets as labeled rectangles.
Preview Mode 2
Live warped preview showing actual widget rendering with CSS matrix3d transforms. Drag corner handles to warp quads. Drag the quad body to move the entire surface. Updates the projection window in real-time.
Camera Mode 3
Live webcam feed with MediaPipe hand skeleton and face mesh overlay. Used for calibration and monitoring tracking quality.

Inspector Panel

The right panel has three tabs:

Properties Tab

Shows editable fields for the selected item:

Actions Tab

Configure TouchOSC-style triggers and actions for the selected widget. See the Action System section.

Camera Tab

Camera controls, calibration buttons, hand/face tracking stats, and settings toggles.

Status Bar

The bottom bar shows live status indicators:

13 Widget Types

Button
Pressable button with label and color. Triggers: press, release. Default 160×60.
Slider
Horizontal slider with min/max/value. Draggable in projection. Trigger: value-change. Default 200×50.
Knob
Rotary knob with canvas rendering. Drag up/down to change value. Trigger: value-change. Default 80×80.
Toggle
On/off switch with sliding thumb animation. Triggers: toggle-on, toggle-off. Default 120×40.
Big Button
Large circular/rounded button with icon support, custom background, and font size. Triggers: press, release. Default 240×240.
#
Readout
Numeric/text display with label, value, unit, and format. Great as an action target. Trigger: touch. Default 160×60.
Aa
Label
Static text with configurable font size and color. Trigger: touch. Default 200×40.
Image
Displays an image from URL or data URI. Supports contain/cover fit modes. Drag & drop supported. Default 320×240.
Video
Embedded video player with autoplay, loop, and muted options. Drag & drop supported. Default 640×360.
Web (iframe)
Live web page embed. Supports any URL including GitHub Pages. Renders as actual interactive iframe in projection. Default 640×480.
Soundboard
Grid of sound trigger buttons. Configurable columns and sound sources. Trigger: press. Default 300×200.
Mic
Circular microphone button with recording state animation. Triggers: record-start, record-stop. Default 100×100.
</>
HTML
Raw HTML content rendered directly. Use for custom layouts, embedded content, or dynamic displays. Trigger: touch. Default 400×200.

Quads & Projection Mapping

Quads are the projection surfaces. Each quad is a rectangular canvas that gets warped via 4-point homography to match a physical surface.

Working with Quads

Homography Math

Each quad's four corners define a perspective transform computed via Direct Linear Transform (DLT). The resulting 3×3 homography matrix is converted to a CSS matrix3d() for GPU-accelerated rendering.

Multi-quad: You can create as many quads as needed. Each has its own virtual canvas (default 1920×1080), color, and independent corner positions. All quads render simultaneously in the projection window.

Action System (TouchOSC-style)

Every widget can have triggers that fire actions. This is the core interactivity system, inspired by TouchOSC's trigger/action model.

How It Works

  1. Select a widget and go to the Actions tab in the inspector
  2. Choose a trigger event from the dropdown (e.g., press) and click "+ Trigger"
  3. Under the trigger, click "+ Action" to add what happens
  4. Configure the action type, target widget, and parameters
  5. Optionally set a voice keyword on the trigger for voice activation
ACTION SYSTEM FLOW Widget Event press, value-change, ... Trigger Matches event type voice? Action set-prop, play-sound, ... Target Widget Property is modified EXAMPLE Play Button widget press Trigger: press keyword: "play" voice: "play" Set Widget Prop prop: "value" val: "100" Readout 100 Value updated! Pressing "Play" button sets the Readout to 100

Trigger Events

Widget TypeAvailable Triggers
Button, Big Buttonpress, release, face-detected, face-lost, mouth-open, mouth-close, blink
Slider, Knobvalue-change
Toggletoggle-on, toggle-off
Micrecord-start, record-stop
Readout, Labeltouch, face-detected, face-lost, mouth-open, mouth-close, blink
Image, Video, HTML, Webtouch
Soundboardpress

Action Types

Set Widget Prop
Set a property on any widget. E.g., set a readout's value to 100, change a label's text, update a slider's value.
Toggle Widget Prop
Toggle a boolean property on a target widget (e.g., a toggle's on state).
Play Sound
Play an audio file by URL. Use for sound effects, alerts, or ambient audio triggered by interactions.
Send Broadcast
Send a message on a named BroadcastChannel. Use to communicate with other web apps or custom integrations.
Run Command
Execute a voice command string programmatically. E.g., add button at 100 200.
Query LLM
Send a prompt to Ollama and display the streaming response in the command log.

Example: Button Controls a Readout

  1. Add a Button and a Readout to a quad
  2. Select the Button, go to Actions tab
  3. Add trigger: press
  4. Add action: Set Widget Prop
  5. Target: the Readout widget
  6. Prop: value, Value: 100
  7. Now pressing the button in projection mode sets the readout to "100"
Voice Keywords: Set a voice keyword on any trigger (e.g., "start show"). When that phrase is spoken, the trigger fires — even before checking built-in voice commands.

📎 Drag & Drop

Drop files or URLs directly onto the editor canvas:

Position: The widget is created at the drop location on the canvas, so you can place it precisely where you want.

🌐 Live Web Embeds

The Web (iframe) widget embeds live web pages that render at full resolution within warped quads in projection mode.

Use Cases

Setting Up

  1. Add a Web widget from the palette
  2. In the inspector, set the src property to your URL
  3. The iframe renders live in projection mode with script/form support
CORS note: Some sites block iframe embedding via X-Frame-Options. GitHub Pages, personal sites, and most web apps work fine. Sites like Google or Twitter will not load in iframes.

📷 Camera Setup

Surface Studio uses your webcam for two things: hand tracking (touch interaction on projected surfaces) and face tracking (expression-based triggers).

Starting the Camera

First load: MediaPipe models (~5-10MB) are downloaded from CDN on first use and cached by the browser.

Hand Tracking

MediaPipe HandLandmarker detects up to 2 hands with 21 landmarks each, running on your GPU via WebAssembly.

How Touch Works

  1. The index fingertip (landmark 8) position determines the cursor location
  2. Touch detection uses two signals:
    • Finger extended (tip above PIP joint)
    • Z-depth < -0.05 (finger pressing toward camera)
  3. When both conditions are met, a touch event fires
  4. After calibration, camera coordinates are transformed to projector space via homography

Landmark Map

MEDIAPIPE HAND LANDMARKS (21 POINTS) 0 1 2 3 4 5 6 7 8 CURSOR 9 10 11 12 13 14 15 16 17 18 19 20 Finger Groups 0: Wrist Thumb 1-4: CMC MCP IP Tip Index Finger 5-8: MCP PIP DIP Tip Tip (8) = cursor position Middle 9-12: MCP PIP DIP Tip Ring 13-16: MCP PIP DIP Tip Pinky 17-20: MCP PIP DIP Tip Joint landmark Cursor PIP joint (touch ref) WRIST (0) Fingertips: 4 (thumb) · 8 (index) · 12 (mid) · 16 (ring) · 20 (pinky)

Camera View Overlay

When enabled, the camera view draws:

😶 Face Tracking

MediaPipe FaceLandmarker detects up to 2 faces with 468+ landmarks each, enabling expression-based interactions.

What's Detected

Face Presence
Detects when faces appear or disappear. Fires face-detected and face-lost triggers on widgets.
Mouth Open/Close
Measures upper-to-lower lip distance relative to face height. Fires mouth-open and mouth-close triggers.
Eye Blinks
Eye Aspect Ratio (EAR) detects when both eyes close and reopen. Fires blink trigger. Tracks total blink count.
Head Pose
Compares nose tip position to face center to estimate looking direction: left, right, up, down, or center.
FACE TRACKING DETECTION ZONES Nose Tip Head pose reference Blink Detection Eye Aspect Ratio (EAR) Mouth Open/Close Lip distance / face height mouth-open | mouth-close Iris Center Gaze tracking Jawline Face silhouette contour Face Presence Detected / Lost events face-detected | face-lost Head pose arrows: up / down / left / right Legend Blink zone Mouth zone Landmark Contour

Face Overlay

The camera view draws face landmarks in cyan/aqua:

Expression Triggers

Use face events as triggers in the Action System. Example use cases:

Performance: Hand and face tracking run simultaneously on the GPU. On modern hardware, both maintain 30+ FPS. The camera inspector shows live FPS stats.

🎯 Calibration

Calibration maps camera coordinates to projector coordinates so hand touches land on the right widgets.

4-Point Calibration Flow

  1. Open the projection window on your projector
  2. Point your camera at the projected surface
  3. Go to Camera > Start Calibration (or say "calibrate")
  4. The projection shows dot 1 (top-left) with a pulsing ring animation
  5. In the camera view, click where you see dot 1
  6. Dot 1 turns green (confirmed), dot 2 appears
  7. Repeat for all 4 corners: TL → TR → BR → BL
  8. System computes homography and reports reprojection error
  9. Status bar shows permanent "Cal: 2.1px" accuracy badge
4-POINT CALIBRATION & HOMOGRAPHY TRANSFORM Camera View Click where you see each dot 1 TL 2 TR 3 BR 4 BL DLT 3x3 matrix H Projection View Dots shown on projector 1 2 3 4 1 Projection shows dot Pulsing ring animation 2 Click in camera view Where dot appears 3 Dot turns green Confirmed, next dot 4 Homography computed Reports error in px

Calibration Quality

Tips for accuracy: Keep the camera steady and perpendicular to the surface. Ensure good lighting so the projected dots are clearly visible in the camera feed. Click the exact center of each dot.

Persistence

Calibration data is saved to localStorage and restored on reload. Use Camera > Reset Calibration to clear it.

👆 Touch Interaction

After calibration, your hand becomes a touch controller for the projected surface.

Interaction Model

Widget Interactions

WidgetTouch Behavior
Button / Big ButtonPress visual feedback + fires press trigger
SliderDrag horizontally to change value
KnobDrag up/down to change value
ToggleTap to switch on/off
SoundboardTap individual sound buttons
MicTap to start/stop recording indicator

Touch Cursor

The projection window shows a circular cursor that follows the tracked fingertip. It shrinks and fills when pressing, giving clear visual feedback.

🎤 Voice Commands

Press F5 to start voice recognition. Whisper STT processes 3-second audio chunks locally in your browser.

Built-in Commands

CommandExampleDescription
add [type]"add button at 200 300 labeled Play"Add a widget at position with optional label
add quad"add quad named Display"Create a new quad
select [name]"select play button"Select a widget or quad by name
move [x] [y]"move to 500 400"Move selected widget
resize [w] [h]"resize 300 200"Resize selected widget
set color [color]"set color red"Change widget color (named colors or hex)
delete"delete"Delete selected widget/quad
duplicate"duplicate"Clone selected widget
layout [mode]"layout grid"Auto-arrange quads (grid/row/column)
save"save"Save project
calibrate"calibrate"Start calibration flow
start/stop camera"start camera"Toggle webcam
ask [question]"ask suggest a layout"Query Ollama LLM
Priority: Widget voice keywords are checked first, then built-in commands. This lets you override or extend the command vocabulary per project.

Whisper STT

Speech recognition uses Xenova/whisper-tiny.en running entirely in your browser via @huggingface/transformers.

The status bar shows download progress during first load.

Ollama LLM Integration

Connect to a local Ollama instance for AI-powered assistance.

Setup

  1. Install Ollama and pull a model: ollama pull llama3
  2. Make sure Ollama is running on localhost:11434
  3. Surface Studio auto-detects available models (prefers llama3, then mistral)
  4. Status bar shows connection status and model name

Usage

Responses stream into the command log in real-time.

Figurate API Key

Surface Studio includes a built-in place to store your Figurate key for voice and image-description/analyze integrations.

  1. Open Help > Add Figurate Key...
  2. Paste your key in the prompt. Figurate keys start with fg_
  3. Click OK to save, or clear the field to remove the key
Storage: The key is saved locally in this browser under surface-studio-figurate-key. It is not written into exported project JSON.

Use this when you want Surface Studio or related integrations to call Figurate-backed voice, narration, image description, or analyze endpoints without re-entering the key each session.

Voice Keywords

Any trigger can have a voiceKeyword — a phrase that activates the trigger when spoken.

Example

  1. Add a Button widget labeled "Start Show"
  2. Go to Actions tab, add a press trigger
  3. Set voice keyword to start show
  4. Add actions under that trigger (e.g., set readout value, play sound)
  5. Now saying "start show" fires all those actions

Keyboard Shortcuts

KeyAction
1Edit mode
2Preview mode
3Camera mode
Ctrl+NNew project
Ctrl+OOpen / Import
Ctrl+SSave
Ctrl+ZUndo
Ctrl+Shift+ZRedo
Ctrl+DDuplicate widget
Ctrl+GToggle grid
Ctrl+LLLM prompt
Ctrl+Shift+PScreenshot
Del / BackspaceDelete selected
F5Toggle voice listening
HProjection HUD: show / hide controls
MProjection HUD: toggle minimal view
CProjection HUD: toggle cursor
GProjection HUD: toggle test grid
FProjection HUD: toggle fullscreen
EscapeDeselect all

💾 Save & Export

📡 BroadcastChannel API

The editor and projection windows communicate via BroadcastChannel('surface-studio').

Message Types

TypeDirectionDescription
project-updateEditor → ProjectionFull project state + test grid flag
widget-updateBoth directionsSingle widget props update
widget-activatedProjection → EditorWidget interaction event
touchEditor → ProjectionHand tracking cursor position
touch-startEditor → ProjectionFinger press detected
touch-endEditor → ProjectionFinger released
touch-visibilityEditor to ProjectionShows or hides the projected touch cursor when tracking changes
Custom integration: Use the send-broadcast action type to send messages on any channel. External web apps can listen on the same channel to react to Surface Studio events.

Troubleshooting

Camera not starting

MediaPipe models not loading

Whisper not transcribing

Ollama not connecting

Projection window not updating

Iframe not loading

Calibration inaccurate

Surface Studio is part of the Interactive Spaces toolkit. Built with vanilla HTML, CSS, and JavaScript. No build tools required.