The slides used at today's annual Web standards conf had a hundred pages, essentially a full report. They were made with W3C's official slide tool, b6plus.js, just one HTML file, rich in Web standards and accessibility. If you missed the Zoom meeting, you can just view: tpac2025.webspatial.dev
Posts by Dexter Yang α― γπππ§π»ββοΈπΎ
That falls under accessibility: provide downgraded alternatives, like head movement for eye gaze, or ray-based hand tracking. Another option is going back to classic XR controllers (like AVP + PS VR2 controller)
According to visionOS interaction guidelines, far eyeβhand input = indirect interaction, near touch input = direct interaction, both are "natural interactions." The OS handles them and maps both to the same spatial gesture events, so developers can write one set of code to support both.
3. Multiple videos or livestreams can be watched and interacted with at the same time, including cross-room interactions.
I designed a TikTok for Vision Pro using Sora.
Features:
1. Each "video" goes beyond 2D screens and frames, it can be made of volumetric entities in real space.
2. You can interact across spaces with livestreams, like tossing gifts into them as if they were real objects.
Me with AI glasses in the Cyberpunk 2077 world (I made three versions with Sora 2 spliced together: Corpo, regular, and a non-2077 TV-style one).
P.S. The third (MR goggles) is actually the most realistic, practical, and technically feasible type of multimodal AI wearable.
I submitted a breakout session at W3C's annual global tech conference (TPAC 2025) this November. The topic is "WebSpatial API for Spatialized HTML/CSS and Spatialized PWAs on spatial and multimodal AI devices"!
github.com/w3c/tpac2025...
What OS architecture and development paradigm shifts follow?
Which New Web APIs are essential for the mainstream Web to get a boarding pass to such devices?
Why are client-side AI agents totally different from server-side AI agents yet just as important?
Why will client-side AI agents drive hands-free wearable multimodal devices to become mainstream platforms?
What new app capabilities are required on these devices?
we need new Web APIs like WebSpatial to bring the new XR development paradigm to the Web. Only then can web-based solutions have a real shot at replacing native apps in many scenarios, especially in AI-related use cases and those that blend with mobile real-world environments.
It also breaks away from mainstream web development, missing out on another big Web advantage (the significantly larger pool of web developers compared to 3D developers).
Just like native spatial apps on visionOS don't prioritize using 3D engines but instead adopt 2D frameworks like SwiftUI,
This kind of heavy, all-in immersive experience not only fails to leverage Web's strengths (like URL-based on-demand access) and just repeats the limitations of native XR apps (where each app creates its own isolated space instead of blending into the existing space).
WebXR sessions mimics traditional native XR apps, they take over the whole spatial environment and both eyes' views. That means they cannot coexist within the same space alongside other 2D/3D apps (2D overlays don't count), and they cannot even coexist with their own webpage content.
Then step into the shadows. Join us. We're recruiting in San Jose: jobs.bytedance.com/en/mobile/po...
I was the only one at the table drinking Diet Coke. To my right: SDK/Runtime gurus and a Silicon Valley leader. To my left: browser engine experts and QA veterans. I've decided to call our gang the Spatial Syndicate.
You craving spatial power on the Web too? π
it's still a 3D engine that uses WebGL to render everything - 2D and 3D - on the canvas or dual-eye screens. It doesnβt support visionOS's Shared Space. Its development mindset is based on 3D graphics APIs - unfamiliar to most web developers who think in terms of 2D GUIs. See the doc for details.
I like R3F. But on one hand, R3F is like React Native - uses React-like syntax but isn't real React, so it's disconnected from the regular HTML/CSS-based React code used on standard websites. It needs its own ecosystem and can't integrate with the mainstream React world.
On the other hand,
How can Web Apps have this kind of capability too, and achieve the UI demonstrated in my first three screenshots: bsky.app/profile/dext...
where the TikTok camera interface becomes the default Home screen. App distribution wouldn't just rely on icon grids, but also come from the context of the live environment (as part of multimodal input). Unless glasses or lightweight headsets completely replace phones before that happens.
The physical screen's background is inevitably trending toward dynamic content - XR headsets also use solid, opaque screens with cameras, so many of their functions are also possible on smartphones.
Its final form might look like a "TikTok OS,"
After the iPhone GUI switched to Liquid Glass, the Home/Launcher screen background could be a live camera feed instead of a static wallpaper, making the phone feel almost transparent (though sadly, they didn't roll out that feature this time).
So the Glass Material / Liquid Glass not only affects where GUI software can be used and what it can do, but also impacts the whole system architecture and how apps are developed.
Android XR can't really do this. It only performs alpha blending. The same limitation keeps its Home Space from mixing 2D and 3D apps the way visionOS Shared Space does (bsky.app/profile/dext...).
- pre-raster layout info in browser engines, ECS data in 3D engines, node graph data for building shaders - so the OS can render all apps and environments together (bsky.app/profile/dext...).
Every GUI pixel must be rendered in real time based on whatever is sitting behind it, following both environmental changes and the user's move (like the eyes move).
Consequently, an app can no longer rasterize itself and pass a finished GPU frame to the OS. It must expose higher-level GUI data -
In spatial computing, a GUI's background is the actual physical space around you, whose brightness, color, and style never stop changing. Fixed solid bg colors, or even the usual light and dark modes (they are also a limited fix for helping the GUI adapt to different environments), are inadequate.
Web UI in WebSpatial quick example
Spatial UI example in WebSpatial's Techshop demo
Web UI in WebSpatial quick example
Glass material
I have spent over a year working with the "Glass Material" UI, so let me stick up for Liquid Glass: do not dismiss it. It's not just about looks, It is a big, necessary fundamental shift as GUI software leaves flat screens and adapts to spatial computing so they can work with multimodal AI and AR/MR
Don't miss this talk at AWE on June 11 - we'll showcase some WebSpatial app demos.
ποΈ awexr.com/usa-2025/age...
WebSpatial is an open-source React SDK that lets you turn regular HTML/CSS-based websites into spatial apps on Vision Pro.
π» github.com/webspatial/w...
What if the web could be spatialized too?