Advertisement · 728 × 90

Posts by Bilawal Sidhu

Post image

Eric Schmidt got a standing ovation from the TED audience this morning.

An absolute pleasure to interview him on the red circle.

We dove into the big questions—superintelligence, national strategy, open source, and what it means to be human in the age of AI.

One for the books.

1 year ago 4 0 0 0
Post image Post image

TikTok ban imminent, yet funny how things change.

>2020: Stressed about TikTok drama at 120K subs.

>2024: Sitting at 994K and completely unfazed.

Ban it? Cool, I’ll build elsewhere. Keep it? Roger that, I’ll double down.

The game is bigger than any one app. Who cares about vanity metrics.

1 year ago 6 1 1 0
Post image Post image

Merry Christmas y’all! 🎄

Pictured: 3d scan vs. ground truth of the feast to follow

1 year ago 8 0 0 0
Post image Post image

Omnidirectional 3D video of reality — damn near teleportation in a VR headset.

This $17,000 VR camera released in 2017 was ahead of its time. 17 cameras → cloud stitching → 8K x 8K stereo VR video.

The moment is ripe for a new 4d capture rig optimized for dynamic 3d gaussians. Anyone building one?

1 year ago 14 0 1 0
Preview
Seeing art in a new way: VR tools let characters jump right in We all know you’re not supposed to touch the pieces of art in a museum, but what if you could jump inside them? In YouTube creator SoKrispy’s latest VR video, Do Not Tou…

JUMP VR tools (2017): blog.google/products/goo...

1 year ago 2 0 0 0
Preview
Stereo4D: Learning How Things Move in 3D from Internet Stereo Videos Use stereo videos from the internet to create a dataset of over 100,000 real-world 4D scenes with metric scale and long-term 3D motion trajectories.

It's one of those through lines when tackling a timeless mission like mapping the world or spatial computing - VR content created for immersion becoming the foundation for teaching machines to understand how the world moves. Sometimes innovation chains together in unexpected ways! stereo4d.github.io

1 year ago 2 1 1 0
Post image

And given we're dealing with real stereoscopic content, results are notably better than synthetic data, giving you a faithful rendition of the real-world with a diverse set of subject matter.

1 year ago 1 0 1 0
Post image

They're using it to train this model called DynaDUSt3R that can predict both 3D structure and motion from video frames. Which means it tracks how objects move between frames while simultaneously reconstructing their 3D shape.

1 year ago 2 0 1 0
Advertisement
Post image

It was always clear that stereo datasets would be valuable -- and we launched some cool VR tools with it back in 2017 (link below). But the game changer now in 2024 is the scale -- they're providing 110K clips :-) That's the kind of massive, real-world dataset that was just a dream in those days!

1 year ago 2 0 1 0
Video

Check out this Stereo4D paper from Google DeepMind. It's a pretty clever approach to a persistent problem in computer vision -- getting good training data for how things move in 3D. The key insight is using VR180 videos -- those stereo fisheye videos we launched back in 2017 for YouTubeVR 🧵

1 year ago 13 2 1 0
Post image

The future isn't just virtual or augmented – it's ambient and intelligent

The Google XR unlocked event in NYC

1 year ago 2 0 0 0

5. Image to video (remix) feature is cool, but CLEARLY needs UI like Kling/Runway motion paint so it isn’t a chaotic mess / constant game of slot machine AI

Will be interesting to do head to head comparisons with US and Chinese models Sora goes live.

1 year ago 2 0 0 0

3. Physics still very wonky (no magic fix yet) – rhino is moving all across the ground; phones appear/disappear like it’s a magician

4. Wow is there a lot of news footage in the training data – generated night time grainy footage is no problem at all

1 year ago 1 0 1 0

1. Sora is VERY good at generating high frequency detail (video doesn’t seem blurry at all) – it’s the most impressive quality to me

2. As expected, Sora is great at well imaged landmarks – AI’s ability to generate custom “stock” footage remains promising

1 year ago 1 0 1 0
Video

MKBHD dropped his OpenAI Sora review (after a week of testing) the much hyped AI video model.

5 immediate observations:

1 year ago 2 0 1 0
AI Just Changed 3D Forever: Genie 2, World Labs, CAT4D
AI Just Changed 3D Forever: Genie 2, World Labs, CAT4D YouTube video by Creative Tech Digest

The future of 3D AI took some serious leaps -- from single images to fully interactive, dynamic 3D worlds. Here's what's cooking at the cutting edge: youtu.be/T7bcYSSSC6s

1 year ago 2 0 0 0
Video

Wav2lip can FINALLY rest in peace. Being able to retarget the facial performance of characters in *existing* live action & CG video makes Act-One an extremely useful tool for all types of creators.

Nicely done RunwayML!

1 year ago 2 1 0 0

the entire bay area quaked hearing that chatgpt pro is gonna cost $200/month

1 year ago 4 0 0 0
Advertisement

Very cool! Would love to see a workflow breakdown

1 year ago 0 0 0 0
Preview
Genie 2: A large-scale foundation world model Generating unlimited diverse training environments for future general agents

The race for building the biggest, baddest world model is very much on. Meanwhile, all I can think is "if only Stadia was still around!"

Check out the various results (and some fun outtakes) below: deepmind.google/discover/blo...

1 year ago 0 0 0 0

Not quite ready for prime time, but promising on two fronts:

1. For game developers: enabling rapid prototyping of interactive experiences straight from concept art

2. For AI research: providing unlimited, diverse 3D environments for training and testing AI agents

1 year ago 2 0 1 0

Right now Genie 2 can generate consistent worlds for up to a minute. And this world model seems to generate larger 3D worlds than what World Labs showcased yesterday. Plus they're dynamic vs. static worlds – the foliage moves in the wind, the water ripples etc.

1 year ago 3 0 1 0
Video

Imagine making 2D concept art for a game world –pressing a button – and suddenly you can walk around an interactive 3D world. That's what Google DeepMind's new paper Genie 2 can do – simulate virtual worlds, including the consequences of any action (e.g. unlock door, jump, swim etc).

1 year ago 8 0 2 0

It's the same reason people browse Zillow houses or watch shows about mansions. AI or not — software reviews simply don't hit the same.

1 year ago 1 0 1 0

Observed: All mega popular tech creators focus on hardware — there's no MKBHD for software. It's literally called "Unbox Therapy" for a reason. Even if people won't buy the devices, there's something about vicariously living through that tech review experience.

1 year ago 2 0 1 0
Advertisement
Video

Tencent’s open weights Hunyuan Video 13B model looks impressive — oh, and image-to-video and facial performance? They’re coming too.

If 2024 was the year open-source LLMs caught up with closed-source AI — 2025 will be the year open-source video catches up.

1 year ago 6 1 1 0
Video

World Labs first demo dropped, and it’s consistent 3D worlds from a single 2D image.

Decent volume size to move around in — def a big step up from the RGB + depth 360 environments we’re used to e.g. Blockade Labs.

Stylized results look good; i’d love to see more photorealistic AI generations!

1 year ago 11 2 1 1

Google 2.5D temporal data, very nice.

1 year ago 6 1 0 0
Video

Augmented reality x-ray vision to “see through” concrete.

Your infrastructure won’t just be scanned — it’ll be anchored to reality.

Demo: Pix4D reality capture with precise geospatial localization.

1 year ago 4 1 0 0

Video compression is pretty bad compared to X and Threads too

1 year ago 3 0 0 0