Advertisement · 728 × 90

Posts by Kenan Tang

Constructed using Nano Banana Pro, this dataset contains 28,000 2K-resolution images tracking the gradual destruction of image content across 100 consecutive edits.

Witness the collapse firsthand.

🔗https://huggingface.co/datasets/kenantang/Banana100

#AI #ModelCollapse #NanoBananaPro #Google

🧵2/2

2 months ago 1 0 0 0
Video

AI agents can lead to an irreversible de-evolution of human knowledge📉

As shown in the video, agentic models drive a cycle of decay: when they edit images iteratively, they introduce invisible noise that accumulates until quality collapse.

To quantify this decay, we built Banana100.

🧵1/2

2 months ago 2 0 2 0
Post image

We are recruiting postdocs @ai-ucsb.bsky.social !
With @haewonjeong.bsky.social Yao Qin

You want to lead the future of AI4Science?

Apply to UCSB Real AI For Science Initiative 🌟
Deadline: Sept 15, 2025.

This is the view you'll have from... your desk!
By @adelemyers.bsky.social

8 months ago 7 4 1 2
Post image

The @cvprconference.bsky.social AI Art Online Gallery 2025 is now live 🥳

Featuring 100+ artworks across aesthetics, environment and identity.

Check it out👇

thecvf-art.com

#CVPR2025 #CVPRAIart #creativeAI

10 months ago 6 2 0 0
Post image

🎨 Excited to share that my work is featured in the CVPR AI Art Gallery 2025!
Come and see how AI image generation can be controlled with surgical precision.

Link: thecvf-art.com/project/comp...

Thanks to @elluba.bsky.social and @cvprconference.bsky.social for hosting the event!
#CVPR2025

10 months ago 6 2 0 0
Post image Post image Post image Post image

Flux.1 Kontext [pro] failing on an image editing task. The task is to add a backpack onto this bench. 100% failure rate. None of the prompt-based models, including gpt4o and Gemini 2.5 Pro, have succeeded on this task.

10 months ago 2 0 0 0
Preview
NutriBench: the first dataset for evaluating LLMs at carbohydrate estimation NutriBench is the first publicly available natural language meal description based nutrition benchmark.

📍 Join us at ICLR tomorrow (April 24) at 10 am, Hall 3 + Hall 2B #19, for our poster on NutriBench: the first publicly available natural language meal description benchmark for nutrition estimation!
Here's our webpage: mehak126.github.io/nutribench.h...
@dongx1997.bsky.social

#ICLR #AI4health

11 months ago 1 1 0 0
Advertisement

SPICE enables complex edits like gesture adjustment, action modification, and object addition with occlusion. SPICE is compatible with major diffusion model UIs (Automatic1111/ComfyUI) and supports popular models like Flux Dev, SDXL, SD1.5, and their variants.

1 year ago 2 0 0 0
Post image Post image Post image Post image

Our team has officially open-sourced the SPICE image editing workflow.

Paper: arxiv.org/abs/2504.09697
Code: github.com/kenantang/sp...

1 year ago 4 0 1 0

📢 Open Source & Collaboration:

- Full GitHub release this April.

- Early testing available now.

Perfect for smart creative platforms and developers building advanced image processing tools. DM for early test access! 🚀

#AI #ImageEditing #OpenSource #SPICE

1 year ago 1 0 0 0

Example scenario: Adding a backpack onto a bench. Traditional methods face spatial errors and distortion. SPICE uses a two-stage denoising method to ensure exact object placement every time.

1 year ago 0 0 1 0

4/ Resolution limits? Not here. SPICE natively handles any resolution—4K, vertical screens, ultra-wide—without cropping or compression. Total creative freedom.

1 year ago 0 0 1 0

3/ SPICE excels at spatial reasoning. With minimal user prompts, it accurately constructs complex 3D spatial relationships. Precise editing, simplified.

1 year ago 0 0 1 0

2/ Ever struggled with multi-step image distortion? SPICE enables ultra-long editing—100+ iterations without degradation. Say goodbye to cumulative distortion issues!

1 year ago 0 0 1 0
Advertisement

1/ SPICE supports diverse artistic styles seamlessly—photorealistic, cartoon, or any LoRA-compatible art style. True cross-style adaptability, no compromises.

1 year ago 0 0 1 0
Post image

🧵 Our team just introduced SPICE, a novel image editing framework that significantly outperforms GPT-4o & Gemini 2.0 in single-step editing tests.

Here’s why SPICE matters in AI-driven creative workflows: 👇

1 year ago 2 0 1 0

Could you further specify your question? The model needs the user to tell it what to edit in the image.

1 year ago 1 0 1 0
Post image Post image

The image editing workflow we propose succeeds in editing a challenging image from Emu Edit. The prompt is “open the refrigerator door in the image”.

1 year ago 1 0 1 0
Video

PD and Abomination

#AiArt #DarkestDungeon

1 year ago 1 0 0 0

The results in the 4 images are generated with Flux Dev. Since the pipeline is training-free, any model can be used (sdxl, etc.). Also, there is no need for a specific inpainting checkpoint. Such a checkpoint is not currently available for a wide range of base models.

1 year ago 1 0 0 0
Post image Post image Post image Post image

Localized and precise image editing results from a pipeline we develop. Users only need to provide crude sketches and masks. No hyperparameter tuning or prompt engineering needed. All results are first shot.

Baseline: arxiv.org/abs/2402.17525

1 year ago 3 0 1 0

We have done developing an open-source tool that does exactly this. Please refer to my previous posts for examples. The tool will be finalized and released soon.

1 year ago 0 0 0 0
Advertisement
Preview
Getting 50% (SoTA) on ARC-AGI with GPT-4o You can just draw more samples

Six months ago someone put a for-loop around GPT-4o and got 50% on the ARC-AGI test set and 72% on a held-out training set redwoodresearch.substack.com/p/getting-50... Just sample 8000 times with beam search.

o3 is probably a more principled search technique...

1 year ago 137 23 4 6

Starter pack for #artists in #STEM.
#Artist s creating #scientific #visuals ( #illustration, #installation etc), and
#scientist s creating artworks (not necessarily #sciart) are welcome.
Please spread the word and let me know if you would like to be added.
#science #art

go.bsky.app/NzXHtrF

1 year ago 26 9 10 0
Preview
We Looked at 78 Election Deepfakes. Political Misinformation Is Not an AI Problem.

In fact, @sayash.bsky.social and I have just published an essay with them, where we play our usual role of looking at the evidence and tamping down AI hype and fears instead of playing them up.
knightcolumbia.org/blog/we-look...

(Cross-posted to AI Snake Oil aisnakeoil.com/p/we-looked-...)

1 year ago 17 6 2 0
AIM-FM Workshop @ NeurIPS'24

🚨 Only 1 day to go! 🚨

Join us at AIM-FM: Advancements In Medical Foundation Models workshop at NeurIPS 2024!

📅 When: December 14th, 2024, 8:20 a.m. PST
📍 Where: East Ballroom A, B
🔗 Details: aim-fm-24.github.io/NeurIPS/

#AIM-FM #NeurIPS2024 #MedicalAI #FoundationModels

1 year ago 4 4 2 0
Preview
Who and What comprise AI Skepticism? An attempt to do justice to a diverse community

buildcognitiveresonance.substack.com/p/who-and-wh...

1 year ago 0 0 0 0
Video

"Sora is a data-driven physics engine."
x.com/chrisoffner3...

1 year ago 137 16 12 10
Post image

I just updated the translation span annotations from our EMNLP Findings paper. Llama-3.3-70B-Instruct is a free and powerful alternative to gpt-4-0125-preview on this task.

Paper: arxiv.org/abs/2410.00988

Demo: kenantang.github.io/cjk-idioms-gpt/

#AI #LLM #NLP #Translation

1 year ago 1 0 0 0
Preview
Don’t Ride This Bike! Generative AI’s persistent trouble with compositionality and parts When the text-to-image AI generation system DALL-E2 was released in April 2022, the two of us, together with Scott Aaronson, ran some informal experiments to probe its abilities.

open.substack.com/pub/garymarc...

1 year ago 0 0 0 0
Advertisement