Constructed using Nano Banana Pro, this dataset contains 28,000 2K-resolution images tracking the gradual destruction of image content across 100 consecutive edits.
Witness the collapse firsthand.
🔗https://huggingface.co/datasets/kenantang/Banana100
#AI #ModelCollapse #NanoBananaPro #Google
🧵2/2
Posts by Kenan Tang
AI agents can lead to an irreversible de-evolution of human knowledge📉
As shown in the video, agentic models drive a cycle of decay: when they edit images iteratively, they introduce invisible noise that accumulates until quality collapse.
To quantify this decay, we built Banana100.
🧵1/2
We are recruiting postdocs @ai-ucsb.bsky.social !
With @haewonjeong.bsky.social Yao Qin
You want to lead the future of AI4Science?
Apply to UCSB Real AI For Science Initiative 🌟
Deadline: Sept 15, 2025.
This is the view you'll have from... your desk!
By @adelemyers.bsky.social
The @cvprconference.bsky.social AI Art Online Gallery 2025 is now live 🥳
Featuring 100+ artworks across aesthetics, environment and identity.
Check it out👇
thecvf-art.com
#CVPR2025 #CVPRAIart #creativeAI
🎨 Excited to share that my work is featured in the CVPR AI Art Gallery 2025!
Come and see how AI image generation can be controlled with surgical precision.
Link: thecvf-art.com/project/comp...
Thanks to @elluba.bsky.social and @cvprconference.bsky.social for hosting the event!
#CVPR2025
Flux.1 Kontext [pro] failing on an image editing task. The task is to add a backpack onto this bench. 100% failure rate. None of the prompt-based models, including gpt4o and Gemini 2.5 Pro, have succeeded on this task.
📍 Join us at ICLR tomorrow (April 24) at 10 am, Hall 3 + Hall 2B #19, for our poster on NutriBench: the first publicly available natural language meal description benchmark for nutrition estimation!
Here's our webpage: mehak126.github.io/nutribench.h...
@dongx1997.bsky.social
#ICLR #AI4health
SPICE enables complex edits like gesture adjustment, action modification, and object addition with occlusion. SPICE is compatible with major diffusion model UIs (Automatic1111/ComfyUI) and supports popular models like Flux Dev, SDXL, SD1.5, and their variants.
Our team has officially open-sourced the SPICE image editing workflow.
Paper: arxiv.org/abs/2504.09697
Code: github.com/kenantang/sp...
📢 Open Source & Collaboration:
- Full GitHub release this April.
- Early testing available now.
Perfect for smart creative platforms and developers building advanced image processing tools. DM for early test access! 🚀
#AI #ImageEditing #OpenSource #SPICE
Example scenario: Adding a backpack onto a bench. Traditional methods face spatial errors and distortion. SPICE uses a two-stage denoising method to ensure exact object placement every time.
4/ Resolution limits? Not here. SPICE natively handles any resolution—4K, vertical screens, ultra-wide—without cropping or compression. Total creative freedom.
3/ SPICE excels at spatial reasoning. With minimal user prompts, it accurately constructs complex 3D spatial relationships. Precise editing, simplified.
2/ Ever struggled with multi-step image distortion? SPICE enables ultra-long editing—100+ iterations without degradation. Say goodbye to cumulative distortion issues!
1/ SPICE supports diverse artistic styles seamlessly—photorealistic, cartoon, or any LoRA-compatible art style. True cross-style adaptability, no compromises.
🧵 Our team just introduced SPICE, a novel image editing framework that significantly outperforms GPT-4o & Gemini 2.0 in single-step editing tests.
Here’s why SPICE matters in AI-driven creative workflows: 👇
Could you further specify your question? The model needs the user to tell it what to edit in the image.
The image editing workflow we propose succeeds in editing a challenging image from Emu Edit. The prompt is “open the refrigerator door in the image”.
PD and Abomination
#AiArt #DarkestDungeon
The results in the 4 images are generated with Flux Dev. Since the pipeline is training-free, any model can be used (sdxl, etc.). Also, there is no need for a specific inpainting checkpoint. Such a checkpoint is not currently available for a wide range of base models.
Localized and precise image editing results from a pipeline we develop. Users only need to provide crude sketches and masks. No hyperparameter tuning or prompt engineering needed. All results are first shot.
Baseline: arxiv.org/abs/2402.17525
We have done developing an open-source tool that does exactly this. Please refer to my previous posts for examples. The tool will be finalized and released soon.
Six months ago someone put a for-loop around GPT-4o and got 50% on the ARC-AGI test set and 72% on a held-out training set redwoodresearch.substack.com/p/getting-50... Just sample 8000 times with beam search.
o3 is probably a more principled search technique...
Starter pack for #artists in #STEM.
#Artist s creating #scientific #visuals ( #illustration, #installation etc), and
#scientist s creating artworks (not necessarily #sciart) are welcome.
Please spread the word and let me know if you would like to be added.
#science #art
go.bsky.app/NzXHtrF
In fact, @sayash.bsky.social and I have just published an essay with them, where we play our usual role of looking at the evidence and tamping down AI hype and fears instead of playing them up.
knightcolumbia.org/blog/we-look...
(Cross-posted to AI Snake Oil aisnakeoil.com/p/we-looked-...)
🚨 Only 1 day to go! 🚨
Join us at AIM-FM: Advancements In Medical Foundation Models workshop at NeurIPS 2024!
📅 When: December 14th, 2024, 8:20 a.m. PST
📍 Where: East Ballroom A, B
🔗 Details: aim-fm-24.github.io/NeurIPS/
#AIM-FM #NeurIPS2024 #MedicalAI #FoundationModels
"Sora is a data-driven physics engine."
x.com/chrisoffner3...
I just updated the translation span annotations from our EMNLP Findings paper. Llama-3.3-70B-Instruct is a free and powerful alternative to gpt-4-0125-preview on this task.
Paper: arxiv.org/abs/2410.00988
Demo: kenantang.github.io/cjk-idioms-gpt/
#AI #LLM #NLP #Translation