Advertisement · 728 × 90

Posts by Pushakar Gaikwad

Post image

Qwen Image Edit Character Consistency
While some hosted versions show very poor character consistency, the official version from Qwen Chat performs quite well.
We will need to wait for the QQUF or low-VRAM version to test its local performance.
#qwen #qwen_image #ai

7 months ago 1 1 0 0
Post image

Qwen Image Edit Virtual Clothing Try-On
The virtual clothing try-on feature works consistently when prompted correctly.
By using a stitched image of the person and clothing as a single input
#Qwen #AI #QwenImage

7 months ago 0 1 0 0
Preview
GitHub - pushakargaikwad/project_extras: Tweaks to make Projects App of Frappe ERPNext more friendlier Tweaks to make Projects App of Frappe ERPNext more friendlier - pushakargaikwad/project_extras

Just released a Frappe ERPNext app with some handy project management tweaks.

First feature: Task dependencies can now be set across projects : pick dependent tasks from any project, not just the current one.

#frappe #erpnext #projectmanagement
Check it out: github.com/pushakargaik...

7 months ago 0 0 0 0
Post image

About one word per second response speed on 8 GB VRAM.
20B model, after all. 😅

8 months ago 0 0 0 0

Lets go OpenAI gpt-oss !!
#ollama #openai

8 months ago 0 0 1 0
Preview
gpt-oss - a openai Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases.

huggingface.co/collections/...

8 months ago 0 0 0 0
Post image

Holy Sh!t This is not a drill.
OpenAI releases open-source reasoning, agentic model.

- 120B and 20B variants
- Apache 2.0 license !!!
#openai #opensource #chatgpt

8 months ago 1 1 1 1

If you want an open-source alternative to Google Genie 3, these folks are building it.
I saw WAN 2.1 used somewhere, so future versions may be optimized for consumer hardware.
#genie3 #opensource
stdstu12.github.io/YUME-Project/

8 months ago 1 1 0 0
Preview
Qwen-Image in ComfyUI: New Era of Text Generation in Images! High-Fidelity Text Rendering Meets Modular Local Workflows!

blog.comfy.org/p/qwen-image...

8 months ago 0 0 0 0
🚀🚀Qwen Image [GGUF] available on Huggingface

GGUF version
www.reddit.com/r/StableDiff...

8 months ago 0 0 1 0
Advertisement

Prompts from the technical report qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Q...

8 months ago 0 0 1 0
Post image
8 months ago 0 0 1 0
Post image Post image

Higher steps (30) makes text better

8 months ago 0 0 1 0
Post image
8 months ago 0 0 1 0
Post image

GGUF speeds things up and is more manageable in low VRAM
Qwen Image in ComfyUI. Now re-creating some of the examples on 8GB VRAM.

8 months ago 0 0 1 0
Post image Post image

Qwen Image works in ComfyUI 🥳
Apache 2.0
SOTA opensource text rendering
Initial loading and offloading time on low VRAM were too high.
#Qwen #OpenSource

8 months ago 1 1 1 0
Preview
Genie 3: A New Frontier for World Models Today we are announcing Genie 3, a general purpose world model that can generate an unprecedented diversity of interactive environments. Given a text prompt, Genie 3 can generate dynamic worlds...

deepmind.google/discover/blo...

8 months ago 0 0 0 0
Video

When the person walks into the puddle: My mind is melting at what it takes to make this in video game & this AI model is doing it real time
Almost photoreal
Soon no amount of money you throw at building AAA games would render as good to what can be generated, as good as real life
credit: MattMcGill_

8 months ago 0 0 1 0
Video

24fps generated in real time. Shadows, water reflections, & a realistic furry dog with responsive animations to player input
Its literally a"video" game
credit: jparkerholder
" #Genie3 feels like a watershed moment for world models:we can now generate multi-minute, real-time interactive simulations"

8 months ago 0 0 2 1
Post image

btw, had to change this line to get Chatterbox TTS running locally

8 months ago 0 0 0 0
Advertisement

Used Chatterbox TTS to clone my voice for video voice-over.
Whisper for subtitles.
Blender VSE for video editing.

End-to-end open-source pipeline in progress. 👨‍💻
#ai #genai #chatterbox #tts #blender

8 months ago 0 1 1 0
WAN 2.2: Free AI Video Generator on Your PC | Text‑to‑Video Just 8GB VRAM #aivideogenerator
WAN 2.2: Free AI Video Generator on Your PC | Text‑to‑Video Just 8GB VRAM #aivideogenerator YouTube video by Pushakar Gaikwad

Searching for how to generate unrestricted, unlimited, local free AI video with no conditions that creates realistic, smooth, high‑quality cinematic videos?
WAN 2.2 is the true open‑source Apache licensed model that answers all of this.
#aivideo #wan
youtube.com/shorts/BwMGz...

8 months ago 0 0 0 1
Preview
Faster Iteration in Image Generation in ComfyUI Using This Technique - Beginner Tip Accelerate your workflow with faster iteration in image generation in ComfyUI. Learn this proven technique to quickly refine images, reduce wait times, and achieve high-quality results using convergin...

pushakar.com/2025/06/02/f...

10 months ago 0 0 0 0
Post image Post image

Want to iterate faster and get better results in ComfyUI?
I just shared a beginner tip for a simple technique that makes image generation quick, consistent, and fun. Check out how using one, four, and ten steps affects the detail—great for beginners and pros alike.
Details in the article 👇🏻

10 months ago 0 0 1 0
Extract Audio (MP3, WAV) from Video: Convert MP4, MKV, AVI, MOV to Audio with FFmpeg - Pushakar Gaikwad How to extract audio (MP3, WAV, OGG, M4A) from any video file (MP4, MKV, AVI, MOV, and more) using ffmpeg on Linux, Windows, or Mac. Simple commands, clear examples, and a handy zsh function to automa...

pushakar.com/2025/05/30/e...

10 months ago 0 0 0 0
Post image

Created a handy zsh function to convert any video to audio (mp3, wav, etc.) with ffmpeg in seconds! 🎬🎵 Works on Linux, Windows, and Mac.
Full guide & copy-paste code in the post 👇

10 months ago 0 0 1 0
Post image

Reading JLA 1997 # 1, is that Wolverine on the left and Doom on the right?

11 months ago 0 0 0 0
Video

Much better IMO compared to first try. [BSKY might have compressed the video]

11 months ago 0 0 0 0
Advertisement
Post image

A touch of compositor to make the text Star Wars yellow

11 months ago 0 0 1 0
Post image

Realized creating an empty, parenting all logos to it, and animating the empty instead of the camera would have been simpler and provided more control for the title crawl.

11 months ago 0 0 1 0