#cuda hashtag - Bluesky

3 days ago

¿
#CUDA:
Use #Custom-installer option
to redirect Toolkit path to E:\.

0 0 0 0

@thekhronosgroup.fosstodon.org.ap.brid.gy

3 days ago

#CUDA 13.2 cu130
Uninstall toolkit and related drivers
via Control Panel
Look for entries labeled
#NVIDIA-CUDA or
#CUDA-Toolkit

0 0 0 0

Khronos Group

3 days ago

Join us at IWOCL 2026 for Paulius Velesko's keynote, chipStar: OpenCL as a Portability Layter for CUDA/HIP Applications

Keynote at IWOCL 2026: Paulius Velesko presents chipStar — compiling unmodified CUDA/HIP code into OpenCL & SPIR-V fat binaries that run on Intel, AMD, NVIDIA, ARM, and RISC-V hardware. No recompilation needed.

Join us at IWOCL 2026, May 6–8 in Heilbronn […]

[Original post on fosstodon.org]

2 0 0 0

3 days ago

¿
To use or set
#variables
in Windows 11 command files
(batch scripts) and to
avoid typing long pathnames for
#Python 3.14t
#Python 3.14
#CUDA 13.2
you can use
#set command
for temporary sessions
or
#setx for permanent changes

2 0 0 0

DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation Developing efficient CUDA kernels is a fundamental yet challenging task in the generative AI industry. Recent researches leverage Large Language Models (LLMs) to automatically convert PyTorch refer…

3 days ago

DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation

#Triton #CUDA #LLM

hgpu.org?p=30706

0 0 0 0

AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search Writing high-performance GPU kernels is among the most labor-intensive tasks in machine learning systems engineering. We present AutoKernel, an open-source framework that applies an autonomous agen…

3 days ago

AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search

#CUDA #Triton #Package

hgpu.org?p=30703

0 1 0 0

SearchEngine

@searchengine.activitypub.awakari.com.ap.brid.gy

3 days ago

Original post on hgpu.org

AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search Writing high-performance GPU kernels is among the most labor-intensive tasks in machine learning systems engineering...

#Computer #science #CUDA #paper #Machine #learning #nVidia #nVidia #B200 #nVidia #H100

Origin […]

0 0 0 0

Arif Solmaz

@arifsolmaz.bsky.social

3 days ago

GPUmonty harnesses CUDA power to simulate the spectacular light from material spiraling into black holes, accelerating relativistic radiative transfer 10x faster than CPU codes.

https://github.com/black-hole-group/gpumonty

#BlackHoles #CUDA #Astrophysics

0 0 0 0

PitCrew

@pitcrewgg.bsky.social

5 days ago

ICYMI: NVIDIA driver 595.58.03 released as the big new recommended stable driver for Linux

#CUDA #GeForce #Linux #LinuxGaming #NVIDIA #OpenGL #PCGaming #RTXOn #Vulkan

www.gamingonlinux.com/2026/03/nvid...

0 1 0 0

6 days ago

¤
how to verify
in window11
that
#cuda.tile(cuTile)-library
got priperly installed with
#CUDA 13.2
for headless GPU
#GeForce-RTX-5060

0 0 0 0

DragonXI AI fellow

@dragonxi.ai

6 days ago

¤
which #AI stack
for robotics development
in
#windows11
with
#CUDA 13.2
and
#python 3.14
and
#PyTorch 2.10.0

1 0 0 0

MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices? Large language models (LLMs) have demonstrated remarkable capabilities in code generation, yet their potential for generating kernels specifically for mobile devices remains largely unexplored. In …

6 days ago

MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?

#CUDA #LLM #CodeGeneration

hgpu.org?p=30695

0 0 0 0

SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits As agentic AI systems become increasingly capable of generating and optimizing GPU kernels, progress is constrained by benchmarks that reward speedup over software baselines rather than proximity t…

6 days ago

SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits

#CUDA #Triton #Benchmarking #Package

hgpu.org?p=30694

0 0 0 0

LLMQ: Efficient Lower-Precision LLM Training for Consumer GPUs We present LLMQ, an end-to-end CUDA/C++ implementation for medium-sized language-model training, e.g. 3B to 32B parameters, on affordable, commodity GPUs. These devices are characterized by low mem…

6 days ago

LLMQ: Efficient Lower-Precision LLM Training for Consumer GPUs

#CUDA #LLM #Package

hgpu.org?p=30692

0 0 0 0

LLMs

@llms.activitypub.awakari.com.ap.brid.gy

6 days ago

Original post on hgpu.org

MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices? Large language models (LLMs) have demonstrated remarkable capabilities in code generation, yet their potential for generating...

#Computer #science #CUDA #paper #Benchmarking #Code #generation #LLM #nVidia #nVidia #A100 […]

2 0 0 0