Can your database system predict underprovisining before it even happens?
Meet ◒ xBound, the very first framework for join size lower bounds. xBound tells you how many tuples your SQL query will produce *at least*.
Brought to you by @microsoft.com Gray Systems Lab & @utndatasystems.bsky.social.
Posts by Andreas Kipf
I hope you've had a great start to the year! I'm excited to announce our blog. We're kicking things off with a look back at everything that happened in 2025.
utndatasystems.github.io/blog/2025/re...
Looking forward to great discussions and catching up with everyone at VLDB!
Tuesday, 1:45 PM:
🪂 Parachute: Single-Pass Bi-Directional Information Passing
by Mihail Stoian (Research 8 — Westminster, 4F)
PDF: www.vldb.org/pvldb/vol18/...
Code: github.com/utndatasyste...
Monday, 11:45 AM:
🍫 Instance-Optimized String Fingerprints
by Mihail Stoian (AIDB Workshop — Westminster, 4F)
PDF: arxiv.org/pdf/2507.10391
Code: github.com/utndatasyste...
Our lab is excited to be presenting two papers at VLDB 2025 in London this week! 🇬🇧
🛠️ The position requires strong programming skills in C++ and Python.
We've already published early results in this space:
- Virtual, TRL @ NeurIPS'24: arxiv.org/pdf/2410.140...
- Virtual, EDBT'25 (Best Demo): openproceedings.org/2025/conf/ed...
📍 Based in Nuremberg, a vibrant and historic city in the heart of Bavaria, our lab offers a stimulating research environment and an excellent quality of life.
This fully funded position (100% E13 salary) is part of an exciting collaboration with the Machine Learning Lab, focusing on systems-oriented research at the intersection of databases and machine learning.
The Data Systems Lab is seeking a motivated PhD candidate to join our team and work on foundation models for data compression.
Off to SIGMOD 2025 in Berlin! 🚄
Here’s our schedule:
Today, 4:20 PM:
💡 Redbench: A Benchmark Reflecting Real Workloads (aiDM)
Wed, 2:00 PM:
🏆 DPconv: Super-Polynomially Faster Join Ordering
Thu, 2:30 PM:
❄️ Pruning in Snowflake: Working Smarter, Not Harder
Come say hi! 👋
Excited to share our latest paper in collaboration with Snowflake! Congratulations to my PhD student, @andizimmerer.bsky.social, on his first SIGMOD publication, and many thanks to Snowflake for their support—both in providing essential statistics and championing the work.
Fantastic news 🎖️
@mihailstoian.bsky.social will present DPconv at SIGMOD in Berlin this June.
Thrilled to share that we've received the Best Demonstration Award 🏆 at EDBT 2025!
Congratulations to my students @mihailstoian.bsky.social and Ping-Lin Kuo for their excellent work and dedication over the past few weeks—well deserved!
Paper: openproceedings.org/2025/conf/ed...
We just released Redbench, a new benchmark that contains 30 analytical SQL workloads that can be used to benchmark workload-driven optimizations. Go check it out!
GitHub: github.com/utndatasyste...
A huge thank you to our speakers and everyone who contributed to the great discussions at today’s workshop!
A big shout-out to Manisha Luthra and @matthiasboehm7.bsky.social for co-organizing—always a pleasure working together! 😊
We just announced the program. Hope to see you in Bamberg on Tuesday!
luthramanisha.github.io/ML4Sys-and-S...
Chapeau!
During their time with us, they gained insights into our research projects, particularly DataLoom and Virtual. Moreover, they had the chance to enjoy the modern furniture in our seminar rooms.
DataLoom: dl.acm.org/doi/pdf/10.1...
Virtual: arxiv.org/pdf/2410.14066
It was a pleasure to host six students from the Software Engineering Elite Graduate Master's Program (University of Augsburg, @tum.de, and @lmumuenchen.bsky.social) for a brief visit to UTN.
In addition to the 30 students in the program, researchers from UTN's AI labs joined us, keen to explore how AI could further enhance Firebolt's caching system.
Alex showcased how Firebolt caches results from arbitrary operators using a cost-benefit-based replacement policy. He also introduced Firebolt's hash join, which features a compact hash table design minimizing its memory footprint.
This week, in the Data Engineering course of the AI & Robotics Master's program at UTN, we had the pleasure of hosting Alex Hall from @firebolthq.bsky.social for an insightful talk on optimizing SQL query performance through caching and reusing subresults in repetitive workloads.
Are you a fan of Parquet and at #NeurIPS2024 tomorrow? Let's meet at our poster at @trl-research.bsky.social to see how you can reduce your Parquet file sizes by up to 40%.
Virtual compresses tables via functions while ensuring fast column scans.
⏰ 2.30pm
📍East Meeting Room 11 & 12
Come join us in Bamberg in March. The submission process is very lightweight, you just need to complete a Google Form.
Thumbnail: DataLoom: Simplifying Data Loading with LLMs
Vol:17 No:12 → DataLoom: Simplifying Data Loading with LLMs
👥 Authors: Alexander Van Renen, Mihail Stoian, Andreas Kipf
📄 PDF: https://www.vldb.org/pvldb/vol17/p4449-renen.pdf