I recorded a longer video of me talking about our recent product dev ideas for building a company in public, but sadly Bluesky limits videos to 3 minutes 😭. If you have the attention span for 5 minutes, you can find it here: www.youtube.com/watch?v=TFDM... #indiehackers #bootstrapping
Posts by philip
We published an overview of optimization goals, contexts and approaches that can be found in industry and academia: www.heltweg.org/posts/optim....
If you are running general-purpose data pipelines in the cloud, how can you optimize your infrastructure to make it more efficient? I could contribute to a study to find out. 🧵#databs #dataengineering
It might be, but why would you rob yourself of one of social media's few pleasures 😬
Added numbers to my French learning game, ready to make the same tired old joke everyone makes about the language. But, in positive and engineering notes, adding numbers was as easy to as just adding another CSV file. So, good for me 😊.
Few pictures can create a positive sense of fragility and dread (maybe an appreciation for life?). These are some of them.
Ah forgot to tag @jmduke.com pls talk to me senpai
The first post is live on my blog, about our current ideation and product ideas: www.heltweg.org/posts/ideat....
I always wish my feed was more interesting to read, with more honest, raw posts and less announcements and AI slob.
So, trying to be the change I want to see in the world, I will try to post more openly about building an independent business + make grammar mistakes to prove that I write myself :).
We did very narrow testing, so I can not confidently comment on that. My assumption would be that it would have lesser effects (participants less used to that style), but still some correctness improvements (off-by-one, still semantically clear what the order must be). But again, haven't tested.
Should languages for experts use spreadsheet syntax or numeric index to select cells in 2D data?
We designed a controlled experiment to test. Importantly, participants made significantly fewer mistakes when using spreadsheet syntax.
Details: www.heltweg.org/posts/is-sp....
More writeups into how to extend model capabilities, continuing with the Model-Context-Protocol/MCP from the bottom-up. Worth to look into the basics of the spec before using it :).
Another day, another write up. Today: Cache-Augmented Generation, an alternative to RAG for providing context to LLMs that works with small knowledge bases. Together with notes on in-context learning, I think this concludes me diving into this cluster of topics. Or did I miss something important?
Also: Added a word-learning extension to my conjugation game (and a recent error view), because I desperately need it 🤦 . But, motivated me to fail for 30 minutes this morning so I count that as a success.
Another write up, this time on alternatives to RAGs, especially Knowledge-Augmented Generation. This one we actually used in a project, right when the research around GraphRag came out. Very helpful to add domain knowledge, especially if a keyword based search is not really useful.
Today's data fundamentals look: FAIR data principles. Again ancient by data science/AI timelines, but good to remind myself :) rhazn.github.io/garden-ds/F...
It's kind of sad that these initiatives never really gain traction. But I guess a large part of the beauty of csv and json is that they're very basic and unstructured, but human readable. Reminds me of the simplicity of GTFS vs the more advanced standards and which one is more common 😅
A great article on data organization basics. Spreadsheets are everywhere, more data than engineers want to admit is managed in Excel and some small data hygiene can help immensely 😅: doi.org/10.1080/000...
I have the same experience and would love to read your thinking! 😅
I don't want to be rude and I like the question, but I assume this is just a social add for your tool. Happy to engage with real human accounts though, so feel free to ask under that again ;).
Day 2 of digital gardening for AI/DS, today having a look at RAGs. Main insights today from reading the original (?) NeurIPS paper by Lewis et al., getting some exposure to the academic language used in AI. I like parametric memory vs. non-parametric memory for example :). 🧵
silicon valley s2, ep10 should be required watching for everyone in an early tech startup
Bit of a vulnerable post, but I assume others feel similarly? With AI/DS, I have an intuitive understanding, but unsure about deep knowledge. Hard to know, hard to build. I want to try with digital gardening, slowly writing. Let's see where it takes me 😅. Today: Entrances. rhazn.github.io/garden-ds/
I am coming around to that realization more and more for social media as well. Better to be a cool fish in a cool pond than a generic guppy in the ocean :).
Added more verbs and an English only mode, this way I can use it + Infinitives only to learn the new verbs themselves 🥳. I wonder how effective this is, very easy to slip into a flow state for me that just pattern matches verbs... but on the other hand, quick pattern matching might be good?