philip (@heltweg.org) Bsky

Building a company in public - Finding ideas YouTube video by Philip Builds In Public

I recorded a longer video of me talking about our recent product dev ideas for building a company in public, but sadly Bluesky limits videos to 3 minutes 😭. If you have the attention span for 5 minutes, you can find it here: www.youtube.com/watch?v=TFDM... #indiehackers #bootstrapping

4 hours ago 2 0 0 0

Optimization Opportunities for Cloud-Based Data Pipeline Infrastructures | Philip Heltweg

We published an overview of optimization goals, contexts and approaches that can be found in industry and academia: www.heltweg.org/posts/optim....

1 day ago 1 0 0 0

If you are running general-purpose data pipelines in the cloud, how can you optimize your infrastructure to make it more efficient? I could contribute to a study to find out. 🧵#databs #dataengineering

1 day ago 0 0 1 0

It might be, but why would you rob yourself of one of social media's few pleasures 😬

2 days ago 1 0 0 0

Added numbers to my French learning game, ready to make the same tired old joke everyone makes about the language. But, in positive and engineering notes, adding numbers was as easy to as just adding another CSV file. So, good for me 😊.

5 days ago 1 0 0 0

Few pictures can create a positive sense of fragility and dread (maybe an appreciation for life?). These are some of them.

5 days ago 0 0 0 0

Ah forgot to tag @jmduke.com pls talk to me senpai

5 days ago 0 0 0 0

Ideation and Product Ideas | Philip Heltweg I always wish my feed was more interesting to read, with more honest, raw posts and less pure announcements and AI slob. So, trying to be the change I want to see in the world, I will try to post more openly about trying to build an independent business (and at the same time make enough grammar mistakes to prove that I type this myself).

The first post is live on my blog, about our current ideation and product ideas: www.heltweg.org/posts/ideat....

5 days ago 1 0 1 0

I always wish my feed was more interesting to read, with more honest, raw posts and less announcements and AI slob.

So, trying to be the change I want to see in the world, I will try to post more openly about building an independent business + make grammar mistakes to prove that I write myself :).

5 days ago 0 0 1 0

We did very narrow testing, so I can not confidently comment on that. My assumption would be that it would have lesser effects (participants less used to that style), but still some correctness improvements (off-by-one, still semantically clear what the order must be). But again, haven't tested.

1 week ago 1 1 0 0

Is spreadsheet syntax better than numeric indexing for cell selection? | Philip Heltweg

Should languages for experts use spreadsheet syntax or numeric index to select cells in 2D data?

We designed a controlled experiment to test. Importantly, participants made significantly fewer mistakes when using spreadsheet syntax.

Details: www.heltweg.org/posts/is-sp....

1 week ago 5 0 1 0

Model Context Protocol (MCP) The Model Context Protocol (MCP) is a standard for AI agents to interact with external tools.

rhazn.github.io/garden-ds/a...)

2 weeks ago 0 0 0 0

More writeups into how to extend model capabilities, continuing with the Model-Context-Protocol/MCP from the bottom-up. Worth to look into the basics of the spec before using it :).

2 weeks ago 1 0 1 0

Cache-Augmented Generation Cache-Augmented Generation (CAG) is an alternative to Retrieval-Augmented Generation to provide additional context to Large Language Models.

rhazn.github.io/garden-ds/a...

3 weeks ago 0 0 0 0

Another day, another write up. Today: Cache-Augmented Generation, an alternative to RAG for providing context to LLMs that works with small knowledge bases. Together with notes on in-context learning, I think this concludes me diving into this cluster of topics. Or did I miss something important?

3 weeks ago 0 0 1 0

Also: Added a word-learning extension to my conjugation game (and a recent error view), because I desperately need it 🤦 . But, motivated me to fail for 30 minutes this morning so I count that as a success.

3 weeks ago 0 0 0 0

Knowledge-Augmented Generation Knowledge-Augmented Generation is an approach / alternative to Retrieval-Augmented Generation that relies on Knowledge Graphs (KG) for information storage and retrieval.

rhazn.github.io/garden-ds/a...

3 weeks ago 1 0 0 0

Another write up, this time on alternatives to RAGs, especially Knowledge-Augmented Generation. This one we actually used in a project, right when the research around GraphRag came out. Very helpful to add domain knowledge, especially if a keyword based search is not really useful.

3 weeks ago 1 0 1 0

FAIR Data Principles Originally published in 20161 with the goal of defining “good data management” to improve scholarly data with the goal of making downstream use and re-use easier and improve the output of research investments.

Today's data fundamentals look: FAIR data principles. Again ancient by data science/AI timelines, but good to remind myself :) rhazn.github.io/garden-ds/F...

3 weeks ago 2 0 0 0

It's kind of sad that these initiatives never really gain traction. But I guess a large part of the beauty of csv and json is that they're very basic and unstructured, but human readable. Reminds me of the simplicity of GTFS vs the more advanced standards and which one is more common 😅

3 weeks ago 2 0 1 0

Frictionless Data Data software and standards

There's frictionless data (frictionlessdata.io) by the OKNF :)

3 weeks ago 9 0 2 0

A great article on data organization basics. Spreadsheets are everywhere, more data than engineers want to admit is managed in Excel and some small data hygiene can help immensely 😅: doi.org/10.1080/000...

3 weeks ago 0 0 0 0

I have the same experience and would love to read your thinking! 😅

3 weeks ago 1 0 0 0

I don't want to be rude and I like the question, but I assume this is just a social add for your tool. Happy to engage with real human accounts though, so feel free to ask under that again ;).

4 weeks ago 0 0 0 0

Retrieval-Augmented Generation Retrieval-Augmented Generation (RAG) refers to enhancing the answer generation of an Large Language Model with additional information that is relevant to the user query.

rhazn.github.io/garden-ds/a...

4 weeks ago 0 2 0 0

Day 2 of digital gardening for AI/DS, today having a look at RAGs. Main insights today from reading the original (?) NeurIPS paper by Lewis et al., getting some exposure to the academic language used in AI. I like parametric memory vs. non-parametric memory for example :). 🧵

4 weeks ago 0 0 2 0

silicon valley s2, ep10 should be required watching for everyone in an early tech startup

4 weeks ago 6 0 1 0

index An attempt to deliberately study the foundations of data science and artificial intelligence, inspired by the concept of digital gardens and the structure of A Pattern Language.

Bit of a vulnerable post, but I assume others feel similarly? With AI/DS, I have an intuitive understanding, but unsure about deep knowledge. Hard to know, hard to build. I want to try with digital gardening, slowly writing. Let's see where it takes me 😅. Today: Entrances. rhazn.github.io/garden-ds/

4 weeks ago 0 0 0 0

I am coming around to that realization more and more for social media as well. Better to be a cool fish in a cool pond than a generic guppy in the ocean :).

4 weeks ago 1 0 1 0

Added more verbs and an English only mode, this way I can use it + Infinitives only to learn the new verbs themselves 🥳. I wonder how effective this is, very easy to slip into a flow state for me that just pattern matches verbs... but on the other hand, quick pattern matching might be good?

1 month ago 1 0 0 0

Posts by philip