Advertisement Β· 728 Γ— 90

Posts by Ian Johnson πŸ”¬πŸ€–

Video

alternative view

1 year ago 2 0 0 0
Video

mean pooling

1 year ago 1 0 1 0

I'm interested in chatting about some data vis work!

1 year ago 5 0 0 0
Post image

implemented a new rendering component for latent scope's scatter plot. had to replace regl-scatterplot with d3-zoom + regl shaders so we could support mobile

1 year ago 7 1 0 0

She is interested in policy, and it sounds like the potential for adopting tech is both exciting and overwhelming. perhaps good examples where tech intervention has had clear benefits. I think plant health would be a great place to start!

1 year ago 2 0 0 0

hi Gabriel, do you have any resources/reading to recommend? I have a friend working in Taiwan to improve local agriculture who's interested in learning about tech potential to help

1 year ago 0 0 1 0
Post image

I'll be at @unireps.bsky.social this Saturday presenting a new experimental pipeline to visually explore structured neural network representations. The core idea is to take thousands of prompts that activate a concept, and then cluster and draw them using MultiDiffusion. πŸ§΅πŸ‘‡

1 year ago 31 8 2 0
Advertisement

is it the overhead of running / opening / managing notebook files via browser?

I found using notebooks in vscode (and cursor) got me over the hump of "just getting started" since I'm already in the ide so much

1 year ago 1 0 1 0

cool! what are you mapping exactly?

1 year ago 1 0 1 0

yes, i want to use it for storage but in order to do so i need to do this inefficient conversion. i'm wondering if there is a better choice to store with

1 year ago 0 0 1 0

I've been operating under assumption parquet is best way to store intermediate data, but now that I'm trying to handle incoming image data it feels a bit wasteful. especially since converting to bytes is only like 40 it/s

1 year ago 0 0 1 0
Post image

am i missing something for handling image data in parquet files?

I can load a dataset from HF like:
dataset = load_dataset("Marqo/marqo-ge-sample", split='google_shopping')
df = pd.DataFrame(dataset)
but i need to convert the images to bytes if I want to do:
df.to_parquet("sample.parquet")

1 year ago 2 0 1 0
Post image

β€œThey said it could not be done”. We’re releasing Pleias 1.0, the first suite of models trained on open data (either permissibly licensed or uncopyrighted): Pleias-3b, Pleias-1b and Pleias-350m, all based on the two trillion tokens set from Common Corpus.

1 year ago 248 85 11 19
Preview
GitHub - j-mahowald/clip-loc-maps: Repository for the paper "Integrating Visual and Textual Inputs for Searching Large-Scale Map Collections with CLIP" Repository for the paper "Integrating Visual and Textual Inputs for Searching Large-Scale Map Collections with CLIP" - j-mahowald/clip-loc-maps

CLIP search for 562K maps in Lib of Congress github.com/j-mahowald/c... Paper: 2024.computational-humanities-research.org/papers/paper... #chr2024

1 year ago 27 5 2 0
Post image

the algorithm is not some deity but a landscape, the feed is an uber ride across the manifold, only the windows are blacked out. what if you had a map of the algorithm? what if the UX of the feed let you look out of the window?

musing with @infowetrust.com
image from distill.pub/2017/aia/

1 year ago 5 3 0 0
Latent Scope

Spent the day playing with this. I'm absolutely blown away @enjalot.bsky.social!

- Chose any embedding from HF
- Project with UMAP, cluster with HDBSCAN
- Use Ollama to label the clusters (Works incredibly well!)

1 year ago 8 2 0 0
Advertisement

πŸ˜™πŸ‘ŒπŸ“ŠπŸ“ˆπŸ“š

1 year ago 2 0 0 0
Preview
Paper Cone Christmas Decorations - Free Printable Make these cute Santa and Christmas tree paper cone Christmas decorations with our free printable templates.

what do you think about having cut out templates like this cool cone ornament coloring thing:
picklebums.com/paper-cone-c...

1 year ago 0 0 1 0

what's crazy to me is that so many of these can be run very efficiently on an M1 MacBook pro, and just fine on a VM with only CPU.
crazy how much value you can pull out of text without billions of parameters

1 year ago 6 0 0 0
Post image

If you're interested in embedding models for retrieval (search), clustering, classification, paraphrase mining, etc., then there's now 10,000 fully free and open source options on @hf.co via Sentence Transformers.

Check out the most popular ones here: huggingface.co/models?libra...

1 year ago 32 7 2 1
Post image

I've organized and participated in many unconferences in the past, and they are always the most intense exchange of ideas and information that I've experienced. Given the energy we're seeing in the registration this one is poised to be no different!

register today!
hiddenstates.org

1 year ago 1 1 0 0

After the morning keynotes we will have a short voting session where topics get put on the board and everyone gets a few votes. Then the most popular topics get assigned to different session times. We will have parallel tracks and breakout rooms for the niche topics with dedicated interest too.

1 year ago 0 0 1 0

There is lots of interest in steering and alignment by leveraging latest interpretability techniques like SAEs. Many people also brought up dimensionality reduction and visualization as well as better ways to extract structure from models.

So how will everyone get to talk about these topics?

1 year ago 0 0 1 0

The beauty of the unconf format is the self-organizing nature, people find each other based on common curiosities. We have noticed some themes in the topics shared during registration:

Lot's of people want to go beyond the chat interface, and there appear to be lots of ideas for how to do that.

1 year ago 0 0 1 0

First we've got 2 amazing keynote speakers to kick off the day: @lelandmcinnes.bsky.social and @thesephist.com

Leland has built indispensable tools for working with model internals, namely UMAP, HDBSCAN and DataMapPlot.
Linus has published inspiring design research interfacing with hidden states.

1 year ago 1 0 1 0

We've hit a critical mass of registrations! The caliber of attendees is exciting, we've got researchers from companies big and small, academic and indie. We've got prototypers and UXers who have worked on bleeding-edge interfaces as well as house-hold names.

let's talk about the unconf experience:

1 year ago 2 1 1 0
Advertisement
Post image

Hidden States is happening next week in SF!

It's a one-day unconference gathering researchers, designers, prototypers and engineers interested in pushing the boundaries of AI interfaces, going below the API and working with the hidden states.

hiddenstates.org

1 year ago 12 2 1 2
enjalot's tweets | Latent Scope

I've also made another tool for exploring unstructured text data (i.e. tweets) via a map of sorts:
enjalot.github.io/latent-scope...

1 year ago 3 0 1 0
Post image Post image

If you do this with enough data you start to get a map of the patterns found in your dataset.

When you embed new data, like the question for a RAG query, you can see where on the map it lands.

1 year ago 2 0 1 0
Post image Post image

You can map more and more points, a less similar point will show up a little further away.
As you add more points a map starts to form, with clusters of similar data spread out before you

1 year ago 3 0 1 0