Advertisement Β· 728 Γ— 90
#
Hashtag
#OpenWebSearchEU
Advertisement Β· 728 Γ— 90
Towards a shared infrastructure for assembling web search engines Web search engines are essential for navigating the web. Suppose we look at the web as a service that is provided by public utility companies, a service similar to electricity, water or telephone. To make sure that everyone has access to the web, public utility companies have to be subject to public control and regulation. Without regulation, a single firm may abuse their natural monopoly, for instance by raising prices, by deteriorating the service, by delivering unequal quality to different groups, or by pushing advertisements and propaganda. Equity requires that all citizens can access the web at a fair price, and at a sufficient level of quality, via transparent, well-regulated, community-based or government-based control. OpenWebSearch.eu is a European Union funded project that researches what a transparent, well-regulated, community-based web search engine would look like. The project builds the index for a web search engine on open infrastructure that is distributed over four data centers in four different European countries. The data centers cooperatively crawl the web, cooperatively preprocess and enrich the web data, and cooperatively build an inverted index that is shared with the world. We envision a future where a search engine is β€œassembled” from parts provided by many different companies, based on public standards. I will discuss public standards for search engine indexes, such as the common index file format (CIFF) and approaches based on open data formats like Parquet and open cloud object storage like S3. Furthermore, I will show how researchers can query the Open Web Index remotely using a low-cost local machine, without the need to download the full index, even though it currently consists of more than 10 billion web pages. _To be presented at the European Conference on Information Retrieval (ECIR 2026)IR 4Good track on 30 March 2026 in Delft, the Netherlands_

I will be giving an invited talk at the #ECIR2026 IR4Good track about #OpenWebSearchEU: "Towards a shared infrastructure for assembling web search engines"

djoerdhiemstra.com/2026/towards-a-shared-in...

4 1 0 0
Original post on idf.social

#ECIR2026 notifications were friendly to me πŸ€—

1. Full paper "Open Web Indexes for Remote Querying" with @gijs and @djoerd.

Can we let ppl query the Terabytes of Web Index we collect in #OpenWebSearch.EU in new ways, making good use of Parquet, S3, DuckDB?

Turns out the answer is a big YES! […]

7 1 0 0
Preview
More digital sovereignty for Europe – Open Web Index launched The OpenWebSearch.EU project aims to strengthen Europe's digital independence. With the launch of the Open Web Index (OWI), it has reached a milestone for open internet search.

πŸ”Ž The #OpenWebSearchEU project, coordinated by @mgrani.bsky.social and #OpenSearch Foundation, aims to strengthen Europe's digital sovereignty. With the launch of the #OpenWebIndex (OWI), it has reached a milestone for open internet search:
πŸ§ͺ

1 0 1 0
Post image Post image

#NGISearch & #OpenWebSearchEU with #NGISargasso at the #NGIForum2025! ✨

Collaboration with the #NextGenerationInternet community is key to shape an #OpenInternet together πŸš€

Discover more about NGI Sargasso: ngisargasso.eu 🌊

#NGIForum25 #DigitalSovereignty #NGI #OpenSource

@ngi4eu.bsky.social

2 0 0 0
## Diamond Sponsor ## Gold Sponsor ## Silver Sponsor ## Bronze Sponsor

#OpenWebSearchEU is a silver sponsor of #ECIR2025!

https://ecir2025.eu/sponsors/

2 0 0 0
Preview
Welcome - OpenWebSearch.eu – Promoting Europeβ€˜s Independence in Web Search

Open web index #OWI update:

4 billion URLs crawled
185 different languages
28 million Hosts
750 TB crawled
1 TB crawled per day
147 WARC Datasets
17.5 TB size of Open Web Index
28.8 TB size of WARC datasets
346 public datasets

#OpenWebSearchEU #OpenWebSearch

https://ows.eu

0 2 0 0