Advertisement · 728 × 90

Posts by Diomidis Spinellis

Post image

Generative AI upends the goals and nature of empirical software research. Here are my notes from Emerson Murphy-Hill's MSR's keynote talk on the topic.
www.spinellis.gr/blog/2026041...

1 week ago 2 0 0 0

Unless you need something that's only available in R / Perl, consider using a more mainstream language, like Python or Java. Been there and saw that the switching cost was lower than the cost of staying there.

1 month ago 1 0 0 0

It's unclear how the cited example was generated. Was it really summarization, or was the model given the freedom to add its own spice? My experience is that state-of-the-art models are quite good (though not perfect) at summarization.

1 month ago 2 0 1 0
blog dds: 2026-03-02 — Vibe coding toward the incident horizon

Vibe coding toward the incident horizon

We are living through a golden age of generative AI: a time when progress is both breathtaking and somehow still unable to reliably open a PDF without hallucinating the author’s middle name.
[…]

Read the complete text at www.spinellis.gr/blog/2026030...

1 month ago 3 0 0 1
Post image Post image Post image

The video of my #FOSDEM 2026 talk on integrating the Fourth Research Edition Unix tape into the GitHub Unix history repository is now available online: ftp.belnet.be/mirror/FOSDE...

2 months ago 9 3 0 0
The discovered Fourth Edition tape (photo credit Rob Ricci)

The discovered Fourth Edition tape (photo credit Rob Ricci)

Several news outlets reported the amazing discovery of a 1970s Fourth Edition Research Unix magnetic tape. I incorporated the tape’s source code into the Unix History Repository hosted on GitHub and studied the code’s composition. Here's what I found.
www.spinellis.gr/blog/2025122...

3 months ago 18 8 0 0

Now I did: literally LoL!

4 months ago 2 0 0 0

The LaTeX Korrektor: A funny video series addressing serious LaTeX style infractions. Andreas Zeller (a German), like Dr. Strangelove, convincingly fakes familiar stereotypes. (He also suggests compounding German words as an alternative to the em-dash.)

4 months ago 3 0 1 0
Advertisement

I also get substantial benefit from it, so I can't say it is. I think it's more akin to nuclear energy: bit potential, big risks.

5 months ago 4 0 0 0
PR diff

PR diff

I just received an AI-generated GitHub pull-request that incorrectly replaced 450 test code lines with a comment. Be VERY CAREFUL when reviewing out-of-the-blue PRs these days. AI-generated ones are likely to contain difficult-to-spot faults hidden as plausible fixes.

github.com/dspinellis/g...

5 months ago 11 3 0 1
Post image

Στις 5 Νοεμβρίου λήγει η διορία της διαδικασίας έκδοσης του Προσωπικού Αριθμού.Με τη γεννήτρια γραμμάτων προσωπικού αριθμού μπορείτε να επιλέξετε εσείς έναν καλό συνδυασμό γραμμάτων για τον προσωπικό σας αριθμό. Θα τη βρείτε στο www.balab.aueb.gr/genpa/

5 months ago 3 4 0 0
Post image Post image

I just added a note in the advice for writing LaTeX documents GitHub repository to avoid using the “itemize” environment, when you're describing elements. The “description” environment is better for this purpose.

github.com/dspinellis/l...

5 months ago 4 1 0 0
Preview
How to spot fake scientists and stop them from publishing papers Journals are considering doing identity checks to expose fake authors — but there are downsides.

A feature in Nature journal published today takes us through the fascinating and frightening world of fake authors, reviewers, and articles. It mentions my recent work on fake articles attributed to known scholars.

www.nature.com/articles/d41...

5 months ago 8 2 0 0
Process tree output.

Process tree output.

Fun, fun, fun! I'm running a binary tournament merge to combine 64 SQLite databases (83 GB in total) into one. More details in doi.org/10.1109/MS.2...

6 months ago 38 3 1 1

A big problem with reviewing GenAI-written text is that it is often (by design) very plausible but it can also be incorrect or superfluous, therefore requiring substantial concentrated effort to detect.

6 months ago 8 2 1 0
Advertisement

Real backups must to be automatically maintained as multiple encrypted, integrity-checked, tested, monitored, and documented historical copies, on different media, offsite, and offline or immutable.

6 months ago 1 2 0 0

RAID, snapshots, Apple Time Machine, Windows Volume Shadow, cloud storage, availability zone mirroring, database replication, Git/GitHub, DRBD. These are all useful, but NOT true backups.

6 months ago 1 2 1 0
Preview
NIRS fire destroys government's cloud storage system, no backups available A fire at the National Information Resources Service (NIRS) Daejeon headquarters destroyed the government’s G-Drive cloud storage system, erasing work files saved individually by some 750,000 civil se...

Following the permanent loss of 858TB data due to a fire in a South Korea government datacenter, here's a reminder on the several technologies that are NOT backup solutions:

koreajoongangdaily.joins.com/news/2025-10...

6 months ago 7 2 1 0

Saw the following hidden deep inside the Linux kernel arm compilation commands. I don't think that the arguments were arbitrarily ordered 😃

ar mPiT init/main.o vmlinux.a arch/arm/kernel/head.o

6 months ago 5 2 0 0

Why I Choose Email Over Messaging

My colleagues and friends know that I prefer to communicate with them via email rather than chat messaging. There are many benefits in such a choice. You may want to consider them and adopt the same stance.

www.spinellis.gr/blog/2025092...

6 months ago 9 4 0 1

The proposed changes are well reasoned and documented. More thought could have been given to asking for shorter proposals. A good proposal takes months to write, which is time not devoted to research.

6 months ago 1 0 0 0
    Checking uu_sed v0.0.1 (/xfs/dds/sed/src/uu/sed)
error: manual implementation of `.is_multiple_of()`
   --> src/uu/sed/src/fast_io.rs:869:8
    |
869 |     if (out_off + head_align) % block_size != 0 {
    |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: replace with: `!(out_off + head_align).is_multiple_of(block_size)`
    |
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_is_multiple_of
    = note: `-D clippy::manual-is-multiple-of` implied by `-D warnings`
    = help: to override `-D warnings` add `#[allow(clippy::manual_is_multiple_of)]`

Checking uu_sed v0.0.1 (/xfs/dds/sed/src/uu/sed) error: manual implementation of `.is_multiple_of()` --> src/uu/sed/src/fast_io.rs:869:8 | 869 | if (out_off + head_align) % block_size != 0 { | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: replace with: `!(out_off + head_align).is_multiple_of(block_size)` | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_is_multiple_of = note: `-D clippy::manual-is-multiple-of` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(clippy::manual_is_multiple_of)]`

With all the GenAI buzz we tend to forget the value of 100% correct deterministic tools. Thank you Rust clippy!

7 months ago 13 4 0 0

TIL that setting Turnitin checks to *include* bibliographic references (a check typically disabled, as it generates false positive matches) can help identifying hallucinated generative AI references (as these are less likely to appear elsewhere).

7 months ago 5 1 0 0
Advertisement

A special place in Hell is reserved who those who create in their programs custom editing areas where the operating system's keyboard editing bindings, copy, paste, and our macros stop working.

7 months ago 6 1 0 0
Preview
Google Scholar Is Doomed Academia built entire careers on a free Google service with zero guarantees. What could go wrong?

Since search is dead, how soon do you think Google Scholar is headed for the Google Graveyard? I'm betting it's soon, and academia is NOT prepared

8 months ago 939 406 54 132
Logos of Crossref, DataCire, ORCID, ROR, PubMed, USPTO

Logos of Crossref, DataCire, ORCID, ROR, PubMed, USPTO

Google Scholar's lack of an API hinders automation and scientometric studies. In common with all proprietary academic databases, searches are opaque and non-replicable. With the Alexandria3k Python package you can run sophisticated queries on your laptop. dspinellis.github.io/alexandria3k/

7 months ago 12 3 0 0
That message means your script `fetch-uspto.sh` is expecting an environment variable named **`MYODP_KEY`**, which hasn’t been set in your shell.

This usually happens when a script needs an API key to authenticate against the USPTO Open Data Portal (ODP).

To fix it, you need to:

1. **Obtain the API key**

   * Go to the [USPTO Open Data Portal](https://developer.uspto.gov/data/bulk-search) and sign in (or register).
   * Generate an API key.

2. **Set the environment variable** before running the script:

   ```bash
   export MYODP_KEY="your_api_key_here"
   ./fetch-uspto.sh
   ```

   Or prepend it inline:

   ```bash
   MYODP_KEY="your_api_key_here" ./fetch-uspto.sh
   ```

3. If you want it to persist across sessions, add the line to your shell config (e.g., `~/.bashrc`, `~/.zshrc`):

   ```bash
   export MYODP_KEY="your_api_key_here"
   ```

---

Do you want me to show you how to modify `fetch-uspto.sh` so that it gives clearer instructions (like printing how to set the variable), instead of just failing?

That message means your script `fetch-uspto.sh` is expecting an environment variable named **`MYODP_KEY`**, which hasn’t been set in your shell. This usually happens when a script needs an API key to authenticate against the USPTO Open Data Portal (ODP). To fix it, you need to: 1. **Obtain the API key** * Go to the [USPTO Open Data Portal](https://developer.uspto.gov/data/bulk-search) and sign in (or register). * Generate an API key. 2. **Set the environment variable** before running the script: ```bash export MYODP_KEY="your_api_key_here" ./fetch-uspto.sh ``` Or prepend it inline: ```bash MYODP_KEY="your_api_key_here" ./fetch-uspto.sh ``` 3. If you want it to persist across sessions, add the line to your shell config (e.g., `~/.bashrc`, `~/.zshrc`): ```bash export MYODP_KEY="your_api_key_here" ``` --- Do you want me to show you how to modify `fetch-uspto.sh` so that it gives clearer instructions (like printing how to set the variable), instead of just failing?

That's a first! As I write a program's error message, I catch myself thinking whether GenAI will be able to understand it and produce a useful suggestion. (It does.)

7 months ago 3 0 0 0
— In UML create a comment note.

— In UML diagrams, a **comment** (or “note”) is shown as a rectangle with a dog-eared corner.
Graphviz can mimic this by using `shape=note`.

Example in DOT:

```dot
digraph UML_Comment {
    rankdir=LR;
    node [fontname="Arial", fontsize=12];

    // Regular UML component
    A [label="«component»\nOrder Service", shape=component];

    // UML comment
    Cmt [label="Processes orders\nand handles payments", shape=note, style=filled, fillcolor="#ffffcc"];

    // Dashed association from comment to element
    Cmt -> A [style=dashed, arrowhead=none];
}
```

**Key points:**

* `shape=note` → dog-eared corner rectangle.
* `fillcolor` → makes the note background match UML style (light yellow is common).
* `style=dashed, arrowhead=none` → matches UML’s “annotates” relationship from a comment to a model element.

---

If you want, I can insert UML comment nodes into your **O1/O2 Graphviz diagram** to annotate entire clusters or specific tasks, so they look like sticky notes connected to the relevant part. Would you like me to do that?

— In UML create a comment note. — In UML diagrams, a **comment** (or “note”) is shown as a rectangle with a dog-eared corner. Graphviz can mimic this by using `shape=note`. Example in DOT: ```dot digraph UML_Comment { rankdir=LR; node [fontname="Arial", fontsize=12]; // Regular UML component A [label="«component»\nOrder Service", shape=component]; // UML comment Cmt [label="Processes orders\nand handles payments", shape=note, style=filled, fillcolor="#ffffcc"]; // Dashed association from comment to element Cmt -> A [style=dashed, arrowhead=none]; } ``` **Key points:** * `shape=note` → dog-eared corner rectangle. * `fillcolor` → makes the note background match UML style (light yellow is common). * `style=dashed, arrowhead=none` → matches UML’s “annotates” relationship from a comment to a model element. --- If you want, I can insert UML comment nodes into your **O1/O2 Graphviz diagram** to annotate entire clusters or specific tasks, so they look like sticky notes connected to the relevant part. Would you like me to do that?

Many years ago I contributed to #GraphViz code for several new node shapes, including UML notes. Now #ChatGPT advises me how to use them! 😃

8 months ago 5 0 0 0

This kind of thing looks like an opportunity for a new generation of developers to make their mark in open source

8 months ago 5 1 0 0
Preview
GitHub - uutils/sed: Rewrite of sed in Rust Rewrite of sed in Rust. Contribute to uutils/sed development by creating an account on GitHub.

What are the challenges and gains when porting legacy systems software written in C into #Rust? Here's my own experience porting the BSD Unix / macOS stream editor sed into Rust as part of #uutils.
IEEE Software article: doi.org/10.1109/MS.2...
Source code: github.com/uutils/sed/

8 months ago 17 6 0 1