And a SxS comparison of how much difference Nastaliq makes to readability !! (Article from Jang)
Posts by Osama Khalid
Since I can't force people to adopt better standards, so instead I did the next best thing. A web extension that renders Urdu in Nastaliq so that you can have a consistent, beautiful font across all pages.
However a lack of adoption of standard web practices and standards by Urdu developers means that a vast number of Urdu websites are rendered in some form of Naskh instead of Nastaliq
The Urdu on the internet, however, is... not. It's written generally in Naskh, that looks like this (cc: @bbcurdu.bsky.social).
This isn't because fonts aren't available. We have had urdu fonts since the 80s with Noori Nastaliq. The contemporary font space includes fonts like Mehr, Noto and Gulzar
I can read Urdu that is on the internet, but the experience is jarring. The Urdu I encounter every day in newspapers, on signs, is written beautifully in Nastaliq:
🚨 I am functionally iliterate when it comes to Urdu on the internet. So I made a couple of web extensions to render Urdu text into Nastaliq, to help readability
For Chrome: chromewebstore.google.com/detail/osama...
and Firefox:
addons.mozilla.org/en-US/firefo...
You can download it for Chrome here: chromewebstore.google.com/detail/osama...
and for Firefox here: addons.mozilla.org/en-US/firefo...
And because I enjoy making videos, I made a short video demoing the functionality and the problem it solves. 3/n
www.youtube.com/watch?v=N2B-...
Problem: Everytime I come across a word in my L2 that I don't know, I find myself copy-pasting it into a translator and it breaks my reading flow.
Solution: A tool that can instantly translate a selected word to my L1 using a keyboard shortcut, and it tracks of all the new words I look up. 2/n
🚨 Weekend Project: I finally finished a small project I've been working on. A web extension to solve my personal language learning problem!
www.youtube.com/shorts/XsAdZ... 1/n
Handwritten notecard. Prompt at the top asks: What object was most devastating for you to lose, and how have you been coping? Answer below: Dentures. Were thrown away and coping isn't somthing I can do. It’s made me feel ugly, unworthy, can’t go get a job with no teeth. So how can get off the streets until another pair can be made?
ProPublica spoke to about 150 people who had lived in homeless encampments when cities cleared them out in “sweeps.”
We distributed notecards so people could tell us about the toll in their own words.
➡️ This is what they wrote: projects.propublica.org/impact-of-ho...
"When the public discourse focuses on whether chatbots have consciousness, we're not talking about documented harms like privacy violations, algorithmic bias... The fake risk of sentient AI provides perfect cover for ignoring real risks that affect real people today"
www.readtpa.com/p/stop-prete...
Swahili’s long history as a ‘language on the road’ has traversed 19th-century trade routes, entered 20th-century classrooms, carried Julius Nyerere on his political ‘safaris’, and forged connections across Africa.
www.historytoday.com/archive/behi...
References
1. ai.meta.com/research/no-...
2. onlinebooksoutlet.com/products/roa...
3. huggingface.co/hexgrad/Koko...
Over the weekend, I digitized and OCRed parts of it and then used Kokoro TTS [3] to generate an audiobook.
This is a proof of concept—not intended to be perfect, but rather an attempt to probe the limits of TTS in Urdu. 3/3
Once upon a time my friend, Abdullah gave me a copy of Roald Dahl's Boy that his mom had translated into Urdu [2]. 2/3
🚨 Weekend project:
Urdu Audiobook Generation:
youtube.com/watch?v=PYrO...
Urdu is generally considered a low-resource language[1], and there isn't enough training data to build high-quality systems.
I wanted to explore how far I can push TTS (text-to-speech) in Urdu. 1/3
letterformarchive.org/news/this-ju...
Chopstick sleeves can be both functional and cultural artifacts.
Since the Chagos Islands are back in the news.
www.bbc.com/news/article...
On the issue with domain names (e.g. .io , .tv, .nu) and digital colonialism
www.wired.com/story/the-di...
Archived: archive.ph/im6HN
Among the many applications of stylometry... weaponizing language (or any technology) as a tool of prosecution always leaves a bad taste in my mouth
www.thedial.world/articles/new...
Executions of the conceivably innocent are no better than human sacrifice
latimes.com/opinion/stor...
"When we kill the guilty, we become needlessly cruel. When we kill the conceivably innocent, we become a mockery of ourselves and our supposed allegiance to justice."
🙋♂️ I am finishing my PhD
Thats it! I am leaving academia and becoming a YouTuber!
Osama Khalid portrait with text: "FINAL EXAM: Osama Khalid Wednesday, Nov. 20 10:15 CT"
📢 Osama Khalid (osamakhalid.bsky.social) will defend his doctoral thesis
"Charting Sensorial Style: New Frontiers in Linguistic Style Analysis"
tomorrow Wednesday 11/20 at 10:15am in 2520C UCC.
Deets at bit.ly/khalid_11_20
#FinalExam #PhDLife #UIowaGrad24 #AcademicSky #stylometry #linguistics 👨🎓 🤞
I believe ں is not just a dotless ن, but as a nasal vowel has an independent existence, and should be treated like a separate letter.
on the other hand ء is just a diacritic and has no place in the Urdu Alphabet!
#linguistics
Did you know ى is more likely to occur at the end of a word than at the start. There are more words which start with م than there are words that end with it.
Used Python+MATLAB+MSPaint to visualize how the letters in the #Urdu language are distributed within words.