This is really beautiful content. I’m assuming it comes from the fact that there are Google teams tasked with digitizing old manuscripts?
I work with a library (Biblioteca Philosophica Hermetica) in Amsterdam that has thousands of manuscripts from the renaissance to the early modern period… all very esoteric. We really want to get the renaissance into model training! Over 75% of books (1450-1700) are unscanned — and the manuscripts are in even worse shape.
Curious if anyone knows if there any new handwriting recognition benchmarks? I’ve noticed the main model providers have plateaued in the past year on their ability to read manuscripts / modern handwriting… I think the lack of well-designed competitive benchmarks is the issue…
I love positive examples of the intersection of AI and the humanities.
Seeing the mention of Round Hand reminded me of Gilbert & Sullivan’s description in H.M.S. Pinafore of the importance of handwriting if, like Sir Joseph Porter K.C.B., you want to rise to the top of the tree:
As office boy I made such a mark
That they gave me the post of a junior clerk.
I served the writs with a smile so bland,
And I copied all the letters in a big round hand—
I copied all the letters in a hand so free,
That now I am the Ruler of the Queen’s Navee!
Browsing this led me to wonder if there is a font available for the Carolingian Minuscule style. Found "Dr. Pfeffer's Fonts", apparently free to download. Wasn't disappointed. I might try using one as a coding font for that real meditative "monastic scribe" kind of vibe...
I always like throwing these old handwritten documents into LLMs to see how well they do. GPT5 did nicely on the quitclaim deed in Anglicana, which is very hard to read.
Presumably they are doing that based on browser or OS language settings rather than IP-based location? As an English-speaker living in a German-speaking land nothing infuriates me more than websites that assume I would rather have German language content simply based on my IP address rather than checking my system language.
But yes, to confirm your assumption - I followed the link above and got the English version.
1. When I loaded the page, it bombarded me with a banner asking me, "Interested in sports?" (Yes, I am, but I came here to read about English handwriting. Go away.)
2. At the end, it presented me with a "badge" for finishing a whole "book"! Yeah, maybe people's attention spans would be better if they weren't bombarded with little banners at the beginning.
I did not get either banner looking at the page using Firefox focus on mobile, but this parallax scrolling experience used on this site (and many others)that today's webdevs seem hot to trot over is abysmal. I'm on an 8" tablet and still have to find the sweet spot where the text does not obfuscate the image, so my eyes are unable to quickly dart between both as I read. It's absurd that people think this is a good way to present information. Another commenter replying to you mentioned why not just ditch the JS, and for a site like this I say, right, why not? I'd much rather have the static text and image positions afforded with simple and elegant HTML and CSS tags as opposed to this finicky BS that was clearly tested on maybe one viewport and called good.
I'm torn on point 2... on the one hand, I'm insulted to be "rewarded" for my tenacity in managing to "read" what is basically a very short article with a lot of pictures. On the other hand, I'm working daily with people who refuse to read messages on Teams chat if they are longer than five words and respond either with an answer that proves they didn't make it past the first sentence or with the even worse "quick call?" So perhaps anything that rewards primary-school-level literacy among adults is a positive step after all.
This is really beautiful content. I’m assuming it comes from the fact that there are Google teams tasked with digitizing old manuscripts?
I work with a library (Biblioteca Philosophica Hermetica) in Amsterdam that has thousands of manuscripts from the renaissance to the early modern period… all very esoteric. We really want to get the renaissance into model training! Over 75% of books (1450-1700) are unscanned — and the manuscripts are in even worse shape.
Curious if anyone knows if there any new handwriting recognition benchmarks? I’ve noticed the main model providers have plateaued in the past year on their ability to read manuscripts / modern handwriting… I think the lack of well-designed competitive benchmarks is the issue…
I love positive examples of the intersection of AI and the humanities.
Seeing the mention of Round Hand reminded me of Gilbert & Sullivan’s description in H.M.S. Pinafore of the importance of handwriting if, like Sir Joseph Porter K.C.B., you want to rise to the top of the tree:
Browsing this led me to wonder if there is a font available for the Carolingian Minuscule style. Found "Dr. Pfeffer's Fonts", apparently free to download. Wasn't disappointed. I might try using one as a coding font for that real meditative "monastic scribe" kind of vibe...
https://robert-pfeffer.net/schriftarten/englisch/index.html?...
It should've included another 100 years to catalogue and showcase the devolvement towards kindergarten handwriting.
I always like throwing these old handwritten documents into LLMs to see how well they do. GPT5 did nicely on the quitclaim deed in Anglicana, which is very hard to read.
For me it's in Dutch, it must detect my language and adapt (since HN block non-English content)?
Presumably they are doing that based on browser or OS language settings rather than IP-based location? As an English-speaker living in a German-speaking land nothing infuriates me more than websites that assume I would rather have German language content simply based on my IP address rather than checking my system language.
But yes, to confirm your assumption - I followed the link above and got the English version.
Slightly off-topic:
1. When I loaded the page, it bombarded me with a banner asking me, "Interested in sports?" (Yes, I am, but I came here to read about English handwriting. Go away.)
2. At the end, it presented me with a "badge" for finishing a whole "book"! Yeah, maybe people's attention spans would be better if they weren't bombarded with little banners at the beginning.
I did not get either banner looking at the page using Firefox focus on mobile, but this parallax scrolling experience used on this site (and many others)that today's webdevs seem hot to trot over is abysmal. I'm on an 8" tablet and still have to find the sweet spot where the text does not obfuscate the image, so my eyes are unable to quickly dart between both as I read. It's absurd that people think this is a good way to present information. Another commenter replying to you mentioned why not just ditch the JS, and for a site like this I say, right, why not? I'd much rather have the static text and image positions afforded with simple and elegant HTML and CSS tags as opposed to this finicky BS that was clearly tested on maybe one viewport and called good.
I'm torn on point 2... on the one hand, I'm insulted to be "rewarded" for my tenacity in managing to "read" what is basically a very short article with a lot of pictures. On the other hand, I'm working daily with people who refuse to read messages on Teams chat if they are longer than five words and respond either with an answer that proves they didn't make it past the first sentence or with the even worse "quick call?" So perhaps anything that rewards primary-school-level literacy among adults is a positive step after all.
[delayed]
And when I visited the page with JavaScript disabled, it displayed all the text but no images.
HTML has the img tag. There’s no need for JavaScript to add images to the DOM!
Sorry. That was an oversight. Now the text is also generated by JavaScript manipulating the DOM tree.
/S