AI Modernization Powers OldNYC Expansion: 10,000 New Historic Photos Added Through GPT and OpenStreetMap

Key Takeaways

▸OpenAI's GPT-4o and GPT-4o-mini enabled the geolocation of 6,000 additional historical photos by extracting location details from image descriptions with semantic understanding
▸OCR coverage improved by 28% (25,000 to 32,000 images) with GPT outperforming custom legacy OCR systems in 75% of comparisons
▸Integration of OpenStreetMap and historical street datasets increased geolocation accuracy for mapped images to 96%, addressing limitations of modern geocoding services on defunct street intersections

Source:

Hacker Newshttps://www.danvk.org/2026/03/08/oldnyc-updates.html↗

Summary

OldNYC, a historical photo archive and map of New York City, has expanded from 39,000 to 49,000 photographs through a major 2024 rebuild leveraging modern AI tools and open-source technologies. The expansion was driven by three key improvements: better geolocation using OpenAI's GPT-4o to extract location details from photo descriptions, dramatically improved optical character recognition (OCR) using GPT-4o-mini to transcribe historical catalog text, and a switch from Google Maps to OpenStreetMap for more accurate historical street data. These AI-powered enhancements improved geolocation accuracy to 87% for photos with usable location data and 96% accuracy for mapped images, while OCR coverage increased from 25,000 to 32,000 images with GPT outperforming the previous custom pipeline in approximately 75% of cases.

The project demonstrates how large language models can solve complex historical digitization challenges that require semantic understanding of context. GPT's ability to interpret ambiguous historical descriptions—such as understanding "North 6th" as "North 6th Street" and extracting relevant intersections while ignoring irrelevant details—enabled the automated geolocation of approximately 6,000 additional photos. The integration of historical street datasets from the New York Public Library further improved accuracy by correcting modern geocoding errors on streets that no longer exist in their historical configurations.

The project demonstrates practical applications of LLMs for digital humanities and historical preservation work

Editorial Opinion

This project exemplifies how modern generative AI can dramatically improve historical digitization projects that were previously bottlenecked by technical limitations. By combining GPT's sophisticated text understanding with open-source mapping infrastructure, the OldNYC team achieved what custom-built systems couldn't—accurate interpretation of ambiguous historical descriptions and reliable text extraction from degraded archival images. This approach could serve as a model for other digital humanities initiatives seeking to unlock historical collections at scale.

AI Modernization Powers OldNYC Expansion: 10,000 New Historic Photos Added Through GPT and OpenStreetMap

Key Takeaways

▸OpenAI's GPT-4o and GPT-4o-mini enabled the geolocation of 6,000 additional historical photos by extracting location details from image descriptions with semantic understanding
▸OCR coverage improved by 28% (25,000 to 32,000 images) with GPT outperforming custom legacy OCR systems in 75% of comparisons
▸Integration of OpenStreetMap and historical street datasets increased geolocation accuracy for mapped images to 96%, addressing limitations of modern geocoding services on defunct street intersections

Summary

The project demonstrates practical applications of LLMs for digital humanities and historical preservation work

Editorial Opinion

This project exemplifies how modern generative AI can dramatically improve historical digitization projects that were previously bottlenecked by technical limitations. By combining GPT's sophisticated text understanding with open-source mapping infrastructure, the OldNYC team achieved what custom-built systems couldn't—accurate interpretation of ambiguous historical descriptions and reliable text extraction from degraded archival images. This approach could serve as a model for other digital humanities initiatives seeking to unlock historical collections at scale.

AI Modernization Powers OldNYC Expansion: 10,000 New Historic Photos Added Through GPT and OpenStreetMap

Key Takeaways

Summary

Editorial Opinion

More from OpenAI

OpenAI's AI Model Solves 80-Year-Old Math Problem — But Experts Urge Caution on Claims

OpenAI Files for IPO, Setting Up High-Stakes Showdown with SpaceX's Record Valuation

Literary World in Crisis as AI-Generated Submissions Infiltrate Prestigious Awards

Comments

Suggested

China Launches World's First Commercial Offshore Wind-Powered Underwater Data Center

Tesla Advances Autonomous Driving Ambitions with Fully Self-Driving Rollout in China

Spotify Partners with Universal Music Group to Launch AI-Powered Fan Covers and Remixes Tool

AI Modernization Powers OldNYC Expansion: 10,000 New Historic Photos Added Through GPT and OpenStreetMap

Key Takeaways

Summary

Editorial Opinion

More from OpenAI

OpenAI's AI Model Solves 80-Year-Old Math Problem — But Experts Urge Caution on Claims

OpenAI Files for IPO, Setting Up High-Stakes Showdown with SpaceX's Record Valuation

Literary World in Crisis as AI-Generated Submissions Infiltrate Prestigious Awards

Comments

Suggested

China Launches World's First Commercial Offshore Wind-Powered Underwater Data Center

Tesla Advances Autonomous Driving Ambitions with Fully Self-Driving Rollout in China

Spotify Partners with Universal Music Group to Launch AI-Powered Fan Covers and Remixes Tool