· Session 7
Language and LLM Safety
Guest Merve Tekgurler on LLMs for translation, historical and low-resourced languages. Examining OpenAI Model Spec and DeepSeek censorship.
Large language models are built overwhelmingly on English-language data. When they encounter Arabic, Turkish, Farsi, or Urdu, they do not simply perform worse — they perform differently, importing biases and assumptions baked into their English-dominant training sets. Our guest, Merve Tekgürler from Stanford, brings a unique perspective to this problem. A historian of the Ottoman-Polish borderlands, she is training handwritten text recognition models for eighteenth-century Ottoman Turkish — a language that most commercial AI systems cannot even parse. Her work demonstrates both the promise and the limitations of applying LLMs to non-English, low-resource languages. We pair her visit with readings on how “safety” in LLMs often means safety for some communities and silencing for others — from the OpenAI Model Spec to DeepSeek’s censorship patterns.
Who decides what counts as a “safe” AI response? What is lost when historical and minority languages are excluded from the training data? And what does it mean to build language technology that works for the powerful but fails the vulnerable?