I am a fourth year Ph.D. student in the Johns Hopkins Computer Science department affiliated with the Center for Language and Speech Processing, where I am coadvised by Jason "The Adveisner" Eisner and David Yarowsky, which means my academic lineage forms a DAG, rather than a tree (see here). I specialize in Natural Language Processing, Computational Linguistics and Machine Learning, focusing on statistical approaches to phonology, morphology and low-resource languages. Previously, I was a visiting Ph.D. student at the Center for Information and Language Processing at Ludwig-Maximilians-Universität München supported by a Fulbright Fellowship and a DAAD Research Grant under the supervision of Hinrich Schütze, with whom I still actively collaborate. These days, I also collaborate a lot with Tim Vieira. Since Fall 2016, I have been supported by an NDSEG graduate fellowship. On a lighter note, I'm a big fan of the passive voice. Outside of the university, I spend a lot of time reading modern German-language literature; Thomas Mann, Hermann Hesse and Max Frisch are favorites.
I am co-organizing the CoNLL 2017 shared task on morphological reinflection with Christo Kirov, John Sylak-Glassman, Ekaterina Vylomova, Patrick Xia, Manaal Faruqui, Géraldine Walter, Sandra Kübler, David Yarowsky, Jason Eisner and Måns Huldén. This is the first major shared task on morphology in NLP and will likely feature data in 50 unique languages! Checkout a previous version of the shared task here, which was hosted in conjunction with SIGMORPHON.Details: Full CV Google Scholar Semantic Scholar Twitter
Where Am I?
- 2017/03: Received Outstanding Paper Award at EACL 2017
- 2017/02: Giving a talk at Universität Heidelberg hosted by Stefan Riezler
- 2017/01: Attending the workshop "From Characters to Understanding Natural Language" at Schloss Dagstuhl in Germany
- 2016/11: Giving a talk at EMNLP 2016 in Austin, Texas about tree-structured models for morphology
- 2016/10: Giving a talk at the University of Alberta hosted by Greg Kondrak
- 2016/09: Visiting Google Research NYC to discuss LSTM and finite-state transducer mashups
- 2016/08: Giving a talk at SIGMORPHON 2016 in Berlin about the SIGMORPHON shared task on morphological reinflection.
- 2016/08: Giving a talk at ACL 2016 in Berlin about my paper with Hinrich Schütze and Jason Eisner Morphological Smoothing and Extrapolation of Word Embeddings.
- 2016/07: Giving an invited talk at Universität Tübingen hosted by Gerhard Jäger.
- 2016/06: Giving a talk at NAACL 2016 in San Diego on my paper with Tim Vieira and Hinrich Schütze A Joint Model of Orthography and Morphological Segmentation
My research interests lie in statistical approaches to phonology and morphology. I am a staunch empiricist and believe in modeling linguistic data as they are in an atheoretic manner. My current research involves building graphical models over strings using weighted finite-state transducers to infer underlying phonological forms. I also work on unsupervised morphology induction using very large corpora, focusing on heavily inflected languages (fusional and agglutinative). Beyond these two projects, I am interested in experimental phonology and modeling experimental data. In the past, I worked with Chris Callison-Burch on using crowdsourcing to improve Arabic dialect identification and under Ben Van Durme on multi-lingual named entity recognition for social media text during a SCALE summer workshop.
Christo Kirov, John Sylak-Glassman, Rebecca Knowles, Ryan Cotterell and Matt Post. A Rich Morphological Tagger for English: Exploring the Cross-Linguistic Tradeoff Between Morphology and Syntax. EACL. 2017. pdf
Chandler May, Ryan Cotterell and Benjamin Van Durme. Analysis of Morphology in Topic Modeling. arXiv preprint. 2016. arXiv
Gaurav Kumar, Yuan Cao, Ryan Cotterell, Chris Callison-Burch, Daniel Povey and Sanjeev Khudanpur. Translation of the CALLHOME Egyptian Arabic Corpus For Conversational Speech Translation. IWSLT. 2014. pdf
Ryan Cotterell, Adithya Renduchintala, Naomi Saphra, and Chris Callison-Burch. An Algerian Arabic-French Code-Switched Corpus. LREC Workshop on Free/Open-Source Arabic Corpora and Corpora Processing Tools. 2014. pdf data
David Etter, Francis Ferraro, Ryan Cotterell, Olivia Buzek, and Benjamin Van Durme. Nerit: Named Entity Recognition for Informal Text. Technical Report 11. HLTCOE, Johns Hopkins University. July, 2013. pdf
- Automata and Computation Theory (600.271) - Teaching Assistant - Spring 2014
- Natural Language Processing (600.465) - Teaching Assistant - Fall 2013