Welcome

I am a fourth year Ph.D. student in the Johns Hopkins Computer Science department affiliated with the Center for Language and Speech Processing, where I am coadvised by Jason "The Adveisner" Eisner and David Yarowsky, which means my academic lineage forms a DAG, rather than a tree (see here). I specialize in Natural Language Processing, Computational Linguistics and Machine Learning, focusing on deep learning and statistical approaches to phonology, morphology, linguistic typology and low-resource languages. I have won best paper awards at ACL 2017 and EACL 2017 after having twice runnered-up (EMNLP 2015, NAACL 2016). Previously, I was a visiting Ph.D. student at the Center for Information and Language Processing at Ludwig-Maximilians-Universität München supported by a Fulbright Fellowship and a DAAD Research Grant under the supervision of Hinrich Schütze, with whom I still actively collaborate. These days, I also collaborate a lot with Tim Vieira. Since Fall 2016, I have been supported by an NDSEG graduate fellowship. On a lighter note, I'm a big fan of the passive voice and I insist forewent is a valid inflection of forego. Outside of the university, I spend a lot of time reading modern German-language literature; Thomas Mann, Hermann Hesse and Max Frisch are favorites.

I am co-organizing the CoNLL 2017 shared task on morphological reinflection with Christo Kirov, John Sylak-Glassman, Ekaterina Vylomova, Patrick Xia, Manaal Faruqui, Géraldine Walter, Sandra Kübler, David Yarowsky, Jason Eisner and Måns Huldén. This is the first major shared task on morphology in NLP and will likely feature data in 50 unique languages! Checkout a previous version of the shared task here, which was hosted in conjunction with SIGMORPHON.

Details: Full CV Google Scholar Semantic Scholar Twitter

News

  • 2017/06: Received Outstanding Paper Award at ACL 2017
  • 2017/06: Starting an internship at Google Research, NYC
  • 2017/05: Giving a talk at Universität Heidelberg hosted by Stefan Riezler
  • 2017/03: Received Outstanding Paper Award at EACL 2017
  • 2017/01: Attending the workshop "From Characters to Understanding Natural Language" at Schloss Dagstuhl in Germany
  • 2016/11: Giving a talk at EMNLP 2016 in Austin, Texas about tree-structured models for morphology
  • 2016/10: Giving a talk at the University of Alberta hosted by Greg Kondrak
  • 2016/09: Visiting Google Research NYC to discuss LSTM and finite-state transducer mashups
  • 2016/08: Giving a talk at SIGMORPHON 2016 in Berlin about the SIGMORPHON shared task on morphological reinflection.
  • 2016/08: Giving a talk at ACL 2016 in Berlin about my paper with Hinrich Schütze and Jason Eisner Morphological Smoothing and Extrapolation of Word Embeddings.
  • 2016/07: Giving an invited talk at Universität Tübingen hosted by Gerhard Jäger.
  • 2016/06: Giving a talk at NAACL 2016 in San Diego on my paper with Tim Vieira and Hinrich Schütze A Joint Model of Orthography and Morphological Segmentation

Research

My research interests lie in statistical approaches to phonology and morphology. I am a staunch empiricist and believe in modeling linguistic data as they are in an atheoretic manner. My current research involves building graphical models over strings using weighted finite-state transducers to infer underlying phonological forms. I also work on unsupervised morphology induction using very large corpora, focusing on heavily inflected languages (fusional and agglutinative). Beyond these two projects, I am interested in experimental phonology and modeling experimental data. In the past, I worked with Chris Callison-Burch on using crowdsourcing to improve Arabic dialect identification and under Ben Van Durme on multi-lingual named entity recognition for social media text during a SCALE summer workshop.

Selected Publications

Ryan Cotterell and Georg Heigold. Cross-lingual, Character-Level Neural Morphological Tagging. EMNLP. 2017.

Ryan Cotterell, Ekaterina Vylomova, Huda Khayrallah, Christo Kirov and David Yarowsky. Paradigm Completion for Derivational Morphology. EMNLP. 2017. pdf data

Ryan Cotterell and Jason Eisner. Probabilistic Typology: Deep Generative Models of Vowel Inventories. ACL. 2017. pdf arXiv (Outstanding Paper)

Katharina Kann, Ryan Cotterell and Hinrich Schütze. One-Shot Neural Cross-Lingual Transfer for Paradigm Completion. ACL. 2017. arXiv

Ryan Cotterell, Christo Kirov, John Sylak-Glassman, Géraldine Walther, Ekaterina Vylomova, Patrick Xia, Manaal Faruqui, Sandra Kübler, David Yarowsky, Jason Eisner, and Mans Hulden. CoNLL-SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection in 52 Languages. CoNLL. 2017. arXiv

Ryan Cotterell and Hinrich Schütze. Joint Semantic Synthesis and Morphological Analysis of the Derived Word. TACL. 2017. arXiv

Francis Ferraro, Adam Poliak, Ryan Cotterell and Benjamin Van Durme. Frame-Based Continuous Lexical Semantics through Exponential Family Tensor Factorization and Semantic Proto-Roles. *SEM. 2017. pdf arXiv

Christo Kirov, John Sylak-Glassman, Rebecca Knowles, Ryan Cotterell and Matt Post. A Rich Morphological Tagger for English: Exploring the Cross-Linguistic Tradeoff Between Morphology and Syntax. EACL. 2017. pdf slides

Ryan Cotterell, John Sylak-Glassman and Christo Kirov. Neural Graphical Models over Strings for Principal Parts Morphological Paradigm Completion. EACL. 2017. pdf slides (Outstanding Paper)

Ryan Cotterell, Adam Poliak, Ben Van Durme and Jason Eisner. Explaining and Generalizing Skip-Gram through Exponential Family Principal Component Analysis. EACL. 2017. pdf code

Ekaterina Vylomova, Ryan Cotterell, Timothy Baldwin and Trevor Cohn. Context-Aware Prediction of Derivational Word-forms. EACL. 2017. pdf arXiv slides code

Arun Kumar, Ryan Cotterell, Lluís Padró and Antoni Oliver. Morphological Analysis of the Dravidian Language Family. EACL. 2017. pdf data

Katharina Kann, Ryan Cotterell and Hinrich Schütze. Neural Multi-Source Morphological Reinflection. EACL. 2017. pdf arXiv code

Chandler May, Ryan Cotterell and Benjamin Van Durme. Analysis of Morphology in Topic Modeling. arXiv preprint. 2016. arXiv

Ryan Cotterell, Arun Kumar and Hinrich Schütze. Morphological Segmentation Inside-Out. EMNLP. 2016. pdf treebank data code

Katharina Kann, Ryan Cotterell and Hinrich Schütze. Neural Morphological Analysis: Encoding-Decoding Canonical Segments. EMNLP. 2016. pdf code

Tim Vieira*, Ryan Cotterell* and Jason Eisner. Speed-Accuracy Tradeoffs in Tagging with Variable-Order CRFs and Structured Sparsity. EMNLP. 2016. pdf code

Ryan Cotterell, Hinrich Schütze and Jason Eisner. Morphological Smoothing and Extrapolation of Word Embeddings. ACL. 2016. pdf slides

Ryan Cotterell, Christo Kirov, John Sylak-Glassman, David Yarowsky, Jason Eisner and Mans Hulden. The SIGMORPHON 2016 Shared Task—Morphological Reinflection. SIGMORPHON. 2016. pdf slides www

Ryan Cotterell, Tim Vieira and Hinrich Schütze. A Joint Model of Orthography and Morphological Segmentation. NAACL. 2016. pdf slides data code (Runner-up for Best Paper)

Pushpendre Rastogi, Ryan Cotterell and Jason Eisner. Weighting Finite-State Transductions With Neural Context. NAACL. 2016. pdf slides code

John Sylak-Glassman and Ryan Cotterell. Contrastive Morphological Typology and Logical Hierarchies. Chicago Linguistic Society. 2016. pdf slides

Nanyun Peng, Ryan Cotterell and Jason Eisner. Dual Decomposition for Graphical Models over Strings. EMNLP. 2015. pdf slides

Thomas Müller, Ryan Cotterell, Alexander Fraser and Hinrich Schütze. Joint Lemmatization and Morphological Tagging with Lemming. EMNLP. 2015. pdf code (Honorable Mention for Best Paper)

Ryan Cotterell, Thomas Müller, Alexander Fraser and Hinrich Schütze. Labeled Morphological Segmentation with Semi-Markov Models. CoNLL. 2015. pdf code

Ryan Cotterell, Nanyun Peng, and Jason Eisner. Modeling Word Forms Using Latent Underlying Morphs and Phonology. TACL. 2015. pdf slides data

Ryan Cotterell and Jason Eisner. Penalized Expectation Propagation for Graphical Models over Strings. NAACL. 2015. pdf slides

Ryan Cotterell and Hinrich Schütze. Morphological Word Embeddings. NAACL. 2015. pdf code

Gaurav Kumar, Yuan Cao, Ryan Cotterell, Chris Callison-Burch, Daniel Povey and Sanjeev Khudanpur. Translation of the CALLHOME Egyptian Arabic Corpus For Conversational Speech Translation. IWSLT. 2014. pdf

Ryan Cotterell, Nanyun Peng, and Jason Eisner. Stochastic Contextual Edit Distance and Probabilistic FSTs. ACL. 2014. pdf code

Ryan Cotterell and Chris Callison-Burch. A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic. LREC. 2014. pdf data

Ryan Cotterell, Adithya Renduchintala, Naomi Saphra, and Chris Callison-Burch. An Algerian Arabic-French Code-Switched Corpus. LREC Workshop on Free/Open-Source Arabic Corpora and Corpora Processing Tools. 2014. pdf data

David Etter, Francis Ferraro, Ryan Cotterell, Olivia Buzek, and Benjamin Van Durme. Nerit: Named Entity Recognition for Informal Text. Technical Report 11. HLTCOE, Johns Hopkins University. July, 2013. pdf

Teaching

Education

Johns Hopkins University

Ph.D. in Computer Science
Advisors: Jason Eisner and David Yarowsky

Bachelor of Arts in Cognitive Science
Minor: Linguistics
Advisor: Colin Wilson
May 2013


Last change: July 3, 2017