Cross-Document Event-Keyed Summarization [arxiv]
William Walden, Pavlo Kuchmiichuk, Alexander Martin, Chihsheng Jin, Angela Cao, Claire Sun, Curisia Allen, Aaron Steven White
In this project we constructed a cross-document summarization dataset on top of the FAMuS (Frame across multiple Sources) dataset. Results from extensive experiments show that cross-document summarization is a non-trivial task for language models, and smaller models can even outperform larger models when being fine-tuned with the dataset.
Presented at XLLM@ACL2025 and PEER2025 [Slides].
Dynamics in the phonological encoding of bilingual speech production [pdf]
Senior thesis. I used a character-naming paradigm to test whether the phonological mapping status of cognates in Mandarin and Shanghai Dialect would affect the speech production process. Besides the most common cognate facilitation effect, there are two novel findings that have yet to be discovered in this field. First, the phonological mapping complexity between the two languages interferes with the speech production. Second, whether the phonemes is the most frequent match for the latent language also affects the speech production. More specifically, if the Mandarin phoneme in the trial is the most frequent match for the Shanghai Dialect phoneme (absent in the experiment), the cognate facilitation effect disappears.