Zae Myung Kim

University of Minnesota Twin Cities, Minneapolis, Minnesota, USA


200 Union St SE

Minneapolis, MN 55455

I am currently a third-year Ph.D. candidate in the Minnesota NLP group led by Prof. Dongyeop Kang at University of Minnesota Twin Cities.

My research interest is centered around two key areas. First, I am focused on enhancing the representation of (very) long textual contexts (e.g., sequences of documents), by utilizing long-term hierarchical structures identified in linguistics, such as discourse frameworks and linguistic moves. This endeavor aims to delve deeper than the surface level of text, akin to “reading between the lines.” It involves uncovering the pragmatic relations and long-term motivations underlying the authoring of sentences and paragraphs. By doing so, I seek to enhance our (and language models’) understanding and processing of complex textual materials, revealing the intricate patterns and motivations that guide the construction of extended narratives.
Second, I aim to develop language models that are capable of generating content guided by these long-term structures, effectively acting as a “plan for writing.” This approach aims to improve thematic cohesion, controllability, and efficiency (via “caching” of these structures) in producing text across extensive contexts.

The overarching goal of my research is to advance the capabilities of language models in understanding and generating extended texts by employing the nuanced, long-term hierarchical structures found in linguistics. This ambition is particularly significant given that the linguistic framework for analyzing the “general” hierarchical structures of lengthy texts is complex and, as a result, remains underexplored. My research endeavors to fill this gap by offering a computational approach that enables the analysis and utilization of these intricate structures.

Regarding modeling and computational strategies, my interest lies in the application of (hyper)graph analyses, hierarchical generation techniques, and the alignment of language models with human or linguistic feedback. This approach encompasses exploring the potential of hypergraphs for capturing complex relationships within texts, employing hierarchical structures to guide the generation process, and refining language models through feedback that aligns with (long-term) human linguistic intuition.

Before joining the Ph.D. program, I worked as a researcher at NAVER LABS Europe and Papago team at NAVER Korea, where I researched on various topics in neural machine translation (NMT), such as analysis of language-pair-specific multilingual representation, document-level NMT with discourse information, cross-attention-based website translation, and quality estimation for evaluating NMT models.

I received a B.Eng. degree in Computer Science from Imperial College London in 2011. From 2012 to 2013, I served in the Republic of Korea Army Special Forces as an army interpreter and a geospatial image analyst. In 2016, I received an M.S. degree in Computer Science from Korea Advanced Institute of Science and Technology (KAIST).

For the Spring 2024 semester, I am actively involved in organizing the meetings and seminars for Textgroup. Additionally, since October 2020, I have been leading Seeking-SOTA, a deep learning study group that convenes weekly. This group brings together researchers, academics, and professionals in South Korea, all dedicated to staying at the cutting edge of the field of deep learning.

My Ph.D. program is generously supported by 3M Science and Technology Fellowships.


Jun 15, 2024 Our research on differentiating between machine-generated and human-authored texts through the analysis of “discourse motifs” is accepted to appear at ACL 2024.
Jun 13, 2024 I am thrilled to be interning at Amazon AGI this summer where I will be working on model alignment with discourse signals.
Sep 25, 2023 I completed my summer internship at Salesforce on developing a challenging benchmark dataset for paper citation task for large language models.
Oct 06, 2022 My summer internship work at Grammarly on improving iterative text revision task is accepted to appear at EMNLP 2022.
May 26, 2022 Our work on system demonstration for interactive and iterative text revision was shared in In2Writing workshop at ACL 2022 and received best paper award 🎉.

latest posts