GIST

An experiment on improving single- and multi-document summarization

GIST investigated different methods for single- and multi-document summarization, comparing abstractive and extractive summarization methods. This project predated the LLM era, where abstractive summarization models struggled to thoroughly capture a document’s content. Therefore, in order to ensure transparency and reliability, we opted for extractive summarization. For single document summarization, we found that a simple heuristic method (Edmundson) worked the best, while for multi-document summarization, we found that SG-Sum was the highest performing algorithm.

I was the co-PI on this project. This project was an internally funded research project that I pitched and won a small grant for.