Do Multi-Document Summarization Models Synthesize?
This paper investigates the ability of modern multi-document summarization models to synthesize conflicting information to produce accurate summaries. The authors conduct experiments to assess the models` effectiveness in this type of synthesis and find that although the models partially perform this task, they have limitations such as over-sensitivity to changes in input ordering and under-sensitivity to changes in input compositions. To improve the models` synthesis performance, the authors propose a method for generating a diverse set of candidate outputs and selecting the best one based on the expected aggregate measure. The results demonstrate the effectiveness of this approach, and the authors call for further research into multi-document summarization methods and learning objectives that account for the need to synthesize in some summarization settings.