Survey: Evaluating the Quality of Texts Produced by NLP Systems
Abstract
I survey techniques and experimental designs used to evaluate the quality of texts produced by NLP systems, including machine translation, natural language generation, and summarisation. I present evaluation as a type of scientific hypothesis testing, and include in this survey papers from the broader scientific community as well as papers from the NLP community.
Published
																			2024-12-05
																	
				Issue
Section
								Survey Article
							
						 
						