Status Quo and Problems of Requirements Engineering for Machine Learning: Results from an International Survey

Marcos Kalinowski , Daniel Mendez , Hugo Villamizar , Stefan Biffl , Michael Felderer , Stefan Wagner , Tony Gorschek , Antonio Pedro Santos Alves , Görkem Giray , Teresa Baldassarre , Jürgen Musil , Niklas Lavesson , Kelly Azevedo , Tatiana Escovedo , Helio Lopes , Eduardo Zimelewicz and Juergen Musil

24th International Conference on Product-Focused Software Process Improvement,

2023 · doi: https://doi.org/10.48550/arXiv.2310.06726

abstract

Systems that use Machine Learning (ML) have become commonplace for companies that want to improve their products and processes. Literature suggests that Requirements Engineering (RE) can help address many problems when engineering ML-enabled systems. However, the state of empirical evidence on how RE is applied in practice in the context of ML-enabled systems is mainly dominated by isolated case studies with limited generalizability. We conducted an international survey to gather practitioner insights into the status quo and problems of RE in ML-enabled systems. We gathered 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems involving open and axial coding procedures. We found significant differences in RE practices within ML projects. For instance, (i) RE-related activities are mostly conducted by project leaders and data scientists, (ii) the prevalent requirements documentation format concerns interactive Notebooks, (iii) the main focus of non-functional requirements includes data quality, model reliability, and model explainability, and (iv) main challenges include managing customer expectations and aligning requirements with data. The qualitative analyses revealed that practitioners face problems related to lack of business domain understanding, unclear goals and requirements, low customer engagement, and communication issues. These results help to provide a better understanding of the adopted practices and of which problems exist in practical environments. We put forward the need to adapt further and disseminate RE-related practices for engineering ML-enabled systems.

url: https://arxiv.org/abs/2310.06726