A practical blueprint for evaluating conversational AI at scale - Enggist