Comparing to human testers
To assess the performance of our AI agent against human testers, we employed a data-driven A/B testing approach, mirroring our standard feature launch process. Both human testers and our AI agent conducted test runs under identical parameters. Our findings reveal the following:
Accuracy: 75% (AI) vs. 80% (Manual)
Efficiency: qa-ai-agent detected 300% more bugs in the same timeframe
Scalability: New tests can be integrated within 15 minutes if the prompt has already been tested, or approximately 1.5 hour if prompt testing is needed, as opposed to the hours required for training manual testers
Cost Savings: Our token cost analysis shows an 86% reduction compared to traditional manual testing expenses
It's important to acknowledge that human testers retain an advantage in areas challenging for test automation systems, such as the user onboarding flow, which requires a real human ID and liveness tests (e.g., selfie verification)