Functional AI Tester - GenAI

📍 Pune, Maharashtra, India
Full-time Other-General Posted January 28, 2026
Apply Now Similar Jobs
Job Description

                        Functional AI Tester - GenAI- - - - - - - - - - - - 
About the Role  
You will be involved in QA for GenAI features including Retrieval-Augmented Generation (RAG), conversational AI and Agentic evaluations. The role centers on: 
Systematic GenAI evaluation (qualitative and quantitative metrics) 
ETL and data quality testing for the data flows that feed AI systems 
Python-driven automated testing 
This position is hands-on and collaborative, partnering with AI engineers, data engineers, and product teams to define measurable acceptance criteria and ship high-quality AI features. 
Key Responsibilities  
Test strategy and planning 
Define risk-based test strategies and detailed test plans for GenAI features. 
Establish clear acceptance criteria with stakeholders for functional, safety, and data quality aspects. 
Python test automation 
Build and maintain automated test suites using Python (e.g., PyTest, requests). 
Implement reusable utilities for prompt/response validation, dataset management, and result scoring. 
Create regression baselines and golden test sets to detect quality drift. 
GenAI evaluation 
Develop evaluation harnesses covering factuality, coherence, helpfulness, safety, bias, and toxicity etc. 
Design prompt suites, scenario-based tests, and golden datasets for reproducible measurements. 
Implement guardrail tests including prompt-injection resilience, unsafe content detection, and PII redaction checks. 
Track quality metrics over time. 
RAG and semantic retrieval testing 
Verify alignment between retrieved sources and generated answers. 
Verify adversarial tests. 
Measure retrieval relevance, precision/recall, grounding quality, and hallucination reduction. 
API and application testing 
Test REST endpoints supporting GenAI features (request/response contracts, error handling, timeouts). 
ETL and data quality validation 
Test ingestion and transformation logic; validate schema, constraints, and field-level rules. 
Implement data profiling, reconciliation between sources and targets, and lineage checks. 
Verify data privacy controls, masking, and retention policies across pipelines. 
Non-functional testing 
Performance and load testing focused on latency, throughput, concurrency, and rate limits for LLM calls. 
Cost-aware testing (token usage, caching effectiveness) and timeout/retry behavior validation. 
Reliability and resilience checks including error recovery and fallback behavior. 
Share results and insights; recommend remediation and preventive actions. 
Required Qualifications  
Experience 
5+ years in software QA, including test strategy, automation, and defect management. 
2+ years testing AI/ML or GenAI features, with hands-on evaluation design. 
4+ years testing ETL/data pipelines and data quality. 
Technical skills 
Python: Strong proficiency building automated tests and tooling (PyTest, requests, pydantic or similar). 
API testing: REST contract testing, schema validation, negative testing. 
GenAI evaluation: crafting prompt suites, golden datasets, rubric-based scoring, and automated evaluation pipelines. 
RAG testing: retrieval relevance, grounding validation, chunking/indexing verification, and embedding checks. 
ETL/data quality: schema and constraint validation, reconciliation, lineage awareness, data profiling. 
Quality and governance 
Understanding of LLM limitations and methods to detect/reduce hallucinations. 
Safety and compliance testing including PII handling and prompt-injection resilience. 
Strong analytical and debugging skills across services and data flows. 
Soft skills 
Excellent written and verbal communication; ability to translate quality goals into measurable criteria. 
Collaboration with AI engineers, data engineers, and product stakeholders. 
Organized, detail-oriented, and outcomes-focused. 
Nice to Have  
Experience with evaluation frameworks or tooling for LLMs and RAG quality measurement. 
Experience creating synthetic datasets to stress specific behaviors. 
Apply for this Position

Ready to join ? Click the button below to submit your application.
Submit Application
Job Details

Location
Pune, Maharashtra, India
Job Type
Full-time