Orion Innovation is a premier, award-winning, global business and technology services firm. Orion delivers game-changing business transformation and product development rooted in digital strategy, experience design, and engineering, with a unique combination of agility, scale, and maturity. We work with a wide range of clients across many industries including financial services, professional services, telecommunications and media, consumer products, automotive, industrial automation, professional sports and entertainment, life sciences, ecommerce, and education.
Role: Performance Test Engineer β Generative AI
Experience: 5+ years (with hands-on performance testing in GenAI / LLM-based applications)
Role Overview:
We are seeking a skilled and detail-oriented Performance Tester with strong experience in Generative AI (GenAI) projects. The ideal candidate will be responsible for ensuring scalability, reliability, and optimal performance of AI-powered applications, including Large Language Model (LLM) integrations, conversational AI systems, and Retrieval-Augmented Generation (RAG) pipelines. This role requires expertise in performance engineering, cloud platforms, and testing of AI/ML workloads in production environments.
Key Responsibilities
β’ Performance Strategy & Planning:
β’ Define and implement performance testing strategies for GenAI and LLM-based applications.
β’ Identify performance bottlenecks across APIs, model inference layers, vector databases, and cloud infrastructure.
β’ Establish performance benchmarks, SLAs, and scalability targets for AI-driven systems.
β’ Performance Testing & Engineering:
β’ Design, develop, and execute load, stress, spike, endurance, and scalability tests for GenAI applications.
β’ Perform performance testing of LLM-powered APIs (e.g., ChatGPT-like applications) hosted on cloud platforms.
β’ Validate latency, throughput, token usage, concurrency handling, and cost-performance trade-offs.
β’ Conduct performance validation for RAG pipelines including embedding generation and vector search.
β’ Analyze model inference time, GPU/CPU utilization, memory usage, and autoscaling behavior.
β’ Tools & Automation:
β’ Develop automated performance test scripts using tools such as JMeter, LoadRunner, k6, or Gatling.
β’ Monitor system performance using APM tools like Dynatrace, AppDynamics, Azure Monitor, or AWS CloudWatch.
β’ Integrate performance testing into CI/CD pipelines using Azure DevOps or similar platforms.
β’ Create dashboards and reports for performance metrics and trend analysis.
β’ Cloud & Infrastructure Testing:
β’ Conduct performance testing on AI solutions deployed on Azure, AWS, or GCP.
β’ Validate autoscaling configurations, containerized deployments (Docker, Kubernetes), and serverless architectures.
β’ Assess performance of vector databases such as Chroma, Pinecone, Weaviate, or FAISS under load.
β’ Collaboration & Optimization:
β’ Collaborate with AI engineers, data scientists, DevOps, and architects to optimize model serving and API performance.
β’ Recommend improvements in prompt engineering, caching strategies, batching, and parallelization.
β’ Support capacity planning and cost optimization for LLM-based applications.
β’ Governance & Reporting:
β’ Document performance test results, bottlenecks, and optimization recommendations.
β’ Ensure compliance with security and data privacy standards in performance environments.
β’ Present findings to stakeholders and provide actionable insights.
Key Requirements
β’ Technical Skills:
β’ 5+ years of experience in Performance Testing and Engineering.
β’ Hands-on experience in performance testing GenAI / LLM-based applications.
β’ Experience working with LLM platforms such as OpenAI GPT models, Gemini, Llama 2, Claude, or Grok.
β’ Understanding of concepts like tokenization, embeddings, vector search, and RAG architecture.
β’ Experience testing AI services hosted on Azure AI Services, Azure ML, AWS Bedrock, or Google Vertex AI.
β’ Proficiency in performance testing tools such as JMeter, LoadRunner, k6, or Gatling.
β’ Knowledge of API testing tools like Postman or Rest Assured.
β’ Familiarity with monitoring tools such as Azure Monitor, AWS CloudWatch, Grafana, or Prometheus.
β’ Experience with containerization (Docker) and orchestration (Kubernetes).
β’ Basic scripting knowledge in Python or Java for test automation.
β’ Understanding of CI/CD pipelines and DevOps practices.
β’ GenAI-Specific Knowledge:
β’ Experience testing conversational AI applications and chatbot performance.
β’ Knowledge of inference latency optimization techniques for LLMs.
β’ Understanding of GPU-based workloads and performance considerations.
β’ Exposure to agentic frameworks like LangChain, Semantic Kernel, AutoGen, or CrewAI (preferred).
β’ Experience validating performance of vector databases (Chroma, Pinecone, Weaviate, FAISS).
Qualifications
β’ Bachelorβs degree in Computer Science, Information Technology, or related field.
β’ 5+ years of experience in performance testing, with at least 2 years in AI/ML or GenAI projects.
β’ Experience in testing cloud-native, microservices-based applications.
β’ Strong analytical and troubleshooting skills.
β’ Excellent communication and stakeholder management skills.
Orion is an equal opportunity employer, and all qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, citizenship status, disability status, genetic information, protected veteran status, or any other characteristic protected by law.
Candidate Privacy Policy
Orion Systems Integrators, LLC and its subsidiaries and its affiliates (collectively, βOrion,β βweβ or βusβ) are committed to protecting your privacy. This Candidate Privacy Policy (orioninc.com) (βNoticeβ) explains:
β’ What information we collect during our application and recruitment process and why we collect it;
β’ How we handle that information; and
β’ How to access and update that information.
Your use of Orion services is governed by any applicable terms in this notice and our general Privacy Policy.