WHY SOCURE?
Socure is building the identity trust infrastructure for the digital economy β verifying 100% of good identities in real time and stopping fraud before it starts. The mission is big, the problems are complex, and the impact is felt by businesses, governments, and millions of people every day.
We hire people who want that level of responsibility. People who move fast, think critically, act like owners, and care deeply about solving customer problems with precision. If you want predictability or narrow scope, this wonβt be your place. If you want to help build the future of identity with a team that holds a high bar for itself β keep reading.
ABOUT THE ROLE
We are seeking a Senior SDET to support the next evolution of quality engineering by combining automated functional validation, production health monitoring, and AI-driven failure analysis.
This role is focused on ensuring that our Disaster Recovery (DR) and production environments are not only available, but fully functional, continuously validated, and increasingly capable of self-diagnosis. The ideal candidate will bring strong automation skills, systems thinking, and a passion for improving reliability across complex distributed systems.
JOB OVERVIEW
As a Senior SDET, you will design and build automated quality and validation systems that strengthen confidence in both production and disaster recovery readiness. You will partner closely with QA, SRE, and Engineering teams to validate critical business workflows, improve observability, reduce alert noise, and accelerate incident detection and resolution.
This role sits at the intersection of test automation, production reliability, and intelligent diagnostics, helping advance Socureβs shift from traditional QA practices toward more autonomous, resilient quality systems.
JOB RESPONSIBILITIES
- Design and implement automated functional health checks for DR and production environments using synthetic transactions and API validation.
- Build continuous validation pipelines that verify end-to-end business workflows such as authentication, transactions, and system integrations.
- Develop intelligent alerting mechanisms based on functional failures and customer-impacting behavior, not solely infrastructure metrics.
- Integrate observability signals including logs, metrics, and traces with automated test frameworks to improve system visibility and diagnosis.
- Develop AI/ML-driven approaches to detect failure patterns, correlate issues across services, and identify probable root causes.
- Build systems that recommend or trigger automated remediation actions to support early-stage self-healing capabilities.
- Partner cross-functionally with QA, SRE, and Engineering teams to improve service reliability, incident response, and recovery readiness.
- Define, measure, and report on functional SLAs, service health indicators, and quality metrics.
- Contribute to disaster recovery drills, readiness exercises, and automated validation efforts that improve resilience over time.
JOB REQUIREMENTS
- 5+ years of experience in QA Automation, SDET, Software Engineering, or a related technical discipline.
- Strong experience building and maintaining automated test frameworks, including tools such as Playwright, Jest, SuperTest, and REST API testing frameworks.
- Experience working in cloud environments, preferably AWS.
- Familiarity with observability and monitoring platforms such as Datadog, New Relic, CloudWatch, Splunk, or similar tools.
- Strong programming skills in TypeScript, Python, Java, or similar languages.
- Experience designing end-to-end test strategies for distributed systems and production-like environments.
- Strong problem-solving skills, with the ability to analyze failures across application, infrastructure, and workflow layers.
PREFERRED QUALIFICATIONS
- Experience with synthetic monitoring, production validation, or proactive health-checking systems.
- Exposure to AI/ML techniques for anomaly detection, log analysis, or failure correlation.
- Experience with CI/CD pipelines, release automation, and validation gates.
- Understanding of microservices architecture, distributed system failure modes, and incident management practices.
- Familiarity with SRE concepts such as SLIs, SLOs, error budgets, or production-readiness practices.
Socure is an equal opportunity employer that values diversity in all its forms within our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
If you need an accommodation during any stage of the application or hiring processβincluding interview or onboarding supportβplease reach out to your Socure recruiting partner directly.
Follow Us!
YouTube [Upgrade to PRO to see link] | LinkedIn [Upgrade to PRO to see link] | X (Twitter) [Upgrade to PRO to see link] | Facebook [Upgrade to PRO to see link]