Job Description

Who We Are:
Wayfair leads e-commerce for all things home, driven by modern tech. We seek sharp thinkers who design scalable systems while keeping a startup mindset. Our culture values fast, data-driven innovation. The Observability team needs experienced engineers skilled in cloud-native design, legacy maintenance, and SRE best practices—plus ideas for improvement! We collaborate across Tech to ensure platforms and services are production-ready, contributing to both platform and software codebases.
What You’ll Do:
● As a Dev Ops Engineer, you will join our team to help grow our systems into best-in-class
for efficiency, stability, observability, velocity, and scale in the e-commerce space,
engage with the product and engineering team from Day 1 to design, build and maintain
the system / software proactively
● Influence the design and architecture of Wayfair system as part of Cloud Enablement
journey whilst maintaining our critical pieces of Legacy Tools; collaborate with
development teams to design scalable and reliable systems, considering aspects such
as fault tolerance, availability and performance
● Work with both software engineers and platform Engineers to optimize and develop
repeatable systems for the two sides to leverage each other. There’s a wide range of
opportunities to both guide the broad conversation and dive into the nuance of our code
& architecture
● Help service owners build realistic SLOs, set SLAs and error budgets, and ensure
production services have “reliability” built into their design
● Even after self-healing and automation done by you –provide production support and
creatively solve challenging engineering problems across our stack
● Participate in a shared on-call schedule managed across Engineering
● Automate repetitive tasks to increase efficiency and reduce human error
● Mentor new hires and other engineers by example, tech talks, paired programming, and
other avenues to increase technical efficiency across the organization.
We Are a Match Because You Have:
● 6+ years experience working in Dev Ops or SRE role, or software development with an
understanding of Cloud Infrastructure
● Experience with cloud platforms GCP, AWS, Azure, and containerization technologies
(e.g. Docker, Kubernetes)
● Experience with server-side software engineering (Python, Go, Java, BASH etc)
● Design experience with distributed systems, microservices architecture, and related
technologies
● Strong understanding of monitoring and alerting, with a focus on performance monitoring
and tracing instrumentation & SLI/SLO/SLAs
● Experience decoupling monolith services a plus
● Knowledge of CI/CD pipelines and version control systems (e.g., Git).
● Excellent communication skills across engineers, product managers, and business
stakeholders alike
● Knowledge of configuration management tools (e.g. Puppet, Ansible, Chef, Terraform)
● Passion for leading a large, cross-cutting technical initiative to delivery, cross-functional
consensus building and influencing design decisions
● Ample experience gathering and balancing requirements from technical and business
stakeholders, and reaching consensus on prioritization
● Experience mentoring engineers and leading code reviews

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application