(Senior) Site Reliability Engineer Kubernetes (f/m/d)

Job description

You are passionate about IT systems and new, innovative technologies that support running workloads in a flexible, scalable, but also secure way?

In the Core Platforms & Services Division, you contribute to the continuous operations and improvement of our 24/7 E-commerce infrastructure. With an already strong and mature DevOps and Infrastructure as Code approach we focus on further developing and innovating on new parts of our platform as well as enabling and guiding our technical users and DevOps teams.
While focussing on the services that you and your team provide, you are welcome and supported to also contribute to cross-functional topics.

The Platform Engineering team is currently responsible for two technical products: The Container Orchestration Platform and the Log Management and Monitoring Platform, which are 2 of the main core platforms provided to our entire IT.
As part of our Platform Engineering strategy, the chapter continues to provide and develop the central execution platform for all workloads in all clouds.

Your responsibilities

  • Develop and manage our Container Orchestration Platform based on Multi-Cluster Kubernetes and Istio
  • Improve our central Log Management and Monitoring Platform based on the Elastic Stack (formerly ELK)
  • Design and improve additional services for Monitoring & Alerting, Application Performance Monitoring (APM) and Tracing
  • Improve our platform availability and stability to ensure the company business can run without disruptions
  • Provide solutions by making use of open-source tools or implementing your own
  • Support our DevOps teams in using our products and services in the most efficient way
  • Engage in rotating on-call duty for the services in responsibility of your team


  • You already have at least 3 years of relevant Platform Engineering experience
  • You possess a University degree in a relevant field or have a proven track record in IT infrastructure
  • You gained knowledge in several of the following areas:
    • Infrastructure and software provisioning using Terraform or Ansible
    • Architecture design and operations of Kubernetes in production environments
    • Log processing and analysis using the Elastic Stack / configuring and optimizing elasticsearch clusters in production
    • Time series data collection, processing, storage and visualization (e.g. Prometheus, Telegraf, InfluxDB or others)
    • Application Performance Monitoring and Tracing (e.g. Jaeger, Elastic APM or others)
  • You are familiar with the DevOps toolbox: git, docker, Jira, Confluence, BitBucket and others
  • You possess development skills in bash and python as well as for debugging, optimizing and automating tasks
  • Your general understanding of network infrastructure basics such as DNS, DHCP, firewalling and load balancing will help you work cross-functional
  • You speak fluent English (at least CEFR B2)

What makes zooplus a great place to work:

🌎 Motivated and friendly teams with over 50 nationalities
🙌 Dedicated Buddy to support your onboarding
💪 A partnership with a selected gym
🏬 A central office location in Munich
✈️Flexible working hours and healthy work-life balance with 28 vacation days (plus Dec 24th and 31st off)
⏰A hybrid working model with 2 office- & 3 flexible days
💻Modern workspaces, state-of-the-art equipment, and mobile phone
📈Continuous development through internal and external training opportunities

📖Personal budget for individual training initiatives and attending your favorite tech conferences

🐾Employee discount for all company shops
💰Company pension scheme
🎈 Company events

In case you didn´t know

Zooplus AG was founded in Munich in 1999 and today we are Europe´s leading online retailer for pet-related products. With an annual double-digit growth rate, we have already successfully rolled out our business model in 30 European countries and we are expecting further sustainable growth in European e-commerce. We remain faithful to the key values of our company – a dynamic and flexible approach, constant learning opportunities, and inventive thinking at every level and position in the organization.

You want to know more? Let´s have a talk! Apply now.