Site Reliability Engineer
Përshkrimi
We’re looking for a Site Reliability Engineer to join our growing team!
**Tasks**
- Collaborate the platform team to understand system complexity and identify gaps in operational excellence.
- Participate in the design ofresilient high availability infrastructure.
- Implement and enhancemonitoring and observability across infrastructure and services, using tools like Open Telemetry Elastic Cloud.
- Collaborate with the central Observability Platform and the Application Operation community to impelment best practices around security, scalability, reliability, and observability.
- Introduce new concepts and technologies to improve overall platform performancecreating documentation and training materials.
- Participate in on-call rotations (24x7) and handle incident management.
- Take a customer (developer) perspective to streamline observability and operation across the platform and beyond.
**Benefits**
- Vacation / Christmas bonus
- Flexible working hours
- Special deals with selected partners
- Hybrid working possible
- Employee events
- Mentoring
**Requirements**
- University Degree in Engineering/ (business) informatics
- Minimum 3 years of experience as an SRE or in IT Operations.
- Experience with logging and monitoring systems such as OpenTelemetry and Elastic Cloud.
- Proficiency in coding/scripting languages like Python, Shell, Java, or .NET.
- Experience with Infrastructure-as-Code (Terraform) and containerization (Azure, GCP).
- Understanding of Agile software development processes.
- Ability to assess multiple stakeholders’ needs and communicate technical solutions effectively.
- Capability to make high-impact decisions quickly and responsibly.
- Experience with Kubernetes
- Experience with Networking
- Business English
**Tasks**
- Collaborate the platform team to understand system complexity and identify gaps in operational excellence.
- Participate in the design ofresilient high availability infrastructure.
- Implement and enhancemonitoring and observability across infrastructure and services, using tools like Open Telemetry Elastic Cloud.
- Collaborate with the central Observability Platform and the Application Operation community to impelment best practices around security, scalability, reliability, and observability.
- Introduce new concepts and technologies to improve overall platform performancecreating documentation and training materials.
- Participate in on-call rotations (24x7) and handle incident management.
- Take a customer (developer) perspective to streamline observability and operation across the platform and beyond.
**Benefits**
- Vacation / Christmas bonus
- Flexible working hours
- Special deals with selected partners
- Hybrid working possible
- Employee events
- Mentoring
**Requirements**
- University Degree in Engineering/ (business) informatics
- Minimum 3 years of experience as an SRE or in IT Operations.
- Experience with logging and monitoring systems such as OpenTelemetry and Elastic Cloud.
- Proficiency in coding/scripting languages like Python, Shell, Java, or .NET.
- Experience with Infrastructure-as-Code (Terraform) and containerization (Azure, GCP).
- Understanding of Agile software development processes.
- Ability to assess multiple stakeholders’ needs and communicate technical solutions effectively.
- Capability to make high-impact decisions quickly and responsibly.
- Experience with Kubernetes
- Experience with Networking
- Business English
Location: Tirana, Tirane, Albania
Specifikimet
Lloji i Punësimit
Kohë e plotë
Niveli i Përvojës
Mesatar
Puna në Distancë
Jo
Periudha e Pagës
Mujore
Metoda e Aplikimit
Website
URL për Aplikim
https://apply.lufthansagroup.careers/index.php?ac=apply&q=64b9d284f8f71ca199178b2900296e5c74b8e902&utm_source=linkedin
Kërkohet CV
Po
Informacioni i shitësit
Admin User
Anëtar që nga: 2025