Senior Site Reliability Engineer, Database Operations:Clickhouse : Gitlab

DevOps

March 21, 2025

Full time

Apply Now

Job Description

Full time
Anywhere
Posted 1 year ago

Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other GitLab production systems running smoothly 24x7x365. SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our environments and the GitLab codebase. We specialize in systems, whether it be networking, the Linux kernel, or some more specific interest in scaling, algorithms, or distributed systems, along these functions:

Design, build, and maintain ClickHouse and PostgreSQL clusters to support high-demand, enterprise-scale workloads.

Provision and Orchestrate cloud infrastructure using configuration management tools (Ansible, Chef), IaC (Terraform) and the Kubernetes ecosystem (Helm charts, Operators) and distributed consensus (etcd) in GCP

Design and implement enterprise-grade, high-availability ClickHouse solutions with ClickHouse Keeper, sharding, and replication, optimized for large-scale and dynamic datasets.

Optimize and scale high-transaction PostgreSQL clusters with Patroni and streaming replication for GitLab’s core applications on GCP

Build and maintain early warning systems, monitoring, and alerting tools (e.g., Prometheus/Grafana) to predict capacity needs, monitor query latency and replication lag, and ensure resource optimization across platforms.

Enable cross-database integrations and workflows, such as ClickHouse-to-PostgreSQL data federation, CDC, and logical replication, to support hybrid analytics.

Respond to platform alerts, user emergencies, and support requests while ensuring strict adherence to SLOs, including during SRE on-call rotations.

Enhance infrastructure security by implementing and updating measures that protect GitLab’s systems and ensure compliance with regulatory requirements (e.g., GDPR, FedRAMP, SOC2, ISO).

Partner with internal and external compliance assessors as Subject Matter Experts during certifications and recertifications.

Collaborate with engineering teams to address architectural bottlenecks, plan service rollouts and migrations, and shape the future roadmap while maintaining strong operational readiness.

Mandatory technical skills and experience

Advanced database platform management experience, preferably using Postgres and Clickhouse at scale

Advanced Cloud Infrastructure automation and management, preferably using Ansible, Chef, Terraform, Helm charts, Operators and Kubernetes

Solid experience with at least one programming language: Go, Ruby or Python

Advanced experience with Linux

Extensive on-call experience as an SRE supporting mission critical systems

Solid incident management experience, across all phases: Analysis, Remediation, RCA and Corrective Actions

Solid experience implementing monitoring at scale (preferably Prometheus and Grafana)

Mandatory non-technical skills, experience and characteristics

Willingness and ability to live and promote Gitlab’s unique CREDIT Values in one’s day to day work and interactions with teammates.

Superior verbal and written communication skills

Cool, collected and composed under pressure

Comfortable and productive working asynchronously across timezones and cultures, at the speed and scale of business.

Enable others to excel

Be a Leader of One

Act Like an Owner with Gitlab’s resources.

How GitLab will support you

Benefits to support your health, finances, and well-being

All remote, asynchronous work environment

Flexible Paid Time Off

Team Member Resource Groups

Equity Compensation & Employee Stock Purchase Plan

Growth and development budget

Parental leave

Home office support

Date Posted

March 21, 2025
Expiration date

--
Experience

3 Year
Gender

Both
Qualification

Bachelor Degree

Senior Site Reliability Engineer, Database Operations:Clickhouse : Gitlab

Job Description

How GitLab will support you

Related Jobs

Senior DevOps Engineer : Hostaway

DevOps Engineer : GHG

Linux Cloud Systems Administrator : ZenSmart.ai

Senior DevOps : OneLocal

Call Us: 2348108230304

For Candidates

Popular Locations

For Employers

Useful Links

Login to WFH

Reset Password

Create a free superio account

Senior Site Reliability Engineer, Database Operations:Clickhouse : Gitlab

Job Description

How GitLab will support you

Share this post

Related Jobs

Senior DevOps Engineer : Hostaway

DevOps Engineer : GHG

Linux Cloud Systems Administrator : ZenSmart.ai

Senior DevOps : OneLocal

Call Us: 2348108230304

For Candidates

Popular Locations

For Employers

Useful Links