Senior Site Reliability Engineer / Infrastructure Architect (Principal DevOps Engineer)

97EX

$2.6-3K[Mensual]
RemotoExp de 3-5 YrsBaceleroTiempo completo
Compartir

Detalles remotos

Abrir paísA nivel mundial

Requisitos de idiomaChino

Descripción del trabajo

Mostrar texto original

Beneficios

  • Reconocimiento y Recompensas de Empleados

    Equipo distribuido, Sin sistema de seguimiento, No hay política en el trabajo

  • Tiempo de apagado y abandono

    Tiempo de pago apagado, PTO ilimitado o flexible, Abandonar el Gobierno

Job Responsibilities 1. Cloud Native Architecture Design and Governance: - Design highly available architectures on AWS and Cloudflare, extending beyond CDN configuration to implement edge logic with Cloudflare Workers and secure access layers using Argo Tunnel/Zero Trust. - Manage AWS multi-account structures via Organizations, architect cross-Region networking (Transit Gateway, VPC Peering, VPN) to resolve complex connectivity and latency challenges. - Enforce Infrastructure as Code (Terraform/Pulumi) across edge rules and underlying resources to minimize manual console operations. 2. Deep Kubernetes Engineering: - Maintain large-scale EKS or self-managed clusters, performing performance tuning and troubleshooting of core components such as etcd, CNI plugins (Cilium/Calico), and CoreDNS. - Develop Kubernetes Operators/Controllers or kubectl plugins to enhance platform automation based on business requirements. - Bridge local development and production environments (Docker Compose to Helm/Kustomize) to ensure consistency. 3. Engineering Productivity and Observability: - Design and maintain complex CI/CD pipelines, integrating code quality analysis (SonarQube), container image security scanning, and automated testing. - Implement GitOps workflows using ArgoCD or Flux. - Build a Prometheus-based monitoring system with in-depth runtime (Go/Java) and system-level (eBPF) performance analysis. 4. System-Level Support and Reliability: - Maintain middleware such as Nginx, Redis, and Kafka with capabilities for source-level debugging and parameter tuning. - Address system bottlenecks under high concurrency (TCP queues, file handles, memory management). - Linux Systems Expert: Deep understanding of Linux kernel internals and proficient use of perf, strace, tcpdump, eBPF, and other tools to diagnose CPU, I/O, and network issues in production. - Cloud and Networking Proficiency: Familiarity with AWS infrastructure limits (API rate limits, EBS IOPS) and Cloudflare fundamentals (Anycast, SSL handshake), with a deep understanding of the TCP/IP stack and HTTP/2/3 protocols. - Kubernetes Hands-On Experience: In-depth knowledge of cgroups and namespaces, service meshes (Istio/Linkerd), and rapid diagnosis of pod scheduling failures or crashes. - Development Skills: Proficient in Go or Python, capable of reading open-source code, fixing bugs, and developing backend tools. Preferred Qualifications - Contributor to CNCF open source projects. - Experience maintaining systems handling hundreds of millions of daily requests. - Hands-on experience implementing chaos engineering in production environments.
Preview

Dora lee

人力資源經理97EX

Responder Hoy 7 veces

Publicado el 27 December 2025

Reporte

Recordatorio de seguridad del jefe

Si la posición requiere que trabajes en el extranjero, por favor ten cuidado y ten cuidado con el fraude.

Si encuentras a un empleador que tiene las siguientes acciones durante tu búsqueda de empleo, por favor repórtalo inmediatamente

  • retiene tu ID,
  • requiere que usted proporcione una garantía o recoja la propiedad,
  • le obliga a invertir o recaudar fondos.
  • recauda beneficios ilícitos,
  • u otras situaciones ilegales.
Tips
×

Some of our features may not work properly on your device.

If you are using a mobile device, please use a desktop browser to access our website.

Or use our app: Download App