Site Reliability Engineer (SRE)

  • Atlanta, GA, USA
  • Full-Time
  • On-Site

Job Description:

Role Description:

SRE will work within the Video Network division to design, build, operate our next generation Video Cloud platform, driving efficiency, reliability and scalability across our cloud infrastructure. Will work primarily on AWS with opportunities to expand across multi-cloud (Azure, GCP).

Deliverables:

  • Deploy solutions in POC, Staging, Production environments, ensuring reliability & scalability
  • Lead/support customer onboarding, including environment setup and configuration
  • Provide tech support to partners/customers on Synamedia technologies, products & solutions
  • Troubleshoot resolve moderate to complex tech issues, ensuring timely resolution & customer satisfaction
  • Replicate/analyze issues in a controlled lab environment to validate fixes and improvements
  • Document tech solutions & best practices, contributing to internal knowledge bases & support documentation
  • Deliver tech presentations & cross-training sessions to internal/external stakeholders
  • Collaborate closely with cross-functional teams (Engineering, Sales, and Product Management) to enhance product quality and customer experience
  • Foster teamwork by actively sharing insights & collaborating with peers toward common objectives
  • Demonstrate a continuous commitment to technical excellence, innovation, and learning

Responsibilities:

  • Design, build, and operate scalable and secure Cloud infrastructure solutions across AWS, Azure, or GCP
  • Manage and resolve Service Requests, Incidents, Problems, and Change Requests related to Cloud environments
  • Analyze complex technical issues, propose effective solutions and communicate recommendations clearly to stakeholders
  • Drive automation across the infrastructure — develop tools, scripts, and pipelines to minimize manual intervention and improve operational efficiency
  • Monitor system performance and anticipate scaling needs to ensure service stability under varying workloads
  • Implement and maintain monitoring and observability frameworks to proactively detect and remediate system anomalies
  • Create and maintain documentation, including architecture diagrams, runbooks, and knowledge base articles
  • Define and track key metrics for Cloud resource utilization, performance, and cost efficiency.
  • Build cost-optimization dashboards and automation to visualize and control cloud spend at both infrastructure and Kubernetes levels
  • Collaborate with development and operations teams to enhance CI/CD pipelines, ensuring smooth deployments and high availability
  • Continuously research and adopt emerging tools, frameworks, and best practices in Cloud and DevOps

Soft Skills:

  • Analytical and troubleshooting skills
  • Eager to learn. Technical aptitude to assimilate new learning quickly (essential)
  • Excellent written and verbal communication skills (essential)
  • Flexible: Very able to adapt to a changing environment (essential)
  • Able to take initiative and drive change (essential)
  • Performs well under pressure and in disruptive environments where priorities can change in response to customer demand (essential)
  • Capacity and passion to help customers. Good customer engagement (essential)
  • Customer facing skills, negotiations, customer satisfaction, clear verbal, written and presentation communication skills
  • Highly organized with ability to manage multiple projects & escalations in fast paced environment