CareersHelp Build the Open Cloud

Site Reliability Engineer

San Francisco, CA, US

At Joyent, engineers at every level directly influence our business and services. Our Site Reliability Engineers are a hybrid of software and systems engineers responsible for reliability, scalability, and automation while keeping an eye on latency, performance, and capacity.

You Want

To automate infrastructure behind customer facing APIs with high availability, reliability, scalability. 

To build systems that:

  • Look Elegant and reliable on the outside
  • Deal with the complexities of rigorous business logic
  • Transparently handle hardware failures

To work in Go and other tools as necessary to build systems.

To participate and thrive in a delivery-oriented, goal-centric culture.

Successful candidates will

  • Demonstrate the ability to be productive and independent in a remote environment
  • Design, write, and maintain software to improve the availability, scalability, reliability, performance, and efficiency of Joyent’s services, incorporating third-party open-source tools when available
  • Create new designs for a growing number of distributed systems
  • Design and implement the tools and processes used for deployment and change management
  • Plan and execute configuration management
  • Automate resource provisioning and allocation process
  • Own, maintain, and continuously improve all systems provided as a service, such as monitoring and datastores
  • Engage in service capacity planning and demand forecasting, anticipating performance bottlenecks
  • Run software performance analysis
  • Plan and execute disaster recovery drills
  • Participate in rotating on-call duties

You have

A love of systems engineering, APIs, and making applications secure. You enjoy collaborating and coming up with reliable solutions that solve business needs. You constantly seek ways to ensure your systems meet the objectives while improving performance.

The ideal candidate doesn’t have all of the following, but is seeking to gain experience with them:

  • Comfortable with languages such as Go and Python
  • Minimum of 4 years of industry experience in engineering
  • Familiarity with algorithms, data structures, and complexity analysis
  • Experience working with Linux systems from kernel to shell including working with system libraries, file systems, and client-server protocols
  • Experience with networking (TCP/IP, UDP, ICMP, ARP, DNS, load balancing, etc.)
  • Experience with configuration management tools (Ansible)
  • Systematic problem solving
  • Elegant and simple solutions to complex problems
  • Strong sense of ownership and drive
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems

Joyent offers

  • An opportunity to build a cloud solution and scale it to meet the needs of the world’s largest cloud consumers
  • A highly distributed, remote-friendly team (US preferred)
  • An opportunity to shape product and business strategy and can grow into new roles as the organization grows

About Joyent

Joyent, a wholly-owned subsidiary of Samsung, is the open cloud company. Joyent builds technology, at the pinnacle of scale, performance, stability, and security to accelerate the transformation toward the mobile and cloud-centric world. Joyent designs, builds and manages market competitive cloud computing solutions and services for Samsung Electronics and its partners at global scale.

How To Apply

To apply, please submit a brief introduction, a copy of your resume, and a link to your Github or LinkedIn profile to with Site Reliability Engineer in the subject. Qualified applicants with criminal histories will be considered for the position in a manner consistent with the Fair Chance Ordinance.

View All Open Positions at Joyent

Opt In to the Joyent Newsletter

Our regular newsletter includes Joyent product information, upcoming vidoes, blogs and content.