Careers
Senior Site Reliability Engineer
Remote •  Full Time •  Experienced

Department

Engineering
Reports to: Head of Engineering
Location: 100% remote, USA or Canada only

Company Summary

Publishing.com empowers individuals from all walks of life to generate meaningful income streams through book publishing. As a leading online education platform, we specialize in guiding our students through the processes of writing, publishing, and selling books and audiobooks on major platforms like Amazon and Audible. We are thrilled to announce that Publishing.com has been recognized as the 19th fastest-growing private company in America for 2023, according to the prestigious Inc. 5000 list. Over the past two years, we've experienced an incredible 30% year-over-year growth and expanded our team by 500%. Recently, we hit a major milestone by helping 60,000+ students through our programs.

Our mission is to become the premier destination for all publishing-related needs. In line with this vision, we are excited to announce the launch of our latest innovation, Publishing.ai, a software designed to revolutionize the publishing industry further. This year marks a significant milestone in our journey toward achieving our goal, as we continue to expand our offerings and support our community of publishers.

Who You Are

You are an operations-focused engineer with a passion for system reliability and scalability! You enjoy collaborating with both development and IT Ops teams to build reliable, high-performing, and secure infrastructure. Whether it’s automating infrastructure deployments, enhancing monitoring systems, or working with tools like Pulumi, Sentry, and Heroku, you thrive in ensuring operational efficiency. You excel at improving workflows, reducing manual interventions, and making sure systems are always running smoothly.

And you have a great attitude!

About The Role
As our first Site Reliability Engineer, you’ll be instrumental in building and maintaining the infrastructure and operational workflows that power our business. You will collaborate closely with the IT team to ensure the reliability, security, and performance of our cloud infrastructure and our website. Additionally, you will work with our software engineers to support their DevOps needs and lead key initiatives such as defining SLAs, SLIs, and SLOs. You will also implement automation, monitoring, and incident management systems to ensure smooth, scalable operations and continuously drive improvements in system reliability and performance.

Responsibilities

  • Build highly scalable web applications
  • Propose, design, and implement scalable solutions to address our ever-growing ambitious marketing initiatives
  • Collaborate with other engineers by participating in design reviews and code reviews
  • Work closely with design, product, sales, and marketing teams to gather requirements and identify opportunities for improvement
  • Employ the most recent software development and deployment techniques
  • Implement data-oriented solutions for improving user experience

Requirements

  • Operational Expertise: Strong experience working closely with IT Ops teams to manage infrastructure and operational workflows.
  • Heroku: Experience managing services on Heroku, including scaling, performance optimization, and troubleshooting.
  • Pulumi: Hands-on experience using Pulumi or similar IaC tools to automate cloud infrastructure.
  • Sentry: Expertise in using Sentry for monitoring and alerting, ensuring prompt detection and resolution of system issues.
  • Cloud Platforms: Deep knowledge of cloud platforms such as AWS, Azure, or GCP, with a focus on reliability, security, and scalability.
  • CI/CD Expertise: Strong experience building and managing CI/CD pipelines using tools like Jenkins, GitLab CI, or similar.
  • Automation & Scripting: Proficiency in scripting languages such as Python or Bash to automate tasks and improve operational efficiency.
  • Monitoring & Observability: Experience with observability tools (Sentry, Prometheus, Grafana, Datadog) to ensure system reliability.
  • Collaboration: Excellent communication skills with the ability to collaborate cross-functionally with IT Ops, development, and other teams.
  • Security & Compliance: Strong understanding of cloud security best practices and experience with compliance frameworks such as SOC 2 and GDPR.

Preferred Skills
  • Incident Response: Experience leading incident response efforts and building automated incident management systems.
  • Terraform or Other IaC Tools: Experience with Terraform or other Infrastructure as Code tools in addition to Pulumi.
  • Kubernetes & Docker: Knowledge of container orchestration tools like Kubernetes and Docker.
  • Zapier, Webflow, HubSpot, and other no-code and low-code platforms: As a business, we use various low-code and no-code solutions and having experience with monitoring, operating, and contributing to reliability and operational excellence of such systems is highly desirable.

Why Publishing.com?

At Publishing.com, our dedication to our mission and core values isn't just talk; it's reflected in how we treat our team. We believe in nurturing our employees' well-being, supporting their families, and empowering them to contribute to their communities. Here's how we stand out:

  • Recently recognized as #19 on the Inc 5000's list of Fastest Growing Private Companies in America for 2023
  • We are a completely remote team located worldwide with 100+ employees
  • We have great benefits including paid time off (PTO), competitive health, vision, and dental benefits, 401k, and team socials...yes, even remotely
  • We care about our culture deeply and live by our company values (1) Service that WOWs, (2) Ultimate Team Player, (3) Great Freakin' Attitude, (4) Billion Dollar Standards
  • We encourage learning, growth, and continuous improvement and create meaningful programs to support our employees' professional development
  • If you want to join a team on the ground floor, this is your chance: we are expanding beyond being an education company to become the one-stop shop for all your self-publishing needs
*Some benefits are available to our US-based employees only. 

At Publishing.com, we're dedicated to assembling teams as diverse as a kaleidoscope and fostering an atmosphere as warm as your favorite coffee shop. We understand that the job application process can sometimes feel daunting, but we’re here to offer our support. Don't hesitate to reach out with any questions or concerns about the hiring process – if you're interested in joining our ranks, we're eager to hear from you! Email us at careers@publishing.com if you need additional support.

We strive to seek out and support individuals from all different backgrounds recognizing your unique experience contributes to the richness of our collective knowledge. We are committed to fostering an environment where we learn from each other's beliefs and experiences and celebrate the differences that eventually will drive forward our innovation. We strive to ensure that every member of our team feels valued and respected, regardless of where they may be situated. Come be a part of our community – your talents and contributions are welcomed!

Apply for this position
* Required fields
First name*
Last name*
Email address*
Location *
Phone number*
Resume*

Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or paste resume

Paste your resume here or attach resume file

LinkedInLinkedIn profile URL:*
Desired salary*
This is our first SRE. Have you been a first-in SRE before?*
What best describes the average company size you've worked in in the last 5 years?*
How many years of experience do you have as a Site Reliability Engineer?*
What best describes your preferred working style?*
Will you now or in the future require visa sponsorship to live in the country you currently reside in?*
Were you referred to this position by a Publishing.com employee? If so, please share their name. If not, leave blank.
Human Check*