Site Reliability Engineer
Gig Harbor, WA 98335 US | Fully Remote TELECOMMUTE US
Former Private Cloud and OnPrem applications and services are moving to a Multi-Cloud environment (AWS/Azure). We’re looking to transform our technology stack to be cloud-native, leveraging AWS services and industry best practices related to Site Reliability Engineering.
Site Reliability Engineer
Our client is on a Digital Transformation journey to implement and automate the infrastructure, and platform creation through IaC integrated with Azure DevOps as the template repo. This role will work to build deploy and manage various Azure and AWS Cloud Environments running on Kubernetes, Micro Services, and EC2.
The Site Reliability Engineers are responsible for the planning, design, management, maintenance, and support of cloud computing applications. They determine technological needs and suggest solutions that meet them. They enhance the delivery of cloud deployments and collaborate with development teams and other personnel to streamline infrastructure services.
Essential Job Functions:
- Work with Development teams accommodating tools and access request
- Standardize Monitoring and alerting environments
- Participate in patching and vulnerability remediation
- Manage IAM roles and conduct basic security audits
- Audit Pipelines and inventory accounts
- Ability to administrate source code repositories
- Work with Sr. Engineers on scoping out proposed Projects
- Become a member of the Incident response team
- Participate in standup meetings
- Repair and recover from hardware, software, and process failures.
- Design, execute, rollout, and evangelize the cloud operating model.
- Lead in defining standards on tools supporting pipelines with a Security first mindset.
- Ensure the quality of architecture and design of systems.
- Plan and coordinate reviews and approval of technical deliverables.
- Build and execute unit tests and unit test plans.
- Monitor progress by maintaining dialogue on work and results.
- Develop installation and monitoring tools for support and operations.
- Work closely with developer teams to create an automated CI/CD pipeline.
- Conduct assessments, build blueprints and roadmaps, build proofs of concept, and write technical points of view.
- Continuously improve patterns, practices, and operational efficiency within the team.
- Contribute to training and customer support activities as needed.
- Communicate fluently with business stakeholders, product managers, researchers, and developers.
- Provide recommendations and technology-based solutions to business requirements
- Provide 1st, 2nd, and 3rd level support to Service Desk and other staff
- Create documentation of server systems, operational procedures, topology, and hardware/software inventory
- Provide technical guidance and mentoring of team members, as needed
- Maintain a good working knowledge of all firm infrastructure and applications services
- Document problem resolutions into the appropriate systems
- Troubleshoot, analyze, and resolve system and user problems
- Plan, organize, and coordinate work assignments and prioritize workload
- Rapidly deploy fixes to systems in response to newly identified stability and security threats
- Knowledge and understanding of ITIL and SRE
- Bachelor’s degree from an accredited college or equivalent combination of education and experience
- 5 years of related experience preferred
- Experience with open-source databases such as Post Gres
- Experience with Scrum or Agile methodologies
- Knowledge of Terraform or CloudFormation
- Experience in Building and maintaining Kubernetes clusters
- Knowledge of AWS and other cloud services
- Knowledge of AWS services
- Experience with programming Languages such as Java or .Net
- Prior experience in working on a team
- Strong communication, interpersonal and mentoring skills
- Ability to adapt to a changing environment
- Self-motivation and ability to stay focused in the middle of distraction
- Experience with Dashboarding and Reporting Management
- The employee may be required to report to a different local office as a normal, contemplated, and mandated incident of their employment
- Office environment with frequent computer, mouse, keyboard use
- Alternating between sitting or standing as needed
- Hearing, talking, reaching, grasping