Gig Harbor, WA 98335 US | Fully Remote TELECOMMUTE US
Former Private Cloud and OnPrem applications and services are moving to a Multi-Cloud environment (AWS/Azure). We’re looking to transform our technology stack to be cloud-native, leveraging AWS services and industry best practices related to Site Reliability Engineering.
This role specializes in the creation, testing, and implementation of cloud-based infrastructure. This role is responsible for the planning, design, management, maintenance, and support of cloud computing applications. This role determines technological needs and suggests solutions that meet them. This role enhances the delivery of cloud deployments and collaborate with development teams and other personnel to streamline infrastructure services.
This role evaluates older applications and determines their viability to be migrated or transferred to cloud services. This role will design, implement, and manage cloud-based systems for businesses and when appropriate, help debug cloud stacks. This role will collaborate with engineering and development teams to evaluate and identify optimal solutions and educate teams on the implementation of new cloud technologies and initiatives.
Essential Job Functions:
- Repair and recover from hardware, software, and process failures.
- Provide backup and recovery support and guidance for cloud resources.
- Design, execute, rollout, and evangelize the cloud operating model.
- Additionally, lead in defining standards on tools supporting pipelines with a Security first mindset.
- Coach other engineers in best practices and encourage experimentation.
- Ensure the quality of architecture and design of systems.
- Plan and coordinate reviews and approval of technical deliverables.
- Build and execute unit tests and unit test plans.
- Produce data-based reports on technology risk for senior management.
- Develop infrastructure documentation and technology mappings in compliance with SOPs.
- Monitor progress by maintaining dialogue on work and results.
- Develop installation and monitoring tools for support and operations.
- Work closely with developer teams to create an automated CI/CD pipeline.
- Conduct assessments, build blueprints and roadmaps, build proofs of concept, and write technical points of view.
- Perform audit checks for security, process, and resource compliance.
- Continuously improve patterns, practices, and operational efficiency within the team.
- Contribute to training and customer support activities as needed.
- Communicate fluently with business stakeholders, product managers, researchers, and developers.
- Provide recommendations and technology-based solutions to business requirements
- Provide 1st, 2nd, and 3rd level support to Service Desk and other staff
- Troubleshoots problems, answer hardware and software questions and provide general technical assistance to the firm.
- Create documentation of server systems, operational procedures, topology, and hardware/software inventory
- Maintains and coordinates servicing of all network-connected servers and peripherals
- Provide technical guidance and mentoring of team members, as needed
- Maintain a good working knowledge of all firm infrastructure and applications services
- Maintain asset records
- Document problem resolutions into the appropriate systems
- Troubleshoot, analyze, and resolve system and user problems
- Plan, organize, and coordinate work assignments and prioritize workload
- Provide backup coverage for other Cloud Engineers when needed
- Rapidly deploy fixes to systems in response to newly identified stability and security threats
- Knowledge and understanding of ITIL and SRE
- Other activities as may be assigned by your manager
- Bachelor's or Graduate's Degree in computer engineering, computer science, engineering or information systems or equivalent combination of education and experience
- Three to five years related experience
- Fluent in Python, Powershell, and a variety of programming languages, software, and systems
- Strong computing and scripting skills
- Comfortable with Agile practices
- Able to work in an environment using cloud systems
- Familiar with SaaS processes and products
- Experience leading projects
- Excellent communication and interpersonal skills and professional appearance
- Proficiency in backup integrity and recovery coordination along with disaster preparedness planning
- Knowledge of networking WAN/LAN protocols including DHCP, DNS, and WINS
- Self-motivated, with the ability to work in both a team environment and individually
- Knowledge of the relevant computer systems, applications, and equipment to provide customer support of technical terminology, concepts, and applications
- Ability to understand and comply with the relevant department and/or corporate policies, procedures, and guidelines as they pertain to customer support
- Knowledge of relevant commonly used concepts, best practices, and procedures
- Knowledge of local and remote computer system diagnostic tools
- Strong communication (oral, technical, and written) skills
- Strong analytical ability, good judgment, strategic and multidimensional thinker
- Problem solves using interpersonal relations and diplomacy skills
- Detail oriented and organized
- Strong commitment and dedication to the position and a team player
- The employee may be required to report to a different local office as a normal, contemplated, and mandated incident of their employment
- Ability to lift and carry 50 pounds
- Travel up to 25% of time
- Office environment with frequent computer, mouse, keyboard use
- Alternating between sitting or standing as needed