Job Description

Responsibilities

SRE (Network Engineer)

A Site Reliability Engineer (SRE) will spend up to 50% of their time doing "ops" related work such as investigating and troubleshooting issues, incident response, and maintaining playbooks and other relevant documentation. Since the system that an SRE oversees is expected to be highly available and self-healing, the SRE should spend the other 50% of their time on development tasks such as improving CI and deployment pipelines, enhancing monitoring capabilities, and keeping systems updated. Even though you will be on the SRE team, this role focuses on Network Administration/Engineering. The ideal candidate is either a software engineer with a good network administration background or a highly skilled network administrator/engineer with knowledge of deployment automation, coding, and DevOps.

Responsiblities

  • Support our State and Federal enrollment programs
  • Support network architecture needs from build to operational supportFirewall policy managementNetwork routingLoad Balancer managementDevice patching
  • Troubleshooting & Support of Network Edge Devices: Manage and resolve network issues, including low-level application interactions and performance problems at the edge
  • Patch Management: Regularly apply security patches and updates to operating systems and applications in Windows/Linux environment to maintain compliance and security
  • Endpoint Security: Collaborate with SecOps to implement and manage endpoint security solutions, including antivirus and encryption tools. Provide rapid response to critical vulnerabilities
  • Advanced Troubleshooting and Support: Provide advanced troubleshooting and RCA for hardware, software and network issues on Windows/Linux OS environment(s) and edge devices
  • Ownership of product KPIs and SLA reporting (ex: outages)
  • Availability and performance of production services.
  • Deployment of upgrades and installation of new patches.
  • Troubleshooting, error logs analysis, reports generation, capacity planning, etc.
  • Management of automated deployments into production and lower environments.
  • Collaborate with other Engineering and Operations Teams: Work closely with other teams including Cloud, Sustainment, Security, Helpdesk and dev teams to ensure seamless integration and support of endpoint systems across multiple environments. 
  • Performance Monitoring and Optimization: Monitor (endpoint performance, application and infrastructure) and make adjustments to improve user experience and system efficiency. Develop monitoring dashboards and system reporting.
  • User Training and Documentation: Create and maintain documentation on new systems and processes.
  • Automation and scripting: Develop scripts and automation tools using PowerShell to streamline endpoint configuration and management tassk such as deployment, patching and montioring for Windows OS environment. 
  • Participate in an on-call rotation to provide support for critical systems outside of standard business hours, including evenings, weekends, and holidays.
  • Qualifications

  • Minimum 6 years of experience supporting AWS cloud-based, highly available solutions.
  • Experience in Network Administration/Engineering
  • Extensive knowledge of AWS cloud platform networking services
  • Experience with cloud network technologies and virtualized network devices, . firewalls and load balancers.Fortinet FortiOS firewallsF5 BIG-IP load balancersOther vendor experience considered a plus.
  • Experience with Terraform and/or Ansible and automation strategies for network appliance management
  • Experience with Linux administration
  • Desired - minimum 6 years experience working in Network, SRE, DevOps, or Software engineering.
  • Certification or relevant experience with Network and AWS and/or Azure Cloud services a big plus.
  • BS/MS in Computer Science, Mathematics, Engineering, or equivalent experience.
  • Apply for this Position

    Ready to join ? Click the button below to submit your application.

    Submit Application