Job Summary:
The Hypervisor, Storage, and Backup Lead is responsible for managing, maintaining, and optimizing the project's virtual infrastructure, storage systems, and data backup solutions. This role ensures high availability, performance, and compliance across virtualized environments and data systems, providing seamless support and solutions for mission-critical systems.
Key Responsibilities:
1. Hypervisor Management
- Design, implement, and maintain virtual infrastructure using hypervisor technologies, including VMware and Oracle Linux Virtualization Manager (OLVM).
- Monitor and optimize virtual machine performance, resource allocation, and utilization to ensure optimal system function.
- Ensure high availability and reliability of the virtual infrastructure through proactive maintenance, monitoring, and incident resolution.
- Manage and resolve incidents and problems within the virtual environment, collaborating with internal teams as necessary.
2. Storage Management
- Oversee the administration of storage systems, such as SAN (Storage Area Network) and NAS (Network-Attached Storage).
- Conduct regular storage health checks and utilization assessments, identifying and implementing optimization opportunities.
- Ensure adequate storage capacity is maintained to meet application and infrastructure demands.
- Address and resolve incidents and issues related to storage systems efficiently to minimize downtime.
3. Backup Management
- Manage and optimize backup solutions, ensuring data protection across the project's environment.
- Monitor backup schedules to ensure successful completions, addressing any failures promptly.
- Ensure compliance with organizational and regulatory data retention policies and regulations.
- Resolve incidents related to the backup solution and perform necessary recovery actions.
4. Documentation and Reporting
- Maintain comprehensive documentation of virtual environments, storage configurations, and backup procedures.
- Generate regular reports on system performance, capacity planning, and backup status to support decision-making.
- Prepare Root Cause Analysis (RCA) reports for high-severity incidents and collaborate on preventive measures.
- Escalate issues to vendors when necessary to ensure timely incident resolution.
Qualifications:
- Bachelor's degree in information technology, Computer Science, or a related field.
- 5+ years of experience in hypervisor management (VMware, OLVM, or similar), storage systems (SAN, NAS), and backup solutions.
- Strong understanding of virtualization principles, resource allocation, and capacity planning.
- Proficient in storage administration, health monitoring, and incident management.
- Experience with backup and recovery systems, including scheduling, monitoring, and troubleshooting.
- Excellent problem-solving skills with the ability to address incidents quickly and effectively.
- Strong communication skills and ability to generate clear, detailed documentation and reports.
Preferred Skills:
- Certifications in VMware, OLVM, or related virtualization and storage solutions.
- Experience with high-availability architectures and disaster recovery planning.
- Familiarity with compliance frameworks related to data retention and recovery