Who We Are
At Kyndryl, we design, build, manage and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward – always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our employees, our customers and our communities.
The Role
The Monitoring Tools Administrator is responsible for deploying, managing, and maintaining various IT monitoring tools used to ensure the performance, availability, and reliability of infrastructure and applications. This role involves configuring diverse monitoring platforms, integrating them with IT systems, and providing insights to optimize operations through real-time data and reporting.
With an unwavering focus on quality, robustness, and security, you'll be a driving force in implementing cutting-edge tools that enhance our operations, improve reliability, and gather valuable feedback on our platforms. Your ability to identify and mitigate common operational issues will play a crucial role in delivering seamless experiences to our customers.
If you're passionate about pushing the boundaries of technology, thrive in a collaborative environment, and are motivated by the opportunity to shape the future of reliability engineering, then we want to hear from you. Join our team and be part of a dynamic and forward-thinking organization that values innovation and excellence in everything we do.
Responsibilities:
- Implementation and Configuration:
- Deploy and configure leading monitoring tools (e.g., Zabbix, Nagios, Prometheus, Grafana, SolarWinds, Dynatrace, New Relic, etc.) across diverse IT environments.
- Define and implement monitoring strategies for servers, applications, networks, and cloud services.
- Customize dashboards, alerts, and reporting tools to meet organizational needs.
- Alert and Incident Management:
- Design and manage alerts to provide early detection of potential issues.
- Integrate monitoring tools with ITSM platforms for seamless incident management and resolution.
- Fine-tune thresholds and alerting mechanisms to minimize false positives.
- Optimization and Performance Monitoring:
- Analyze performance metrics and trends to identify areas for improvement.
- Continuously evaluate and optimize monitoring configurations to enhance accuracy and efficiency.
- Provide recommendations for capacity planning and scalability based on monitoring insights.
- Integration and Automation:
- Integrate monitoring tools with other IT systems (e.g., CI/CD pipelines, APM tools, cloud platforms).
- Automate routine monitoring tasks through scripting and configuration management tools.
- Collaborate with DevOps and SRE teams to incorporate monitoring into the software development lifecycle.
- Maintenance and Support:
- Perform routine maintenance, updates, and upgrades for all monitoring tools.
- Troubleshoot and resolve technical issues with monitoring systems.
- Provide training and support to team members on monitoring tools and practices.
- Documentation and Reporting:
- Create detailed documentation of monitoring configurations, processes, and best practices.
- Generate and distribute regular performance reports and ad hoc analysis for stakeholders.
- Maintain an updated inventory of monitored assets and systems.
Your Future at Kyndryl
Kyndryl has a global footprint, which means that as a Site Reliability Engineer at Kyndryl you will have opportunities to work on projects and collaborate with colleagues from around the world. This role is dynamic and influential – offering a wide range of professional and personal growth opportunities that you won’t find anywhere else.
Who You Are
You’re good at what you do and possess the required experience to prove it. However, equally as important – you have a growth mindset; keen to drive your own personal and professional development. You are customer-focused – someone who prioritizes customer success in their work. And finally, you’re open and borderless – naturally inclusive in how you work with others.
Requirements:
- Experience and Technical Skills:
- 5+ years of experience working with monitoring tools in IT environments.
- Proficiency in using one or more major monitoring platforms (e.g., Zabbix, Prometheus, Grafana, SolarWinds, Dynatrace, New Relic, Datadog, etc.).
- Strong knowledge of network protocols (SNMP, ICMP, HTTP, etc.), servers, and cloud services.
- Experience in scripting (Python, Bash, PowerShell) for automation and customization.
- Familiarity with container and orchestration monitoring (e.g., Kubernetes, Docker).
- Knowledge of databases and query languages (e.g., SQL, PromQL).
- Recommended Certifications:
- Relevant certifications such as Zabbix Certified Specialist, AWS/Azure Monitoring certifications, or Dynatrace Professional.
- ITIL Foundation Certification for understanding IT Service Management processes.
- Soft Skills:
- Strong analytical and problem-solving skills.
- Ability to collaborate across teams, including IT, DevOps, and application development.
- Excellent communication and documentation skills for technical and non-technical audiences.
- English Level:
- Intermediate English proficiency for technical documentation and communication with global teams.
Being You
Diversity is a whole lot more than what we look like or where we come from, it’s how we think and who we are. We welcome people of all cultures, backgrounds, and experiences. But we’re not doing it single-handily: Our Kyndryl Inclusion Networks are only one of many ways we create a workplace where all Kyndryls can find and provide support and advice. This dedication to welcoming everyone into our company means that Kyndryl gives you – and everyone next to you – the ability to bring your whole self to work, individually and collectively, and support the activation of our equitable culture. That’s the Kyndryl Way.
What You Can Expect
With state-of-the-art resources and Fortune 100 clients, every day is an opportunity to innovate, build new capabilities, new relationships, new processes, and new value. Kyndryl cares about your well-being and prides itself on offering benefits that give you choice, reflect the diversity of our employees and support you and your family through the moments that matter – wherever you are in your life journey. Our employee learning programs give you access to the best learning in the industry to receive certifications, including Microsoft, Google, Amazon, Skillsoft, and many more. Through our company-wide volunteering and giving platform, you can donate, start fundraisers, volunteer, and search over 2 million non-profit organizations. At Kyndryl, we invest heavily in you, we want you to succeed so that together, we will all succeed.
Get Referred!
If you know someone that works at Kyndryl, when asked ‘How Did You Hear About Us’ during the application process, select ‘Employee Referral’ and enter your contact's Kyndryl email address.
Kyndryl is the world's largest provider of IT infrastructure services serving thousands of enterprise customers in more than 60 countries.
We design, build, manage and modernize the mission-critical...
Apply Now