Observability Engineer

Location CO-Barranquilla
Posted Date 1 day ago(4/3/2025 11:56 AM)
Job ID
2025-3808
# Positions
1
Category
ITO

Job Summary

The Observability Engineer will design, implement, and maintain observability solutions for complex systems and applications. This role requires a solid understanding of monitoring and observability practices, as well as expertise in tools and technologies used to collect and analyze performance, logging, and metrics data.  

Responsibilities

  • Monitoring Setup and Configuration:Configure monitoring tools to gather data from various systems, applications, and network components. Define metrics, configure data collection agents, and ensure proper connectivity and access.
  • Alert Management:Monitor alerts, perform triage to identify critical issues, analyze alert patterns, and configure escalation workflows to ensure timely response and resolution.
  • Performance Analysis and Troubleshooting:Use tool features to analyze metrics, logs, and traces. Conduct root cause analysis, troubleshoot issues, and identify areas for optimization.
  • Incident Response:Collaborate across teams to respond to incidents quickly, handling triage, communication, and coordination with stakeholders. Participate in post-incident reviews to identify improvements.
  • Dashboard and Visualization:Develop and maintain dashboards and visualizations that offer a consolidated view of system health and performance. Customize dashboards based on specific business and operational requirements.
  • Capacity Planning and Scalability:Monitor resource utilization and trends to forecast capacity needs. Collaborate on resource planning and provisioning to support scalability and optimal performance.
  • Tool Administration and Maintenance:Perform routine administration tasks for observability tools, including user management, access control, and system upgrades. Monitor the health and availability of these tools.
  • Documentation and Knowledge Sharing:Document configurations, troubleshooting steps, and best practices. Contribute to knowledge bases and share insights with the team.
  • Tool Integration and Automation:Integrate observability tools with other systems, including ticketing and incident management platforms. Automate monitoring configurations and reporting to improve efficiency.
  • Continuous Improvement and Research:Stay updated on observability trends, research new tools and methods, and continuously improve monitoring setups to align with best practices.
  • Other duties as assigned.

Skills and Experience

  • Bachelor's degree in computer science or a related technical field preferred.
  • 5+ years of experience in software engineering or IT with a focus on monitoring, alerting, and analysis.
  • Proficiency in application, cloud infrastructure, and monitoring tool administration.
  • Hands-on experience with SolarWinds, Elasticsearch (AWS OpenSearch), and similar tools (e.g., Splunk).
  • Experience with APM tools such as AppDynamics or alternatives like Dynatrace, New Relic.
  • Proficiency in scripting languages (Python, JSON, PowerShell preferred).
  • Strong understanding of web services and CI/CD pipelines.
  • Ability to thrive in a fast-paced environment with excellent problem-solving skills, adaptability, and teamwork skills.
  • Knowledge of Infrastructure as Code (IaC), particularly CDK and Terraform, is highly desirable.

Passion for DevOps, application/API monitoring, automation, and reliability

About Auxis

This is a full-time position with a work schedule of Monday-Friday with some schedule variations as needed including on-call coverage rotation. Occasional night or weekend work for special projects. This position is 100% work from office

Options

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed