






Context
Customer wanted to gain better visibility into their information system. Until then, no centralized monitoring solution was in place, which complicated incident analysis and slowed down decision-making.
Problem
The IT system suffered from sporadic latency, slowdowns, and opaque overall activity. It was impossible to quickly identify bottlenecks or the most resource-intensive applications.
Solution
I designed and deployed a complete observability stack:
- Umami to collect analytics and track user behavior in real time.
- Prometheus to aggregate system and application metrics.
- Loki + Alloy to centralize and index logs from all services.
- Grafana to visualize metrics and provide teams with a clear and interactive dashboard.
- Automated alerting via Prometheus + Slack to be notified as soon as a critical threshold is crossed.
Impact
- Detection and removal of unnecessary processes, freeing up 15% of RAM.
- Identification of the most resource-intensive applications, enabling optimization prioritization.
- A gain of 3 hours per week for the Customer Success team, which can now focus on providing strategic support to customers rather than copying and pasting.