Nagios is still the king of monitoring. For our needs — a heterogeneous environment with dozens of servers, databases and application servers — Nagios with its plugin ecosystem is unbeaten. But monitoring a Java application server is more than just checking whether a port responds.
JMX monitoring¶
The check_jmx plugin connects via JMX port (SSL) to GlassFish and reads metrics: heap memory, thread count, loaded classes, GC time, connection pool utilization, session count.
Custom plugins for business metrics¶
Perl plugins check the number of orders processed per hour, measure the response time of SOAP endpoints and check the age of the last record in the audit log. Business monitoring is what the client truly values.
Alerting and escalation¶
A three-tier model: WARNING = email to the team. CRITICAL = SMS to the on-call admin (30-minute response). CRITICAL for more than 30 minutes = escalation to a senior admin. On-call rotation weekly — including developers, to motivate them to write stable code.
SLA reporting¶
Monthly report from Nagios availability data. We consistently meet 99.5% uptime thanks to monitoring and fast response.
Lessons learned¶
Monitoring is not a nice-to-have, it’s a must-have. The investment pays off the first time you catch an incident before the client does.
Need help with implementation?
Our experts can help with design, implementation, and operations. From architecture to production.
Contact us