MONQ 5.2: a new resource-service model for root cause analysis and the immediate health map of IT objects

In June, the MONQ Digital Lab team released a new version of the platform. The health map of objects and the resource-service model have now new opportunities.

MONQ 5.2 functionality helps engineers more effectively manage failures in the IT infrastructure, maintaining the health of the entire system and digital services.

1. In the health map of the configuration unit (it is an element of resource-service model), users have the opportunity to use the synthetic parameter to quickly assess the state of their objects and the mutual influence of objects on each other.

The health of an object is calculated based on the health of the objects affecting it, as well as the monitoring events associated with it. The built-in calculation model can be easily changed. Two tools are used for this:

1) the weight of the bond;
2) a critical factor.

The weight of the connection is used in assessing the "equivalent" effect, while the critical factor is the direct inheritance of health, suitable for critical nodes. To calculate health, MONQ performs two calculations: one by weight, the second by a critical factor. The result is the smallest of the two calculations.

For example, there is a cluster of 5 objects. The first object is a master, and if it fails, it does not matter what happens to the rest, the cluster will be broken. The remaining objects are additional "nodes". All five objects weight equal to 1, but the critical factor is put for the master. According to the model, if the master fails or degrades on it, the state of the cluster will not be better than that of the master. If one of the nodes fails, the cluster health will be 80%. Thus, the model allows to quickly assess the state of the entire IT environment as a whole.

The service has solved the difficult task of instantly recalculating the health of the state of the entire graph, taking into account changes made by users to the model immediately.

2. A new resource-service model is combined with a configuration unit card (UC).

The tab for calculating the health of UC appeared on the card. This hugely helps in root cause analysis. You can see in detail which factors most negatively affect the object, and go through the whole chain. In the coming releases, all the information that is now available in the old system interface will go to the new card.

3. In MONQ 5.2, it is now possible to save the position of the vertices of the graph on the resource-service model: now, in addition to the auto-construction algorithm, the user can fix vertices on the graph. At the same time, a combined option is also available, when part of the peaks is fixed, and the other part is completed by the auto-construction algorithm. The main feature is that the position of the vertices is not "global", but is stored in a specific user filter. Thus, you can build various views and “cards” of PCM and share the filter with other teams.

4. Unlike the old version, 5.2 provides support for asynchronous data transfer (Websockets). Unit card statuses, health values ​​are all updated using Websockets technology.

5. Thanks to the d3.js library, performance is significantly improved when building and displaying large graphs.


Some details about MONQ that you may have missed.

MONQ is an AIOps platform for log analysis, functiolal UX monitoring, and automated incident management.

MONQ reduces business risks and improves financial performance that depends on the reliability of IT infrastructure and digital services. The platform increases IT productivity with AI, machine learning, and automation.

MONQ reduces the number of manually processed failure notifications, allows centralizing IT monitoring, automatically identifies and informs teams about the most important events for business. The platform automatically solves problems and has an effective notification system.
MONQ increases the efficiency of root-cause analysis and increases the speed of investigation of IT-incidents by 70-95%.

MONQ is a system that helps engineers to get rid of routine and have more time to more important things – creativity, development, and growth. Routine work remains for robots.

The MONQ Digital lab is among the winner of the Cybersecurity Challenge 2020 and Startup Village 2020.