Blog>>Observability>>Monitoring>>[interview] How to use data to navigate complex systems better? Talking about monitoring and observability.

[interview] How to use data to navigate complex systems better? Talking about monitoring and observability.

In today's constantly developing technological landscape, businesses are looking for solutions that help them manage complex systems and ensure optimal performance. However, although 90% of IT professionals believe observability is important and strategic to their business, only 26% said their observability practice was mature      link-icon. Why this difference? What are the reasons why businesses don’t manage to implement monitoring and observability at a high level? 

In this interview, we explore the significance of monitoring and observability - our guests: Edyta Kałka, Senior Engineering Manager, and Maciej Manturewicz, Director of Engineering, will share their insights, real-world experiences, and opinions on how to create effective monitoring and observability solutions.

Starting from the ground up, what is monitoring and observability and why do we need it? 

Edyta Kałka: Monitoring and observability play a crucial role in understanding and managing complex systems. Nowadays, systems are highly complex and distributed, consisting of interconnected components and services. This complexity makes it challenging to handle behavior and performance effectively. 

Maciej Manturewicz: And that's where monitoring and observability come in. These activities provide valuable insights into such systems, enabling better management and troubleshooting. We’ll sometimes use the M/O abbreviation instead of the complete name monitoring and observability during this interview. 

Modernization, automation and transformation of your environment

What monitoring and observability features make it such a helping hand for modern tech industries? 

MM: Firstly, with the shift towards microservices architecture and containerization, systems have become even more complex. Traditional monitoring approaches aren't sufficient anymore because numerous independent microservices communicate with each other. Observability provides a comprehensive view of the system by collecting and analyzing data from various sources, allowing for better monitoring and troubleshooting.

Additionally, the widespread adoption of cloud computing and dynamic infrastructure has added new layers of complexity. Applications and services can now scale rapidly, and infrastructure components may change dynamically. Monitoring and observability are crucial in understanding the performance and health of these dynamic environments, ensuring efficient resource utilization, and quickly identifying issues.

EK: Furthermore, businesses today are increasingly focused on providing exceptional user experiences. Monitoring and observability allow organizations to gain deep insights into how their applications and services perform from the user's perspective. By tracking user interactions, monitoring can identify performance bottlenecks, usability issues, and other factors that impact the overall user experience.

How do you assist M/O solution providers? 

MM: We offer expertise in various areas. We can boost clients' solutions with one of our vertical strengths; for example, we can help enhance generic APM solutions with advanced functionalities from NPM (network performance monitoring). 

Another example may be a client that has a brilliant algorithm to analyze performance data, but struggles with visualizing it. In such cases, we provide a UI development team with experience in observability solutions to assist them.
Yet another case would be a monitoring product vendor's SDLC (software development life cycle) process is inefficient. In that case, our Platform Engineering team can step in and build an efficient CI/CD platform, automate QA processes, and help speed up development.

So, you are helping M/O solutions providers to build their solutions but you don't want to build your own product?

MM: We differentiate ourselves as an engineering services provider rather than a product company for several compelling reasons. 

Firstly, we prioritize building solid partnerships with our clients. We believe that competing with them would negatively impact our collaborative approach. Instead, we prefer to explore and leverage various technologies, programming languages, and tools across different domains, allowing us to deliver comprehensive solutions. 

However, if a highly customized solution is required, we can integrate an open source solution (OSS) into a cohesive and tailored system. This versatility enables us to adapt to unique client requirements and address their needs.

monitoring and observability interview

Could you tell me more about your expertise in the field of monitoring and observability?

EK: Our team has a core unit that has worked for major APM (application performance monitoring) solutions providers for several years. Additionally, our portfolio includes clients from different flavors of the monitoring and observability space. We stay updated by attending relevant conferences like KubeCon and Monitorama, where we gain insights from industry experts. Even during general tech conferences, we make it a point to attend dedicated talks on monitoring and observability. By staying on top of new technologies, we can provide comprehensive expertise to our clients.

monitoring and observability interview

I noticed you mentioned "M/O flavors". Could you explain what those flavors are?

MM: M/O flavors refer to different aspects of monitoring and observability in the context of network and application infrastructure.
Monitoring involves collecting, processing, aggregating, and displaying real-time quantitative data to understand a system's health and address issues to give measurable user experience insights and better responses to customer needs. Observability allows for diagnosing a system's internal state based on its external outputs. Observability from the early system design phase provides deep understanding and rich context for effective troubleshooting. 

Both terms are pretty wide thus when going into details specialities need to be considered. So under the heading of M/O flavors we can include:

  • infrastructure monitoring, 
  • network observability, 
  • application performance management, 
  • log monitoring, 
  • alerting, 
  • events and incidents correlation and management, 
  • observability-driven load management, 
  • security observability, and more.

I ought to mention that the list includes only the selected flavors that we consider the most important. We’ll describe just a few of them to give you a better insight into how different they are.

One of the flavors is the security domain. Organizations can monitor traffic between microservices and detect security vulnerabilities faster and more efficiently. It enables network teams to troubleshoot issues, optimize performance, protect against threats, and plan for growth. 

EK: However, achieving network observability can be challenging in today's complex and dynamic network environments. Networks span multiple data centers, clouds, and edge locations, and configurations constantly change due to software-defined networking, containerization, orchestration, and automation.

To cover this, it is important to collect and analyze network data from various sources and formats like flow data, internet routing data, synthetic tests, network metrics, and even contextual information about infrastructure, applications, users, customers, and policies. 

Moreover, with a network performance monitoring (NPM) solution, organizations can detect and resolve issues faster, optimize network performance and costs, protect against threats like DDoS attacks, and effectively plan for future network needs. It provides complete visibility into networks and their impact on business outcomes and user experience. 

MM: The next flavor is log monitoring. It allows businesses to gain insights by continuously analyzing logs and detecting critical events or anomalies. The applications create logs without delay. That makes log monitoring a rich source of information about the application. Effective log monitoring can support real-time monitoring, visualizations, and integration with other monitoring tools, enabling comprehensive observability across the entire company infrastructure. 

However, it could be challenging because the format of the logs and messages can be very different from each other. 

Another flavor we want to describe further is observability-driven load management. It involves collecting and analyzing various metrics and data points related to the system's behavior, resource utilization, and workload characteristics. This allows us to gain insights into its operational state and make informed decisions to optimize resource allocation and handle load fluctuations effectively.

Even a healthy application requires load management - unexpected traffic spikes can reveal insufficient capacity allocation or service upgrades can introduce performance regressions due to bugs. Another example where load management is needed is when unpredictable traffic spikes occur during new product launches or sales promotions. 

Observability-driven load management is a remedy for the above-mentioned problems, it limits their harmful results by properly and automatically managing access to application resources by different types of users. 

The last question: How about AI? Do you see a place for it in the monitoring world?

EK: We firmly believe that anytool without AI capabilities is at a significant disadvantage in today's market. The integration of AI has become crucial for staying competitive and meeting the evolving needs of users. 

AI is revolutionizing monitoring and observability by delivering predictive analytics. AI-powered systems can anticipate future occurrences that may impact availability and performance, enabling proactive identification and prevention of potential issues for uninterrupted services. 

Service analytics allows for analyzing both IT and business data, uncovering behavioral patterns, and identifying positive business outcomes through business value dashboards, enabling informed decision-making. Moreover, with AI's causal analysis, organizations also gain insights into the root causes of availability or performance problems. 

Automated workflows driven by AI empower systems to adapt and optimize outcomes by dynamically adjusting configurations and acting proactively. 

The possibilities are even broader, but the above examples show how organizations can gain an edge, enhancing performance and delivering exceptional user experiences with the help of AI. 

_

Edyta Kałka – Senior Engineering Manager

Edyta specializes in network monitoring product design and innovation. She started as a software developer and then moved into managing development teams. At CodiLime, she is a Senior Engineering Manager and cooperates with clients in the network and cloud performance monitoring industry. Privately, she is a salsa dancer and lover of water sports, including surfing & kitesurfing.

Maciej Manturewicz – Director of Engineering

Maciej has strong experience in delivering full-stack software projects for cloud, microservices and monitoring. A DevOps culture enthusiast, he is passionate about QA automation and perfecting CI/CD.

Kałka Edyta

Edyta Kałka

Senior Engineering Manager

Edyta specializes in network monitoring product design and innovation. She started as a software developer and then moved into managing development teams. At CodiLime, she is a Senior Engineering Manager and cooperates with clients in the Network and Cloud Performance Monitoring industry. Privately, she is a...Read about author >
Manturewicz Maciej

Maciej Manturewicz

Director of Engineering

Maciej is a Director of Engineering with nearly two decades in the software industry. He started his career journey as a software engineer, and he gained experience on every step of the ladder before landing in his current leadership role. With a rich background in software engineering, Maciej possesses a...Read about author >
Rusinowicz Karolina

Karolina Rusinowicz

Content writer

A content writer with a passion for software development and a unique blend of creativity and technical expertise. Karolina has been crafting engaging and insightful articles in collaboration with seasoned developers. In her writing, Karolina breaks down complex technical concepts into accessible and...Read about author >

Read also

Get your project estimate

For businesses that need support in their software or network engineering projects, please fill in the form and we’ll get back to you within one business day.

For businesses that need support in their software or network engineering projects, please fill in the form and we’ll get back to you within one business day.

We guarantee 100% privacy.

Trusted by leaders:

Cisco Systems
Palo Alto Services
Equinix
Jupiter Networks
Nutanix