Immersive Blog

Immersive Fusion
OpenTelemetry Chaos Simulator. Experience observability without writing any code

OpenTelemetry Chaos Simulator. Experience observability without writing any code

Dan Kowalski - 2024-01-05

According to Gartner, one of the barriers to entry in the observability and application performance monitoring (APM) market is the complexity of implementation and configuration. Often, APM solutions need substantial time and expertise to establish and upkeep, posing a difficulty for organizations that lack dedicated resources or technical proficiency. Additionally, the cost of APM tools and services can also be a barrier for some organizations. It is important to carefully evaluate the requirements and costs associated with implementing APM solutions to ensure a successful and cost-effective deployment.

APM Adoption Hurdles

If you have never used an APM solution before you are probably researching and asking yourself questions like "What exactly does this do?" and "How easy is it to understand this tool?". Sometimes the answers are lengthy and time-consuming when it comes to Application Performance Management (APM). That is because usually those answers are in some documentation, a video, require contacting a sales representative or delving into potentially complex do-it-yourself (DIY) solution.

This is an obvious problem because customers are interested in knowing the optimal route or the potential consequences when something goes wrong. But in order to find out a good solution to do so they don't want to go through the hassle of setting up an entire infrastructure, creating subscriptions, or adding packages to their applications.

We asked ourselves how can we enable users to see the power of APM without any of the usual work required to instrument or integrate a vendor into the customer's software. We wanted to showcase simple applications of root cause analysis using familiar technologies like SQL server or Redis cache. Fortunately, this entire process can be completed within just a matter of minutes, making it a quick and hassle-free endeavor. The solution for us was the OpenTelemetry Chaos Simulator, an open source project from Immersive Fusion available on GitHub. With this sample, we answer the question ""What exactly does OpenTelemetry or this vendor do"? with minimal effort from the user. There is also the OpenTelemetry Demo which shows "a microservice-based distributed system intended to illustrate the implementation of OpenTelemetry in a near real-world environment". It is an unrelated projects that solves a different but be sure to check it out as well once you have increased your comfort with OpenTelemetry.

Trials and tribulations

A common practice in the software industry is for potential customers to try out any software before making a purchasing decision. This is done to allow them to effectively evaluate various aspects of the software such as its features, compatibility with their existing systems, and overall usability. By doing so, customers are able to gain firsthand experience with the software and ensure that it meets their specific needs and requirements.

In order to facilitate this process, there are numerous methods that software companies employ to provide trial versions to potential customers. These include offering free trials, online demos, and even virtual machines or appliances that allow users to test the software in a controlled environment. Additionally, some companies provide access to sample source code, enabling customers to examine the underlying code and better understand the software's functionality.

The importance of facilitating this trial period cannot be overstated. Even the most exceptional software products can potentially lose a prospect if the amount of effort or work required to try out the software is perceived to be significant. A seamless and hassle-free trial experience is crucial in order to keep potential customers engaged and interested in making a purchase.

Moreover, it is imperative that the software company ensures there is no confusion or ambiguity surrounding the billing process. If customers encounter any uncertainty or complexity when it comes to understanding how they will be charged for the software, it can act as a negative factor, deterring them from proceeding with the purchase. Therefore, it is essential for software companies to provide clear and transparent information about their pricing plans and billing procedures, thus instilling confidence and trust in the potential customers.

When adopting an APM vendor, one of the crucial steps that needs to be taken is adding the vendor packages to the customer application. This process entails seamlessly integrating the comprehensive APM vendor's software package into the application's existing codebase. The integration process, in essence, involves not only effortlessly installing the required dependencies but also carefully configuring the APM agent or SDK, and skillfully instrumenting the application code in order to successfully capture and effortlessly transmit the essential telemetry data to the APM vendor's cutting-edge platform.

OpenTelemetry

We mentioned OpenTelemetry several times already and that's for good reason. OpenTelemetry is an open-source observability framework that provides a unified way to collect, process, and export telemetry data from applications. It allows developers to instrument their code to capture metrics, traces, and logs, which can then be used to monitor and troubleshoot the performance and behavior of their applications. With OpenTelemetry, you can easily integrate various telemetry data sources into your application, such as libraries, frameworks, and third-party services. Any vendor that supports OpenTelemetry is automatically much easier to onboard to because of the large number of language libraries available from the project and the community.

Chaos Engineering

Causing chaos, breaking things, unplugging server in a network room

Chaos engineering is a practice that involves intentionally injecting failures and disruptions into a system to test its resilience and identify potential weaknesses. It is a proactive approach to ensure that a system can withstand unexpected events and continue to function properly.

By simulating various failure scenarios, chaos engineering helps organizations uncover vulnerabilities and improve the overall reliability of their systems. It allows them to identify and address potential issues before they cause significant disruptions or downtime.

Chaos engineering can be applied to different layers of a system, including infrastructure, applications, and networks. It involves techniques such as fault injection, traffic manipulation, and resource exhaustion to create controlled chaos and observe how the system responds.

The goal of chaos engineering is not to cause harm or chaos for the sake of it, but rather to gain insights into the system's behavior under stress and ensure that it can recover gracefully. By intentionally introducing failures in a controlled environment, organizations can build more resilient systems that can withstand real-world challenges.

Sandboxing

Programmers playing in a sandbox

Sandboxing is a technique used in software development and security to create a controlled environment for running applications. It involves isolating an application from the rest of the system, preventing it from accessing sensitive resources or causing harm to the system.

Sandboxing is commonly used to test and evaluate applications in a safe and controlled manner. It allows developers to run potentially malicious or untrusted code without risking the security and stability of the underlying system. By confining the application to a sandbox, any malicious actions or vulnerabilities are contained within the sandbox and cannot affect the rest of the system.

OpenTelemetry Chaos Simulator

Video: OpenTelemetry Chaos Simulator. Experience observability without writing any code

The OpenTelemetry Chaos Simulator a simple Angular/ASP.NET Core application that answers the questions "What's the impact?" or "What's the risk?". This permits you to inject/eject failures into a software and observe how these faults influence the telemetry produced.

It is already preconfigured with an OpenTelemetry exporter so you don't need to change any code or enter any API keys.

The source code is available on GitHub you can clone, fork and run it locally or deployed somewhere if you like or just go to demo.iapm.app for an already deployed version that is ready to use.

Objective

Multiple prospects walking on a bridge from a barren field to circle platform in the sky

What

  1. Vendor-Neutral OpenTelemetry example
  2. End-to-end application performance management experience
  3. Demo requires no setup effort from the user

Why

  1. Simple and quick APM experience of OpenTelemetry

How

  1. No-code
  2. No-touch
  3. Readily Available OpenTelemetry libraries

Demo

  1. Sample application (Angular/.NET Core)
  2. Configured OpenTelemetry OTLP Exporter
  3. Basic chaos - failure injection/ejection (break/fix functionality)
  4. Sandboxes - users can break/fix independently of each other

The main goal of this initiative is to provide a hands-off, code-free, holistic application performance management exposure to those interested in understanding the potential of OpenTelemetry. This example uses readily available OpenTelemetry libraries to ensure vendor-neutrality and ease of understanding.

The demo at demo.iapm.app uses an OTLP Exporter, which transmits telemetry data into a demonstration Immersive APM application grid. We will also take a look at some basic uses of chaos or failure injection into isolated sandbox instances to allow different users to break or fix functionality independently of each other. Upon arrival, a new sandbox will be generated for you, unless you are reusing a previous one.

The chaos simulator consists of three straightforward steps:

  1. Inject failure or break the sandbox, as well as eject failure or fix the sandbox.
  2. Run a flow that either has failure injected or not.
  3. Show the result of the flow that was run.

We want to enable the most efficient and least effort-intensive way to observe a happy/broken path in a demo application, without requiring changes in the code or settings, deployment, or running the sample within a container.

There are two ways in which we display the flow results:

One is inside a simple text terminal that provides immediately available output from the flow. The displayed happy/broken path representation doesn't seem very beneficial as it is from the user's perspective, not the developer's. However, it is helpful when running through the example quickly.

The other is by opening the configured APM tool. In this case, an Immersive APM application grid. This is how developers or other personnel responsible for software application health would use APM. An APM tool is capable of showing the data in high quality on a unified screen, no matter the computer where the error took place.

Example flows include interacting with a SQL Database and Redis Cache. We can observe the happy path and break the flows to witness the broken path. Running either flow without injecting failure produces a SUCCESS output message in the terminal. We can break the flows by pressing the "Break" buttons. If we run the flows again, we see FAILURE messages in the terminal.

Upon opening the configured APM tool, you will immediately latch onto the sandbox you are using and will not see other users' sandbox interactions. There, you can easily observe the happy and broken path. Moreover, for the failure cause, you can view the error, the full stack trace, and any related logs.

Conclusion

Zoom into computer code with OpenTelemetry and APM

The OpenTelemetry Chaos Simulator demonstrates the power of observability without the usual instrumentation or integration effort through root cause analysis and showcasing happy/broken functionality with familiar technologies like SQL server or Redis cache. It simplifies or eliminates hurdles with APM adoption during initial stages of discovery by combining chaos engineering and sandboxes into a deployed solution. No custom work needed. No subscription or trial required. Open-sourced for use with any vendor, including Immersive Fusion. Code is not proprietary. OpenTelemetry Chaos Simulator is helpful to us and we hope it is for others.

In conclusion, this is a simple, no-touch, no-code, end-to-end application performance management experience in less than five minutes.

Try the chaos simulator with Immersive APM today or with another APM vendor of your choosing.


Dan Kowalski

Father, technology aficionado, gamer, grid master

About Immersive Fusion

<> Immersive Fusion (immersivefusion.com) is an innovator in Application Performance Monitoring and Management (APM) utilizing web, VR, and 3D technologies. Creators of Immersive APM. Our solutions empower software and operations engineers with the ability to view and troubleshoot their applications resulting in rapid root-cause analysis, decreased downtime, and higher productivity. Learn more about or join Immersive Fusion on LinkedIn, Mastodon, Twitter, YouTube, Facebook, Instagram, GitHub, Discord.

Streamlined Setup and Integration

Ingress, Retention, Search

Packages available for any business size

Rapid Root Cause Analysis

Web and 3D/VR tooling

Find answers within a single glance

Global Deployment

Accessed Anywhere

Our servers are available no matter your location

Comprehensive Support

Chat, Email, Consulting

Expert support when you need it.

Enter the World of Your Application ™
Request a demo