Team of professionals

Back to all news

Enhance OpenTelemetry gRPC With a Consistent Hash Load Balancer

This article demonstrates leveraging the envoy's consistent hash load balancing for OpenTelemetry OTLP gRPC payload.

The use case

OpenTelemetry collector (OTel collector) is deployed as an agent alongside the application on remote servers. It sends telemetry data (logs, traces, metrics) from the application and the host into central storage through a gateway deployed on the Kubernetes cluster.

The OTel collector is deployed using the OpenTelemetry operator Helm chart, with Kubernetes HPA, scaling replicas based on CPU load. The traffic is routed through a headless service because the standard Kubernetes service is not a good fit for gRPC, described in this article. But with this setup, there is no load balancing on the Kubernetes side, which is also mentioned in the article in the above link.

So, this lack of load balancing with the OTel agents configured to send data in batches causes the data from the same remote host to be forwarded randomly through the OTel collector gateway replicas. Data are written multiples by the actual number of replicas into the storage due to different label values holding the identity of the OTel replica. This drastically increases the storage usage, and the queries must be aggregated.

Let’s show it in an example.
Take one of the OTel agent metrics called otelcol_process_uptime, which has a label added by the OTel gateway called otelcol_replica, holding the name of the replica. The OTel gateway has four replicas; let’s query the metric using PromQL on the storage side:

avg by (otelcol_replica)(otelcol_process_uptime{hostname="xxxxxx"})
{otelcol_replica="opentelemetry-collector-5fc9f8g5sj5"} 2502046.749352578
{otelcol_replica="opentelemetry-collector-5fc9f8pfmvh"}
2502096.74889717
{otelcol_replica="opentelemetry-collector-5fc9f8rzkh4"}
2502156.749325255
{otelcol_replica="opentelemetry-collector-5fc9f8xj95v"}
2502136.749453457

As demonstrated, the data coming from the remote host are written four times into the storage.

So, the solution to this problem is a load balancing mechanism, which provides consistency in routing data from the same remote source through the same OTel collector replica. And that’s where the envoy-proxy is a perfect candidate, offering load balancers based on consistent hashing.

The solution

The envoy-proxy is deployed with two replicas and a headless service between the ingress and OTel collector gateway.

It is configured with a ring-hash load balancer based on the X-Forwarded-For HTTP header, enabling HTTP2 for upstream clusters.

...
route:
  cluster: "opentelemetry-collector-cluster"
  hash_policy:
    - header:
        header_name: x-forwarded-for
...
clusters:
- name: opentelemetry-collector-cluster
  connect_timeout: 0.25s
  type: STRICT_DNS
  dns_lookup_family: V4_ONLY
  lb_policy: RING_HASH
  http2_protocol_options: {}
...

This configuration ensures that the data from the same source IP will flow through the same OTel gateway replica while it exists. With this consistent route, only one copy of the data is written into storage from the remote host.

In case the replica fails, the envoy-proxy will redirect the data flow to the next member of the hash ring, so for a short period in the storage, two copies of the data will exist due to the changed value of the label holding the identity of the OTel collector replica.

Conclusion

Consider a high-load environment where the number of the OTel gateway replicas could be scaled to quite a high number. How much storage capacity could be saved with a reliable data flow from remote sources?

Author

Gabriel Illés
Senior DevOps Engineer

Dedicated professional with experience in managing cloud infrastructure and system administration, integrating cloud-based infrastructure components, and developing automation and data engineering solutions. Good at troubleshooting problems and building successful solutions. Excellent verbal and written communicator with strong background cultivating positive relationships and exceeding goals.

The entire Grow2FIT consulting team: Our team

Related services

Team of professionals

Back to all news

Case study: 365.bank – Evaluating the Future: A Comprehensive Review of Bank’s New Architecture

365.bank is poised to modernize its core IT systems, including core banking and omnichannel platform, for various business and technological reasons. They opted for modern, cloud-based solutions. The primary challenge was to confirm whether this new architecture was feasible and deliverable and could effectively address the initial reasons for initiating the program. The bank needed assurance that the transition would not only be technologically sound but also align with its business objectives and future growth plans.

Solution

Employing a structured methodology, Grow2FIT’s approach for each area included:

  • An initial workshop to review the proposed TO-BE architecture and identified issues.
  • This was followed by the preparation of a draft output for each domain.
  • Subsequent follow-up workshops allowed for collaborative refinement of these drafts.
  • The final stage involved the completion and finalization of the outputs.

The areas reviewed were:

  • Accounts & Cards
  • Payments
  • Consumer & Mortgage Loans
  • Corporate & Treasury
  • Data, Reporting, Compliance & CRM
  • Front-end, New Omnichannel platform integration

Result

After a strategic review, Grow2FIT has advised 365.bank to proceed with a phased approach to IT system enhancement, focusing on key areas such as payment gateway functionality and new customer channels. The recommendation includes the implementation of a new Cloud Data Warehouse solution, focusing initially just on incorporating new requirements into this platform.

We also recommend retaining core banking systems where beneficial. Further stages involve consideration of system evolution based on specific technological, financial, and market-driven factors. Details of the implementation are kept general to respect confidentiality agreements.

Contact Person

Martin Petrík, 365.bank Program Manager

About the client

365.bank is a Slovak bank that carries out its business activities mainly on the basis of the Commercial Code and the Banking Act. The bank offers its clients a wide range of banking and financial products and services. Its core activities include accepting deposits, providing loans, performing domestic and cross-border transfers of funds, providing investment services, performing investment activities and providing ancillary services under the Act on Securities.

Provided services

Key Technologies

  • Mambu
  • Backbase
  • AWS

Team of professionals

Back to all news

Leveraging OpenTelemetry for Fault-Tolerant Prometheus Metrics with Envoy Mirroring

There are a lot of use cases when metrics collected from applications or services need to be forwarded from the local environment to remote centralized long-term storage such as Thanos or Mimir.

This article will help build a fault-tolerant and highly available solution to collect and forward metrics from applications and services running in Kubernetes to the remote Prometheus-compatible long-term TSDB storage. It also requires proper knowledge about the components used, such as the OpenTelemetry collector, Prometheus in agent mode, and Envoy proxy request mirroring. Detailed configuration is outside the scope of this article.

The Design

The OTEL collector collects metrics from desired resources, and the pipeline is configured using OpenTelemetry collector receivers, processors, and exporters to process and send collected metrics to the endpoint of the Envoy proxy.

The Envoy proxy is configured with a static route mirror policy with upstream clusters of Prometheus pods. This means that the Envoy proxy directly connects to the k8s pod and not to the k8s service in front of the pods. Each Prometheus pod represents an Envoy upstream cluster. Data are routed primarily to one of the two replicas of the Prometheus pod and mirrored to the second one.

Prometheus is deployed into the k8s cluster with two replicas in Agent mode with the remote-write-receiver feature enabled. Also, an external label prometheus_replica was added to instances, which is used to deduplicate series in Thanos, sent from high-availability Prometheus instances pairs.

Conclusion

This design helped make monitoring more resilient and reduced the time series data gap in Grafana dashboards.

Author

Gabriel Illés
Senior DevOps Engineer

Dedicated professional with experience in managing cloud infrastructure and system administration, integrating cloud-based infrastructure components, and developing automation and data engineering solutions. Good at troubleshooting problems and building successful solutions. Excellent verbal and written communicator with strong background cultivating positive relationships and exceeding goals.

The entire Grow2FIT consulting team: Our team

Related services

Team of professionals

Back to all news

Introducing Libor Vanek: Seasoned Technology and Banking Expert Joining Ou

Welcome to Libor Vanek, our new Technology & Banking Consultant. With an extensive career spanning over two decades, Libor brings a wealth of knowledge and experience to our team. His expertise lies in data, integration, banking, and fintech, with a keen focus on aligning business stakeholders with IT delivery teams.

Libor’s approach is rooted in agile methodologies, including Scrum and Kanban, ensuring rapid, iterative deliveries that build momentum, consensus, and trust. His previous roles include Senior Data Architect at Walmart/Asda and Architecture Lead at Scroll Finance. Libor’s skillset includes enterprise and solution architecture, data mesh methodologies, and modern data stack technologies.

His addition to our team marks a significant milestone in our journey towards innovative technology solutions in (and beyond) banking.

Check our other Senior Consultants here

Team of professionals

Back to all news

CloudGuard by Grow2FIT

At Grow2FIT, we offer bespoke solutions tailored for businesses of all sizes, from startups to enterprises. Our dedication is to ensure your cloud infrastructure always performs at its best. Backed by our team of seasoned experts, we pledge continuous monitoring, proactive upkeep, strategic cost optimization, and agile enhancements for an efficient, cost-effective cloud ecosystem.

Basic Package Features

Choose CloudGuard basic package for peace of mind, knowing that your cloud infrastructure is under expert watch. And when you’re ready to delve deeper into optimization and strategic planning, our advanced services are just a call away.

Price: Ranges from 500€ – 1000€ monthly (excluding VAT). The final quotation is contingent on the intricacy and magnitude of your infrastructure.

Additional Details:

  • Service hours: 5*8
  • SLA: Best effort

Additional Services

Bodyguards of your cloud

Tomáš Čorej
Grow2FIT Cloud & DevOps Consultant

Tomáš has 15 years of experience in designing and building high-performance and cost-effective solutions for automation of the maintenance of physical servers. He prefers to use commodity hardware and open-source tools such as MaaS.io, OpenStack, Terraform, Juju or Ceph. At the same time, he has extensive experience in the integration of open-source tools into the startup and corporate environments and operation of on-premise, cloud and hybrid solutions.

Kamil Madáč
Grow2FIT Cloud & DevOps Consultant

Kamil is a Senior Cloud / Infrastructure consultant with 20+ years of experience and strong know-how in designing, implementing, and administering private cloud solutions (primarily built on OpenSource solutions such as OpenStack). He has many years of experience with application development in Python and currently also with development in Go. Kamil has substantial know-how in SDS (Software-defined storages), SDN (Software-defined networking), Data Storages (Ceph, NetApp), administration of Linux servers and operation of deployed solutions. Kamil is a regular contributor to OpenSource projects (OpenStack, Kuryr, Requests Lib – Python).

Petr Drastil
Grow2FIT Cloud & DevOps Consultant

DevOps Consultant and Architect with previous experience in software development focusing on design and implementation of IaaS and PaaS solutions in the cloud (AWS, Azure) and Kubernetes. Petr has worked on multiple projects that delivered standardised tooling used by developers to break legacy monolithic solutions into separate services with an independent lifecycle. He is also experienced in shifting applications from dedicated servers to the Kubernetes / Red Hat OpenShift platform. Petr is experienced in the finance (Deutsche Börse), telco (Deutsche Telekom) and e-commerce (Wallmart Global Tech) sectors.

And many others… The entire Grow2FIT consulting team: Our team

Clients

Case Studies

Contact us

Team of professionals

Back to all news

How many software development environments are needed and why?

In software engineering, a "software development environment" refers to a combination of processes, tools, and infrastructure that developers use to design, create, test, and maintain software. This includes everything from Integrated Development Environments (IDEs), such as Visual Studio, Eclipse, and IntelliJ, to foundational tools and libraries and even broader components like databases, servers, and network setups. Simply said, it denotes a particular set of infrastructure resources set up to execute a program under specific conditions.

As software advances through its life cycle, different environments address the unique requirements of the Development and Operations teams. Given today’s rapid and competitive digital business setting, development teams must fine-tune their workflows to stay ahead. An efficient workflow enhances team productivity and guarantees the delivery of prompt and reliable software.

Benefits of Harnessing Multiple Environments

Parallel Development

Software development often resembles balancing multiple tasks at once. While introducing new features, it’s vital not to disrupt a live application, potentially bringing in bugs, performance issues, or security vulnerabilities. While one part of the team might be fully occupied by crafting fresh features, another could be refining an existing version based on feedback from testing. Having segregated environments enables teams to work on different tasks without stepping on each other’s toes.

Enhanced Security

Limiting access to production data is crucial. By distributing data across various environments, we strengthen the security of production data and preserve its integrity. This reduces the chance of unintentional modifications to the live data during development or testing phases.

Minimized Application Downtime

These days, application stability and uptime are more crucial than ever. Customers expect and rely on consistent service availability. Repetitive disruptions might lead to losing a company’s reputation. By cultivating multiple environments and establishing rigorous testing, we position ourselves to launch robust and reliable software.

Efficient Hotfix Deployment

There are moments when a quick fix or enhancement must be rolled out with great speed. For such instances, having an environment that mirrors production closely and is free from ongoing feature development is invaluable. This dedicated environment facilitates quick feature or fix deployment, followed by testing, before a seamless transition to live production.

An In-Depth Look at Development Environments

As software evolves from an idea to a full-fledged application, it passes through various stages, each with its unique set of tools, protocols, and objectives. These stages, or environments, form the backbone of the development lifecycle, ensuring that software is crafted, refined, tested, and deployed precisely.

Local Development Environment

The initial stage of software development occurs in the local development environment. It acts as the primary workspace where developers initiate the coding process, often directly on their personal computers with a distinct project version. This setting allows a developer to construct application features without interference with other ongoing developments. While this environment is suitable for running unit and integration tests (with mock external services), end-to-end tests are typically less frequent. Developers commonly employ Integrated Development Environments (IDEs), software platforms offering an extensive suite of coding, compiling, testing, and debugging tools.

Integration Environment

At this stage in the development process, developers aim to merge their code into a team’s codebase. With many developers or teams working independently, conflicts and test failures can naturally arise during this integration. In expansive projects, where multiple teams focus on distinct segments (or microservices), the integration environment becomes the critical platform where all these separate functionalities come together. Additionally, integration tests may be adjusted here to ensure application stability. Different implementations of multiple teams (like API integration point adjustments) can often originate from the initial analysis stage. Furthermore, the challenge of locally developing cloud-native features emphasizes the integration environment’s essential role, highlighting distinctions between local setups and actual cloud operations.

Test Environment

Also known as the quality assurance environment, it employs rigorous tests to evaluate individual features and the application’s overall functionality. Tests range from internal service interactions (integration tests) to all-inclusive tests, including internal and external services (end-to-end tests). Typically, the test environment doesn’t demand the extensive infrastructure of a production setting. The primary goal is to ensure the software meets specifications and sort out any defects before they reach production. Organizations might optimize processes by combining the integration and test environments, facilitating simultaneous initial integration and testing.

Staging Environment

The staging or pre-production environment aims to simulate the production environment regarding resource allocation, computational demands, hardware specifications, and overall architecture. This simulation ensures the application’s readiness to handle expected production workloads. Organizations sometimes opt for a soft launch phase, where the software goes through internal use before its full-scale production deployment. Access to the staging environment is typically limited to specific individuals like stakeholders, sponsors, testers, or developers working on imminent production patches. This environment’s closeness to the actual production setting makes it the go-to for urgent fixes, which, once tested here, can swiftly be promoted to production.

Production Environment

The production environment refers to the final and live phase providing end-user access. This setup includes hardware and software components like databases, servers, APIs, and other external services, all scaled for real-world use. The infrastructure in the production environment must be prepared to handle large volumes of traffic, cyber threats, or hardware malfunctions.

Other Environments

The specific needs of an application, the scale of the project, or business requirements may necessitate the introduction of additional environments. Some of the more common ones include:

  • Performance Environment: Dedicated to gauging the application’s efficiency and response times.
  • Security Testing Environment: The primary focus is to assess the application’s resilience to vulnerabilities and threats.
  • Alpha/Beta Testing Environments: These are preliminary versions of the application made available to a restricted group of users for early feedback.
  • Feature Environments: New functionalities can be evaluated in a standalone domain before being incorporated into the primary integration environment.

Summary

The software development process requires a series of specialized environments tailored to different stages of its lifecycle. The number and nature of these environments can vary based on the size and requirements of the project. For example, in some cases, to optimize workflows, the integration and testing environments might be combined into one, providing a unified platform for both merging code and conducting initial tests.

While performance-focused environments hold their importance, with the proper monitoring tools, the production environment can occasionally negate the need for a separate performance environment.

In conclusion, the software development environment isn’t a one-size-fits-all approach. It demands careful planning and customization to fit a project’s specific goals and needs. Making the right choices in setting up these environments is critical to ensuring a smooth journey from idea to launch, ultimately delivering top-notch applications.

Author

Róbert Ďurčanský
Senior Fullstack Developer

Róbert is a highly skilled Senior Fullstack Developer with over 15 years of experience in the software development industry. With a strong background in back-end and front-end development and UX&Graphics and a passion for delivering high-quality solutions, Róbert has proven expertise in a wide range of technologies and frameworks. He is proficient in TypeScript, Angular, Java, Spring Boot, Kotlin, and AWS Cloud Solutions, among others. Throughout his career, Robert has worked on various projects, including e-commerce platforms, financial systems, and game development.

The entire Grow2FIT consulting team: Our team

Related services

Team of professionals

Back to all news

Reference: Raiffeisen Bank International – Designing a Digital Bank’s Data Architecture

Raiffeisen Bank International (RBI), a prominent banking group, was on the journey of launching its new digital banking platform. With the rapid digitization of banking services and the increasing demand for seamless online customer experiences, RBI recognized the imperative need for a robust and adaptable data architecture. While the bank had in-house teams proficient in traditional banking systems, they sought external expertise to harness the full potential of contemporary cloud technologies.

The Problem

RBI’s vision of its digital bank was modern, agile, and future-ready. The challenge was twofold:

  • Designing a data architecture that would be scalable, efficient, and capable of handling the vast influx of digital transactions.
  • Ensuring that the architecture, while modern, would remain compliant with internal and external regulations and seamlessly integrate with RBI’s existing systems.

Our Solution

Our specialized team of Data Consultants delved into the project with a two-pronged approach:

  • Serverless and Cloud-Agnostic Architecture: Our design principles prioritized a serverless framework on AWS. This not only ensured automatic scalability without the overhead of managing servers but also brought down operational costs. Moreover, by designing the architecture to be cloud-agnostic, we ensured that RBI would not be tethered to a single cloud provider, granting them flexibility and resilience in their digital endeavours.
  • Integration and Compliance: Acknowledging the paramount importance of security and regulation in the banking sector, our solution was meticulously tailored. We:
    • Conducted a comprehensive Requirements Analysis to ascertain the bank’s needs and align our design accordingly.
    • Crafted the Data Architecture and Data Processing blueprint utilizing a suite of cloud-agnostic services, ensuring optimal data flow, storage, and retrieval mechanisms.
    • Ensured Internal Regulation Compliance by integrating the architecture with RBI’s internal environment, embedding requisite security measures, and devising a robust security concept.

Outcome

With our intervention, Raiffeisen Bank International now boasts a state-of-the-art digital banking data architecture that stands as a beacon of efficiency, resilience, and adaptability. The bank is poised to deliver unmatched digital banking experiences to its customers while staying ahead of the curve in the rapidly evolving fintech landscape.

Provided services

Key Technologies

  • AWS

Team of professionals

Back to all news

Welcome Marián Ivančo: Software Architect with 20+ Years of Experience

We are pleased to announce Marián Ivančo has joined our team. With over 20 years in the field, Marián has extensive experience in designing and implementing complex IT systems. His work has covered a range of sectors, including finance, gaming, and energy.

Marián is adept at migrating from legacy systems to modern container solutions. His technical expertise includes Java, Kubernetes, cloud solutions, and container platforms. Throughout his career, he’s played pivotal roles in large-scale system integrations and migrations.

We’re looking forward to Marián’s contributions and the wealth of experience he brings to our team.

Check our other Senior Consultants here

Team of professionals

Back to all news

Our Summer Teambuilding Adventure Was a Splash!

🌊☀️ Had an absolute blast at our summer teambuilding event! 🚣‍♂️🏄‍♀️

Check out this video for a sneak peek of our adventurous day filled with rafting, surfing, and more! 💦
Grateful for a team that knows how to work hard and play hard. 💪😄

Team of professionals

Back to all news

Reference: Atlas Group – Monitoring, Support, and Infrastructure Development

Atlas Group is a technology-driven organization that relies on Kubernetes for its infrastructure. They sought assistance in monitoring, support, and problem-solving in their Kubernetes environment. Additionally, they required help in setting up a distributed block-based storage solution based on LINSTOR to provide persistent volumes for their pods or NFS storage.

Solution

Grow2FIT, a service provider specializing in Kubernetes and infrastructure management, partnered with Atlas Group to address their needs. The following services were provided:

  • Monitoring and Support
    • Implemented a monitoring system to identify issues or anomalies in the Kubernetes environment proactively.
    • Established a support mechanism to address and resolve problems encountered by the Grow2FIT team promptly.
    • Responded to requests for assistance regarding Kubernetes and other related technologies.
  • Problem Solving and Consultation
    • Provided consultation services to Atlas Group, offering expertise and guidance in problem-solving and troubleshooting within the Kubernetes ecosystem.
  • Infrastructure Development
    • Upgrading Kubernetes to newer versions, ensuring smooth transitions and minimizing disruptions.
    • Engaged in ongoing maintenance and problem resolution related to Kubernetes and other infrastructure components.
  • Distributed Block-based Storage (LINSTOR)
    • Assisted Atlas Group in setting up a distributed block-based storage solution based on LINSTOR.
    • Configured LINSTOR to provide persistent volumes for their pods, enabling data persistence and reliability.
    • Integrated NFS storage into the infrastructure, leveraging LINSTOR’s capabilities to enhance storage capabilities.

Result

  • Swift identification and resolution of issues through proactive monitoring and responsive support.
  • Successful implementation of LINSTOR, providing reliable and persistent volumes for their pods.
  • A collaborative partnership between Atlas Group and Grow2FIT ensured ongoing support and consultation, enabling their infrastructure’s seamless development and enhancement.

Provided services

Key Technologies

  • Kubernetes
  • LINSTOR

Contact Person

Tomáš Řehák, Head of Engineering