Platform Architecture · Technical Reference

Component Reference

Architecture rationale, ChainSys-specific configuration, and service applicability for all 19 platform components across infrastructure, runtime, data, AI, observability, and deployment layers.

19 components · 7 groups · Source: ChainSys Platform Technical Architecture 2025

Group 1 · 3 components

Infrastructure & Security

🌐

Apache HTTPD

Reverse Proxy & SSL Termination

Infrastructure Latest stable Active

What it is

Apache HTTP Server is the world's most widely deployed open-source web server. In the ChainSys architecture it operates exclusively as a reverse proxy and SSL termination point — not as a direct application server.

Role in ChainSys Platform

HTTPD sits at the outermost edge of the DMZ tier, acting as the first layer of ingress control. All inbound traffic from browsers, REST/SOAP clients, and enterprise source systems is received here. HTTPD terminates TLS, enforces HTTPS, and forwards requests into the internal web tier — ensuring no direct internet-to-backend path exists.

Configuration & Usage

Configured with mod_proxy and mod_proxy_http for reverse proxying to the Tomcat cluster
Handles SSL/TLS certificate termination — internal hops use plain HTTP within the trusted network boundary
HTTP → HTTPS redirect enforced at the proxy layer for all external traffic
Request filtering at the DMZ boundary prevents malformed or oversized payloads reaching application services
Works alongside Keycloak (authentication) and ActiveMQ (async messaging) within the DMZ

HTTPD handles routing and termination only. Authentication and authorisation are delegated entirely to Keycloak — HTTPD does not hold identity state.

Service Applicability

All Services (DMZ ingress)

⚖️

HAProxy

High-Availability Load Balancer

Infrastructure Latest stable Active

What it is

HAProxy is a high-performance TCP/HTTP load balancer and proxy, designed for high availability and horizontal scale-out. It is the reference standard for open-source load balancing in enterprise Linux environments.

Role in ChainSys Platform

HAProxy is deployed at every horizontal scale tier in the ChainSys architecture — not just at the web tier. This means the Web Tier (Tomcat cluster), each Application Service cluster (dataZap, dataZen, dataZense, BOTS, SAB independently), and the Database Tier all sit behind a HAProxy instance. New nodes are added to the HAProxy backend pool without downtime.

Configuration & Usage

Distributes HTTP traffic across the Tomcat cluster at the Web Tier
Each application service (dataZap, dataZen, dataZense, Smart BOTS, SAB) scales independently behind its own HAProxy backend pool
PostgreSQL Primary/Replica read scale-out routed through HAProxy at the Database Tier
Session affinity (sticky sessions) configurable per deployment profile
Health checks detect failed nodes and remove them from the backend pool automatically
Redis cluster and ActiveMQ broker clustering complement HAProxy at the cache and messaging tiers

HAProxy enables stateless application services — shared state is stored in Redis and PostgreSQL, so any node behind HAProxy can serve any request.

Service Applicability

All Services

🔐

Keycloak

Identity & Access Management

Infrastructure Latest stable Active

What it is

Keycloak is an open-source Identity and Access Management (IAM) platform supporting SAML 2.0, OAuth 2.0, OpenID Connect (OIDC), LDAP, Kerberos, and Active Directory. It is the ChainSys platform's single authentication authority.

Role in ChainSys Platform

Keycloak is the central IdP for the entire platform. Every authentication flow — user login, service-to-service JWT issuance, SSO, MFA — passes through Keycloak. Zero Trust is enforced at every layer: JWT tokens are validated on every inbound request, with no implicit trust between internal services.

Configuration & Usage

One Realm per Tenant: each customer organisation maps to a dedicated Keycloak Realm, providing full identity isolation — authentication, SSO configuration, MFA policy, and IdP federation are independent per realm with no cross-realm bleed
Supports SAML 2.0, OAuth 2.0, OIDC, LDAP, Kerberos, Active Directory — enterprise IdP federation out of the box
MFA enforcement configurable per realm and per role
SSO across all platform services within a realm — single login, all products
JWT tokens issued per session; validated on every API request across all service boundaries
Role-Based Access Control (RBAC) fine-grained per service, per data domain, and per tenant

Tenant isolation in Keycloak is the identity layer of the platform's multi-tenancy model. Combined with dedicated database schemas per tenant, it ensures complete logical separation between customer organisations.

Service Applicability

All Services (platform-wide IAM)

Group 2 · 1 component

Messaging

📨

Apache ActiveMQ

Message Broker & Event Bus

Messaging Latest stable Active

What it is

Apache ActiveMQ is a widely deployed open-source message broker supporting JMS, AMQP, MQTT, OpenWire, and STOMP. It provides reliable, asynchronous message delivery with persistence and broker clustering.

Role in ChainSys Platform

ActiveMQ sits in the DMZ as the platform's internal event bus, decoupling producers from consumers across services. It is the backbone of dataZap's CDC (Change Data Capture) pipeline — pipeline trigger events, ETL job notifications, and cross-service coordination messages are delivered through ActiveMQ queues and topics.

Configuration & Usage

Deployed in the DMZ alongside HTTPD and Keycloak — within the trust boundary but before the application tier
Used by dataZap for CDC event delivery: source system change events are queued and consumed by pipeline processors asynchronously
Cross-service coordination: pipeline completion events notify downstream services (e.g. dataZen validation triggers, dataZense catalog refresh)
Broker clustering supported for message throughput scale-out
Persistent queues ensure no event loss during downstream service restarts or processing delays
Message TTL, retry policies, and dead-letter queues configured per pipeline type

ActiveMQ is the async backbone for event-driven pipeline orchestration. For synchronous service-to-service calls, services communicate directly via REST APIs secured by JWT.

Service Applicability

dataZapdataZendataZense Catalog

Group 3 · 4 components

Application Runtime

☕

Spring Boot

Backend Application Framework (Current)

Application Runtime 3.1.6 Migrating → Quarkus

What it is

Spring Boot is a Java framework that simplifies the bootstrapping and development of Spring-based applications. It provides auto-configuration, embedded server support, and an extensive ecosystem of integrations.

Role in ChainSys Platform

Spring Boot is the current runtime for all ChainSys data and business platform services: dataZap, dataZen, dataZense Catalog, dataZense Analytics, Smart BOTS, Smart App Builder, and the Platform Foundation layer. It has underpinned the platform's growth from inception and remains fully operational during the Quarkus migration.

Configuration & Usage

Version 3.1.6 across all services currently on Spring Boot
Auto-configuration supports rapid adaptation and independent service deployment without XML boilerplate
Embedded Tomcat provides self-contained service packaging — deployed as standalone JARs
Transitioning to Quarkus via the strangler-fig pattern: existing services remain fully operational while new capabilities are built on Quarkus
Spring Security handles service-level JWT validation (tokens issued by Keycloak)
Spring Data JPA manages PostgreSQL persistence across all services

Spring Boot 3.1.6 is not the target state. All services are planned for Quarkus migration. New feature development should be designed for Quarkus compatibility.

Service Applicability

dataZapdataZendataZense CatalogdataZense AnalyticsSmart BOTSSmart App BuilderPlatform Foundation

⚡

Quarkus

Backend Application Framework (Next-Gen)

Application Runtime Latest stable Active

What it is

Quarkus is a Kubernetes-native Java framework built for GraalVM and OpenJDK HotSpot, designed to make Java a first-class citizen in container and cloud-native environments. It features a reactive programming model (Mutiny), compile-time dependency injection (ArC), and native image compilation.

Role in ChainSys Platform

Quarkus is the strategic target runtime for all ChainSys platform services. The AI Gateway is already fully live on Quarkus, validating the stack in production. The migration follows a strangler-fig pattern — existing Spring Boot services remain operational while services are progressively moved to Quarkus.

Configuration & Usage

Native image compilation via GraalVM reduces container image sizes by 60–70% compared to Spring Boot fat JARs — critical for Kubernetes deployment density
Startup time drops from seconds (JVM Spring Boot) to milliseconds (Quarkus native) — enabling fast pod scaling in Kubernetes
Reactive model (Mutiny / Vert.x) provides non-blocking I/O — particularly beneficial for AI Gateway (high-concurrency LLM calls) and dataZap (high-throughput pipeline processing)
Compile-time dependency injection (ArC) eliminates reflection overhead — lower memory footprint per service instance
RESTEasy Reactive for JAX-RS endpoints; Panache for PostgreSQL persistence (replaces Spring Data JPA)
Native Kubernetes health check integration: readiness/liveness probes out of the box

Migration status: AI Gateway ✅ Complete | dataZap, dataZen, dataZense 🔄 In Pipeline | Smart BOTS, SAB, Foundation 🔄 Planned

Service Applicability

AI GatewaydataZap (pipeline)dataZen (pipeline)dataZense (pipeline)

🖥️

Apache Tomcat

Web Tier Servlet Container

Application Runtime 10.x Active

What it is

Apache Tomcat is an open-source Java Servlet container and web server maintained by the Apache Software Foundation. It implements the Jakarta Servlet, Jakarta Expression Language, and WebSocket specifications.

Role in ChainSys Platform

Tomcat forms the Web Tier — the HTTP serving layer that sits between the HTTPD reverse proxy and the application service cluster. It handles HTTP request processing, session management, and routing to the correct application service. The Tomcat cluster is fronted by HAProxy for horizontal scale-out.

Configuration & Usage

Version 10.x (Jakarta EE 9+ compatible, required for Spring Boot 3.x)
Clustered deployment behind HAProxy — new Tomcat nodes added to the backend pool without traffic disruption
Session affinity (sticky sessions) configurable: useful for stateful web UI flows while maintaining horizontal scale
Handles HTTP/1.1 and HTTP/2 request processing
Forwards API requests to the appropriate application service cluster based on URL routing rules
JVM tuning (heap, GC settings) applied per-node based on traffic profile

Service Applicability

All Services (Web Tier)

🟩

Node.js

Web Application Deployment Runtime

Application Runtime v12 → v22 LTS Active (upgrading)

What it is

Node.js is a JavaScript runtime built on Chrome's V8 engine, designed for building scalable server-side applications. In the ChainSys architecture it is used specifically as the deployment runtime for applications generated by Smart App Builder.

Role in ChainSys Platform

When a developer or citizen developer builds a web or mobile application in the Smart App Builder low-code environment, the generated application is deployed as a Node.js server. The Node.js runtime hosts the app at runtime, connects to CouchDB for app data, and uses dataZap connectors for backend enterprise data access.

Configuration & Usage

Current version: Node.js 12 (legacy LTS — planned upgrade)
Target version: Node.js 22 LTS — part of the Smart App Builder modernization initiative
Generated apps are deployed as standalone Node.js processes with their own CouchDB data partition
dataZap connectors provide access to enterprise backend systems (ERP, CRM, databases) from within SAB-built apps
Each deployed app is isolated: separate process, separate CouchDB partition, separate Keycloak client configuration
Node.js 22 LTS brings V8 improvements, native ESM support, and security patches critical for production deployments

Node.js 12 has reached end-of-life. The upgrade to Node.js 22 LTS is a planned modernization item alongside the Quarkus migration roadmap.

Service Applicability

Smart App Builder

Group 4 · 5 components

Data Stores

🐘

PostgreSQL

Primary Relational Database

Data Store Latest stable Active

What it is

PostgreSQL is a powerful open-source object-relational database system with a strong reputation for reliability, feature robustness, and standards compliance. It is the ChainSys platform's primary persistent store for all structured data.

Role in ChainSys Platform

PostgreSQL is used across all five platform services as the source of truth for metadata, workflow state, governance records, datamart results, and platform configuration. It is the most broadly used data component in the platform — every service reads and writes to PostgreSQL.

Configuration & Usage

Primary/Replica configuration for read scale-out: read queries routed to replicas, writes to primary via HAProxy
Write scaling via vertical node sizing or table partitioning for high-volume datamart workloads
Dedicated schema per tenant (configurable) — provides logical data isolation without separate database instances; separate DB instance option available for high-security tenants
Authorization Engine enforces tenant-scoped access at query execution time — no cross-tenant data leakage
Stores: platform metadata, ETL job state (dataZap), MDM golden records (dataZen), catalog metadata & lineage (dataZense), governance policies, analytics datamart, SAB app definitions, Foundation audit logs
JSONB columns used for semi-structured metadata (catalog attributes, dynamic workflow configuration) without sacrificing relational integrity

Service Applicability

dataZapdataZendataZense CatalogdataZense AnalyticsSmart App BuilderPlatform Foundation

🛋️

Apache CouchDB

NoSQL Document Store

Data Store Latest stable Active

What it is

Apache CouchDB is an open-source document-oriented NoSQL database that uses JSON for documents, JavaScript for MapReduce indexes, and HTTP as its API. Its multi-master replication model makes it well-suited for distributed application data.

Role in ChainSys Platform

CouchDB serves as the dedicated data store for Smart App Builder-generated applications. When SAB generates a web or mobile app, that app's runtime data (user records, form submissions, transactional app data) is stored in a CouchDB partition specific to that app. The reference in the Technical Architecture PDF is explicit: 'CouchDB would be needed only if dynamic mobile applications are to be generated.'

Configuration & Usage

Partitioned per tenant and per deployed application — each SAB app has its own CouchDB database
Accessed via the Node.js runtime (SAB app backend) and via dataZap connectors (for ETL/integration from SAB app data into enterprise systems)
HTTP-native API simplifies integration with Node.js runtime without ORM overhead
Document model suits the flexible, schema-on-write nature of SAB-generated app data
Multi-master replication supports offline-capable mobile apps with eventual consistency sync
Cluster nodes scale horizontally — additional CouchDB nodes added to the cluster as SAB app volume grows

CouchDB is not a general-purpose platform store. Its scope is strictly Smart App Builder app data. All other structured data — including SAB app metadata and configurations — is held in PostgreSQL.

Service Applicability

Smart App Builder

🔍

Apache Solr

Full-Text Search & Metadata Index

Data Store Latest stable Active

What it is

Apache Solr is a highly scalable, open-source enterprise search platform built on Apache Lucene. It provides full-text search, faceted search, hit highlighting, dynamic clustering, and rich document handling.

Role in ChainSys Platform

Solr provides the search and indexing engine for dataZense Catalog exclusively. When data assets are catalogued, their metadata, descriptions, classifications, and PII tags are indexed in Solr. This enables the intelligent search experience in the catalog — keyword search, faceted filtering, and metadata discovery across all catalogued enterprise data assets. The Technical Architecture PDF explicitly states: 'SOLR would be needed only if data cataloging is implemented.'

Configuration & Usage

Dedicated Solr cluster node in on-premise multi-node deployments (separate VM, not co-located)
Indexes: dataset metadata, column descriptions, business glossary terms, PII classifications, data lineage tags, usage statistics
Powers the intelligent search in the Data Catalog UI — keyword queries, fuzzy matching, faceted navigation by domain, sensitivity, source system
Updated incrementally as new datasets are catalogued or metadata is updated
Solr schema aligned to ChainSys metadata model — custom field types for lineage relationships and PII classification hierarchies
Not deployed for services that do not require catalog search: dataZap, dataZen, SAB, Smart BOTS, and Analytics have no Solr dependency

Solr is a catalog-only component. Deploying it for non-catalog use cases adds operational overhead without benefit — the architecture explicitly scopes it to the catalog service only.

Service Applicability

dataZense Catalog

⚡

Redis

In-Memory Cache & Session Store

Data Store Latest stable Active

What it is

Redis is an open-source in-memory data structure store, used as a cache, message broker, and session store. It supports strings, hashes, lists, sets, sorted sets, bitmaps, hyperloglogs, and geospatial indexes.

Role in ChainSys Platform

Redis is the platform's shared session cache and API response cache. Because all application services are stateless (a prerequisite for horizontal scaling behind HAProxy), Redis holds the shared state that allows any node in a service cluster to serve any request. This is what makes the ChainSys horizontal scaling model work: nodes are interchangeable because state lives in Redis and PostgreSQL, not in the application process.

Configuration & Usage

Redis Cluster mode for horizontal cache scaling — sharding across Redis nodes as cache volume grows
Shared across all platform services — single Redis cluster, logically partitioned by service and tenant prefix
Session data: authenticated user sessions cached post-Keycloak validation to avoid repeated token introspection
API response caching: expensive read queries (catalog search results, analytics aggregations) cached with configurable TTL
Reduces database read load — cache hit rate significantly reduces PostgreSQL query pressure during peak traffic
Distributed lock support: used by dataZap pipeline scheduler to prevent duplicate job execution across cluster nodes

Service Applicability

All Services (shared cache)

📁

Git / SVN

Source & Artifact Version Control

Data Store Latest stable Active

What it is

Git (distributed) and SVN (centralised) are version control systems for tracking changes to files and coordinating work among developers. In the ChainSys platform they serve as both integration targets (for source data) and storage for platform-managed artifacts.

Role in ChainSys Platform

Git and SVN serve two roles in the platform: as connector targets in dataZap (pulling versioned configuration or code files as source data for migration or integration pipelines), and as the versioning substrate for platform-managed artifacts — ETL mapping templates, transformation scripts, and workflow definitions created within the platform are versioned for auditability and rollback.

Configuration & Usage

dataZap includes Git and SVN as supported endpoint types — pipelines can read versioned files from Git/SVN repositories as source data
2,000+ ready-made mapping templates (Oracle EBS, Oracle Fusion, SAP S/4HANA, Microsoft Dynamics) are versioned artifacts stored in version control
Custom transformation scripts and business rules authored within dataZap are versioned: rollback to any prior version on pipeline failure
Lineage tracking: dataZense Catalog can trace data origins back to versioned transformation artifacts for full audit trails
Integration with CI/CD pipelines: platform configuration artifacts can be promoted across environments (dev → staging → prod) via Git pull requests

Service Applicability

dataZapdataZen

Group 5 · 3 components

AI & Intelligence

🤖

AI Gateway

LLM Orchestration & Governance Layer

AI & Intelligence Quarkus + ReactJS Active

What it is

The AI Gateway is a ChainSys-built service that brokers all AI/LLM function calls between platform services and external or self-hosted language model providers. It is the first ChainSys service fully built on the Quarkus stack, validating the target architecture in production.

Role in ChainSys Platform

Every AI call within the ChainSys platform — whether from dataZap's AI-assisted data cleansing, dataZen's duplication detection, dataZense's automated metadata generation, or SAB Autonomous agentic workflows — routes through the AI Gateway. No service calls an LLM directly. The Gateway enforces governance, manages System Prompts as versioned artifacts, controls token budgets, and provides a unified audit log of all AI interactions.

Configuration & Usage

Built on Quarkus (backend) and ReactJS 22 (management console) — the live reference for the platform's Quarkus migration
Supports: OpenAI, Anthropic, Azure OpenAI, AWS Bedrock, Google Vertex AI, and self-hosted models via Ollama
System Prompt versioning: prompts are managed as versioned artifacts in the Gateway console — promotion, rollback, and A/B testing of prompts without code deployment
Token budget management: per-service and per-model token limits enforced at the gateway, preventing runaway AI cost
Model routing: requests routed to different providers based on capability, cost, latency, or data residency requirements
Immutable audit log of all AI calls: input prompt, model used, token count, response hash — available for compliance review
Circuit breaker: falls back to alternative provider on LLM API timeout or rate limit

The AI Gateway is the only path from the ChainSys platform to external LLM providers. This design choice ensures governance is never bypassed and all AI interactions are logged centrally.

Service Applicability

AI Gateway (owns the service)dataZapdataZendataZense CatalogdataZense AnalyticsSmart App Builder

🧠

LLM Providers

Large Language Model APIs

AI & Intelligence Various Active

What it is

Large Language Models (LLMs) are AI systems trained on large text corpora that can perform natural language understanding, generation, reasoning, and code synthesis. ChainSys integrates with both cloud-hosted and self-hosted LLM providers via the AI Gateway.

Role in ChainSys Platform

LLMs power the intelligence layer across all ChainSys products: AI-assisted data cleansing and transformation suggestions in dataZap, duplication detection and golden record scoring in dataZen, automated metadata generation and PII classification in dataZense Catalog, natural language analytics queries in dataZense Analytics, and multi-step agentic workflow orchestration (SAB Autonomous) in Smart App Builder.

Configuration & Usage

Cloud providers: OpenAI (GPT-4o, GPT-4 Turbo), Anthropic (Claude), Azure OpenAI (enterprise data residency), AWS Bedrock (Titan, Claude, Llama), Google Vertex AI (Gemini)
Self-hosted: Ollama enables running open-weight models (Llama 3, Mistral, Phi-3) on-premise — used where data cannot leave the customer's environment
All traffic routes through the AI Gateway — providers are transparent to calling services
Model selection configurable per prompt type: smaller/faster models for classification tasks, larger models for complex reasoning
Data residency: Azure OpenAI and AWS Bedrock used for customers with regional data sovereignty requirements
No training data leakage: all API calls use inference endpoints only — customer data is not used for model training

Model selection is managed in the AI Gateway System Prompt configuration, not hardcoded in individual services. Switching models requires no service deployment.

Service Applicability

via AI Gateway — all services

🗄️

Vector Store

Semantic Search & RAG Foundation

AI & Intelligence TBD Roadmap

What it is

A vector store (also called a vector database) stores numerical embeddings — dense vector representations of text, metadata, or other content — and enables similarity search over them. It is foundational for Retrieval-Augmented Generation (RAG) patterns, where LLMs are grounded with retrieved context rather than relying on parametric memory alone.

Role in ChainSys Platform

The vector store is a planned addition to the ChainSys AI architecture to enable semantic search over the Data Catalog and RAG-based intelligent data discovery. Rather than keyword matching (currently served by Apache Solr), embeddings allow finding semantically similar datasets, columns, and business glossary terms — even when exact keywords don't match.

Configuration & Usage

Use case 1: Semantic catalog search — user queries like 'find all datasets related to customer revenue' match by meaning, not just keyword
Use case 2: AI-assisted metadata generation — embed existing catalog entries, retrieve similar ones as context for LLM-generated descriptions
Use case 3: SAB Autonomous context grounding — embed enterprise knowledge base documents; retrieve relevant chunks as context for agentic workflow steps
Technology TBD — candidates include pgvector (PostgreSQL extension), Weaviate, Qdrant, or Chroma
Will integrate with the AI Gateway for embedding generation (using LLM provider embedding APIs)
Expected to complement rather than replace Apache Solr — Solr for keyword/faceted search, vector store for semantic search

Vector Store is on the technology roadmap — not currently deployed. Target timeline aligns with the dataZense Catalog Quarkus migration.

Service Applicability

dataZense Catalog (primary)Smart App Builder (SAB Autonomous)AI Gateway

Group 6 · 1 component

Observability

📡

Apache SkyWalking

APM & Distributed Tracing

Observability Latest stable Active

What it is

Apache SkyWalking is an open-source Application Performance Monitoring (APM) and observability platform purpose-built for distributed systems. It provides distributed tracing, service topology mapping, metrics aggregation, and log correlation.

Role in ChainSys Platform

SkyWalking provides end-to-end observability across the ChainSys multi-service platform. In a distributed architecture spanning HTTPD, Tomcat, Spring Boot services, Quarkus (AI Gateway), PostgreSQL, Redis, and ActiveMQ, correlating a slow user-facing request back to its root cause requires distributed tracing — SkyWalking provides this across all service boundaries.

Configuration & Usage

Agent-based instrumentation: Java agents injected into Spring Boot and Quarkus services at startup — no code changes required
Distributed trace collection: every inbound request generates a trace ID propagated across all service hops — dataZap pipeline trigger → ActiveMQ → dataZen validation → dataZense catalog update
Service topology map: auto-generated call graph showing dependencies between services, databases, and external systems — used by architects and SREs to understand platform wiring
Latency profiling: P50/P95/P99 latency per endpoint per service — identifies slow queries and processing bottlenecks
Error rate alerting: threshold-based alerts on error rates per service, triggering on-call notifications
Database span capture: PostgreSQL and Redis calls are traced with query text and duration — slow query identification without manual EXPLAIN ANALYZE
AI Gateway integration: LLM call spans captured with provider, model, token count, and latency — enables AI cost and performance tracking

SkyWalking is referenced in the 2025 Technical Architecture document as the platform's APM solution. Its service topology and distributed tracing capabilities are particularly valuable as the Quarkus migration progresses — ensuring no observability gaps during the transition.

Service Applicability

All Services (platform-wide APM)

Group 7 · 1 component

Deployment

☸️

Kubernetes / Docker

Container Orchestration & Runtime

Deployment Latest stable Active (SAB) / Roadmap (others)

What it is

Docker is a container runtime for packaging applications and their dependencies into portable container images. Kubernetes (K8s) is the de facto container orchestration platform — managing deployment, scaling, health, and networking of containerised workloads across a cluster.

Role in ChainSys Platform

Kubernetes and Docker are currently deployed for Smart App Builder. The containerisation of SAB validates the deployment model and operational tooling (Helm charts, ingress controllers, autoscaling) before wider rollout. The Quarkus migration is the key enabler for expanding K8s to all other services — Quarkus native images produce containers 60–70% smaller than Spring Boot fat JARs, making per-node density economically viable.

Configuration & Usage

Current scope: Smart App Builder only — Node.js app containers and SAB backend are deployed via Kubernetes
Roadmap: all other services (dataZap, dataZen, dataZense, Smart BOTS, Platform Foundation) will be containerised as they complete the Quarkus migration
Docker provides the container runtime; Kubernetes manages cluster orchestration — scheduling, health checks, rolling updates, self-healing
Kubernetes-native health probes (readiness/liveness) are first-class in Quarkus — startup time milliseconds vs seconds in Spring Boot, critical for fast pod recycling
Horizontal Pod Autoscaler (HPA) will replace HAProxy-based manual cluster expansion for K8s-deployed services
Helm charts used for deployment templating across environments (dev, staging, production)
On-premise and cloud-managed Kubernetes (GKE, AKS, EKS) both supported — deployment topology matches customer infrastructure

The sequence is deliberate: Quarkus migration first → container image size reduction → Kubernetes deployment economically viable. Containerising Spring Boot fat JARs would work technically but creates operational overhead that Quarkus native images eliminate.

Service Applicability

Smart App BuilderdataZap (roadmap)dataZen (roadmap)dataZense (roadmap)Smart BOTS (roadmap)Platform Foundation (roadmap)