DE EN EW

technology

Technologies and platforms

Tools are interchangeable. Architecture principles remain. This overview shows open source and open technologies that Kaufman AIS works with or that we typically use in enterprise knowledge systems. No tech stack marketing, but transparency about capabilities, operations and integration capabilities for management, IT management, enterprise architecture and engineering in medium-sized and enterprise companies.

Arrange a conversation

Enterprise knowledge system architecture with open technologies and modular platforms

Our selection principles

Enterprise AI Open Source and open platforms are means, not ends in themselves. We choose building blocks based on architectural principles, not tool religion. The following principles guide Enterprise AI architecture and operations in projects.

Open standards

Open interfaces and formats facilitate integration, portability and long-term operability across system boundaries.

API First

Services and knowledge layers can be accessed via APIs. This creates modular enterprise knowledge systems instead of monolithic isolated solutions.

Cloud optional

Architectures work in cloud, private cloud and on premise. Sovereign AI and self-hosted AI remain plannable.

Self-hosted possible

Critical components can be operated in your own data center. Control over data, models and access remains with the company.

Modular

Building blocks are interchangeable. Retrieval, memory, inference and assistance remain decoupled and expandable.

Portable

Workloads and configurations are transferable between environments. No unnecessary vendor lock-in at the infrastructure level.

Model Agnostic

Models are interchangeable. What matters is grounding, governance and connection to company knowledge, not a single provider.

Operational

Technology must work in enterprise operations. Observability, security, scaling and maintainability are part of the architecture.

Knowledge systems and retrieval

Knowledge systems combine sources, index, parsing and retrieval. These building blocks make corporate knowledge discoverable and addressable. They are part of the knowledge layer, not a replacement for governance and permissions.

OpenSearch

Open source search and analysis platform for full text, aggregations and scalable queries.

Full text index and enterprise search

Elasticsearch

Widely used search and analytics engine with rich indexing and observability ecosystem.

Search and analytics layers

Qdrant

Vector database with a focus on performance, filtering and production-related operations.

Vector index for semantic retrieval

Weaviate

Vector database with hybrid search, modules and graph functions.

Hybrid retrieval and knowledge objects

Meilisearch

Fast, developer-friendly search engine for low-latency applications.

Application-oriented search and Typahead

LlamaIndex

Framework for data connectors, indexing and RAG pipelines over heterogeneous sources.

Orchestration of retrieval pipelines

Haystack

Open source framework for NLP pipelines, QA and production-related RAG systems.

Pipeline framework for retrieval and QA

Apache Solr

Enterprise Search Server based on Lucene with faceted search and scaling.

Classic enterprise full-text index

Apache Tika

Content detection and text extraction from documents and binary formats.

Parsing and metadata extraction

Unstructured

Pipeline for document parsing, chunking and unstructured content preparation.

ETL for unstructured knowledge sources

Knowledge Runtime and Enterprise Memory

Enterprise memory needs more than vectors. Knowledge objects, context, linking and reuse are created in a runtime layer via retrieval stores. Cognee combines graph, memory and semantic structure for usable knowledge architecture.

Neo4j

Graph Database for relationships between entities, processes and knowledge objects.

Knowledge graph and linking

Mem0

Memory layer for persistent context and user reference in assistance systems.

Context memory for assistants

LlamaIndex

Data connectors and index abstractions for repeatable retrieval workflows.

Index and Connector Runtime

Haystack

Pipeline Runtime for QA, RAG and documented processing steps.

Pipeline runtime for knowledge flow

Qdrant

Vector store for embeddings with filtering and production-oriented scaling.

Embedding store in the memory stack

LanceDB

Embedded vector database for local and scalable vector workloads.

Lightweight vector store

Knowledge objects, context, memory, retrieval and linking work together. More under What is Enterprise Memory and What is a Knowledge Layer.

Vector databases and retrieval stores

Vector stores make knowledge addressable. They are a building block, not the sole architecture. Grounding, permissions and source reference remain central.

Qdrant

Production-oriented vector database with filtering, sharding and REST API.

Primary vector index

LanceDB

Embedded Vector Store for Edge, On Premise and Integrated Workloads.

Local and embedded vector store

Weaviate

Hybrid search with vectors, keywords and modules.

Hybrid Retrieval Store

Milvus

Scalable open source vector database for large embedding quantities.

Scalable vector clusters

pgvector

PostgreSQL extension for vector search in relational workloads.

Vectors in existing SQL landscape

Making knowledge addressable means embedding embeddings in architecture, not just installing a vector store. More under What is retrieval.

Data platform and storage

Enterprise knowledge systems need resilient storage for metadata, events, structured facts and knowledge objects. These platforms complement existing ERP, CRM and DMS landscapes.

PostgreSQL

Relational basis for metadata, configuration, pgvector and transactional workloads.

System of Record for metadata and vectors

ClickHouse

Column store for analytics, logs and high-volume queries.

Analytics and event storage

DuckDB

Embedded analytics engine for local evaluations and prototypes.

Local analysis and exploration

Apache Iceberg

Open Table Format for versioned, scalable data lakes.

Lakehouse and versioned datasets

MinIO

S3 compatible object storage for self-hosted and private cloud.

Object storage for raw data and artifacts

Redis

In memory store for cache, queues and session-related workloads.

Cache and fast buffers

MongoDB

Document store for flexible schemas and application data.

Document storage for app layers

Neo4j

Graph storage for relationships and enterprise knowledge graphs.

Graph storage for shortcuts

Streaming and data flow

Move data instead of copying it. Event streaming connects source systems with index, memory and assistance without fragile batch chains.

Apache Kafka

De facto standard for event streaming and integration backbone.

Event backbone and integration

Redpanda

Kafka compatible streaming platform with easier operation.

Lightweight event streaming

Apache Flink

Stream processing for real-time transformations and stateful jobs.

Stream processing and aggregation

Debezium

Change data capture from relational sources into event streams.

CDC from ERP and specialist systems

Apache Pulsar

Multi-tenant messaging with streaming and queuing.

Messaging and event platform

Streaming is a capability for current knowledge landscapes, not an end in itself. More under Enterprise Knowledge Systems Architecture.

Transformation and orchestration

Pipelines prepare sources for index, memory and assistance. Orchestration makes dependencies, retries and quality understandable.

Apache Airflow

Workflow orchestration for data-driven pipelines.

Batch orchestration

Dagster

Data Orchestrator with Asset Lineage and Developer Experience.

Asset based pipelines

dbt

SQL based transformations with tests and documentation.

Transformation layer in the warehouse

Apache Spark

Distributed processing for large amounts of data.

Scalable batch and stream jobs

Polars

Fast DataFrame Engine for local and medium-sized workloads.

Efficient local transformation

Prefect

Workflow engine with a focus on observability and dynamic flows.

Modern pipeline orchestration

Kestra

Event driven orchestration with declarative flows.

Declarative workflow platform

AI and inference

Inference layers make models operable. Open source and commercial models remain interchangeable as long as grounding, governance and source connection are correct.

vLLM

High throughput LLM serving for production-level inference.

LLM Serving and Batch Inference

Ollama

Local model runtime for development and self-hosted workloads.

Local inference and prototypes

LiteLLM

Proxy and routing for heterogeneous model providers via one interface.

Model routing and abstraction

Open WebUI

Web interface for local and remote model interaction.

UI for internal model usage

Haystack

RAG and QA pipelines with a clear pipeline structure.

RAG Runtime

LangGraph

Graph based agents and workflows with state management.

Agents and Workflow Runtime

OpenRouter

Gateway for heterogeneous commercial and open models.

Gateway and routing model

Agents and automation

Agents supplement assistance where recurring steps are clearly defined. Governance and human in the loop remain central.

n8n

Workflow automation with visual orchestration and self-hosting.

Integration and easy automation

Temporal

Durable execution for reliable, long-lasting workflows.

Robust process execution

Trigger.dev

Background jobs and agent workflows for development teams.

Event driven background jobs

Flowise

Visual builder for LLM flows and prototypes.

Prototyping LLM Flows

LangGraph

Stateful agent graphs for controlled multi-step processes.

Agent Runtime with State

CrewAI

Multi agent orchestration for specialized roles.

Multi-agent coordination

Backend and APIs

Services and APIs connect knowledge layers, intelligence layers and assistance. API First architecture keeps layers decoupled.

FastAPI

Modern Python API layer with typing and OpenAPI.

REST Services for Knowledge APIs

Node.js

Event driven runtime for I/O intensive services and BFF layer.

Service runtime and integration

NestJS

Structured Node Framework for Enterprise APIs.

Modular API Services

Python

Dominant language for AI, data and integration logic.

AI and data services

TypeScript

Type-safe development for APIs and frontend-related services.

Shared types and services

GraphQL

Flexible query layer for aggregated application data.

Flexible client APIs

tRPC

End to End type-safe APIs for TypeScript stacks.

Type-safe internal APIs

gRPC

Efficient RPC for internal service communication.

Internal service communication

Frontend and Experience

Experience Layer makes knowledge and assistance usable. Focus on clarity, permissions and source reference, not feature overload.

Next.js

React Framework for high-performance web apps and SSR.

Web app framework

React

Component based UI library for assistance interfaces.

UI components

Tailwind CSS

Utility CSS for consistent, maintainable interfaces.

Design system basis

Shadcn UI

Accessible components based on Radix and Tailwind.

UI components library

Vercel

Deployment and edge hosting for frontend workloads.

Hosting and preview deployments

Deployment and platform

Enterprise infrastructure must be deployable in a reproducible manner. Containers, Kubernetes and Infrastructure as Code are standard tools, not hype.

Docker

Container packaging for reproducible services.

Container Images

Kubernetes

Orchestration for scalable, resilient workloads.

Container platform

OpenTofu

Open Source Infrastructure as Code Fork from Terraform.

IaC with open governance model

Terraform

Widely used IaC for cloud and hybrid landscapes.

Infrastructure as Code

Helm

Package manager for Kubernetes deployments.

K8's release management

GitHub Actions

CI/CD for repositories and automation.

Continuous integration

GitLab CI

Integrated CI/CD in GitLab environments.

Pipeline automation

Caddy

Automatic TLS and reverse proxy for edge services.

Edge Proxy and TLS

Observability and quality

Enterprise AI needs measurability. Latency, errors, costs and quality must be visible without tool sprawl.

Prometheus

Metrics collection and alerting for services.

Metrics and alerts

Grafana

Dashboards for metrics, logs and traces.

Visualization and dashboards

OpenTelemetry

Open standard for traces, metrics and logs.

Unified Telemetry

Sentry

Error tracking for applications and services.

Error analysis

Loki

Log aggregation in the Grafana ecosystem.

Log storage and query

Jaeger

Distributed tracing for microservices and pipelines.

Trace analysis

Security and Identity

Security is an architectural requirement. Identity, secrets and edge protection are part of enterprise operations, not a separate late-stage project.

Keycloak

Open source identity and access management.

SSO and OIDC providers

Authentik

Flexible identity platform for self-hosted scenarios.

Identity Provider

Vault

Secrets management and dynamic credentials.

Secrets and key management

Traefik

Cloud native reverse proxy and ingress.

Ingress and routing

Caddy

Simple TLS Terminator and Reverse Proxy.

Edge TLS and Proxy

Documents and content

Unstructured content is often the largest pool of knowledge. Parsing, OCR and normalization are prerequisites for reliable retrieval.

Apache Tika

Extract text and metadata from Office, PDF and binary formats.

Universal document parsing

OCRmyPDF

OCR pipeline for searchable PDFs.

PDF OCR and preparation

Tesseract

Open source OCR engine for scanned documents.

Text recognition in scans

Pandoc

Conversion between document formats.

Format conversion

LibreOffice

Headless conversion and rendering of Office documents.

Office rendering and export

How we choose technologies

We deliberately do not promise any tool preference. Selection follows requirements, operations, governance, costs, scaling, sovereignty and integration into existing landscapes. Open Source Enterprise AI makes sense when operation, security and maintainability are right in the enterprise context.

  • Derive requirements from use cases, architecture and compliance, not from tool hype
  • Check operational capability, including monitoring, backup, upgrades and support model
  • Plan governance for data, models, permissions and audit early
  • Consider costs across the entire operation, not just the license or API price
  • Evaluate scaling based on data volume, number of users and latency requirements
  • Sovereignty and self-hosting as an option if data and models should remain in-house
  • Ensure integration into ERP, CRM, DMS and Identity without rip and replace

Enterprise technology stack decisions are architecture decisions. More under Build vs Buy AI and AI Governance for medium-sized businesses.

What we consciously avoid

Capability architecture often fails due to too many parallel initiatives without a common model. We avoid patterns that make an impression in the short term but make maintenance and transparency difficult in the long term.

  • Tool sprawl with parallel platforms without clear layers and responsibilities
  • Vendor lock in on infrastructure, model or proprietary data formats without an exit strategy
  • Unnecessary platforms that duplicate existing systems instead of connecting knowledge
  • Early model commitment before clarified architecture, sources and governance
  • Excessive agent architectures without clear boundaries, approvals and human in the loop

No more tools. More usable systems. More under What is Agentic AI and Why AI fails without company knowledge.

Why Kaufman AIS

Kaufman AIS develops enterprise knowledge systems and applied intelligence systems for medium-sized and larger companies in Europe. For us, technology is a means for usable knowledge architecture, digital assistance and confident operation.

  • Enterprise knowledge systems as an integrated architecture, not as a collection of tools
  • Open infrastructure with self-hosting, cloud optional and European data centers
  • Sovereign AI with control over sources, models and operations
  • Knowledge layer with retrieval, permissions and source connection
  • Digital assistance with grounding and governance for departments
  • Enterprise memory for reuse of decision and process knowledge

No more tools. More usable systems. More under Enterprise Knowledge Systems, Sovereign AI and Internal ChatGPT for Companies.

Frequently asked questions

What technologies does Kaufman AIS use?

We work in a technology-neutral manner with open source and open platforms for knowledge systems, retrieval, inference, assistance and operation. The specific selection follows architecture and requirements per project. This page shows typical building blocks, not a fixed mandatory list.

Do you only work with open source?

No. Open source is often the basis for knowledge layers, infrastructure and operations. Commercial models, cloud services and existing enterprise systems can be useful complements if integration and governance are right.

Is self-hosting possible?

Yes. Many architectures can be operated on premise or in a private cloud. Self-hosted AI is part of sovereign setups if data, models and access are to remain within the company. More at Sovereign AI.

Can existing systems remain?

Yes. Enterprise knowledge systems typically build on top of ERP, CRM, DMS and identity rather than replacing everything. More under data silos without system migration.

Which models do you support?

Models are interchangeable. We support open weight and API models via routing and inference layers. Grounding, source reference and governance are crucial, not a single provider.

How do you start?

Often with a use case with clear sources, measurable usage and a defined operating model. Parallel outline architecture, knowledge layer and governance. More under AI Transformation and What is AI Readiness.

Assess AI opportunity in 3 minutes

A short check of systems, friction points, and goals shows where enterprise AI can create measurable impact first.

Tools are interchangeable. Architecture and operational capability determine sustainable Enterprise AI.

We will help you determine which technologies and platforms make sense for your enterprise knowledge system architecture. The conversation is aimed at management, IT management, enterprise architecture, engineering and digitalization.

Arrange a conversation

Kpekpeɖeŋu

Ka na míe ɖe wò data ƒe ɖoɖo, nudzɔdzikpɔ ƒe ɖoɖo kple dodolawo siwo ate ŋu nɔ dɔ wɔm le wò dɔwɔƒe me la ŋu.

Rodrique Dallh
Wò ame si nàte ŋu kpli na mía Rodrique Dallh