OpenSearch
Open source search and analysis platform for full text, aggregations and scalable queries.
Full text index and enterprise search
technology
Tools are interchangeable. Architecture principles remain. This overview shows open source and open technologies that Kaufman AIS works with or that we typically use in enterprise knowledge systems. No tech stack marketing, but transparency about capabilities, operations and integration capabilities for management, IT management, enterprise architecture and engineering in medium-sized and enterprise companies.
Enterprise AI Open Source and open platforms are means, not ends in themselves. We choose building blocks based on architectural principles, not tool religion. The following principles guide Enterprise AI architecture and operations in projects.
Knowledge systems combine sources, index, parsing and retrieval. These building blocks make corporate knowledge discoverable and addressable. They are part of the knowledge layer, not a replacement for governance and permissions.
Open source search and analysis platform for full text, aggregations and scalable queries.
Full text index and enterprise search
Widely used search and analytics engine with rich indexing and observability ecosystem.
Search and analytics layers
Vector database with a focus on performance, filtering and production-related operations.
Vector index for semantic retrieval
Vector database with hybrid search, modules and graph functions.
Hybrid retrieval and knowledge objects
Fast, developer-friendly search engine for low-latency applications.
Application-oriented search and Typahead
Framework for data connectors, indexing and RAG pipelines over heterogeneous sources.
Orchestration of retrieval pipelines
Open source framework for NLP pipelines, QA and production-related RAG systems.
Pipeline framework for retrieval and QA
Enterprise Search Server based on Lucene with faceted search and scaling.
Classic enterprise full-text index
Content detection and text extraction from documents and binary formats.
Parsing and metadata extraction
Pipeline for document parsing, chunking and unstructured content preparation.
ETL for unstructured knowledge sources
Knowledge engine for knowledge objects, graph and memory with a focus on enterprise context.
Knowledge runtime and semantic linking
Enterprise memory needs more than vectors. Knowledge objects, context, linking and reuse are created in a runtime layer via retrieval stores. Cognee combines graph, memory and semantic structure for usable knowledge architecture.
Knowledge engine for structured knowledge objects, graph and memory in enterprise projects.
Central knowledge runtime layer
Graph Database for relationships between entities, processes and knowledge objects.
Knowledge graph and linking
Memory layer for persistent context and user reference in assistance systems.
Context memory for assistants
Data connectors and index abstractions for repeatable retrieval workflows.
Index and Connector Runtime
Pipeline Runtime for QA, RAG and documented processing steps.
Pipeline runtime for knowledge flow
Vector store for embeddings with filtering and production-oriented scaling.
Embedding store in the memory stack
Embedded vector database for local and scalable vector workloads.
Lightweight vector store
Knowledge objects, context, memory, retrieval and linking work together. More under What is Enterprise Memory and What is a Knowledge Layer.
Vector stores make knowledge addressable. They are a building block, not the sole architecture. Grounding, permissions and source reference remain central.
Production-oriented vector database with filtering, sharding and REST API.
Primary vector index
Embedded Vector Store for Edge, On Premise and Integrated Workloads.
Local and embedded vector store
Hybrid search with vectors, keywords and modules.
Hybrid Retrieval Store
Scalable open source vector database for large embedding quantities.
Scalable vector clusters
PostgreSQL extension for vector search in relational workloads.
Vectors in existing SQL landscape
Making knowledge addressable means embedding embeddings in architecture, not just installing a vector store. More under What is retrieval.
Enterprise knowledge systems need resilient storage for metadata, events, structured facts and knowledge objects. These platforms complement existing ERP, CRM and DMS landscapes.
Relational basis for metadata, configuration, pgvector and transactional workloads.
System of Record for metadata and vectors
Column store for analytics, logs and high-volume queries.
Analytics and event storage
Embedded analytics engine for local evaluations and prototypes.
Local analysis and exploration
Open Table Format for versioned, scalable data lakes.
Lakehouse and versioned datasets
S3 compatible object storage for self-hosted and private cloud.
Object storage for raw data and artifacts
In memory store for cache, queues and session-related workloads.
Cache and fast buffers
Document store for flexible schemas and application data.
Document storage for app layers
Graph storage for relationships and enterprise knowledge graphs.
Graph storage for shortcuts
Move data instead of copying it. Event streaming connects source systems with index, memory and assistance without fragile batch chains.
De facto standard for event streaming and integration backbone.
Event backbone and integration
Kafka compatible streaming platform with easier operation.
Lightweight event streaming
Stream processing for real-time transformations and stateful jobs.
Stream processing and aggregation
Change data capture from relational sources into event streams.
CDC from ERP and specialist systems
Multi-tenant messaging with streaming and queuing.
Messaging and event platform
Streaming is a capability for current knowledge landscapes, not an end in itself. More under Enterprise Knowledge Systems Architecture.
Pipelines prepare sources for index, memory and assistance. Orchestration makes dependencies, retries and quality understandable.
Workflow orchestration for data-driven pipelines.
Batch orchestration
Data Orchestrator with Asset Lineage and Developer Experience.
Asset based pipelines
SQL based transformations with tests and documentation.
Transformation layer in the warehouse
Distributed processing for large amounts of data.
Scalable batch and stream jobs
Fast DataFrame Engine for local and medium-sized workloads.
Efficient local transformation
Workflow engine with a focus on observability and dynamic flows.
Modern pipeline orchestration
Event driven orchestration with declarative flows.
Declarative workflow platform
Inference layers make models operable. Open source and commercial models remain interchangeable as long as grounding, governance and source connection are correct.
High throughput LLM serving for production-level inference.
LLM Serving and Batch Inference
Local model runtime for development and self-hosted workloads.
Local inference and prototypes
Proxy and routing for heterogeneous model providers via one interface.
Model routing and abstraction
Web interface for local and remote model interaction.
UI for internal model usage
RAG and QA pipelines with a clear pipeline structure.
RAG Runtime
Graph based agents and workflows with state management.
Agents and Workflow Runtime
Gateway for heterogeneous commercial and open models.
Gateway and routing model
Agents supplement assistance where recurring steps are clearly defined. Governance and human in the loop remain central.
Workflow automation with visual orchestration and self-hosting.
Integration and easy automation
Durable execution for reliable, long-lasting workflows.
Robust process execution
Background jobs and agent workflows for development teams.
Event driven background jobs
Visual builder for LLM flows and prototypes.
Prototyping LLM Flows
Stateful agent graphs for controlled multi-step processes.
Agent Runtime with State
Multi agent orchestration for specialized roles.
Multi-agent coordination
Services and APIs connect knowledge layers, intelligence layers and assistance. API First architecture keeps layers decoupled.
Modern Python API layer with typing and OpenAPI.
REST Services for Knowledge APIs
Event driven runtime for I/O intensive services and BFF layer.
Service runtime and integration
Structured Node Framework for Enterprise APIs.
Modular API Services
Dominant language for AI, data and integration logic.
AI and data services
Type-safe development for APIs and frontend-related services.
Shared types and services
Flexible query layer for aggregated application data.
Flexible client APIs
End to End type-safe APIs for TypeScript stacks.
Type-safe internal APIs
Efficient RPC for internal service communication.
Internal service communication
Experience Layer makes knowledge and assistance usable. Focus on clarity, permissions and source reference, not feature overload.
React Framework for high-performance web apps and SSR.
Web app framework
Component based UI library for assistance interfaces.
UI components
Utility CSS for consistent, maintainable interfaces.
Design system basis
Accessible components based on Radix and Tailwind.
UI components library
Deployment and edge hosting for frontend workloads.
Hosting and preview deployments
Enterprise infrastructure must be deployable in a reproducible manner. Containers, Kubernetes and Infrastructure as Code are standard tools, not hype.
Container packaging for reproducible services.
Container Images
Orchestration for scalable, resilient workloads.
Container platform
Open Source Infrastructure as Code Fork from Terraform.
IaC with open governance model
Widely used IaC for cloud and hybrid landscapes.
Infrastructure as Code
Package manager for Kubernetes deployments.
K8's release management
CI/CD for repositories and automation.
Continuous integration
Integrated CI/CD in GitLab environments.
Pipeline automation
Automatic TLS and reverse proxy for edge services.
Edge Proxy and TLS
Enterprise AI needs measurability. Latency, errors, costs and quality must be visible without tool sprawl.
Metrics collection and alerting for services.
Metrics and alerts
Dashboards for metrics, logs and traces.
Visualization and dashboards
Open standard for traces, metrics and logs.
Unified Telemetry
Error tracking for applications and services.
Error analysis
Log aggregation in the Grafana ecosystem.
Log storage and query
Distributed tracing for microservices and pipelines.
Trace analysis
Security is an architectural requirement. Identity, secrets and edge protection are part of enterprise operations, not a separate late-stage project.
Open source identity and access management.
SSO and OIDC providers
Flexible identity platform for self-hosted scenarios.
Identity Provider
Secrets management and dynamic credentials.
Secrets and key management
Cloud native reverse proxy and ingress.
Ingress and routing
Simple TLS Terminator and Reverse Proxy.
Edge TLS and Proxy
Unstructured content is often the largest pool of knowledge. Parsing, OCR and normalization are prerequisites for reliable retrieval.
Extract text and metadata from Office, PDF and binary formats.
Universal document parsing
OCR pipeline for searchable PDFs.
PDF OCR and preparation
Open source OCR engine for scanned documents.
Text recognition in scans
Conversion between document formats.
Format conversion
Headless conversion and rendering of Office documents.
Office rendering and export
We deliberately do not promise any tool preference. Selection follows requirements, operations, governance, costs, scaling, sovereignty and integration into existing landscapes. Open Source Enterprise AI makes sense when operation, security and maintainability are right in the enterprise context.
Enterprise technology stack decisions are architecture decisions. More under Build vs Buy AI and AI Governance for medium-sized businesses.
Capability architecture often fails due to too many parallel initiatives without a common model. We avoid patterns that make an impression in the short term but make maintenance and transparency difficult in the long term.
No more tools. More usable systems. More under What is Agentic AI and Why AI fails without company knowledge.
Kaufman AIS develops enterprise knowledge systems and applied intelligence systems for medium-sized and larger companies in Europe. For us, technology is a means for usable knowledge architecture, digital assistance and confident operation.
No more tools. More usable systems. More under Enterprise Knowledge Systems, Sovereign AI and Internal ChatGPT for Companies.
We work in a technology-neutral manner with open source and open platforms for knowledge systems, retrieval, inference, assistance and operation. The specific selection follows architecture and requirements per project. This page shows typical building blocks, not a fixed mandatory list.
No. Open source is often the basis for knowledge layers, infrastructure and operations. Commercial models, cloud services and existing enterprise systems can be useful complements if integration and governance are right.
Yes. Many architectures can be operated on premise or in a private cloud. Self-hosted AI is part of sovereign setups if data, models and access are to remain within the company. More at Sovereign AI.
Yes. Enterprise knowledge systems typically build on top of ERP, CRM, DMS and identity rather than replacing everything. More under data silos without system migration.
Models are interchangeable. We support open weight and API models via routing and inference layers. Grounding, source reference and governance are crucial, not a single provider.
Often with a use case with clear sources, measurable usage and a defined operating model. Parallel outline architecture, knowledge layer and governance. More under AI Transformation and What is AI Readiness.
A short check of systems, friction points, and goals shows where enterprise AI can create measurable impact first.
We will help you determine which technologies and platforms make sense for your enterprise knowledge system architecture. The conversation is aimed at management, IT management, enterprise architecture, engineering and digitalization.
Ka na míe ɖe wò data ƒe ɖoɖo, nudzɔdzikpɔ ƒe ɖoɖo kple dodolawo siwo ate ŋu nɔ dɔ wɔm le wò dɔwɔƒe me la ŋu.