Skip to content

ADR-001: Technology Stack for CostEngine (MfgIQ)

Status: Proposed
Date: 2026-01-24
Decision Makers: Engineering Team
Stakeholders: Product, Engineering


Context

CostEngine (MfgIQ) is an intelligent manufacturing costing and quotation system targeting MSME/SME manufacturers. The system needs to:

  1. Import Excel costing templates with complex formulas
  2. Maintain a semantic layer for 50+ cost metrics
  3. Support quote versioning and comparison
  4. Integrate with ERP systems (Tally) asynchronously
  5. Generate professional quotes in multiple formats
  6. Scale to 100s of concurrent users eventually

Key Constraints: - Must be fully open source to avoid vendor lock-in - Team comfortable with Python and TypeScript - Need rapid MVP delivery (6 months) - Future AI/copilot features planned


Decision

Core Application Stack

Component Choice License Rationale
Frontend & API Next.js MIT Modern React framework, excellent DX, easy deployment
Primary Database PostgreSQL 15+ PostgreSQL Battle-tested, JSONB support for flexible schemas
Cache & Queue Redis + Celery BSD + BSD Proven stack for background jobs (Excel parsing, PDF gen)
Object Storage MinIO AGPL v3 Self-hosted S3-compatible, no cloud costs
Authentication Keycloak Apache 2.0 Open source, supports SAML/OAuth for enterprise

Semantic Layer (Phase 1: Custom YAML)

Decision: Start with custom YAML metric definitions instead of Cube.dev or dbt.

Rationale: - ✅ Metrics already documented in semantic-layer.md - ✅ Domain well-understood (deterministic formulas) - ✅ Zero dependency on external platforms - ✅ Full control over calculation logic - ✅ Easy migration to dbt when complexity increases

Example Metric Definition:

# config/metrics/material_cost.yml
metric_id: raw_material_cost
display_name: "Raw Material Cost per Piece"
description: "Total raw material cost including yield loss"
category: material_costs
data_type: decimal
unit: INR
formula: "(gross_weight * rm_rate) / yield_percent"
sql_template: |
  SELECT 
    (f.gross_weight * m.rm_rate) / NULLIF(f.yield_percent, 0) AS value
  FROM fact_cost f
  JOIN dim_material m ON f.material_id = m.material_id
  WHERE {{filters}}
dependencies:
  - gross_weight
  - rm_rate
  - yield_percent
aggregation: SUM
dimensions:
  - part_number
  - material_grade
  - time_period

Migration Path: - Phase 1 (MVP): Custom YAML → Python calculation engine - Phase 2 (Scale): Migrate to dbt metrics (similar YAML structure) - Phase 3 (Enterprise): Consider Cube.dev Cloud if needed


Workflow Orchestration

Decision: Deferred to Phase 2. Start with Celery, add Temporal when critical.

Phase 1 (Celery): - Simple background jobs: Excel parsing, PDF generation, emails - Good enough for < 1000 quotes/day

Phase 2 (Temporal): - Add when: - ERP integration requires retry logic (UC-701: Tally sync) - Bulk quote updates need reliability (UC-402: Material price changes) - Long-running workflows exceed 30 seconds

Choice: Temporal (MIT) over Prefect - Better durability guarantees - Stronger consistency for versioning - Python + TypeScript SDKs


Analytics & BI

Decision: Apache Superset (Apache 2.0) for internal dashboards

Rationale: - ✅ Fully open source (no Tableau/Looker costs) - ✅ Direct SQL access to PostgreSQL - ✅ Custom visualizations for cost breakdowns - ✅ Can embed in Next.js app

Alternative considered: Metabase (AGPL) - simpler but less powerful


AI/RAG Layer (Phase 3)

Decision: LlamaIndex (MIT) + LangGraph (MIT) + Qdrant (Apache 2.0)

Why deferred to Phase 3: - MVP costing is deterministic (formulas, no ML needed) - AI useful for: quote similarity, cost prediction, chatbot

Stack when needed: - Vector DB: Qdrant (self-hosted, Apache 2.0) - RAG Framework: LlamaIndex for historical quote search - Agents: LangGraph for multi-step reasoning - Embeddings: Open source models (all-MiniLM-L6-v2 or similar)


Analytics & Historical Data Scaling

Decision: Start with PostgreSQL (Row-based) for historical trends. Evaluate Columnar Storage (DuckDB or Citus) if quote history exceeds 5M rows.

Rationale: - ✅ MVP Needs: Primarily transactional (save/retrieve single quote). - ✅ Team Productivity: Single DB (PostgreSQL) reduces infra overhead. - ➕ Scale Path: If complex queries over 10 years of price history become slow, we will add DuckDB as a local analytical engine or enable Citus for columnar partitioning.


NOT Using

Tool Reason for Exclusion
Bauplan Overkill for transactional app; designed for data lakehouses
Rill Data Focused on embedded analytics; Superset better fit
Airflow Batch-oriented; Temporal better for transactional workflows
dbt (Phase 1) Custom YAML simpler for MVP; migrate later when complexity grows
Cube.dev Cloud Paid features not needed yet; OSS version or custom sufficient

Consequences

Positive

  1. Zero Vendor Lock-In: All components are open source with permissive licenses
  2. Cost Efficiency: Self-hosted stack, no SaaS fees until scale
  3. Team Familiarity: Python + TypeScript + PostgreSQL are widely known
  4. Migration Path: Custom YAML → dbt is straightforward
  5. Flexibility: Can add Temporal/AI layer when actually needed

Negative

  1. Infrastructure Overhead: Need to manage PostgreSQL, Redis, MinIO ourselves
  2. Custom Semantic Layer: More code to maintain vs. using dbt/Cube
  3. Initial Complexity: Temporal learning curve when we need it
  4. No Built-in Lineage: Unlike dbt, need to build metric dependency tracking

Risks & Mitigations

Risk Mitigation
Custom YAML becomes unmaintainable Limit to 50-100 metrics; migrate to dbt if > 100
Celery insufficient for workflows Monitor job failures; add Temporal when retry logic critical
PostgreSQL scale limits Start with read replicas; consider partitioning if > 1M quotes
Team unfamiliar with Temporal Delay until Phase 2; invest in training when adding

Implementation Plan

Phase 1: MVP (Months 0-6)

✅ Next.js + PostgreSQL + Redis/Celery
✅ Custom YAML semantic layer
✅ MinIO for Excel/PDF storage
✅ Keycloak for auth
✅ Basic Superset dashboards

Deliverable: Working quote generation from Excel uploads


Phase 2: Scale (Months 6-12)

➕ Temporal for Tally ERP sync
➕ Migrate to dbt for semantic layer (if > 100 metrics)
➕ Horizontal PostgreSQL scaling
➕ Advanced Superset dashboards

Trigger: > 1000 quotes/day or ERP integration launches


Phase 3: AI Features (Months 12-18)

➕ Qdrant vector database
➕ LlamaIndex for RAG over quotes
➕ LangGraph for AI copilot
➕ Evaluate Bauplan if pivoting to analytics platform

Trigger: Customer demand for AI-assisted costing


Alternatives Considered

1. Cube.dev OSS vs. Custom YAML vs. Boring Semantic Layer

Factor Cube.dev OSS Custom YAML (Chosen) Boring Semantic Layer
Learning curve Medium Low Low
Flexibility High Very High Medium
Caching Built-in Need to build Need to build
Migration to dbt Medium effort Low effort Medium effort
Vendor risk Open core model None None (OSI approved)
LLM Integration No No Yes (designed for)
Database Support Multi-DB Any Multi-DB (via Ibis)

Boring Semantic Layer Assessment: - ✅ Open source: Apache 2.0, lightweight (pip install boring-semantic-layer) - ✅ Built on Ibis: Supports DuckDB, PostgreSQL, BigQuery, Snowflake - ✅ LLM-friendly: Designed for connecting LLMs to structured data - ⚠️ Phase 3 candidate: Good fit when adding AI copilot features - ❌ Not for MVP: Optimized for LLM use cases, not transactional costing

Decision: - Phase 1 (MVP): Custom YAML (simpler, full control) - Phase 2 (if complexity grows): Migrate to dbt - Phase 3 (AI features): Consider Boring Semantic Layer for LLM→Database bridge

Rationale: Boring Semantic Layer is excellent for AI/RAG use cases but adds unnecessary abstraction for deterministic costing formulas in MVP.

2. Prefect vs. Temporal vs. Celery

Factor Celery (Phase 1) Temporal (Phase 2) Prefect
Simplicity High Medium Medium
Durability Basic Excellent Good
Versioning support No Yes (built-in) No
Learning curve Low High Medium

Decision: Start Celery, add Temporal when workflow complexity increases

3. Metabase vs. Superset

Factor Metabase Superset (Chosen)
License AGPL Apache 2.0
Customization Limited Extensive
SQL access Yes Yes
Embedding Basic Advanced

Decision: Superset for deeper customization needs.


References


Revision History

Date Author Changes
2026-01-24 Engineering Team Initial ADR

Next ADRs: * ADR-002: Database Schema (Star vs. Snowflake) * ADR-003: Excel Parsing Strategy (pandas vs. openpyxl) * ADR-004: Quote Versioning Implementation