ADR-001: Technology Stack for CostEngine (MfgIQ)¶

Status: Proposed
Date: 2026-01-24
Decision Makers: Engineering Team
Stakeholders: Product, Engineering

Context¶

CostEngine (MfgIQ) is an intelligent manufacturing costing and quotation system targeting MSME/SME manufacturers. The system needs to:

Import Excel costing templates with complex formulas
Maintain a semantic layer for 50+ cost metrics
Support quote versioning and comparison
Integrate with ERP systems (Tally) asynchronously
Generate professional quotes in multiple formats
Scale to 100s of concurrent users eventually

Key Constraints: - Must be fully open source to avoid vendor lock-in - Team comfortable with Python and TypeScript - Need rapid MVP delivery (6 months) - Future AI/copilot features planned

Decision¶

Core Application Stack¶

Component	Choice	License	Rationale
Frontend & API	Next.js	MIT	Modern React framework, excellent DX, easy deployment
Primary Database	PostgreSQL 15+	PostgreSQL	Battle-tested, JSONB support for flexible schemas
Cache & Queue	Redis + Celery	BSD + BSD	Proven stack for background jobs (Excel parsing, PDF gen)
Object Storage	MinIO	AGPL v3	Self-hosted S3-compatible, no cloud costs
Authentication	Keycloak	Apache 2.0	Open source, supports SAML/OAuth for enterprise

Semantic Layer (Phase 1: Custom YAML)¶

Decision: Start with custom YAML metric definitions instead of Cube.dev or dbt.

Rationale: - ✅ Metrics already documented in semantic-layer.md - ✅ Domain well-understood (deterministic formulas) - ✅ Zero dependency on external platforms - ✅ Full control over calculation logic - ✅ Easy migration to dbt when complexity increases

Example Metric Definition:

# config/metrics/material_cost.yml
metric_id: raw_material_cost
display_name: "Raw Material Cost per Piece"
description: "Total raw material cost including yield loss"
category: material_costs
data_type: decimal
unit: INR
formula: "(gross_weight * rm_rate) / yield_percent"
sql_template: |
  SELECT 
    (f.gross_weight * m.rm_rate) / NULLIF(f.yield_percent, 0) AS value
  FROM fact_cost f
  JOIN dim_material m ON f.material_id = m.material_id
  WHERE {{filters}}
dependencies:
  - gross_weight
  - rm_rate
  - yield_percent
aggregation: SUM
dimensions:
  - part_number
  - material_grade
  - time_period

Migration Path: - Phase 1 (MVP): Custom YAML → Python calculation engine - Phase 2 (Scale): Migrate to dbt metrics (similar YAML structure) - Phase 3 (Enterprise): Consider Cube.dev Cloud if needed

Workflow Orchestration¶

Decision: Deferred to Phase 2. Start with Celery, add Temporal when critical.

Phase 1 (Celery): - Simple background jobs: Excel parsing, PDF generation, emails - Good enough for < 1000 quotes/day

Phase 2 (Temporal): - Add when: - ERP integration requires retry logic (UC-701: Tally sync) - Bulk quote updates need reliability (UC-402: Material price changes) - Long-running workflows exceed 30 seconds

Choice: Temporal (MIT) over Prefect - Better durability guarantees - Stronger consistency for versioning - Python + TypeScript SDKs

Analytics & BI¶

Decision: Apache Superset (Apache 2.0) for internal dashboards

Rationale: - ✅ Fully open source (no Tableau/Looker costs) - ✅ Direct SQL access to PostgreSQL - ✅ Custom visualizations for cost breakdowns - ✅ Can embed in Next.js app

Alternative considered: Metabase (AGPL) - simpler but less powerful

AI/RAG Layer (Phase 3)¶

Decision: LlamaIndex (MIT) + LangGraph (MIT) + Qdrant (Apache 2.0)

Why deferred to Phase 3: - MVP costing is deterministic (formulas, no ML needed) - AI useful for: quote similarity, cost prediction, chatbot

Stack when needed: - Vector DB: Qdrant (self-hosted, Apache 2.0) - RAG Framework: LlamaIndex for historical quote search - Agents: LangGraph for multi-step reasoning - Embeddings: Open source models (all-MiniLM-L6-v2 or similar)

Analytics & Historical Data Scaling¶

Decision: Start with PostgreSQL (Row-based) for historical trends. Evaluate Columnar Storage (DuckDB or Citus) if quote history exceeds 5M rows.

Rationale: - ✅ MVP Needs: Primarily transactional (save/retrieve single quote). - ✅ Team Productivity: Single DB (PostgreSQL) reduces infra overhead. - ➕ Scale Path: If complex queries over 10 years of price history become slow, we will add DuckDB as a local analytical engine or enable Citus for columnar partitioning.

NOT Using¶

Tool	Reason for Exclusion
Bauplan	Overkill for transactional app; designed for data lakehouses
Rill Data	Focused on embedded analytics; Superset better fit
Airflow	Batch-oriented; Temporal better for transactional workflows
dbt (Phase 1)	Custom YAML simpler for MVP; migrate later when complexity grows
Cube.dev Cloud	Paid features not needed yet; OSS version or custom sufficient

Consequences¶

Positive¶

Zero Vendor Lock-In: All components are open source with permissive licenses
Cost Efficiency: Self-hosted stack, no SaaS fees until scale
Team Familiarity: Python + TypeScript + PostgreSQL are widely known
Migration Path: Custom YAML → dbt is straightforward
Flexibility: Can add Temporal/AI layer when actually needed

Negative¶

Infrastructure Overhead: Need to manage PostgreSQL, Redis, MinIO ourselves
Custom Semantic Layer: More code to maintain vs. using dbt/Cube
Initial Complexity: Temporal learning curve when we need it
No Built-in Lineage: Unlike dbt, need to build metric dependency tracking

Risks & Mitigations¶

Risk	Mitigation
Custom YAML becomes unmaintainable	Limit to 50-100 metrics; migrate to dbt if > 100
Celery insufficient for workflows	Monitor job failures; add Temporal when retry logic critical
PostgreSQL scale limits	Start with read replicas; consider partitioning if > 1M quotes
Team unfamiliar with Temporal	Delay until Phase 2; invest in training when adding

Implementation Plan¶

Phase 1: MVP (Months 0-6)¶

✅ Next.js + PostgreSQL + Redis/Celery
✅ Custom YAML semantic layer
✅ MinIO for Excel/PDF storage
✅ Keycloak for auth
✅ Basic Superset dashboards

Deliverable: Working quote generation from Excel uploads

Phase 2: Scale (Months 6-12)¶

➕ Temporal for Tally ERP sync
➕ Migrate to dbt for semantic layer (if > 100 metrics)
➕ Horizontal PostgreSQL scaling
➕ Advanced Superset dashboards

Trigger: > 1000 quotes/day or ERP integration launches

Phase 3: AI Features (Months 12-18)¶

➕ Qdrant vector database
➕ LlamaIndex for RAG over quotes
➕ LangGraph for AI copilot
➕ Evaluate Bauplan if pivoting to analytics platform

Trigger: Customer demand for AI-assisted costing

Alternatives Considered¶

1. Cube.dev OSS vs. Custom YAML vs. Boring Semantic Layer¶

Factor	Cube.dev OSS	Custom YAML (Chosen)	Boring Semantic Layer
Learning curve	Medium	Low	Low
Flexibility	High	Very High	Medium
Caching	Built-in	Need to build	Need to build
Migration to dbt	Medium effort	Low effort	Medium effort
Vendor risk	Open core model	None	None (OSI approved)
LLM Integration	No	No	Yes (designed for)
Database Support	Multi-DB	Any	Multi-DB (via Ibis)

Boring Semantic Layer Assessment: - ✅ Open source: Apache 2.0, lightweight (pip install boring-semantic-layer) - ✅ Built on Ibis: Supports DuckDB, PostgreSQL, BigQuery, Snowflake - ✅ LLM-friendly: Designed for connecting LLMs to structured data - ⚠️ Phase 3 candidate: Good fit when adding AI copilot features - ❌ Not for MVP: Optimized for LLM use cases, not transactional costing

Decision: - Phase 1 (MVP): Custom YAML (simpler, full control) - Phase 2 (if complexity grows): Migrate to dbt - Phase 3 (AI features): Consider Boring Semantic Layer for LLM→Database bridge

Rationale: Boring Semantic Layer is excellent for AI/RAG use cases but adds unnecessary abstraction for deterministic costing formulas in MVP.

2. Prefect vs. Temporal vs. Celery¶

Factor	Celery (Phase 1)	Temporal (Phase 2)	Prefect
Simplicity	High	Medium	Medium
Durability	Basic	Excellent	Good
Versioning support	No	Yes (built-in)	No
Learning curve	Low	High	Medium

Decision: Start Celery, add Temporal when workflow complexity increases

3. Metabase vs. Superset¶

Factor	Metabase	Superset (Chosen)
License	AGPL	Apache 2.0
Customization	Limited	Extensive
SQL access	Yes	Yes
Embedding	Basic	Advanced

Decision: Superset for deeper customization needs.

References¶

Revision History¶

Date	Author	Changes
2026-01-24	Engineering Team	Initial ADR

Next ADRs: * ADR-002: Database Schema (Star vs. Snowflake) * ADR-003: Excel Parsing Strategy (pandas vs. openpyxl) * ADR-004: Quote Versioning Implementation