ADR-001: Technology Stack for CostEngine (MfgIQ)¶
Status: Proposed
Date: 2026-01-24
Decision Makers: Engineering Team
Stakeholders: Product, Engineering
Context¶
CostEngine (MfgIQ) is an intelligent manufacturing costing and quotation system targeting MSME/SME manufacturers. The system needs to:
- Import Excel costing templates with complex formulas
- Maintain a semantic layer for 50+ cost metrics
- Support quote versioning and comparison
- Integrate with ERP systems (Tally) asynchronously
- Generate professional quotes in multiple formats
- Scale to 100s of concurrent users eventually
Key Constraints: - Must be fully open source to avoid vendor lock-in - Team comfortable with Python and TypeScript - Need rapid MVP delivery (6 months) - Future AI/copilot features planned
Decision¶
Core Application Stack¶
| Component | Choice | License | Rationale |
|---|---|---|---|
| Frontend & API | Next.js | MIT | Modern React framework, excellent DX, easy deployment |
| Primary Database | PostgreSQL 15+ | PostgreSQL | Battle-tested, JSONB support for flexible schemas |
| Cache & Queue | Redis + Celery | BSD + BSD | Proven stack for background jobs (Excel parsing, PDF gen) |
| Object Storage | MinIO | AGPL v3 | Self-hosted S3-compatible, no cloud costs |
| Authentication | Keycloak | Apache 2.0 | Open source, supports SAML/OAuth for enterprise |
Semantic Layer (Phase 1: Custom YAML)¶
Decision: Start with custom YAML metric definitions instead of Cube.dev or dbt.
Rationale:
- ✅ Metrics already documented in semantic-layer.md
- ✅ Domain well-understood (deterministic formulas)
- ✅ Zero dependency on external platforms
- ✅ Full control over calculation logic
- ✅ Easy migration to dbt when complexity increases
Example Metric Definition:
# config/metrics/material_cost.yml
metric_id: raw_material_cost
display_name: "Raw Material Cost per Piece"
description: "Total raw material cost including yield loss"
category: material_costs
data_type: decimal
unit: INR
formula: "(gross_weight * rm_rate) / yield_percent"
sql_template: |
SELECT
(f.gross_weight * m.rm_rate) / NULLIF(f.yield_percent, 0) AS value
FROM fact_cost f
JOIN dim_material m ON f.material_id = m.material_id
WHERE {{filters}}
dependencies:
- gross_weight
- rm_rate
- yield_percent
aggregation: SUM
dimensions:
- part_number
- material_grade
- time_period
Migration Path: - Phase 1 (MVP): Custom YAML → Python calculation engine - Phase 2 (Scale): Migrate to dbt metrics (similar YAML structure) - Phase 3 (Enterprise): Consider Cube.dev Cloud if needed
Workflow Orchestration¶
Decision: Deferred to Phase 2. Start with Celery, add Temporal when critical.
Phase 1 (Celery): - Simple background jobs: Excel parsing, PDF generation, emails - Good enough for < 1000 quotes/day
Phase 2 (Temporal): - Add when: - ERP integration requires retry logic (UC-701: Tally sync) - Bulk quote updates need reliability (UC-402: Material price changes) - Long-running workflows exceed 30 seconds
Choice: Temporal (MIT) over Prefect - Better durability guarantees - Stronger consistency for versioning - Python + TypeScript SDKs
Analytics & BI¶
Decision: Apache Superset (Apache 2.0) for internal dashboards
Rationale: - ✅ Fully open source (no Tableau/Looker costs) - ✅ Direct SQL access to PostgreSQL - ✅ Custom visualizations for cost breakdowns - ✅ Can embed in Next.js app
Alternative considered: Metabase (AGPL) - simpler but less powerful
AI/RAG Layer (Phase 3)¶
Decision: LlamaIndex (MIT) + LangGraph (MIT) + Qdrant (Apache 2.0)
Why deferred to Phase 3: - MVP costing is deterministic (formulas, no ML needed) - AI useful for: quote similarity, cost prediction, chatbot
Stack when needed: - Vector DB: Qdrant (self-hosted, Apache 2.0) - RAG Framework: LlamaIndex for historical quote search - Agents: LangGraph for multi-step reasoning - Embeddings: Open source models (all-MiniLM-L6-v2 or similar)
Analytics & Historical Data Scaling¶
Decision: Start with PostgreSQL (Row-based) for historical trends. Evaluate Columnar Storage (DuckDB or Citus) if quote history exceeds 5M rows.
Rationale: - ✅ MVP Needs: Primarily transactional (save/retrieve single quote). - ✅ Team Productivity: Single DB (PostgreSQL) reduces infra overhead. - ➕ Scale Path: If complex queries over 10 years of price history become slow, we will add DuckDB as a local analytical engine or enable Citus for columnar partitioning.
NOT Using¶
| Tool | Reason for Exclusion |
|---|---|
| Bauplan | Overkill for transactional app; designed for data lakehouses |
| Rill Data | Focused on embedded analytics; Superset better fit |
| Airflow | Batch-oriented; Temporal better for transactional workflows |
| dbt (Phase 1) | Custom YAML simpler for MVP; migrate later when complexity grows |
| Cube.dev Cloud | Paid features not needed yet; OSS version or custom sufficient |
Consequences¶
Positive¶
- Zero Vendor Lock-In: All components are open source with permissive licenses
- Cost Efficiency: Self-hosted stack, no SaaS fees until scale
- Team Familiarity: Python + TypeScript + PostgreSQL are widely known
- Migration Path: Custom YAML → dbt is straightforward
- Flexibility: Can add Temporal/AI layer when actually needed
Negative¶
- Infrastructure Overhead: Need to manage PostgreSQL, Redis, MinIO ourselves
- Custom Semantic Layer: More code to maintain vs. using dbt/Cube
- Initial Complexity: Temporal learning curve when we need it
- No Built-in Lineage: Unlike dbt, need to build metric dependency tracking
Risks & Mitigations¶
| Risk | Mitigation |
|---|---|
| Custom YAML becomes unmaintainable | Limit to 50-100 metrics; migrate to dbt if > 100 |
| Celery insufficient for workflows | Monitor job failures; add Temporal when retry logic critical |
| PostgreSQL scale limits | Start with read replicas; consider partitioning if > 1M quotes |
| Team unfamiliar with Temporal | Delay until Phase 2; invest in training when adding |
Implementation Plan¶
Phase 1: MVP (Months 0-6)¶
✅ Next.js + PostgreSQL + Redis/Celery
✅ Custom YAML semantic layer
✅ MinIO for Excel/PDF storage
✅ Keycloak for auth
✅ Basic Superset dashboards
Deliverable: Working quote generation from Excel uploads
Phase 2: Scale (Months 6-12)¶
➕ Temporal for Tally ERP sync
➕ Migrate to dbt for semantic layer (if > 100 metrics)
➕ Horizontal PostgreSQL scaling
➕ Advanced Superset dashboards
Trigger: > 1000 quotes/day or ERP integration launches
Phase 3: AI Features (Months 12-18)¶
➕ Qdrant vector database
➕ LlamaIndex for RAG over quotes
➕ LangGraph for AI copilot
➕ Evaluate Bauplan if pivoting to analytics platform
Trigger: Customer demand for AI-assisted costing
Alternatives Considered¶
1. Cube.dev OSS vs. Custom YAML vs. Boring Semantic Layer¶
| Factor | Cube.dev OSS | Custom YAML (Chosen) | Boring Semantic Layer |
|---|---|---|---|
| Learning curve | Medium | Low | Low |
| Flexibility | High | Very High | Medium |
| Caching | Built-in | Need to build | Need to build |
| Migration to dbt | Medium effort | Low effort | Medium effort |
| Vendor risk | Open core model | None | None (OSI approved) |
| LLM Integration | No | No | Yes (designed for) |
| Database Support | Multi-DB | Any | Multi-DB (via Ibis) |
Boring Semantic Layer Assessment:
- ✅ Open source: Apache 2.0, lightweight (pip install boring-semantic-layer)
- ✅ Built on Ibis: Supports DuckDB, PostgreSQL, BigQuery, Snowflake
- ✅ LLM-friendly: Designed for connecting LLMs to structured data
- ⚠️ Phase 3 candidate: Good fit when adding AI copilot features
- ❌ Not for MVP: Optimized for LLM use cases, not transactional costing
Decision: - Phase 1 (MVP): Custom YAML (simpler, full control) - Phase 2 (if complexity grows): Migrate to dbt - Phase 3 (AI features): Consider Boring Semantic Layer for LLM→Database bridge
Rationale: Boring Semantic Layer is excellent for AI/RAG use cases but adds unnecessary abstraction for deterministic costing formulas in MVP.
2. Prefect vs. Temporal vs. Celery¶
| Factor | Celery (Phase 1) | Temporal (Phase 2) | Prefect |
|---|---|---|---|
| Simplicity | High | Medium | Medium |
| Durability | Basic | Excellent | Good |
| Versioning support | No | Yes (built-in) | No |
| Learning curve | Low | High | Medium |
Decision: Start Celery, add Temporal when workflow complexity increases
3. Metabase vs. Superset¶
| Factor | Metabase | Superset (Chosen) |
|---|---|---|
| License | AGPL | Apache 2.0 |
| Customization | Limited | Extensive |
| SQL access | Yes | Yes |
| Embedding | Basic | Advanced |
Decision: Superset for deeper customization needs.
References¶
- Semantic Layer Documentation
- Data Pipeline Architecture
- Use Case Estimates
- Temporal Documentation
- dbt Metrics
- Cube.dev OSS
Revision History¶
| Date | Author | Changes |
|---|---|---|
| 2026-01-24 | Engineering Team | Initial ADR |
Next ADRs: * ADR-002: Database Schema (Star vs. Snowflake) * ADR-003: Excel Parsing Strategy (pandas vs. openpyxl) * ADR-004: Quote Versioning Implementation