Analytics, engineering, warehousing, governance, ML ops, visualization. The layer that turns raw events into decisions — and the one regulators care most about. Every domain in the tree produces data; this branch is where you make sense of it.
Six sub-categories under data/. Analytics is live with Apache Superset as the first leaf. The rest are mapped and landing as content ships.
BI tools, dashboards, data exploration. Apache Superset is the first live leaf and the 2nth default for new builds. Metabase, Evidence, Druid, and Grafana mapped as stubs.
4 tools · 1 liveETL/ELT pipelines, dbt transformations, orchestration with Airflow and Dagster — and document parsing via LiteParse. The plumbing that gets data (structured and unstructured) from source systems into warehouses and vector stores.
5 databases · 1 livePostgreSQL, ClickHouse, DuckDB, BigQuery, Snowflake. The storage layer analytics tools read from — OLTP to OLAP, star schemas to wide tables, and the tradeoffs between them.
POPIA compliance, data quality frameworks, lineage tracking, cataloging, and access control. The regulatory and operational guardrails that keep data trustworthy and lawful.
Model training, deployment, monitoring, and feature stores. The engineering discipline that turns a notebook experiment into a production prediction service.
Charting libraries, mapping, reporting tools beyond BI dashboards. D3, Observable, deck.gl, and the patterns for building data stories that don't need a full BI platform.
Apache Superset for the read layer. LiteParse for getting unstructured documents into the pipeline in the first place. Salesforce Data Cloud as the CDP / unified-profile layer behind AgentForce. More leaves landing as the data branch grows.
Open-source BI platform. SQL Lab for ad-hoc queries, 40+ chart types, embedded analytics, RBAC with row-level security. The default BI tool for new 2nth builds — preferred over Power BI and Tableau for cost, flexibility, and data sovereignty.
Local document parsing from LlamaIndex. PDFs, DOCX, PPTX, XLSX, images → clean text or JSON with bounding boxes. Runs in a client VPC or on a laptop — no cloud, no LLM, no per-page pricing. The POPIA-safe ingestion layer for RAG.
The Customer Data Platform that grounds AgentForce. Real-time ingestion, identity resolution, harmonisation across structured and unstructured sources, RAG-ready semantic search. The load-bearing data licence for any serious AgentForce deployment.
Data is produced by every domain in the tree. These are the branches it pulls on most — either because they generate the raw material (Business, Technology) or because they consume the output (Finance, Healthcare).