Benchmark

Benchmark

1. Architecture & Paradigm

  • Runink: A Go/Linux-native, vertically integrated data platform that combines execution, scheduling, governance, and observability in a single runtime. Unlike traditional stacks, Runink does not rely on Kubernetes or external orchestrators. Instead, it uses a Raft-based control plane to ensure high availability and consensus across services like scheduling, metadata, and security — forming a distributed operating model purpose-built for data.

  • Competitors: Use a layered, loosely coupled stack:

    • Execution: Spark, Beam (JVM-based)
    • Orchestration: Airflow, Dagster (often on Kubernetes)
    • Transformation: DBT (runs SQL on external data platforms)
    • Cluster Management: Kubernetes, Slurm
    • Governance: Collibra, Apache Atlas (external)
  • Key Differentiator: Runink is built from the ground up as a distributed system — with Raft consensus at its core — whereas competitors compose multiple tools that communicate asynchronously or rely on external state systems.


1. MapReduce vs. RDD vs. Raft for Data Pipelines


1.1 Architecture & Paradigm

AspectMapReduceRDDRaft (Runink model)
OriginGoogle (2004)Spark (2010)Raft (2013) adapted for distributed control
Execution ModelBatch, two-stage (Map → Reduce)In-memory DAGs of transformationsReal-time coordination of distributed nodes
Consistency ModelEventual (job outputs persisted)Best effort (job outputs in memory, lineage for recovery)Strong consistency (N/2+1 consensus)
Primary UseLarge batch analyticsInteractive, iterative analyticsDistributed metadata/state management for pipelines
Fault ToleranceOutput checkpointingLineage-based recomputationLog replication and state machine replication

1.2. Performance & Efficiency

AspectMapReduceRDDRaft (Runink)
Cold Start TimeHigh (JVM startup, slot allocation)Medium (Spark cluster overhead)Low (Go processes, native scheduling)
Memory UseDisk-heavyMemory-heavy (RDD caching)Lightweight (control metadata, not bulk data)
I/O OverheadHeavy disk I/O (HDFS reads/writes)Network/memory optimized, but needs enough RAMMinimal (only metadata replication)
Pipeline ComplexityRequires multiple jobs for DAGsNatural DAG executionDirect DAG compilation from DSLs (Runink)

1.3. Data Governance and Lineage

AspectMapReduceRDDRaft (Runink)
Built-in LineageNo (external)Yes (RDD lineage graph)Yes (atomic commit of contracts, steps, runs)
Governance APIsManual (logs, job output)Partial (Spark listeners)Native (contracts, lineage logs, per-slice metadata)
AuditabilityHard to reconstructPossible with effortNative per-run audit logs, Raft-signed events

1.4. Fault Tolerance and Recovery

AspectMapReduceRDDRaft (Runink)
Recovery MechanismRe-run failed jobsRecompute from lineageReplay committed log entries
Failure ImpactFull-stage re-executionDepends on lost partitionsMinimal if quorum is maintained
Availability GuaranteeNonePartial (driver failure = job loss)Strong (as long as majority nodes are alive)

1.5. Security and Isolation

AspectMapReduceRDDRaft (Runink)
AuthenticationOptionalOptionalMandatory (OIDC, RBAC)
Secrets ManagementAd hocAd hocNative, Raft-backed, scoped by Herds
Multi-TenancyNoneNoneHerd isolation (namespace + cgroup enforcement)

1.6 Real case scenario example

Imagine a critical pipeline for trade settlement:

  • MapReduce would force every job to write to disk between stages — slow and painful for debugging.
  • RDD would speed things up but require heavy RAM and still risk full job loss if the driver fails.
  • Raft (Runink) keeps every contract, every transformation, every secret atomically committed and recoverable — even if a node crashes mid-run, the system can resume from the last committed stage safely.

2. Raft Advantages for Distributed Coordination

Runink:

  • Uses Raft for strong consistency and leader election across:

    • Control plane state (Herds, pipelines, RBAC, quotas)
    • Scheduler decisions
    • Secrets and metadata governance
  • Guarantees:

    • No split-brain conditions
    • Predictable and deterministic behavior in failure scenarios
    • Fault-tolerant HA (N/2+1 consensus)

Competitors:

  • Kubernetes uses etcd (Raft-backed), but tools like Airflow/Spark have no equivalent.
  • Scheduling decisions, lineage, and metadata handling are often eventually consistent or stored in external systems without consensus guarantees.
  • Result: higher complexity, latency, and coordination failure risks under scale or failure.

3. Performance & Resource Efficiency

  • Runink:

    • Written in Go for low-latency cold starts and efficient concurrency.
    • Uses direct exec, cgroups, and namespaces, not Docker/K8s layers.
    • Raft ensures low-overhead coordination, avoiding polling retries and state divergence.
  • Competitors:

    • Spark is JVM-based; powerful but resource-heavy.
    • K8s introduces orchestration latency, plus pod startup and scheduling delays.
    • Airflow relies on Celery/K8s executors with less efficient scheduling granularity.

4. Scheduling & Resource Management

  • Runink:

    • Custom, Raft-backed Scheduler matches pipeline steps to nodes in real time.
    • Considers Herd quotas, CPU/GPU/Memory availability.
    • Deterministic task placement and retry logic are logged and replayable via Raft.
  • Competitors:

    • Kubernetes schedulers are general-purpose and not pipeline-aware.
    • Airflow does not control actual compute — delegates to backends like K8s.
    • Slurm excels in HPC, but lacks pipeline-native orchestration and data governance.

5. Security Model

  • Runink:

    • Secure-by-default with OIDC + JWT, RBAC, Secrets Manager, mTLS, and field-level masking.
    • Secrets are versioned and replicated with Raft, avoiding plaintext spillage or inconsistent states.
    • Namespace isolation per Herd.
  • Competitors:

    • Kubernetes offers RBAC and secrets, but complexity leads to misconfigurations.
    • Airflow often shares sensitive configs (connections, variables) across DAGs.

6. Data Governance, Lineage & Metadata

  • Runink:

    • Built-in Data Governance Service stores contracts, lineage, quality metrics, and annotations.
    • Changes are committed to Raft, ensuring atomic updates and rollback support.
    • Contracts and pipeline steps are versioned and tracked centrally.
  • Competitors:

    • Require integrating platforms like Atlas or Collibra.
    • Lineage capture is manual or partial, with data loss possible on failure or drift.
    • Metadata syncing lacks consistency guarantees.

7. Multi-Tenancy

  • Runink:

    • Uses Herds as isolation units — enforced via RBAC, ephemeral UIDs, cgroups, and namespace boundaries.
    • Raft ensures configuration updates (quotas, roles) are safely committed across all replicas.
  • Competitors:

    • Kubernetes uses namespaces and resource quotas.
    • Airflow has no robust multi-tenancy — teams often need separate deployments.

8. LLM Integration & Metadata Handling

  • Runink:

    • LLM inference is a first-class pipeline step.
    • Annotations are tied to lineage and stored transactionally in the Raft-backed governance store.
  • Competitors:

    • LLMs are orchestrated as container steps via KubernetesPodOperator or Argo.
    • Metadata is stored in external tools or left untracked.

9. Observability

  • Runink:

    • Built-in metrics via Prometheus, structured logs via Fluentd.
    • Metadata and run stats are Raft-consistent, enabling reproducible audit trails.
    • Observability spans from node → slice → Herd → run.
  • Competitors:

    • Spark, Airflow, and K8s use external stacks (Loki, Grafana, EFK) that need configuration and instrumentation.
    • Logs may be disjointed or context-lacking.

10. Ecosystem & Maturity

  • Runink:

    • Early-stage, but intentionally narrow in scope and highly integrated.
    • No need for external orchestrators or data governance platforms.
  • Competitors:

    • Vast ecosystems (Airflow, Spark, DBT, K8s) with huge community support.
    • Tradeoff: Requires significant integration, coordination, and DevOps effort.

11. Complexity & Operational Effort

  • Runink:

    • High initial build complexity — but centralization of Raft and Go-based primitives allows for deterministic ops, easier debug, and stronger safety guarantees.
    • Zero external dependencies once deployed.
  • Competitors:

    • Operationally fragmented. DevOps teams must manage multiple platforms (e.g., K8s, Helm, Spark, Airflow).
    • Requires cross-tool observability, secrets management, and governance.

✅ Summary: Why Raft Makes Runink Different

CapabilityRunink (Raft-Powered)Spark / Airflow / K8s Stack
State CoordinationRaft ConsensusPartial (only K8s/etcd)
Fault ToleranceHA ReplicationTool-dependent
SchedulerRaft-backed, deterministicVaries per layer
GovernanceNative, consistent, queryableExternal
SecretsEncrypted + Raft-consistentK8s or env vars
LineageImmutable + auto-trackedExternal integrations
MultitenancyHerds + namespace isolationNamespaces (K8s)
SecurityEnd-to-end mTLS + RBAC + UIDsComplex setup
LLM-nativeFirst-class integrationAd hoc orchestration
ObservabilityBuilt-in, unified stackCustom integration

Process flow

  %% Mermaid Diagram: MapReduce vs RDD vs Raft (Runink)
flowchart LR
  subgraph Normal_Operation["✅ Normal Operation (Execution Flow)"]
    subgraph MapReduce
      ID1["Input Data 📂"]
      MP["Map Phase 🛠️"]
      SS["Shuffle & Sort Phase 🔀"]
      RP["Reduce Phase 🛠️"]
      ID1 --> MP --> SS --> RP
    end

    subgraph RDD
      RID1["Input Data 📂"]
      RT["RDD Transformations 🔄"]
      AT["Action Trigger ▶️"]
      JS["Job Scheduler 📋"]
      CE["Cluster Execution (Executors) ⚙️"]
      OD["Output Data 📦"]
      RID1 --> RT --> AT --> JS --> CE --> OD
    end

    subgraph Raft_Runink
      RIN["Input Data 📂"]
      RC["Raft Commit (Contracts + Metadata) 🗄️"]
      RS["Runi Scheduler (Raft-backed) 🧠"]
      LW["Launch Slices (Isolated Workers) 🚀"]
      ROD["Output Data 📦"]
      RIN --> RC --> RS --> LW --> ROD
    end
  end

  subgraph Failure_Recovery["⚡ Failure Recovery Flow (Crash Handling)"]
    subgraph MapReduce_Failure
      MFID["Input Data 📂"]
      MFR["Map Phase Running 🛠️"]
      MC["Map Node Crash 🛑"]
      MF["Job Fails Entirely ❌"]
      MR["Manual Restart Needed 🔄"]
      MFID --> MFR --> MC --> MF --> MR
    end

    subgraph RDD_Failure
      RFID["Input Data 📂"]
      RER["RDD Execution Running 🔄"]
      RCN["RDD Node Crash 🛑"]
      DR["Driver Attempts Lineage Recompute 🔁"]
      PR["Partial or Full Job Restart 🔄"]
      RFID --> RER --> RCN --> DR --> PR
    end

    subgraph Raft_Failure
      RFD["Input Data 📂"]
      SR["Slice Running 🚀"]
      RC["Raft Node Crash 🛑"]
      EL["Raft Detects Loss + Elects New Leader 🧠"]
      RE["Reschedule Slice Elsewhere ♻️"]
      CE["Continue Execution Seamlessly ✅"]
      RFD --> SR --> RC --> EL --> RE --> CE
    end
  end

🚀 How This Model Beats the Status Quo

✅ Compared to Apache Spark

Spark (JVM)Runink (Go + Linux primitives)
JVM-based, slow cold startsInstantaneous slice spawn using exec
Containerized via YARN/Mesos/K8sNo container daemon needed
Fault tolerance via RDD lineage/logsStrong consistency via Raft
Needs external tools for lineageBuilt-in governance and metadata

✅ Compared to Kubernetes + Airflow

Kubernetes / AirflowRunink
DAGs stored in SQL, not consistent across API serversDAGs submitted via Raft log, replicated to all
Task scheduling needs K8s Scheduler or CeleryRuni agents coordinate locally via consensus
Containers = overheadDirect exec in a namespaced PID space
Secrets are environment or K8s Secret dependentRaft-backed, RBAC-scoped Secrets Manager
Governance/logging externalObservability and lineage native and real-time

🧠 Conclusion: Go + Linux internals + Raft = Data-Native Compute

Runink leverages Raft consensus not just for fault tolerance, but as a foundational architectural choice. It eliminates whole categories of orchestration complexity, state drift, and configuration mismatches by building from first principles — while offering a single runtime that natively understands pipelines, contracts, lineage, and compute.

If you’re designing a modern data platform — especially one focused on governance, and efficient domain isolation — Runink is a radically integrated alternative to the Kubernetes-centric model.