Back to all terms

Resilient Distributed Dataset (RDD)

Definition Fundamental immutable data structure in Apache Spark.

Why It Matters

Enables fault-tolerant parallel processing.

Deeper Detail

Core building block for PySpark analytics.

Related Terms