Welcome to PardoX
The Speed of Rust. The Simplicity of Python.
PardoX is a high-performance DataFrame engine designed for modern data engineering. It combines the safety and speed of a Rust Core with the ease of use of a Python SDK, allowing you to process massive datasets efficiently without learning a new language.
🚀 Why PardoX?
- Zero-Copy Architecture: Data is loaded directly into memory-mapped buffers.
- SIMD Acceleration: Mathematical operations utilize AVX2/NEON CPU instructions.
- Universal Compatibility: Runs natively on Windows, Linux, and MacOS (Intel & Apple Silicon).
- Native Format: The
.prdxbinary format allows for instant data persistence.
📚 Documentation Modules
Select a topic to start building faster pipelines:
🏁 Getting Started
- Installation - Setup guide for Windows, Linux, and Mac.
- Quick Start - Build your first ETL pipeline in 5 minutes.
📘 User Guide
- Input / Output - Learn about the multi-threaded CSV reader and SQL engine.
- Data Mutation - Perform vectorized arithmetic and data cleaning.
- Aggregations - Extract business insights and statistical metrics.
⚙️ API Reference
- Full Reference - Detailed documentation of classes and functions.
📓 Examples & Notebooks
- Jupyter Notebooks - Interactive examples and tutorials showcasing PardoX capabilities, including the v0.1 Beta Showcase with real-world ETL scenarios.
- Benchmark Scripts - Production-ready example for processing 640 million rows and transforming data into
.prdxformat, demonstrating real-world performance at scale.
📦 Installation
Get started immediately via pip:
Open Source Project distributed under the MIT License.
More info: www.pardox.io