Working with Parquet in Python
Apache Parquet file handling with high-performance columnar data processing and compression.
Installation
Section titled “Installation”pip install fairspecGetting Started
Section titled “Getting Started”The Parquet plugin provides:
load_parquet_table- Load Parquet files into tablessave_parquet_table- Save tables to Parquet filesParquetPlugin- Plugin for framework integration
For example:
from fairspec import load_parquet_table, Resource
table = load_parquet_table(Resource(data="table.parquet"))# Efficient columnar format with compressionBasic Usage
Section titled “Basic Usage”Loading Parquet Files
Section titled “Loading Parquet Files”from fairspec import load_parquet_table, Resource
# Load from local filetable = load_parquet_table(Resource(data="data.parquet"))
# Load from remote URLtable = load_parquet_table(Resource(data="https://example.com/data.parquet"))
# Load multiple files (concatenated)table = load_parquet_table(Resource(data=["file1.parquet", "file2.parquet"]))Saving Parquet Files
Section titled “Saving Parquet Files”from fairspec import save_parquet_table
# Save with default optionssave_parquet_table(table, path="output.parquet")Advanced Features
Section titled “Advanced Features”Remote File Loading
Section titled “Remote File Loading”from fairspec import load_parquet_table, Resource
# Load from URLtable = load_parquet_table(Resource(data="https://example.com/data.parquet"))
# Load multiple remote filestable = load_parquet_table(Resource(data=[ "https://api.example.com/data-2023.parquet", "https://api.example.com/data-2024.parquet",]))