Working with CSV in Python
Comprehensive CSV file handling with automatic format detection, advanced header processing, and high-performance data operations.
Installation
Section titled “Installation”pip install fairspecGetting Started
Section titled “Getting Started”The CSV plugin provides these capabilities:
load_csv_table- Load CSV/TSV files into tablessave_csv_table- Save tables to CSV/TSV filesCsvPlugin- Plugin for framework integration
For example:
from fairspec import load_csv_table, Resource
table = load_csv_table(Resource(data="table.csv"))# the column types will be automatically inferred# or you can provide a Table SchemaBasic Usage
Section titled “Basic Usage”Loading CSV Files
Section titled “Loading CSV Files”from fairspec import load_csv_table, Resourcefrom fairspec_metadata import CsvFileDialect
# Load a simple CSV filetable = load_csv_table(Resource(data="data.csv"))
# Load with custom formattable = load_csv_table(Resource( data="data.csv", fileDialect=CsvFileDialect( delimiter=";", headerRows=[1], ),))
# Load multiple CSV files (concatenated)table = load_csv_table(Resource(data=["part1.csv", "part2.csv", "part3.csv"]))Saving CSV Files
Section titled “Saving CSV Files”from fairspec import save_csv_tablefrom fairspec_metadata import CsvFileDialect
# Save with default optionssave_csv_table(table, path="output.csv")
# Save with custom formatsave_csv_table(table, path="output.csv", fileDialect=CsvFileDialect( delimiter="\t", quoteChar="'",))
# Save as TSVsave_csv_table(table, path="output.tsv", fileDialect=CsvFileDialect(delimiter="\t"))Automatic Format Detection
Section titled “Automatic Format Detection”from fairspec import load_csv_table, Resourcefrom fairspec_metadata import CsvFileDialect
# Format is automatically detected when not specifiedtable = load_csv_table(Resource(data="unknown-dialect.csv"))# The CSV plugin will automatically infer delimiter, quote characters, etc.
# You can also explicitly specify the format if detection isn't accuratetable = load_csv_table(Resource( data="data.csv", fileDialect=CsvFileDialect( delimiter=",", quoteChar='"', headerRows=[1], ),))Advanced Features
Section titled “Advanced Features”Multi-Header Row Processing
Section titled “Multi-Header Row Processing”# CSV with multiple header rows:# Year,2023,2023,2024,2024# Quarter,Q1,Q2,Q1,Q2# Revenue,100,120,110,130
from fairspec import load_csv_table, Resourcefrom fairspec_metadata import CsvFileDialect
table = load_csv_table(Resource( data="multi-header.csv", fileDialect=CsvFileDialect( headerRows=[1, 2], headerJoin="_", ),))# Resulting columns: ["Year_Quarter", "2023_Q1", "2023_Q2", "2024_Q1", "2024_Q2"]Comment Row Handling
Section titled “Comment Row Handling”# CSV with comment rows:# # This is a comment# # Generated on 2024-01-01# Name,Age,City# John,25,NYC
from fairspec import load_csv_table, Resourcefrom fairspec_metadata import CsvFileDialect
table = load_csv_table(Resource( data="with-comments.csv", fileDialect=CsvFileDialect( commentRows=[1, 2], headerRows=[3], ),))
# Or use commentPrefix to skip lines starting with a specific charactertable = load_csv_table(Resource( data="with-comments.csv", fileDialect=CsvFileDialect( commentPrefix="#", headerRows=[1], ),))Remote File Loading
Section titled “Remote File Loading”from fairspec import load_csv_table, Resource
# Load from URLtable = load_csv_table(Resource(data="https://example.com/data.csv"))
# Load multiple remote filestable = load_csv_table(Resource(data=[ "https://api.example.com/data-2023.csv", "https://api.example.com/data-2024.csv",]))