Working with XLSX in Python
Excel (.xlsx) file handling with sheet selection, advanced header processing, and high-performance data operations.
Installation
Section titled “Installation”pip install fairspecGetting Started
Section titled “Getting Started”The XLSX plugin provides:
load_xlsx_table- Load Excel files into tablessave_xlsx_table- Save tables to Excel filesXlsxPlugin- Plugin for framework integration
For example:
from fairspec import load_xlsx_table, Resource
table = load_xlsx_table(Resource(data="table.xlsx"))# the column types will be automatically inferredBasic Usage
Section titled “Basic Usage”Loading XLSX Files
Section titled “Loading XLSX Files”from fairspec import load_xlsx_table, Resourcefrom fairspec_metadata import XlsxFileDialect
# Load a simple XLSX filetable = load_xlsx_table(Resource(data="data.xlsx"))
# Load with custom format (specify sheet)table = load_xlsx_table(Resource( data="data.xlsx", fileDialect=XlsxFileDialect(sheetName="Sheet2"),))
# Load multiple XLSX files (concatenated)table = load_xlsx_table(Resource(data=["part1.xlsx", "part2.xlsx", "part3.xlsx"]))Saving XLSX Files
Section titled “Saving XLSX Files”from fairspec import save_xlsx_tablefrom fairspec_metadata import XlsxFileDialect
# Save with default optionssave_xlsx_table(table, path="output.xlsx")
# Save with custom sheet namesave_xlsx_table(table, path="output.xlsx", fileDialect=XlsxFileDialect(sheetName="Data"))Advanced Features
Section titled “Advanced Features”Sheet Selection
Section titled “Sheet Selection”from fairspec import load_xlsx_table, Resourcefrom fairspec_metadata import XlsxFileDialect
# Select by sheet number (1-indexed)table = load_xlsx_table(Resource( data="workbook.xlsx", fileDialect=XlsxFileDialect(sheetNumber=2),))
# Select by sheet nametable = load_xlsx_table(Resource( data="workbook.xlsx", fileDialect=XlsxFileDialect(sheetName="Sales Data"),))Multi-Header Row Processing
Section titled “Multi-Header Row Processing”from fairspec import load_xlsx_table, Resourcefrom fairspec_metadata import XlsxFileDialect
# XLSX with multiple header rowstable = load_xlsx_table(Resource( data="multi-header.xlsx", fileDialect=XlsxFileDialect( headerRows=[1, 2], headerJoin="_", ),))# Resulting columns: ["Year_Quarter", "2023_Q1", "2023_Q2", "2024_Q1", "2024_Q2"]Comment Row Handling
Section titled “Comment Row Handling”from fairspec import load_xlsx_table, Resourcefrom fairspec_metadata import XlsxFileDialect
# Skip specific comment rowstable = load_xlsx_table(Resource( data="with-comments.xlsx", fileDialect=XlsxFileDialect( commentRows=[1, 2], headerRows=[3], ),))
# Skip rows with comment prefixtable = load_xlsx_table(Resource( data="data.xlsx", fileDialect=XlsxFileDialect( commentPrefix="#", headerRows=[1], ),))Remote File Loading
Section titled “Remote File Loading”from fairspec import load_xlsx_table, Resource
# Load from URLtable = load_xlsx_table(Resource(data="https://example.com/data.xlsx"))
# Load multiple remote filestable = load_xlsx_table(Resource(data=[ "https://api.example.com/data-2023.xlsx", "https://api.example.com/data-2024.xlsx",]))Column Selection
Section titled “Column Selection”from fairspec import load_xlsx_table, Resourcefrom fairspec_metadata import XlsxFileDialect
# Select specific columnstable = load_xlsx_table(Resource( data="data.xlsx", fileDialect=XlsxFileDialect(columnNames=["name", "age", "city"]),))