API Reference: I/O Functions¶

This page documents all file input/output functions in Diet Pandas.

All I/O functions automatically optimize the loaded DataFrame and return a standard pandas DataFrame.

Read Functions¶

read_csv()¶

Read a CSV file with automatic memory optimization. Uses Polars engine for 5-10x faster parsing.

`dietpandas.io.read_csv(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, use_polars=True, schema_path=None, save_schema=False, memory_threshold=0.7, auto_chunk=True, chunksize=100000, **kwargs)` ¶

Reads a CSV file using Polars engine (if available), then converts to optimized Pandas.

Automatically switches to chunked reading when file is too large to fit in memory. This function is often 5-10x faster at parsing CSVs than pandas.read_csv, and the resulting DataFrame uses significantly less memory due to automatic optimization.

Parameters:

Name	Type	Description	Default
`filepath`	`Union[str, Path]`	Path to CSV file	required
`optimize`	`bool`	If True, apply diet optimization after reading (default: True)	`True`
`aggressive`	`bool`	If True, use aggressive optimization (float16, etc.)	`False`
`categorical_threshold`	`float`	Threshold for converting objects to categories	`0.5`
`verbose`	`bool`	If True, print memory reduction statistics	`False`
`use_polars`	`bool`	If True and Polars is available, use it for parsing (default: True)	`True`
`schema_path`	`Union[str, Path, None]`	Optional path to schema file for consistent typing	`None`
`save_schema`	`bool`	If True, save schema after optimization (only with chunked reading)	`False`
`memory_threshold`	`float`	Use chunked reading if estimated memory > threshold * available (default: 0.7)	`0.7`
`auto_chunk`	`bool`	If True, automatically use chunked reading for large files (default: True)	`True`
`chunksize`	`int`	Number of rows per chunk when using chunked reading (default: 100,000)	`100000`
`**kwargs`		Additional arguments passed to the CSV reader	`{}`

Returns:

Type	Description
`DataFrame`	Optimized pandas DataFrame

Examples:

>>> df = read_csv("large_dataset.csv")
Diet Complete: Memory reduced by 67.3%

>>> # Disable optimization if needed
>>> df = read_csv("data.csv", optimize=False)

>>> # Use aggressive mode for maximum compression
>>> df = read_csv("data.csv", aggressive=True)

>>> # Use saved schema for consistent typing
>>> df = read_csv("data.csv", schema_path="data.diet_schema.json")

>>> # Large files are automatically chunked
>>> df = read_csv("huge_file.csv")  # Automatically uses chunked reading

Source code in src/dietpandas/io.py

def read_csv(
    filepath: Union[str, Path],
    optimize: bool = True,
    aggressive: bool = False,
    categorical_threshold: float = 0.5,
    verbose: bool = False,
    use_polars: bool = True,
    schema_path: Union[str, Path, None] = None,
    save_schema: bool = False,
    memory_threshold: float = 0.7,
    auto_chunk: bool = True,
    chunksize: int = 100000,
    **kwargs,
) -> pd.DataFrame:
    """
    Reads a CSV file using Polars engine (if available), then converts to optimized Pandas.

    Automatically switches to chunked reading when file is too large to fit in memory.
    This function is often 5-10x faster at parsing CSVs than pandas.read_csv, and the
    resulting DataFrame uses significantly less memory due to automatic optimization.

    Args:
        filepath: Path to CSV file
        optimize: If True, apply diet optimization after reading (default: True)
        aggressive: If True, use aggressive optimization (float16, etc.)
        categorical_threshold: Threshold for converting objects to categories
        verbose: If True, print memory reduction statistics
        use_polars: If True and Polars is available, use it for parsing (default: True)
        schema_path: Optional path to schema file for consistent typing
        save_schema: If True, save schema after optimization
            (only with chunked reading)
        memory_threshold: Use chunked reading if estimated memory >
            threshold * available (default: 0.7)
        auto_chunk: If True, automatically use chunked reading for large files (default: True)
        chunksize: Number of rows per chunk when using chunked reading (default: 100,000)
        **kwargs: Additional arguments passed to the CSV reader

    Returns:
        Optimized pandas DataFrame

    Examples:
        >>> df = read_csv("large_dataset.csv")
        Diet Complete: Memory reduced by 67.3%

        >>> # Disable optimization if needed
        >>> df = read_csv("data.csv", optimize=False)

        >>> # Use aggressive mode for maximum compression
        >>> df = read_csv("data.csv", aggressive=True)

        >>> # Use saved schema for consistent typing
        >>> df = read_csv("data.csv", schema_path="data.diet_schema.json")

        >>> # Large files are automatically chunked
        >>> df = read_csv("huge_file.csv")  # Automatically uses chunked reading
    """
    filepath = Path(filepath)

    # Check if we should use chunked reading
    use_chunked = False
    if auto_chunk:
        try:
            estimated_memory = _estimate_csv_memory_mb(filepath)
            available_memory = _get_available_memory_mb()

            if estimated_memory > (available_memory * memory_threshold):
                use_chunked = True
                if verbose:
                    print(
                        f"File size: ~{estimated_memory:.0f}MB, "
                        f"Available: {available_memory:.0f}MB - "
                        f"Using chunked reading"
                    )
        except Exception:
            # If we can't check, proceed with normal reading
            pass

    # Use chunked reading for large files
    if use_chunked:
        return _read_csv_chunked(
            filepath,
            chunksize=chunksize,
            optimize=optimize,
            aggressive=aggressive,
            categorical_threshold=categorical_threshold,
            verbose=verbose,
            schema_path=schema_path,
            save_schema=save_schema,
            **kwargs,
        )

    # Normal reading path
    filepath_str = str(filepath)

    # Try to use Polars for fast parsing
    if use_polars and POLARS_AVAILABLE:
        try:
            # Step 1: Fast Read with Polars
            # Polars is multi-threaded and much faster at parsing CSVs
            pl_df = pl.read_csv(filepath_str, **kwargs)

            # Step 2: Convert to Pandas
            pd_df = pl_df.to_pandas()

            if verbose:
                print("Loaded with Polars engine (fast mode)")

        except Exception as e:
            if verbose:
                print(f"Polars parsing failed ({e}), falling back to Pandas")
            # Fallback to standard Pandas
            pd_df = pd.read_csv(filepath_str, **kwargs)
    else:
        # Use standard Pandas
        if verbose and use_polars and not POLARS_AVAILABLE:
            print("Polars not installed, using standard Pandas reader")
        pd_df = pd.read_csv(filepath_str, **kwargs)

    # Apply schema if provided
    if schema_path:
        from .schema import apply_schema, load_schema

        if Path(schema_path).exists():
            if verbose:
                print(f"Applying schema from {schema_path}")
            schema = load_schema(schema_path)
            pd_df = apply_schema(pd_df, schema)
            return pd_df

    # Step 3: Apply the Diet immediately
    if optimize:
        result = diet(
            pd_df,
            verbose=verbose,
            aggressive=aggressive,
            categorical_threshold=categorical_threshold,
        )

        # Save schema if requested
        if save_schema and schema_path:
            from .schema import save_schema as save_schema_func

            save_schema_func(result, schema_path)
            if verbose:
                print(f"Saved schema to {schema_path}")

        return result

    return pd_df

Example:

import dietpandas as dp

# Basic usage
df = dp.read_csv("data.csv")

# Disable optimization
df = dp.read_csv("data.csv", optimize=False)

# Aggressive mode
df = dp.read_csv("data.csv", aggressive=True)

read_parquet()¶

Read a Parquet file with automatic memory optimization.

`dietpandas.io.read_parquet(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, use_polars=True, **kwargs)` ¶

Reads a Parquet file using Polars engine (if available), then converts to optimized Pandas.

Parameters:

Name	Type	Description	Default
`filepath`	`Union[str, Path]`	Path to Parquet file	required
`optimize`	`bool`	If True, apply diet optimization after reading (default: True)	`True`
`aggressive`	`bool`	If True, use aggressive optimization (float16, etc.)	`False`
`categorical_threshold`	`float`	Threshold for converting objects to categories	`0.5`
`verbose`	`bool`	If True, print memory reduction statistics	`False`
`use_polars`	`bool`	If True and Polars is available, use it for parsing (default: True)	`True`
`**kwargs`		Additional arguments passed to the Parquet reader	`{}`

Returns:

Type	Description
`DataFrame`	Optimized pandas DataFrame

Source code in src/dietpandas/io.py

def read_parquet(
    filepath: Union[str, Path],
    optimize: bool = True,
    aggressive: bool = False,
    categorical_threshold: float = 0.5,
    verbose: bool = False,
    use_polars: bool = True,
    **kwargs,
) -> pd.DataFrame:
    """
    Reads a Parquet file using Polars engine (if available), then converts to optimized Pandas.

    Args:
        filepath: Path to Parquet file
        optimize: If True, apply diet optimization after reading (default: True)
        aggressive: If True, use aggressive optimization (float16, etc.)
        categorical_threshold: Threshold for converting objects to categories
        verbose: If True, print memory reduction statistics
        use_polars: If True and Polars is available, use it for parsing (default: True)
        **kwargs: Additional arguments passed to the Parquet reader

    Returns:
        Optimized pandas DataFrame
    """
    filepath = str(filepath)

    # Try to use Polars for fast parsing
    if use_polars and POLARS_AVAILABLE:
        try:
            pl_df = pl.read_parquet(filepath, **kwargs)
            pd_df = pl_df.to_pandas()

            if verbose:
                print("Loaded with Polars engine (fast mode)")

        except Exception as e:
            if verbose:
                print(f"Polars parsing failed ({e}), falling back to Pandas")
            pd_df = pd.read_parquet(filepath, **kwargs)
    else:
        if verbose and use_polars and not POLARS_AVAILABLE:
            print("Polars not installed, using standard Pandas reader")
        pd_df = pd.read_parquet(filepath, **kwargs)

    if optimize:
        return diet(
            pd_df,
            verbose=verbose,
            aggressive=aggressive,
            categorical_threshold=categorical_threshold,
        )

    return pd_df

Example:

import dietpandas as dp

df = dp.read_parquet("data.parquet")

read_excel()¶

Read an Excel file with automatic memory optimization.

`dietpandas.io.read_excel(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs)` ¶

Reads an Excel file and returns an optimized Pandas DataFrame.

Note: Polars support for Excel is limited, so this uses pandas.read_excel.

Parameters:

Name	Type	Description	Default
`filepath`	`Union[str, Path]`	Path to Excel file	required
`optimize`	`bool`	If True, apply diet optimization after reading (default: True)	`True`
`aggressive`	`bool`	If True, use aggressive optimization (float16, etc.)	`False`
`categorical_threshold`	`float`	Threshold for converting objects to categories	`0.5`
`verbose`	`bool`	If True, print memory reduction statistics	`False`
`**kwargs`		Additional arguments passed to pandas.read_excel	`{}`

Returns:

Type	Description
`DataFrame`	Optimized pandas DataFrame

Source code in src/dietpandas/io.py

def read_excel(
    filepath: Union[str, Path],
    optimize: bool = True,
    aggressive: bool = False,
    categorical_threshold: float = 0.5,
    verbose: bool = False,
    **kwargs,
) -> pd.DataFrame:
    """
    Reads an Excel file and returns an optimized Pandas DataFrame.

    Note: Polars support for Excel is limited, so this uses pandas.read_excel.

    Args:
        filepath: Path to Excel file
        optimize: If True, apply diet optimization after reading (default: True)
        aggressive: If True, use aggressive optimization (float16, etc.)
        categorical_threshold: Threshold for converting objects to categories
        verbose: If True, print memory reduction statistics
        **kwargs: Additional arguments passed to pandas.read_excel

    Returns:
        Optimized pandas DataFrame
    """
    filepath = str(filepath)
    pd_df = pd.read_excel(filepath, **kwargs)

    if optimize:
        return diet(
            pd_df,
            verbose=verbose,
            aggressive=aggressive,
            categorical_threshold=categorical_threshold,
        )

    return pd_df

Example:

import dietpandas as dp

# Read specific sheet
df = dp.read_excel("data.xlsx", sheet_name="Sheet1")

# Read all sheets
dfs = dp.read_excel("data.xlsx", sheet_name=None)

read_json()¶

Read a JSON file with automatic memory optimization.

`dietpandas.io.read_json(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs)` ¶

Reads a JSON file and returns an optimized Pandas DataFrame.

Parameters:

Name	Type	Description	Default
`filepath`	`Union[str, Path]`	Path to JSON file	required
`optimize`	`bool`	If True, apply diet optimization after reading (default: True)	`True`
`aggressive`	`bool`	If True, use aggressive optimization (float16, etc.)	`False`
`categorical_threshold`	`float`	Threshold for converting objects to categories	`0.5`
`verbose`	`bool`	If True, print memory reduction statistics	`False`
`**kwargs`		Additional arguments passed to pandas.read_json	`{}`

Returns:

Type	Description
`DataFrame`	Optimized pandas DataFrame

Examples:

>>> df = read_json("data.json")
🥗 Diet Complete: Memory reduced by 45.2%

Source code in src/dietpandas/io.py

def read_json(
    filepath: Union[str, Path],
    optimize: bool = True,
    aggressive: bool = False,
    categorical_threshold: float = 0.5,
    verbose: bool = False,
    **kwargs,
) -> pd.DataFrame:
    """
    Reads a JSON file and returns an optimized Pandas DataFrame.

    Args:
        filepath: Path to JSON file
        optimize: If True, apply diet optimization after reading (default: True)
        aggressive: If True, use aggressive optimization (float16, etc.)
        categorical_threshold: Threshold for converting objects to categories
        verbose: If True, print memory reduction statistics
        **kwargs: Additional arguments passed to pandas.read_json

    Returns:
        Optimized pandas DataFrame

    Examples:
        >>> df = read_json("data.json")
        🥗 Diet Complete: Memory reduced by 45.2%
    """
    filepath = str(filepath)
    pd_df = pd.read_json(filepath, **kwargs)

    if optimize:
        return diet(
            pd_df,
            verbose=verbose,
            aggressive=aggressive,
            categorical_threshold=categorical_threshold,
        )

    return pd_df

Example:

import dietpandas as dp

# Read JSON lines format
df = dp.read_json("data.jsonl", lines=True)

# Read standard JSON
df = dp.read_json("data.json")

read_hdf()¶

Read an HDF5 file with automatic memory optimization.

`dietpandas.io.read_hdf(filepath, key, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs)` ¶

Reads an HDF5 file and returns an optimized Pandas DataFrame.

Parameters:

Name	Type	Description	Default
`filepath`	`Union[str, Path]`	Path to HDF5 file	required
`key`	`str`	Group identifier in the HDF5 file	required
`optimize`	`bool`	If True, apply diet optimization after reading (default: True)	`True`
`aggressive`	`bool`	If True, use aggressive optimization (float16, etc.)	`False`
`categorical_threshold`	`float`	Threshold for converting objects to categories	`0.5`
`verbose`	`bool`	If True, print memory reduction statistics	`False`
`**kwargs`		Additional arguments passed to pandas.read_hdf	`{}`

Returns:

Type	Description
`DataFrame`	Optimized pandas DataFrame

Examples:

>>> df = read_hdf("data.h5", key="dataset1")
🥗 Diet Complete: Memory reduced by 52.1%

Source code in src/dietpandas/io.py

def read_hdf(
    filepath: Union[str, Path],
    key: str,
    optimize: bool = True,
    aggressive: bool = False,
    categorical_threshold: float = 0.5,
    verbose: bool = False,
    **kwargs,
) -> pd.DataFrame:
    """
    Reads an HDF5 file and returns an optimized Pandas DataFrame.

    Args:
        filepath: Path to HDF5 file
        key: Group identifier in the HDF5 file
        optimize: If True, apply diet optimization after reading (default: True)
        aggressive: If True, use aggressive optimization (float16, etc.)
        categorical_threshold: Threshold for converting objects to categories
        verbose: If True, print memory reduction statistics
        **kwargs: Additional arguments passed to pandas.read_hdf

    Returns:
        Optimized pandas DataFrame

    Examples:
        >>> df = read_hdf("data.h5", key="dataset1")
        🥗 Diet Complete: Memory reduced by 52.1%
    """
    filepath = str(filepath)
    pd_df = pd.read_hdf(filepath, key=key, **kwargs)

    if optimize:
        return diet(
            pd_df,
            verbose=verbose,
            aggressive=aggressive,
            categorical_threshold=categorical_threshold,
        )

    return pd_df

Example:

import dietpandas as dp

df = dp.read_hdf("data.h5", key="dataset1")

Note: Requires optional tables dependency:

pip install "diet-pandas[hdf]"

read_feather()¶

Read a Feather file with automatic memory optimization.

`dietpandas.io.read_feather(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs)` ¶

Reads a Feather file and returns an optimized Pandas DataFrame.

Feather is a fast, lightweight columnar data format.

Parameters:

Name	Type	Description	Default
`filepath`	`Union[str, Path]`	Path to Feather file	required
`optimize`	`bool`	If True, apply diet optimization after reading (default: True)	`True`
`aggressive`	`bool`	If True, use aggressive optimization (float16, etc.)	`False`
`categorical_threshold`	`float`	Threshold for converting objects to categories	`0.5`
`verbose`	`bool`	If True, print memory reduction statistics	`False`
`**kwargs`		Additional arguments passed to pandas.read_feather	`{}`

Returns:

Type	Description
`DataFrame`	Optimized pandas DataFrame

Examples:

>>> df = read_feather("data.feather")
🥗 Diet Complete: Memory reduced by 38.7%

Source code in src/dietpandas/io.py

def read_feather(
    filepath: Union[str, Path],
    optimize: bool = True,
    aggressive: bool = False,
    categorical_threshold: float = 0.5,
    verbose: bool = False,
    **kwargs,
) -> pd.DataFrame:
    """
    Reads a Feather file and returns an optimized Pandas DataFrame.

    Feather is a fast, lightweight columnar data format.

    Args:
        filepath: Path to Feather file
        optimize: If True, apply diet optimization after reading (default: True)
        aggressive: If True, use aggressive optimization (float16, etc.)
        categorical_threshold: Threshold for converting objects to categories
        verbose: If True, print memory reduction statistics
        **kwargs: Additional arguments passed to pandas.read_feather

    Returns:
        Optimized pandas DataFrame

    Examples:
        >>> df = read_feather("data.feather")
        🥗 Diet Complete: Memory reduced by 38.7%
    """
    filepath = str(filepath)
    pd_df = pd.read_feather(filepath, **kwargs)

    if optimize:
        return diet(
            pd_df,
            verbose=verbose,
            aggressive=aggressive,
            categorical_threshold=categorical_threshold,
        )

    return pd_df

Example:

import dietpandas as dp

df = dp.read_feather("data.feather")

Write Functions¶

to_csv_optimized()¶

Write a DataFrame to CSV with memory optimization.

`dietpandas.io.to_csv_optimized(df, filepath, optimize_before_save=True, **kwargs)` ¶

Saves a DataFrame to CSV, optionally optimizing it first.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DataFrame to save	required
`filepath`	`Union[str, Path]`	Path where CSV will be saved	required
`optimize_before_save`	`bool`	If True, optimize the DataFrame before saving	`True`
`**kwargs`		Additional arguments passed to pandas.to_csv	`{}`

Source code in src/dietpandas/io.py

def to_csv_optimized(
    df: pd.DataFrame, filepath: Union[str, Path], optimize_before_save: bool = True, **kwargs
) -> None:
    """
    Saves a DataFrame to CSV, optionally optimizing it first.

    Args:
        df: DataFrame to save
        filepath: Path where CSV will be saved
        optimize_before_save: If True, optimize the DataFrame before saving
        **kwargs: Additional arguments passed to pandas.to_csv
    """
    if optimize_before_save:
        df = diet(df, verbose=False)

    df.to_csv(filepath, **kwargs)

Example:

import dietpandas as dp
import pandas as pd

df = pd.DataFrame({'col': range(1000)})
dp.to_csv_optimized(df, "output.csv")

to_parquet_optimized()¶

Write a DataFrame to Parquet with memory optimization.

`dietpandas.io.to_parquet_optimized(df, filepath, optimize_before_save=True, **kwargs)` ¶

Saves a DataFrame to Parquet format, optionally optimizing it first.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DataFrame to save	required
`filepath`	`Union[str, Path]`	Path where Parquet file will be saved	required
`optimize_before_save`	`bool`	If True, optimize the DataFrame before saving	`True`
`**kwargs`		Additional arguments passed to pandas.to_parquet	`{}`

Source code in src/dietpandas/io.py

def to_parquet_optimized(
    df: pd.DataFrame, filepath: Union[str, Path], optimize_before_save: bool = True, **kwargs
) -> None:
    """
    Saves a DataFrame to Parquet format, optionally optimizing it first.

    Args:
        df: DataFrame to save
        filepath: Path where Parquet file will be saved
        optimize_before_save: If True, optimize the DataFrame before saving
        **kwargs: Additional arguments passed to pandas.to_parquet
    """
    if optimize_before_save:
        df = diet(df, verbose=False)

    df.to_parquet(filepath, **kwargs)

Example:

import dietpandas as dp
import pandas as pd

df = pd.DataFrame({'col': range(1000)})
dp.to_parquet_optimized(df, "output.parquet")

to_feather_optimized()¶

Write a DataFrame to Feather format with memory optimization.

`dietpandas.io.to_feather_optimized(df, filepath, optimize_before_save=True, **kwargs)` ¶

Saves a DataFrame to Feather format, optionally optimizing it first.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DataFrame to save	required
`filepath`	`Union[str, Path]`	Path where Feather file will be saved	required
`optimize_before_save`	`bool`	If True, optimize the DataFrame before saving	`True`
`**kwargs`		Additional arguments passed to pandas.to_feather	`{}`

Source code in src/dietpandas/io.py

def to_feather_optimized(
    df: pd.DataFrame, filepath: Union[str, Path], optimize_before_save: bool = True, **kwargs
) -> None:
    """
    Saves a DataFrame to Feather format, optionally optimizing it first.

    Args:
        df: DataFrame to save
        filepath: Path where Feather file will be saved
        optimize_before_save: If True, optimize the DataFrame before saving
        **kwargs: Additional arguments passed to pandas.to_feather
    """
    if optimize_before_save:
        df = diet(df, verbose=False)

    df.to_feather(filepath, **kwargs)

Example:

import dietpandas as dp
import pandas as pd

df = pd.DataFrame({'col': range(1000)})
dp.to_feather_optimized(df, "output.feather")

Supported File Formats¶

Format	Read Function	Write Function	Optional Dependency
CSV	`read_csv()`	`to_csv_optimized()`	None (built-in)
Parquet	`read_parquet()`	`to_parquet_optimized()`	`pyarrow`
Excel	`read_excel()`	N/A	`openpyxl`
JSON	`read_json()`	N/A	None (built-in)
HDF5	`read_hdf()`	N/A	`tables`
Feather	`read_feather()`	`to_feather_optimized()`	`pyarrow`

Performance Comparison¶

CSV Reading Performance¶

import time
import pandas as pd
import dietpandas as dp

# Standard pandas
start = time.time()
df_pandas = pd.read_csv("large_file.csv")
pandas_time = time.time() - start

# Diet pandas
start = time.time()
df_diet = dp.read_csv("large_file.csv")
diet_time = time.time() - start

print(f"Pandas: {pandas_time:.2f}s, Memory: {df_pandas.memory_usage().sum() / 1e6:.1f} MB")
print(f"Diet:   {diet_time:.2f}s, Memory: {df_diet.memory_usage().sum() / 1e6:.1f} MB")
# Pandas: 45.2s, Memory: 2300.0 MB
# Diet:   8.7s, Memory: 750.0 MB

Common Parameters¶

Most read functions support these common parameters:

optimize (bool, default=True): Whether to optimize memory usage
aggressive (bool, default=False): Use aggressive optimization mode
**kwargs: Additional parameters passed to underlying pandas function

API Reference: I/O Functions¶

Read Functions¶

read_csv()¶

dietpandas.io.read_csv(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, use_polars=True, schema_path=None, save_schema=False, memory_threshold=0.7, auto_chunk=True, chunksize=100000, **kwargs) ¶

read_parquet()¶

dietpandas.io.read_parquet(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, use_polars=True, **kwargs) ¶

read_excel()¶

dietpandas.io.read_excel(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs) ¶

read_json()¶

dietpandas.io.read_json(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs) ¶

read_hdf()¶

dietpandas.io.read_hdf(filepath, key, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs) ¶

read_feather()¶

dietpandas.io.read_feather(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs) ¶

Write Functions¶

to_csv_optimized()¶

dietpandas.io.to_csv_optimized(df, filepath, optimize_before_save=True, **kwargs) ¶

to_parquet_optimized()¶

dietpandas.io.to_parquet_optimized(df, filepath, optimize_before_save=True, **kwargs) ¶

to_feather_optimized()¶

dietpandas.io.to_feather_optimized(df, filepath, optimize_before_save=True, **kwargs) ¶

Supported File Formats¶

Performance Comparison¶

CSV Reading Performance¶

Common Parameters¶

See Also¶

`dietpandas.io.read_csv(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, use_polars=True, schema_path=None, save_schema=False, memory_threshold=0.7, auto_chunk=True, chunksize=100000, **kwargs)` ¶

`dietpandas.io.read_parquet(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, use_polars=True, **kwargs)` ¶

`dietpandas.io.read_excel(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs)` ¶

`dietpandas.io.read_json(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs)` ¶

`dietpandas.io.read_hdf(filepath, key, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs)` ¶

`dietpandas.io.read_feather(filepath, optimize=True, aggressive=False, categorical_threshold=0.5, verbose=False, **kwargs)` ¶

`dietpandas.io.to_csv_optimized(df, filepath, optimize_before_save=True, **kwargs)` ¶

`dietpandas.io.to_parquet_optimized(df, filepath, optimize_before_save=True, **kwargs)` ¶

`dietpandas.io.to_feather_optimized(df, filepath, optimize_before_save=True, **kwargs)` ¶