pyarrow.csv.ReadOptions

class pyarrow.csv.ReadOptions(use_threads=None, *, block_size=None, skip_rows=None, column_names=None, autogenerate_column_names=None, encoding='utf8')

Bases: pyarrow.lib._Weakrefable

Options for reading CSV files.

Parameters
  • use_threads (bool, optional (default True)) – Whether to use multiple threads to accelerate reading

  • block_size (int, optional) – How much bytes to process at a time from the input stream. This will determine multi-threading granularity as well as the size of individual chunks in the Table.

  • skip_rows (int, optional (default 0)) – The number of rows to skip before the column names (if any) and the CSV data.

  • column_names (list, optional) – The column names of the target table. If empty, fall back on autogenerate_column_names.

  • autogenerate_column_names (bool, optional (default False)) – Whether to autogenerate column names if column_names is empty. If true, column names will be of the form “f0”, “f1”… If false, column names will be read from the first CSV row after skip_rows.

  • encoding (str, optional (default 'utf8')) – The character encoding of the CSV data. Columns that cannot decode using this encoding can still be read as Binary.

__init__(*args, **kwargs)

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(*args, **kwargs)

Initialize self.

Attributes

autogenerate_column_names

Whether to autogenerate column names if column_names is empty.

block_size

How much bytes to process at a time from the input stream.

column_names

The column names of the target table.

encoding

object

skip_rows

The number of rows to skip before the column names (if any) and the CSV data.

use_threads

Whether to use multiple threads to accelerate reading.

autogenerate_column_names

Whether to autogenerate column names if column_names is empty. If true, column names will be of the form “f0”, “f1”… If false, column names will be read from the first CSV row after skip_rows.

block_size

How much bytes to process at a time from the input stream. This will determine multi-threading granularity as well as the size of individual chunks in the Table.

column_names

The column names of the target table. If empty, fall back on autogenerate_column_names.

encoding

object

Type

encoding

skip_rows

The number of rows to skip before the column names (if any) and the CSV data.

use_threads

Whether to use multiple threads to accelerate reading.