dask_awkward.from_parquet
dask_awkward.from_parquet¶
- dask_awkward.from_parquet(path, storage_options=None, ignore_metadata=True, scan_files=False, columns=None, filters=None, split_row_groups=None)[source]¶
Create an Array collection from a Parquet dataset.
- Parameters
url (str) – Location of data, including protocol (e.g.
s3://
)storage_options (dict) – For creating filesystem (see
fsspec
documentation).ignore_metadata (bool) – Ignore parquet metadata associated with the input dataset (the
_metadata
file).scan_files (bool) – TBD
filters (list[list[tuple]], optional) – Parquet-style filters for excluding row groups based on column statistics
split_row_groups (bool, optional) – If True, each row group becomes a partition. If False, each file becomes a partition. If None, the existence of a
_metadata
file and ignore_metadata=False implies True, else False.
- Returns
Array collection from the parquet dataset.
- Return type