dask_awkward.from_text

Contents

dask_awkward.from_text#

dask_awkward.from_text(source, blocksize='128 MiB', delimiter=b'\n', compression='infer', storage_options=None)[source]#

Create an Array collection from text data and a delimiter.

The default behavior of this input function is to create an array collection where elements are seperated by newlines.

Parameters:
  • source (str | list[str]) – Data source as a list of files or a single path (can be remote files).

  • blocksize (str | int) – Size of each partition in bytes.

  • delimiter (bytes) – Delimiter to separate elements of the array (default is newline character).

  • compression (str, optional) – Compression of the files for reading (default is to infer).

  • storage_options (dict, optional) – Storage options passed to the fsspec filesystem.

Returns:

Resulting collection.

Return type:

Array

Examples

>>> import dask_awkward as dak
>>> ds = dak.from_text("s3://path/to/files/*.txt", blocksize="256 MiB")