dask_awkward.sample#
- dask_awkward.sample(arr, factor=None, probability=None)[source]#
Decimate the data to a smaller number of rows.
Must give either factor or probability.
- Parameters:
arr (dask_awkward.Array) – Array collection to sample
factor (int, optional) – if given, every Nth row will be kept. The counting restarts for each partition, so reducing the row count by an exact factor is not guaranteed
probability (float, optional) – a number between 0 and 1, giving the chance of any particular row surviving. For instance, for probability=0.1, roughly 1-in-10 rows will remain.
- Return type: