This channel is rarely used. For other channels to contact the dask community, please see https://docs.dask.org/en/stable/support.html
da.from_array(large_numpy_array)
was blowing up workers, as (counterintuitively) the large_numpy_array
is not actually partitioned along the expected chunks. Each worker appears to get a full copy of large_numpy_array
regardless of the chunking.
from_array(large_array, chunks=chunks)[0].compute()
, for example, does not allocate data to workers the way one would expect from the chunking
scatter
, is that right?
solve
operation across the last 2 dimensions, i.e. given A
and B
I want to do C[i,:,:]=solve(A[i,:,:], B[i,:,:])
for all i in the leading dimension... I tried the below but it seems to be really slow (slower than numpy) - does anyone know what I'm doing wrong/what I could do better? Sorry if this was the wrong place to ask.C = da.apply_gufunc(np.linalg.solve, "(i,j),(i,k)->(j,k)", A, B, vectorize=True,output_dtypes=A.dtype)
scale
and then read it back then scale up/down from that?
@mrocklin Do you think a maximum limit of workers which can be used by a submitted job would be useful, Although I see that we already have a param for providing a set of workers.
On a broader idea worker pools, something like this could be rather useful.
Can these two behaviours be achieved currently?