# Automatic dispatch The primary function of [`autoray`](autoray) is to enable writing high level array / tensor code that is agnostic to the backend arrays being supplied. It does this via ***'automatic dispatch'***, which has a few notable differences to other approaches: * It is automatic - generally neither you or the backend array library needs to implement any dispatch logic, instead [`autoray`](autoray) finds, if neccesary 'translates', and then caches the relevant functions when they are first called. * It is specialized for array functions and treats [`numpy`](numpy) as the reference interface for call signatures of 'equivalent' functions, although it doesn't rely or numpy or require it to be installed. * Despite this, there is no fixed API as such - if a backend can be inferred, and the relevant function imported, a [`do`](autoray.do) call is valid. ## Basics The main function of [`autoray`](autoray) is [`do`](autoray.do), which takes a function name followed by `*args` and `**kwargs`, and automatically looks up (and caches) the correct backend function. There are four main ways that the backend is inferred: ***1. Automatic backend:*** ```python do('sqrt', x) ``` Here the backend is inferred from ``x``. By default dispatch happens on the first argument, but various functions (such as ``'stack'`` and ``'einsum'``) know to dispatch on other arguments. ***2. Backend 'like' another array:*** ```python do('random.normal', size=(2, 3, 4), like=x) ``` Here the backend is inferred from another array and can thus be implicitly propagated, even when functions take no array arguments. Some creation routines such as ``"eye"`` and ``"zeros"`` will also set the default ``dtype`` and / or device to match ``like`` in this case. ***3. Explicit backend:*** ```python do('einsum', eq, x, y, like='customlib') ``` Here one simply supplies the desired function backend explicitly. ***4. Context manager*** ```python with backend_like('autoray.lazy'): xy = do('tensordot', x, y, 1) z = do('trace', xy) ``` Here you set a default backend for a whole block of code. This default overrides method 1. above but 2. and 3. still take precedence. The argument to [`backend_like`](autoray.backend_like) can be a backend string or an example array. ## Namespace API An alternative way to call functions is to get a namespace object using [`get_namespace`](autoray.get_namespace). There are two main ways to use this. **1. Late dispatch -- alternative syntax to `do`:** Here dispatch still occurs according to above 4 patterns, but with a syntax similar to a standard array library. ```python xp = ar.get_namespace() x = xp.ones((3, 4), like="torch") xp.sqrt(x) ``` ````{hint} `autoray` also provides this 'default namespace' as an importable object like so: ```python from autoray import numpy as np np.fn_name(*args, like=like, **kwargs) ``` ```` **2. Early dispatch -- specialized to a particular `backend` (and optionally `dtype` and `device`)** Here [`get_namespace`](autoray.get_namespace) dispatches on a `like` argument: ```python # if `like` is an array, dtype and device will be picked up from it xp = ar.get_namespace(like="torch", device="cuda", dtype="float32") x = xp.ones((3, 4)) xp.sqrt(x) ``` Function calls then lookup and cache the specific backend function, removing any dispatch overhead. This is a concise and fast way to use `autoray` within a function where you are happy to lock in the backend. It is also similar how the [*python array api*](https://data-apis.org/array-api) works. ### Manual dispatch functions You can manually break the process into two steps with the following functions: * [`autoray.infer_backend`](autoray.infer_backend) - return the backend name for a single array. * [`autoray.infer_backend_multi`](autoray.infer_backend_multi) - return the backend name based on multiple arrays. * [`autoray.get_lib_fn`](autoray.get_lib_fn) - return the actual function for a given backend and function name. If you know you are going to use a function repeatedly, you can thus avoid the (albeit minor) overhead of dispatching each call separately, for instance: ```python def matmul_chain(*arrays): # if the arrays might be a mix of backends, use infer_backend_multi, # but here we just dispatch on the first array backend = infer_backend(arrays[0]) fn = get_lib_fn(backend, 'matmul') return functools.reduce(fn, arrays) ``` ### Other special functions There are a few high level functions that might be preferred to attribute access, for reasons of consitency: * [`autoray.shape`](autoray.shape) - return the shape of an array. In most cases `x.shape` is fine, but this ensures the output is `tuple[int]` and also works for builtins without calling `numpy`. * [`autoray.ndim`](autoray.ndim) - return the number of dimensions of an array. * [`autoray.size`](autoray.size) - return the total number of elements in an array * [`autoray.dag`](autoray.dag) - return the adjoint of an array, i.e. the transpose with complex conjugation. Functions for dealing with dtypes: * [`autoray.get_dtype_name`](autoray.get_dtype_name) - return the name of the dtype of an array as a string * [`autoray.to_backend_dtype`](autoray.to_backend_dtype) - turn a string specified dtype into the equivalent dtype for a given backend * [`autoray.astype`](autoray.astype) - cast an array to a given dtype, specified as a string. And for converting any array to a numpy array: * [`autoray.to_numpy`](autoray.to_numpy) ```{hint} All of these can be called via [`do`](autoray.do) as well, e.g. `do('shape', x)`. ``` ## Backends In [`autoray`](autoray) a backend internally is simply specified by a string. By default, the `backend` of an array is name of the library that the class is defined in, and the relevant functions are assumed to be in the namespace of `backend`. If that is the case (e.g. `cupy`), then that library is already compatible with `autoray`. Note all backend lookups are cached on `obj.__class__` for speed. `autoray` also handles common cases where the functions are in a different library or sub-module (such as `jax -> jax.numpy`). This requires a simple mapping to be specified, which `autoray` does for various libraries. You can explicitly register a backend name (and thus default location) for a specific class with the function [`register_backend`](autoray.register_backend): ```python register_backend(mylib.myobjs.MyClass, 'mylib.myfuncs') ``` Now when `autoray` encounters an instance of `MyClass` it will look for functions in `mylib.myfuncs` instead of `mylib`. You could also use an arbitrary name for the backend, and then alias it to the correct location separately. ````{note} `autoray` is aware of the `scipy` namespace and relevant submodules for `numpy`, `cupy`, `jax`, for example: ```python do('scipy.linalg.exp', x) ``` ```` ## Functions Once a `backend` is inferred and the location of the relevant functions is known, `autoray` tries to import and cache the relevant function from that namespace. Many libraries (e.g. `cupy`, `dask`, `jax`, `autograd`, `sparse`, ...) actively mirror the `numpy` API, so there is little else to be done. Some other libraries (e.g. `tensorflow`, `pytorch`, ...) diverge from the `numpy` API more, and yet have largely equivalent functions, simply defined in slight different places with different names and / or signatures. `autoray` has a simple translation mechanism for: * when functions are in a different module (e.g. `'trace' -> tensorflow.linalg.trace`) * when functions have a different name (e.g. `'sum' -> tensorflow.reduce_sum`) * when functions have a different signature (e.g. `tensordot(a, b, axes) -> torch.tensordot(a, b, dims)`) If you want to directly provide a missing or *alternative* implementation of some function for a particular backend you can swap one in with [`register_function`](autoray.register_function): ```python def my_custom_torch_svd(x): import torch print('Hello SVD!') u, s, v = torch.svd(x) return u, s, v.T ar.register_function('torch', 'linalg.svd', my_custom_torch_svd) x = ar.do('random.uniform', size=(3, 4), like='torch') ar.do('linalg.svd', x) # Hello SVD! # (tensor([[-0.5832, 0.6188, -0.5262], # [-0.5787, -0.7711, -0.2655], # [-0.5701, 0.1497, 0.8078]]), # tensor([2.0336, 0.8518, 0.4572]), # tensor([[-0.4568, -0.3166, -0.6835, -0.4732], # [-0.5477, 0.2825, -0.2756, 0.7377], # [ 0.2468, -0.8423, -0.0993, 0.4687]])) ``` If you want to make use of the existing function you can supply ``wrap=True`` in which case the custom function supplied should act like a decorator: ```python def my_custom_sum_wrapper(old_fn): def new_fn(*args, **kwargs): print('Hello sum!') return old_fn(*args **kwargs) return new_fn ar.register_function('torch', 'sum', my_custom_sum_wrapper, wrap=True) ar.do('sum', x) # Hello sum! # tensor(5.4099) ``` Though be careful, if you call [`register_function`](autoray.register_function) again it will now wrap the *new* function! Note you can combine [`register_backend`](autoray.register_backend) and [`register_function`](autoray.register_function) to dynamically define array types and functions from anywhere. See also [`register_dispatch`](autoray.register_dispatch) for controlling which arguments are used to infer the backend for any function. ### Composing new functions Sometimes you want to define a function that is composed of many array functions, but you want to dispatch at the level of the whole block, not each individual call, or indeed use a completely different implementation. For instance, you might want to use a [`numba`](https://numba.pydata.org/) or [`pythran`](https://pythran.readthedocs.io/en/latest/) compiled version for `numpy`. The [`autoray.compose`](autoray.compose) function allows you to do this. You decorate a function, that forms the default implementation, then you can register alternative implementations for specific backends. For instance: ```python from autoray import compose from numba import njit @compose def my_func(x): # get how many elements are needed to sum to 20 return ar.do('sum', ar.do('cumsum', x, 0) < 20) # register a numba implementation @my_func.register('numpy') @njit def my_func_numba(x): s = 0.0 i = 0 while s < 20: s += x[i] i += 1 return i - 1 # any calls like this now dispatch to my_func_numba do('my_func', x_numpy) ``` ### Deviations from `numpy` As stated above, `autoray` does not have an explicit API, but where there exist equivalent functions, `autoray` uses the call signature of `numpy` as a reference. The following are deviations from this: * `do('linalg.svd', x)` - `autoray` defaults to `full_matrices=False`, since this is generally always desired, and many libraries do not even support `full_matrices=True`. ------------------------------------------------------------------------------- ## Comparison to alternatives * The [Python Array API](https://data-apis.org/array-api) (`x.__array_namespace__()`) is a proposed standard for array libraries to implement that many are converging to. It hopefully will be the de-facto approach for writing backend array agnostic code, and [autoray.get_namespace](autoray.get_namespace) provides a similar mechanism with this in mind. The current benefit of `autoray` is that it does not require the backend library to explicitly support it, and it can dynamically compose and replace functions etc. * The ``__array_function__`` protocol has been [suggested](https://www.numpy.org/neps/nep-0018-array-function-protocol.html) and now implemented in ``numpy``. This is more orientated around array classes that explicitly subclass `numpy.ndarray`. * The [uarray](https://github.com/Quansight-Labs/uarray) project appears to have similar goals but is still being developed. * [`functools.singledispatch`](https://docs.python.org/3/library/functools.html#functools.singledispatch) is a general *single* dispatch mechanism, but it is slower and requires the user to explicitly register each function they want to dispatch on. * [`plum`](https://github.com/beartype/plum) is a general *multiple* dispatch mechanism, but again it would require registering every function for every backend explicitly.