There are some non-obvious limitations in ChainerX:

  • ChainerX only supports a limited set of dtypes: bool_ int8 int16 int32 int64 uint8 float32 float64.

  • Operations with mixed dtypes are not supported. You need to explicitly convert dtypes using either chainerx.astype() or F.cast().

  • True division of Python, where 2/3 returns .66 rather than 0, is not supported yet. Given an ndarray a of the dtype int32, a / a does not return an array of float64, but returns an array of int32.

  • Only a limited set of Chainer functions are well tested with the ChainerX integration.

  • ChainerX CUDA backend requires cuDNN. See installation for details.

  • As ChainerX arrays have a computational graph in their own, some operations are prohibited for safety:

    • Unless an array is free from the computational graph, in-place modification of its data is prohibited.

      a = chainerx.zeros((2,), chainerx.float32)
      a.require_grad()  # install the computational graph on `a`.
      a += 1  # ! error

      The reason of this limitation is that, as backward operations may depend on the value of a, the backward gradients might be unexpectedly affected if it would be altered.

      You may circumvent this limitation by making a disconnected view:

      # A memory-shared view of `a` which is disconnected from the computational graph of `a`.
      b = a.as_grad_stopped()
      b += 1

      Note however that this operation is inherently dangerous. You should be super careful to ensure that that does not affect backward computations.

      Note also that we may restrict further in the future so that even in-place modification on a disconnected view is only allowed if it is actually safe.

    • If an array is wrapped with a Variable with requires_grad=True (which is default), you won’t be able to re-assign the array:

      a = chainerx.zeros((2,), chainerx.float32)
      b = chainerx.zeros((2,), chainerx.float32)
      var = chainer.Variable(a)
      var.array = b  # ! error

      You may circumvent this by using in-place assignment on var.array:

      var.array[:] = b

      This workaround may also be dangerous just as in the previous limitation.