Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

segfault in native code while trying to use CustomOp #11926

@mdespriee

Description

@mdespriee

I'm trying to use CustomOp to create a Constant, as it has been suggested in #8428.
As soon as I define that my CustomOp has no inputs, it fails with a segfault, and I can't find a workaround.

Environment

Code

class ConstantOp(value: NDArray) extends CustomOp {

  def forward(isTrain: Boolean, req: Array[String], inData: Array[NDArray], outData: Array[NDArray], aux: Array[NDArray]): Unit = {
    val data = value.copyTo(outData(0).context)
    this.assign(outData(0), req(0), data)
    data.dispose()
  }

  def backward(req: Array[String], outGrad: Array[NDArray], inData: Array[NDArray], outData: Array[NDArray], inGrad: Array[NDArray], aux: Array[NDArray]): Unit = {
    throw new Exception(s"Backward not supported by Constant")
  }
}

class ConstantOpProp(needTopGrad: Boolean = false) extends CustomOpProp(needTopGrad) {

  override def listArguments(): Array[String] = Array()

  override def listOutputs(): Array[String] = Array("output")

  override def inferShape(inShape: Array[Shape]): (Array[Shape], Array[Shape], Array[Shape]) = {
    val data = NDArray.deserialize(this.kwargs("value").toCharArray.map(_.toByte))
    (Array(), Array(data.shape), null)
  }

  override def inferType(inType: Array[DType]): (Array[DType], Array[DType], Array[DType]) = {
    val data = NDArray.deserialize(this.kwargs("value").toCharArray.map(_.toByte))
    (Array(), Array(data.dtype), null)
  }

  override def createOperator(ctx: String, inShapes: Array[Array[Int]],
                              inDtypes: Array[Int]): CustomOp = {
    // hacky stuff to workaround the declaration using String
    val data = NDArray.deserialize(this.kwargs("value").toCharArray.map(_.toByte))
    new ConstantOp(data)
  }
}


object TestConst {
  Operator.register("constant", new ConstantOpProp())

  val value = NDArray.array(Array(1f), Shape(1))
  val const = Symbol.Custom("constant")()(
    kwargs = Map(
      "op_type" -> "constant",
      // hacky thing to workaround the fact CustomOpProp uses Map[String, String] internally for kwargs
      "value" -> String.copyValueOf(value.serialize().map(_.toChar))
    ))

  val a = Symbol.Variable("a")
  val symbol = a + const
  val e = symbol.bind(Context.defaultCtx, Map(
    "a" -> NDArray.array(Array(10f), Shape(1)))
  )

  e.forward()

  println("outputs=" + e.outputs.mkString(", "))
}

Error:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f37ec25e7a9, pid=14172, tid=0x00007f379cbf7700
#
# JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
# Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x6797a9]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /app/hs_err_pid14172.log

in the log

[...]

Stack: [0x00007f6b28ad3000,0x00007f6b28bd4000],  sp=0x00007f6b28bcecd0,  free space=1007k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x6797a9]
C  [mxnet-scala+0x455dd4]  Java_org_apache_mxnet_LibInfo_mxCustomOpRegister::{lambda(char const*, int, char const**, char const**, MXCallbackList*)#1}::operator()(char const*, int, char const**, char const**, MXCallbackList*) const::{lambda(int, int*, unsigned int**, void*)#5}::_FUN(int, {lambda(char const*, int, char const**, char const**, MXCallbackList*)#1}, unsigned int*, unsigned int**)+0x454
C  [mxnet-scala+0x6acf4c]  mxnet::op::custom::InferShape(nnvm::NodeAttrs const&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*)+0x2dc

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  org.apache.mxnet.LibInfo.mxExecutorBindEX(JIII[Ljava/lang/String;[I[II[J[J[I[JJLorg/apache/mxnet/Base$RefLong;)I+0
j  org.apache.mxnet.Symbol.bindHelper(Lorg/apache/mxnet/Context;Lscala/collection/Seq;Lscala/collection/Iterable;Lscala/collection/Iterable;Lscala/collection/Iterable;Lscala/collection/Iterable;Lscala/collection/immutable/Map;Lorg/apache/mxnet/Executor;)Lorg/apache/mxnet/Executor;+767
j  org.apache.mxnet.Symbol.bind(Lorg/apache/mxnet/Context;Lscala/collection/immutable/Map;)Lorg/apache/mxnet/Executor;+38

[...]

side-note

As you see in the code, I'm obliged to hack NDArrays into strings to transmit the data. That's because CustomOp implementation defines Map[String, String] for kwargs, whereas Symbol.Custom allows Map[String, Any]. It leads to very strange things where we actually have, at runtime, non-string objects behind java String references. But they aren't castable anyway because of the type system. Weird
A change of the def in CustomOp would be welcome.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions