Layer Fusion

GPU kernel fusion API reference

miopenFusionDirection_t

enum miopenFusionDirection_t

Kernel fusion direction in the network.

Values:

miopenVerticalFusion = 0

fuses layers vertically, current the only supported mode

miopenHorizontalFusion = 1

fuses layers horizontally, this is unimplemented

miopenCreateFusionPlan

miopenStatus_t miopenCreateFusionPlan(miopenFusionPlanDescriptor_t *fusePlanDesc, const miopenFusionDirection_t fuseDirection, const miopenTensorDescriptor_t inputDesc)

Creates the kenrel fusion plan descriptor object.

Return
miopenStatus_t
Parameters
  • fusePlanDesc: Pointer to a fusion plan (output)
  • fuseDirection: Horizontal or Vertical fusion (input)
  • inputDesc: Descriptor to tensor for the input (input)

miopenDestroyFusionPlan

miopenStatus_t miopenDestroyFusionPlan(miopenFusionPlanDescriptor_t fusePlanDesc)

Destroy the fusion plan descriptor object.

Return
miopenStatus_t
Parameters
  • fusePlanDesc: A fusion plan descriptor type

miopenCompileFusionPlan

miopenStatus_t miopenCompileFusionPlan(miopenHandle_t handle, miopenFusionPlanDescriptor_t fusePlanDesc)

Compiles the fusion plan.

Return
miopenStatus_t
Parameters
  • handle: MIOpen handle (input)
  • fusePlanDesc: A fusion plan descriptor (input)

miopenFusionPlanGetOp

miopenStatus_t miopenFusionPlanGetOp(miopenFusionPlanDescriptor_t fusePlanDesc, const int op_idx, miopenFusionOpDescriptor_t *op)

Allows access to the operators in a fusion plan.

This api call does bounds checking on the supplied op_idx and would return miopenStatusError if the index is out of bounds

Return
miopenStatus_t
Parameters
  • fusePlanDesc: A fusion plan descriptor (input)
  • op_idx: Index of the required operator in the fusion plan, in the order of insertion
  • op: returned pointer to the operator

miopenFusionPlanGetWorkSpaceSize

miopenStatus_t miopenFusionPlanGetWorkSpaceSize(miopenHandle_t handle, miopenFusionPlanDescriptor_t fusePlanDesc, size_t *workSpaceSize, miopenConvFwdAlgorithm_t algo)

Query the workspace size required for the fusion plan.

Return
miopenStatus_t
Parameters
  • fusePlanDesc: A fusion plan descriptor (input)
  • workSpaceSize: Pointer to memory to return size in bytes (output)

miopenFusionPlanConvolutionGetAlgo

miopenStatus_t miopenFusionPlanConvolutionGetAlgo(miopenFusionPlanDescriptor_t fusePlanDesc, const int requestAlgoCount, int *returnedAlgoCount, miopenConvFwdAlgorithm_t *returnedAlgos)

Returns the supported algorithms for the convolution operator in the Fusion Plan.

A Convolution operator in a fusion plan may be implemented by different algorithms representing different tradeoffs of memory and performance. The returned list of algorithms is sorted in decreasing order of priority. Therefore, if the user does not request an algorithm to be set using the miopenFusionPlanConvolutionSetAlgo call, the first algorithm in the list would be used to execute the convolution in the fusion plan. Moreover this call must be immediately preceded by the miopenCreateOpConvForward call for the op in question.

Return
miopenStatus_t
Parameters
  • fusePlanDesc: A fusion plan descriptor (input)
  • requestAlgoCount: Number of algorithms to return (input)
  • returnedAlgoCount: The actual number of returned algorithms; always be less than equal to requestAlgoCount (output)
  • returnedAlgos: Pointer to the list of supported algorithms

miopenCreateOpConvForward

miopenStatus_t miopenCreateOpConvForward(miopenFusionPlanDescriptor_t fusePlanDesc, miopenFusionOpDescriptor_t *convOp, miopenConvolutionDescriptor_t convDesc, const miopenTensorDescriptor_t wDesc)

Creates forward convolution operator.

Return
miopenStatus_t
Parameters
  • fusePlanDesc: A fusion plan descriptor (input)
  • convOp: Pointer to an operator type (output)
  • convDesc: Convolution layer descriptor (input)
  • wDesc: Descriptor for the weights tensor (input)

miopenCreateOpActivationForward

miopenStatus_t miopenCreateOpActivationForward(miopenFusionPlanDescriptor_t fusePlanDesc, miopenFusionOpDescriptor_t *activFwdOp, miopenActivationMode_t mode)

Creates a forward activation operator.

Return
miopenStatus_t
Parameters
  • fusePlanDesc: A fusion plan descriptor (input)
  • activFwdOp: Pointer to an operator type (output)
  • mode: Activation version (input)

miopenCreateOpBiasForward

miopenStatus_t miopenCreateOpBiasForward(miopenFusionPlanDescriptor_t fusePlanDesc, miopenFusionOpDescriptor_t *biasOp, const miopenTensorDescriptor_t bDesc)

Creates a forward bias operator.

Return
miopenStatus_t
Parameters
  • fusePlanDesc: A fusion plan descriptor (input)
  • biasOp: Pointer to an operator type (output)
  • bDesc: bias tensor descriptor (input)

miopenCreateOpBatchNormInference

miopenStatus_t miopenCreateOpBatchNormInference(miopenFusionPlanDescriptor_t fusePlanDesc, miopenFusionOpDescriptor_t *bnOp, const miopenBatchNormMode_t bn_mode, const miopenTensorDescriptor_t bnScaleBiasMeanVarDesc)

Creates a forward inference batch normalization operator.

Return
miopenStatus_t
Parameters
  • fusePlanDesc: A fusion plan descriptor (input)
  • bnOp: Pointer to an operator type (output)
  • bn_mode: Batch normalization layer mode (input)
  • bnScaleBiasMeanVarDesc: Gamma, beta, mean, variance tensor descriptor (input)

miopenCreateOperatorArgs

miopenStatus_t miopenCreateOperatorArgs(miopenOperatorArgs_t *args)

Creates an operator argument object.

Return
miopenStatus_t
Parameters
  • args: Pointer to an operator argument type (output)

miopenDestroyOperatorArgs

miopenStatus_t miopenDestroyOperatorArgs(miopenOperatorArgs_t args)

Destroys an operator argument object.

Return
miopenStatus_t
Parameters
  • args: An operator argument type (output)

miopenSetOpArgsConvForward

miopenStatus_t miopenSetOpArgsConvForward(miopenOperatorArgs_t args, const miopenFusionOpDescriptor_t convOp, const void *alpha, const void *beta, const void *w)

Sets the arguments for forward convolution op.

Return
miopenStatus_t
Parameters
  • args: An arguments object type (output)
  • convOp: Forward convolution operator (input)
  • alpha: Floating point scaling factor, allocated on the host (input)
  • beta: Floating point shift factor, allocated on the host (input)
  • w: Pointer to tensor memory (input)

miopenSetOpArgsBatchNormInference

miopenStatus_t miopenSetOpArgsBatchNormInference(miopenOperatorArgs_t args, const miopenFusionOpDescriptor_t bnOp, const void *alpha, const void *beta, const void *bnScale, const void *bnBias, const void *estimatedMean, const void *estimatedVariance, double epsilon)

Sets the arguments for inference batch normalization op.

Return
miopenStatus_t
Parameters
  • args: An arguments object type (output)
  • bnOp: Batch normalization inference operator (input)
  • alpha: Floating point scaling factor, allocated on the host (input)
  • beta: Floating point shift factor, allocated on the host (input)
  • bnScale: Pointer to the gamma tensor memory (input)
  • bnBias: Pointer to the beta tensor memory (input)
  • estimatedMean: Pointer to population mean memory (input)
  • estimatedVariance: Pointer to population variance memory (input)
  • epsilon: Scalar value for numerical stability (input)

miopenSetOpArgsBiasForward

miopenStatus_t miopenSetOpArgsBiasForward(miopenOperatorArgs_t args, const miopenFusionOpDescriptor_t biasOp, const void *alpha, const void *beta, const void *bias)

Sets the arguments for forward bias op.

Return
miopenStatus_t
Parameters
  • args: An arguments object type (output)
  • biasOp: Forward bias operator (input)
  • alpha: Floating point scaling factor, allocated on the host (input)
  • beta: Floating point shift factor, allocated on the host (input)
  • bias: Pointer to the forward bias input tensor memory (input)

miopenExecuteFusionPlan

miopenStatus_t miopenExecuteFusionPlan(const miopenHandle_t handle, const miopenFusionPlanDescriptor_t fusePlanDesc, const miopenTensorDescriptor_t inputDesc, const void *input, const miopenTensorDescriptor_t outputDesc, void *output, miopenOperatorArgs_t args)

Executes the fusion plan.

Return
miopenStatus_t
Parameters
  • handle: MIOpen handle (input)
  • fusePlanDesc: fused plan descriptor (input)
  • inputDesc: Descriptor of the input tensor (input)
  • input: Source data tensor (input)
  • outputDesc: Decriptor of the output tensor (input)
  • output: Destination data tensor (output)
  • args: An argument object of the fused kernel (input)