42.cuBLAS开发指南中文版--cuBLAS中的Level-3函数gemm()

2.7.1. `cublasgemm()`

在这里插入图片描述

cublasStatus_t cublasSgemm(cublasHandle_t handle,
                           cublasOperation_t transa, cublasOperation_t transb,
                           int m, int n, int k,
                           const float           *alpha,
                           const float           *A, int lda,
                           const float           *B, int ldb,
                           const float           *beta,
                           float           *C, int ldc)
cublasStatus_t cublasDgemm(cublasHandle_t handle,
                           cublasOperation_t transa, cublasOperation_t transb,
                           int m, int n, int k,
                           const double          *alpha,
                           const double          *A, int lda,
                           const double          *B, int ldb,
                           const double          *beta,
                           double          *C, int ldc)
cublasStatus_t cublasCgemm(cublasHandle_t handle,
                           cublasOperation_t transa, cublasOperation_t transb,
                           int m, int n, int k,
                           const cuComplex       *alpha,
                           const cuComplex       *A, int lda,
                           const cuComplex       *B, int ldb,
                           const cuComplex       *beta,
                           cuComplex       *C, int ldc)
cublasStatus_t cublasZgemm(cublasHandle_t handle,
                           cublasOperation_t transa, cublasOperation_t transb,
                           int m, int n, int k,
                           const cuDoubleComplex *alpha,
                           const cuDoubleComplex *A, int lda,
                           const cuDoubleComplex *B, int ldb,
                           const cuDoubleComplex *beta,
                           cuDoubleComplex *C, int ldc)
cublasStatus_t cublasHgemm(cublasHandle_t handle,
                           cublasOperation_t transa, cublasOperation_t transb,
                           int m, int n, int k,
                           const __half *alpha,
                           const __half *A, int lda,
                           const __half *B, int ldb,
                           const __half *beta,
                           __half *C, int ldc)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

此函数执行矩阵矩阵乘法

$\alpha op(A)op(B) + \beta C$

其中 $\alpha$ 和 $\beta$ 是标量，A 、 B 和 C 是以列优先格式存储的矩阵，维度分别为 op(A) mxk 、 op(B) kxn 和 C mxn 。另外，对于矩阵 A:

$o p ( A ) = { A 如果 t r a n s a = = C U B L A S _ O P _ N , A T 如果 t r a n s a = = C U B L A S _ O P _ T , A H 如果 t r a n s a = = C U B L A S _ O P _ C op(A)=$

{\begin{cases} A 如 果 t r a n s a == C U B L A S_O P_N, \\ A^{T} 如 果 t r a n s a == C U B L A S_O P_T, \\ A^{H} 如 果 t r a n s a == C U B L A S_O P_C \end{cases}

o p (A) = ⎩ ⎪ ⎨ ⎪ ⎧ A 如 果 t r a n s a = = C U B L A S_O P_N, A^{T} 如 果 t r a n s a = = C U B L A S_O P_T, A^{H} 如 果 t r a n s a = = C U B L A S_O P_C

这里op(B)定义的是B矩阵

Param.	Memory	In/out	Meaning
handle		input	handle to the cuBLAS library context.
transa		input	Operation op(A) that is non- or (conj.) transpose.
transb		input	Operation op(B) that is non- or (conj.) transpose.
m		input	Number of rows of matrix op(A) and C.
n		input	Number of columns of matrix op(B) and C.
k		input	Number of columns of op(A) and rows of op(B).
alpha	host or device	input	scalar used for multiplication.
A	device	input	array of dimensions lda x k with lda>=max(1,m) if `transa == CUBLAS_OP_N` and lda x m with lda>=max(1,k) otherwise.
lda		input	Leading dimension of two-dimensional array used to store the matrix A.
B	device	input	array of dimension ldb x n with ldb>=max(1,k) if transb == CUBLAS_OP_N and ldb x k with ldb>=max(1,n) otherwise.
ldb	input	Leading dimension of two-dimensional array used to store matrix B.
beta	host or device	input	scalar used for multiplication. If beta==0, C does not have to be a valid input.
C	device	in/out	array of dimensions ldc x n with ldc>=max(1,m).
ldc		input	Leading dimension of a two-dimensional array used to store the matrix C.

该函数可能返回的错误值及其含义如下表所示：

ErrorValue	Meaning
CUBLAS_STATUS_SUCCESS	操作成功完成
CUBLAS_STATUS_NOT_INITIALIZED	库未初始化
CUBLAS_STATUS_INVALID_VALUE	If m, n, k < 0 or if transa, transb != CUBLAS_OP_N, CUBLAS_OP_C, CUBLAS_OP_T or if lda < max(1, m) if transa == CUBLAS_OP_N and lda < max(1, k) otherwise or if ldb < max(1, k) if transb == CUBLAS_OP_N and ldb < max(1, n) otherwise or if ldc < max(1, m) or if alpha, beta == NULL or C == NULL if C needs to be scaled
CUBLAS_STATUS_EXECUTION_FAILED	该功能无法在 GPU 上启动

参考资料请参考：

sgemm, dgemm, cgemm, zgemm

相关阅读:
【MATLAB源码-第81期】基于matlab的polar码三种译码算法比较（SC,SCL,BP）。
03 LaTex之标题页&摘要
Webpack配置entry修改入口文件或打包多个文件
微信机器人开发
spring mvc上传文件MultipartHttpServletRequest值为空
[Django 0-1] Apps模块
函数的用法
基于JAVA后台微信校园疫情防控小程序系统开题报告
在win10上格式化Linux启动盘
Linux 下的 OOM Killer理解Out of memory: Kill process

原文地址：https://blog.csdn.net/kunhe0512/article/details/128145108

42.cuBLAS开发指南中文版--cuBLAS中的Level-3函数gemm()

2.7.1. cublasgemm()

2.7.1. `cublasgemm()`