• 19.cuBLAS开发指南中文版--cuBLAS中的Level-2函数gemv()


    2.6.2. cublasgemv()

    在这里插入图片描述

    cublasStatus_t cublasSgemv(cublasHandle_t handle, cublasOperation_t trans,
                               int m, int n,
                               const float           *alpha,
                               const float           *A, int lda,
                               const float           *x, int incx,
                               const float           *beta,
                               float           *y, int incy)
    cublasStatus_t cublasDgemv(cublasHandle_t handle, cublasOperation_t trans,
                               int m, int n,
                               const double          *alpha,
                               const double          *A, int lda,
                               const double          *x, int incx,
                               const double          *beta,
                               double          *y, int incy)
    cublasStatus_t cublasCgemv(cublasHandle_t handle, cublasOperation_t trans,
                               int m, int n,
                               const cuComplex       *alpha,
                               const cuComplex       *A, int lda,
                               const cuComplex       *x, int incx,
                               const cuComplex       *beta,
                               cuComplex       *y, int incy)
    cublasStatus_t cublasZgemv(cublasHandle_t handle, cublasOperation_t trans,
                               int m, int n,
                               const cuDoubleComplex *alpha,
                               const cuDoubleComplex *A, int lda,
                               const cuDoubleComplex *x, int incx,
                               const cuDoubleComplex *beta,
                               cuDoubleComplex *y, int incy)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28

    此函数执行矩阵向量乘法

    y = α o p ( A ) x + β y y=\alpha op(A)x + \beta y y=αop(A)x+βy

    其中 A 是以列优先格式存储的 m*n 矩阵,x 和 y 是向量, α \alpha α β \beta β 是标量。 此外,对于矩阵 A:

    o p ( A ) = { A     如果 t r a n s a = = C U B L A S O P N , A T   如果 t r a n s a = = C U B L A S O P T , A H   如果 t r a n s a = = C U B L A S O P H op(A)= {A    transa==CUBLASOPN,AT  transa==CUBLASOPT,AH  transa==CUBLASOPH

    op(A)= A    如果transa==CUBLASOPN,AT  如果transa==CUBLASOPT,AH  如果transa==CUBLASOPH

    Param.MemoryIn/outMeaning
    handleinputhandle to the cuBLAS library context.
    transinputoperation op(A) that is non- or (conj.) transpose.
    minputnumber of rows of matrix A.
    ninputnumber of columns of matrix A.
    klinputnumber of subdiagonals of matrix A.
    kuinputnumber of superdiagonals of matrix A.
    alphahost or deviceinput scalar used for multiplication.
    Adeviceinput array of dimension lda x n with lda>=kl+ku+1.
    ldainputleading dimension of two-dimensional array used to store matrix A.
    xdeviceinput vector with n elements if transa == CUBLAS_OP_N and m elements otherwise.
    incxinputstride between consecutive elements of x.
    betahost or deviceinput scalar used for multiplication, if beta == 0 then y does not have to be a valid input.
    ydevicein/out vector at least (1+(m-1)*abs(incy)) elements if transa==CUBLAS_OP_N and at least (1+(n-1)*abs(incy)) elements otherwise.
    incyinputstride between consecutive elements of y.

    该函数可能返回的错误值及其含义如下所列。

    ErrorValueMeaning
    CUBLAS_STATUS_SUCCESS操作成功完成
    CUBLAS_STATUS_NOT_INITIALIZED库未初始化
    CUBLAS_STATUS_INVALID_VALUE参数 m,n<0 或 incx,incy=0
    CUBLAS_STATUS_EXECUTION_FAILED该功能无法在 GPU 上启动
  • 相关阅读:
    云原生SIEM解决方案
    超赞极简奶油风装修攻略~速来抄作业
    vue+element 实现input批量查询条件
    【Java】容器|Set、List、Map及常用API
    Kubernetes 的亲和性污点与容忍
    贪心 Leetcode 968 监控二叉树
    解决C# 连接MYSQL数据库查询数据时Unable to convert MySQL date/time value to System.DateTime
    网络爬虫-----爬虫的分类及原理
    咖啡屋时光书城【原创】
    sys.argv和argparse和os.environ
  • 原文地址:https://blog.csdn.net/kunhe0512/article/details/126325020