clBLAS  2.11
 All Functions Typedefs Enumerations Enumerator Groups Pages
Overview

This library provides an implementation of the Basic Linear Algebra Subprograms levels 1, 2 and 3, using OpenCL and optimized for AMD GPU hardware. It provides BLAS-1 functions SWAP, SCAL, COPY, AXPY, DOT, DOTU, DOTC, ROTG, ROTMG, ROT, ROTM, iAMAX, ASUM and NRM2, BLAS-2 functions GEMV, SYMV, TRMV, TRSV, HEMV, SYR, SYR2, HER, HER2, GER, GERU, GERC, TPMV, SPMV, HPMV, TPSV, SPR, SPR2, HPR, HPR2, GBMV, TBMV, SBMV, HBMV and TBSV and BLAS-3 functions GEMM, SYMM, TRMM, TRSM, HEMM, HERK, HER2K, SYRK and SYR2K.

This library’s primary goal is to assist the end user to enqueue OpenCL kernels to process BLAS functions in an OpenCL-efficient manner, while keeping interfaces familiar to users who know how to use BLAS. All functions accept matrices through buffer objects.

This library is thread-safe with the exception of the following API : clblasSetup and clblasTeardown, clblasXgemm, clblasStrsm, clblasDtrsm. clblasXgemm is no longer thread-safe due to Auto-Gemm. clblasStrsm and clblasDtrsm rely on clblasXgemm. But clblasCtrsm and clblasZtrsm are still thread-safe. Developers using the library can safely using any blas routine from different thread.

deprecated

This library provided support for the creation of scratch images to achieve better performance on older AMD APP SDK's. However, memory buffers now give the same performance as buffers objects in the current SDK's. Scratch image buffers are being deprecated and users are advised not to use scratch images in new applications.