Class CSRPointer
- java.lang.Object
-
- org.apache.sysds.runtime.instructions.gpu.context.CSRPointer
-
public class CSRPointer extends Object
Compressed Sparse Row (CSR) format for CUDA Generalized matrix multiply is implemented for CSR format in the cuSparse library among other operations Since we assume that the matrix is stored with zero-based indexing (i.e. CUSPARSE_INDEX_BASE_ZERO), the matrix 1.0 4.0 0.0 0.0 0.0 0.0 2.0 3.0 0.0 0.0 5.0 0.0 0.0 7.0 8.0 0.0 0.0 9.0 0.0 6.0 is stored as val = 1.0 4.0 2.0 3.0 5.0 7.0 8.0 9.0 6.0 rowPtr = 0.0 2.0 4.0 7.0 9.0 colInd = 0.0 1.0 1.0 2.0 0.0 3.0 4.0 2.0 4.0
-
-
Field Summary
Fields Modifier and Type Field Description jcuda.PointercolIndinteger array of nnz values' column indicesjcuda.jcusparse.cusparseMatDescrdescrdescriptor of matrix, only CUSPARSE_MATRIX_TYPE_GENERAL supportedstatic jcuda.jcusparse.cusparseMatDescrmatrixDescriptorlongnnzNumber of non zeroesjcuda.PointerrowPtrinteger array of start of all rows and end of last row + 1jcuda.Pointervaldouble array of non zero values
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static CSRPointerallocateEmpty(GPUContext gCtx, long nnz2, long rows)static CSRPointerallocateEmpty(GPUContext gCtx, long nnz2, long rows, boolean initialize)Factory method to allocate an empty CSR Sparse matrix on the GPUstatic CSRPointerallocateForDgeam(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, CSRPointer B, int m, int n)Estimates the number of non zero elements from the results of a sparse cusparseDgeam operation C = a op(A) + b op(B)static CSRPointerallocateForMatrixMultiply(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, int transA, CSRPointer B, int transB, int m, int n, int k)Estimates the number of non-zero elements from the result of a sparse matrix multiplication C = A * B and returns theCSRPointerto C with the appropriate GPU memory.CSRPointerclone(int rows)static voidcopyPtrToHost(CSRPointer src, int rows, long nnz, int[] rowPtr, int[] colInd)Static method to copy a CSR sparse matrix from Device to hoststatic voidcopyToDevice(GPUContext gCtx, CSRPointer dest, int rows, long nnz, int[] rowPtr, int[] colInd, double[] values)Static method to copy a CSR sparse matrix from Host to Devicevoiddeallocate()Calls cudaFree lazily on the allocatedPointerinstancesvoiddeallocate(boolean eager)Calls cudaFree lazily or eagerly on the allocatedPointerinstancesstatic longestimateSize(long nnz2, long rows)Estimate the size of a CSR matrix in GPU memory Size of pointers is not needed and is not added instatic jcuda.jcusparse.cusparseMatDescrgetDefaultCuSparseMatrixDescriptor()booleanisUltraSparse(int rows, int cols)Check for ultra sparsityjcuda.PointertoColumnMajorDenseMatrix(jcuda.jcusparse.cusparseHandle cusparseHandle, jcuda.jcublas.cublasHandle cublasHandle, int rows, int cols, String instName)Copies this CSR matrix on the GPU to a dense column-major matrix on the GPU.static inttoIntExact(long l)StringtoString()
-
-
-
Field Detail
-
matrixDescriptor
public static jcuda.jcusparse.cusparseMatDescr matrixDescriptor
-
nnz
public long nnz
Number of non zeroes
-
val
public jcuda.Pointer val
double array of non zero values
-
rowPtr
public jcuda.Pointer rowPtr
integer array of start of all rows and end of last row + 1
-
colInd
public jcuda.Pointer colInd
integer array of nnz values' column indices
-
descr
public jcuda.jcusparse.cusparseMatDescr descr
descriptor of matrix, only CUSPARSE_MATRIX_TYPE_GENERAL supported
-
-
Method Detail
-
toIntExact
public static int toIntExact(long l)
-
getDefaultCuSparseMatrixDescriptor
public static jcuda.jcusparse.cusparseMatDescr getDefaultCuSparseMatrixDescriptor()
- Returns:
- Singleton default matrix descriptor object (set with CUSPARSE_MATRIX_TYPE_GENERAL, CUSPARSE_INDEX_BASE_ZERO)
-
estimateSize
public static long estimateSize(long nnz2, long rows)Estimate the size of a CSR matrix in GPU memory Size of pointers is not needed and is not added in- Parameters:
nnz2- number of non zeroesrows- number of rows- Returns:
- size estimate
-
copyToDevice
public static void copyToDevice(GPUContext gCtx, CSRPointer dest, int rows, long nnz, int[] rowPtr, int[] colInd, double[] values)
Static method to copy a CSR sparse matrix from Host to Device- Parameters:
gCtx- GPUContextdest- [input] destination location (on GPU)rows- number of rowsnnz- number of non-zeroesrowPtr- integer array of row pointerscolInd- integer array of column indicesvalues- double array of non zero values
-
copyPtrToHost
public static void copyPtrToHost(CSRPointer src, int rows, long nnz, int[] rowPtr, int[] colInd)
Static method to copy a CSR sparse matrix from Device to host- Parameters:
src- [input] source location (on GPU)rows- [input] number of rowsnnz- [input] number of non-zeroesrowPtr- [output] pre-allocated integer array of row pointers of size (rows+1)colInd- [output] pre-allocated integer array of column indices of size nnz
-
allocateForDgeam
public static CSRPointer allocateForDgeam(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, CSRPointer B, int m, int n)
Estimates the number of non zero elements from the results of a sparse cusparseDgeam operation C = a op(A) + b op(B)- Parameters:
gCtx- a validGPUContexthandle- a validcusparseHandleA- Sparse Matrix A on GPUB- Sparse Matrix B on GPUm- Rows in An- Columns in Bs- Returns:
- CSR (compressed sparse row) pointer
-
allocateForMatrixMultiply
public static CSRPointer allocateForMatrixMultiply(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, int transA, CSRPointer B, int transB, int m, int n, int k)
Estimates the number of non-zero elements from the result of a sparse matrix multiplication C = A * B and returns theCSRPointerto C with the appropriate GPU memory.- Parameters:
gCtx- a validGPUContexthandle- a validcusparseHandleA- Sparse Matrix A on GPUtransA- 'T' if A is to be transposed, 'N' otherwiseB- Sparse Matrix B on GPUtransB- 'T' if B is to be transposed, 'N' otherwisem- Rows in An- Columns in Bk- Columns in A / Rows in B- Returns:
- a
CSRPointerinstance that encapsulates the CSR matrix on GPU
-
allocateEmpty
public static CSRPointer allocateEmpty(GPUContext gCtx, long nnz2, long rows, boolean initialize)
Factory method to allocate an empty CSR Sparse matrix on the GPU- Parameters:
gCtx- a validGPUContextnnz2- number of non-zeroesrows- number of rowsinitialize- memset to zero?- Returns:
- a
CSRPointerinstance that encapsulates the CSR matrix on GPU
-
allocateEmpty
public static CSRPointer allocateEmpty(GPUContext gCtx, long nnz2, long rows)
-
clone
public CSRPointer clone(int rows)
-
isUltraSparse
public boolean isUltraSparse(int rows, int cols)Check for ultra sparsity- Parameters:
rows- number of rowscols- number of columns- Returns:
- true if ultra sparse
-
toColumnMajorDenseMatrix
public jcuda.Pointer toColumnMajorDenseMatrix(jcuda.jcusparse.cusparseHandle cusparseHandle, jcuda.jcublas.cublasHandle cublasHandle, int rows, int cols, String instName)Copies this CSR matrix on the GPU to a dense column-major matrix on the GPU. This is a temporary matrix for operations such as cusparseDcsrmv. Since the allocated matrix is temporary, bookkeeping is not updated. The caller is responsible for calling "free" on the returned Pointer object- Parameters:
cusparseHandle- a validcusparseHandlecublasHandle- a validcublasHandlerows- number of rows in this CSR matrixcols- number of columns in this CSR matrixinstName- name of the invoking instruction to recordStatistics.- Returns:
- A
Pointerto the allocated dense matrix (in column-major format)
-
deallocate
public void deallocate()
Calls cudaFree lazily on the allocatedPointerinstances
-
deallocate
public void deallocate(boolean eager)
Calls cudaFree lazily or eagerly on the allocatedPointerinstances- Parameters:
eager- whether to do eager or lazy cudaFrees
-
-