CyxWiz LogoCyxWiz
DocsLinear Algebra

Linear Algebra Tools

GPU-accelerated linear algebra operations powered by ArrayFire.

Overview

Matrix Operations
Multiplication, inversion, decomposition
Eigenvalue Problems
Eigenvalues, eigenvectors, SVD
Solvers
Linear systems, least squares
Utilities
Norms, ranks, conditions

Matrix Operations

import cyxwiz.linalg as la

# Matrix multiplication
C = la.matmul(A, B)
C = A @ B  # Operator overload

# Transpose
At = la.transpose(A)
At = A.T

# Inverse
A_inv = la.inv(A)

# Determinant
det = la.det(A)

# Trace
tr = la.trace(A)

Matrix Decompositions

# SVD (Singular Value Decomposition)
U, S, Vt = la.svd(A)
U, S, Vt = la.svd(A, full_matrices=False)  # Economy

# Eigendecomposition
eigenvalues, eigenvectors = la.eig(A)
eigenvalues = la.eigvals(A)  # Values only

# LU Decomposition
P, L, U = la.lu(A)

# QR Decomposition
Q, R = la.qr(A)

# Cholesky (for positive definite matrices)
L = la.cholesky(A)

Linear System Solvers

# Direct solve (Ax = b)
x = la.solve(A, b)

# Least squares
x, residuals, rank, s = la.lstsq(A, b)

# Pseudo-inverse
A_pinv = la.pinv(A)
x = A_pinv @ b

Norms and Properties

# Norms
fro_norm = la.norm(A, 'fro')  # Frobenius
spectral_norm = la.norm(A, 2)  # Spectral
nuclear_norm = la.norm(A, 'nuc')  # Nuclear

# Properties
rank = la.matrix_rank(A)
cond = la.cond(A)  # Condition number

Performance Comparison

Benchmarks for 1000x1000 matrix multiplication:

BackendTimeSpeedup
CPU (single-threaded)2,500 ms1x
CPU (multi-threaded)450 ms5.6x
OpenCL (Intel UHD)120 ms20.8x
CUDA (RTX 3060)15 ms166x
CUDA (RTX 4090)5 ms500x

Node Editor Integration

NodeInputsOutputsGPU
MatMulA, BC = A @ BYes
TransposeAA^TYes
InverseAA^-1Yes
SVDAU, S, VYes
SolveA, bxYes

Best Practices

Numerical Stability
  1. Check condition number before solving
  2. Use appropriate decomposition
  3. Prefer SVD for rank-deficient matrices
  4. Use Cholesky for positive definite (2x faster)
Memory Management
  1. Use in-place operations when possible
  2. Release intermediate results promptly
  3. Batch operations to minimize transfers
  4. Use appropriate precision (float32 vs float64)