计算时间

所有示例库中的 10 个文件总执行时间为 18:04.929

示例

时间

内存 (MB)

融合注意力 (Fused Attention) (../python/tutorials/06-fused-attention.py)

13:08.989

0.0

矩阵乘法 (../python/tutorials/03-matrix-multiplication.py)

02:06.838

0.0

持久化矩阵乘法 (Persistent Matmul) (../python/tutorials/09-persistent-matmul.py)

01:25.823

0.0

融合 Softmax (Fused Softmax) (../python/tutorials/02-fused-softmax.py)

00:37.173

0.0

层归一化 (Layer Normalization) (../python/tutorials/05-layer-norm.py)

00:29.256

0.0

向量加法 (../python/tutorials/01-vector-add.py)

00:09.602

0.0

分组 GEMM (Group GEMM) (../python/tutorials/08-grouped-gemm.py)

00:06.130

0.0

低内存 Dropout (Low-Memory Dropout) (../python/tutorials/04-low-memory-dropout.py)

00:00.794

0.0

Libdevice (tl.extra.libdevice) 函数 (../python/tutorials/07-extern-functions.py)

00:00.292

0.0

分块缩放矩阵乘法 (Block Scaled Matrix Multiplication) (../python/tutorials/10-block-scaled-matmul.py)

00:00.033

0.0