NVGPUOps¶

`nvgpu.cluster_id` (triton::nvgpu::ClusterCTAIdOp)¶

语法

operation ::= `nvgpu.cluster_id` attr-dict

特性: AlwaysSpeculatableImplTrait

接口：ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

效果: MemoryEffects::Effect{}

结果:¶

结果	描述
`结果`	32位无符号整数

`nvgpu.ld_acquire` (triton::nvgpu::LoadAcquireOp)¶

语法

operation ::= `nvgpu.ld_acquire` $sem `,` $scope `,` $addr (`,` $mask^)? attr-dict `:` functional-type($addr, $result)

接口：MemoryEffectOpInterface (MemoryEffectOpInterface)

效果：MemoryEffects::Effect{MemoryEffects::Read on ::mlir::SideEffects::DefaultResource}

属性：¶

属性	MLIR 类型	描述
`sem`	::mlir::triton::nvgpu::MemSemanticAttr	允许的32位无符号整数情况：1, 2, 3, 4
`scope`	::mlir::triton::nvgpu::MemSyncScopeAttr	允许的32位无符号整数情况：1, 2, 3

操作数:¶

操作数	描述
`addr`	地址空间1中的LLVM指针
`mask`	1位无符号整数

结果：¶

结果	描述
`结果`	浮点数或整数

`nvgpu.ldmatrix` (triton::nvgpu::LoadMatrixOp)¶

语法

operation ::= `nvgpu.ldmatrix` $addr `,` $shape `,` $bit_width attr-dict `:` functional-type($addr, $result)

接口：MemoryEffectOpInterface (MemoryEffectOpInterface)

效果：MemoryEffects::Effect{MemoryEffects::Read on ::mlir::SideEffects::DefaultResource}

属性：¶

属性	MLIR 类型	描述
`shape`	::mlir::triton::nvgpu::LoadMatrixShapeAttr	允许的32位无符号整数情况：0, 1
`bit_width`	::mlir::IntegerAttr	32位无符号整数属性
`trans`	::mlir::UnitAttr	单元属性

操作数：¶

操作数	描述
`addr`	地址空间3中的LLVM指针

结果：¶

结果	描述
`结果`	LLVM结构体类型或32位无符号整数

`nvgpu.tensor_memory_base` (triton::nvgpu::TensorMemoryBaseAddress)¶

语法

operation ::= `nvgpu.tensor_memory_base` attr-dict

用于表示内核中张量内存基地址的操作。这用于简化从TritonGPU到LLVM的降低。

特性: AlwaysSpeculatableImplTrait

接口：ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

效果: MemoryEffects::Effect{}

结果：¶

结果	描述
`结果`	地址空间6中的LLVM指针

`nvgpu.wgmma` (triton::nvgpu::WGMMAOp)¶

语法

operation ::= `nvgpu.wgmma` $opA `,` $opB `,` $useC (`,` $opC^)? attr-dict `:` functional-type(operands, $res)

属性：¶

属性	MLIR 类型	描述
`m`	::mlir::IntegerAttr	32位无符号整数属性
`n`	::mlir::IntegerAttr	32位无符号整数属性
`k`	::mlir::IntegerAttr	32位无符号整数属性
`eltTypeC`	::mlir::triton::nvgpu::WGMMAEltTypeAttr	wgmma操作数类型，可以是 's8', 's32', 'e4m3', 'e5m2', 'f16', 'bf16', 'tf32', 或 'f32'
`eltTypeA`	::mlir::triton::nvgpu::WGMMAEltTypeAttr	wgmma操作数类型，可以是 's8', 's32', 'e4m3', 'e5m2', 'f16', 'bf16', 'tf32', 或 'f32'
`eltTypeB`	::mlir::triton::nvgpu::WGMMAEltTypeAttr	wgmma操作数类型，可以是 's8', 's32', 'e4m3', 'e5m2', 'f16', 'bf16', 'tf32', 或 'f32'
`layoutA`	::mlir::triton::nvgpu::WGMMALayoutAttr	wgmma布局，可以是 'row' 或 'col'
`layoutB`	::mlir::triton::nvgpu::WGMMALayoutAttr	wgmma布局，可以是 'row' 或 'col'

操作数：¶

操作数	描述
`opA`	wgmma操作数A/B类型
`opB`	wgmma操作数A/B类型
`useC`	1位无符号整数
`opC`	LLVM结构体类型

结果：¶

结果	描述
`res`	LLVM结构体类型

`nvgpu.wgmma_wait_group` (triton::nvgpu::WGMMAWaitGroupOp)¶

语法

operation ::= `nvgpu.wgmma_wait_group` $input attr-dict `:` type($input)

接口 (Interfaces): InferTypeOpInterface

属性：¶

属性	MLIR 类型	描述
`pendings`	::mlir::IntegerAttr	32位无符号整数属性

操作数:¶

操作数	描述
`输入`	LLVM结构体类型

结果：¶

结果	描述
`输出`	LLVM结构体类型

`nvgpu.warp_id` (triton::nvgpu::WarpIdOp)¶

语法

operation ::= `nvgpu.warp_id` attr-dict

特性: AlwaysSpeculatableImplTrait

接口：ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

效果: MemoryEffects::Effect{}

结果:¶

结果	描述
`结果`	32位无符号整数

NVGPUOps¶

nvgpu.cluster_id (triton::nvgpu::ClusterCTAIdOp)¶

结果:¶

nvgpu.ld_acquire (triton::nvgpu::LoadAcquireOp)¶

属性：¶

操作数:¶

结果：¶

nvgpu.ldmatrix (triton::nvgpu::LoadMatrixOp)¶

属性：¶

操作数：¶

结果：¶

nvgpu.tensor_memory_base (triton::nvgpu::TensorMemoryBaseAddress)¶

结果：¶

nvgpu.wgmma (triton::nvgpu::WGMMAOp)¶

属性：¶

操作数：¶

结果：¶

nvgpu.wgmma_wait_group (triton::nvgpu::WGMMAWaitGroupOp)¶

属性：¶

操作数:¶

结果：¶

nvgpu.warp_id (triton::nvgpu::WarpIdOp)¶

结果:¶

`nvgpu.cluster_id` (triton::nvgpu::ClusterCTAIdOp)¶

`nvgpu.ld_acquire` (triton::nvgpu::LoadAcquireOp)¶

`nvgpu.ldmatrix` (triton::nvgpu::LoadMatrixOp)¶

`nvgpu.tensor_memory_base` (triton::nvgpu::TensorMemoryBaseAddress)¶

`nvgpu.wgmma` (triton::nvgpu::WGMMAOp)¶

`nvgpu.wgmma_wait_group` (triton::nvgpu::WGMMAWaitGroupOp)¶

`nvgpu.warp_id` (triton::nvgpu::WarpIdOp)¶