[arch] internal/simdgen: update documentations

0 views
Skip to first unread message

Junyang Shao (Gerrit)

unread,
Jun 7, 2025, 10:17:01 PM (20 hours ago) Jun 7
to goph...@pubsubhelper.golang.org, golang-co...@googlegroups.com

Junyang Shao has uploaded the change for review

Commit message

internal/simdgen: update documentations

This CL is generated by Gemini
Change-Id: If3323c3a23b0d2197390d1a239bdcbedd60615d2

Change diff

diff --git a/internal/simdgen/categories.yaml b/internal/simdgen/categories.yaml
index 1ffe16d..05ae129 100644
--- a/internal/simdgen/categories.yaml
+++ b/internal/simdgen/categories.yaml
@@ -2,75 +2,131 @@
- go: Add
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Add adds corresponding elements of two vectors.
- go: SaturatedAdd
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ SaturatedAdd adds corresponding elements of two vectors with saturation.
- go: MaskedAdd
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedAdd adds corresponding elements of two vectors, masked.
+ docUnmasked: !string |-
+ Add adds corresponding elements of two vectors.
- go: MaskedSaturatedAdd
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedSaturatedAdd adds corresponding elements of two vectors with saturation, masked.
+ docUnmasked: !string |-
+ SaturatedAdd adds corresponding elements of two vectors with saturation.
- go: Sub
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ Sub subtracts corresponding elements of two vectors.
- go: SaturatedSub
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ SaturatedSub subtracts corresponding elements of two vectors with saturation.
- go: MaskedSub
masked: "true"
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedSub subtracts corresponding elements of two vectors, masked.
+ docUnmasked: !string |-
+ Sub subtracts corresponding elements of two vectors.
- go: MaskedSaturatedSub
masked: "true"
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedSaturatedSub subtracts corresponding elements of two vectors with saturation, masked.
+ docUnmasked: !string |-
+ SaturatedSub subtracts corresponding elements of two vectors with saturation.
- go: PairwiseAdd
commutative: "false"
extension: "AVX.*"
- documentation: "Add pairs of elements in vector x and store them in higher half of the target; Add pairs of elements in vector y and store them in lower half of the target"
+ documentation: !string |-
+ PairwiseAdd horizontally adds adjacent pairs of elements.
+ For x = [x0, x1, x2, x3] and y = [y0, y1, y2, y3], the result is [y0+y1, y2+y3, x0+x1, x2+x3].
- go: PairwiseSub
commutative: "false"
extension: "AVX.*"
- documentation: "Sub pairs of elements in vector x and store them in higher half of the target; Sub pairs of elements in vector y and store them in lower half of the target"
+ documentation: !string |-
+ PairwiseSub horizontally subtracts adjacent pairs of elements.
+ For x = [x0, x1, x2, x3] and y = [y0, y1, y2, y3], the result is [y0-y1, y2-y3, x0-x1, x2-x3].
- go: SaturatedPairwiseAdd
commutative: "false"
extension: "AVX.*"
- documentation: "Add pairs of elements in vector x and store them in higher half of the target; Add pairs of elements in vector y and store them in lower half of the target; With saturation"
+ documentation: !string |-
+ SaturatedPairwiseAdd horizontally adds adjacent pairs of elements with saturation.
+ For x = [x0, x1, x2, x3] and y = [y0, y1, y2, y3], the result is [y0+y1, y2+y3, x0+x1, x2+x3].
- go: SaturatedPairwiseSub
commutative: "false"
extension: "AVX.*"
- documentation: "Sub pairs of elements in vector x and store them in higher half of the target; Sub pairs of elements in vector y and store them in lower half of the target; With saturation"
+ documentation: !string |-
+ SaturatedPairwiseSub horizontally subtracts adjacent pairs of elements with saturation.
+ For x = [x0, x1, x2, x3] and y = [y0, y1, y2, y3], the result is [y0-y1, y2-y3, x0-x1, x2-x3].
- go: And
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ And performs a bitwise AND operation between two vectors.
- go: MaskedAnd
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedAnd performs a masked bitwise AND operation between two vectors.
+ docUnmasked: !string |-
+ And performs a bitwise AND operation between two vectors.
- go: Or
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Or performs a bitwise OR operation between two vectors.
- go: MaskedOr
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedOr performs a masked bitwise OR operation between two vectors.
+ docUnmasked: !string |-
+ Or performs a bitwise OR operation between two vectors.
- go: AndNot
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ AndNot performs a bitwise AND NOT operation between two vectors.
- go: MaskedAndNot
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedAndNot performs a masked bitwise AND NOT operation between two vectors.
+ docUnmasked: !string |-
+ AndNot performs a bitwise AND NOT operation between two vectors.
- go: Xor
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Xor performs a bitwise XOR operation between two vectors.
- go: MaskedXor
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedXor performs a masked bitwise XOR operation between two vectors.
+ docUnmasked: !string |-
+ Xor performs a bitwise XOR operation between two vectors.
# We also have PTEST and VPTERNLOG, those should be hidden from the users
# and only appear in rewrite rules.
# const imm predicate(holds for both float and int|uint):
@@ -84,311 +140,560 @@
constImm: 0
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 0 if it has;"
+ documentation: !string |-
+ Equal compares for equality.
+ Const Immediate = 0.
- go: Less
constImm: 1
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 1 if it has;"
+ documentation: !string |-
+ Less compares for less than.
+ Const Immediate = 1.
- go: LessEqual
constImm: 2
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 2 if it has;"
+ documentation: !string |-
+ LessEqual compares for less than or equal.
+ Const Immediate = 2.
- go: IsNan # For float only.
constImm: 3
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 3 if it has; Returns mask element True if either one of the input\\'s element is Nan; Please use this method as x\\.IsNan\\(x\\) to check x only;"
+ documentation: !string |-
+ IsNan checks if elements are NaN. Use as x.IsNan(x).
+ Const Immediate = 3.
- go: NotEqual
constImm: 4
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 4 if it has;"
+ documentation: !string |-
+ NotEqual compares for inequality.
+ Const Immediate = 4.
- go: GreaterEqual
constImm: 5
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 5 if it has;"
+ documentation: !string |-
+ GreaterEqual compares for greater than or equal.
+ Const Immediate = 5.
- go: Greater
constImm: 6
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 6 if it has;"
+ documentation: !string |-
+ Greater compares for greater than.
+ Const Immediate = 6.

- go: MaskedEqual
constImm: 0
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 0 if it has;"
+ documentation: !string |-
+ MaskedEqual compares for equality, masked.
+ Const Immediate = 0.
+ docUnmasked: !string |-
+ Equal compares for equality.
+ Const Immediate = 0.
- go: MaskedLess
constImm: 1
masked: "true"
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 1 if it has;"
+ documentation: !string |-
+ MaskedLess compares for less than, masked.
+ Const Immediate = 1.
+ docUnmasked: !string |-
+ Less compares for less than.
+ Const Immediate = 1.
- go: MaskedLessEqual
constImm: 2
masked: "true"
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 2 if it has;"
+ documentation: !string |-
+ MaskedLessEqual compares for less than or equal, masked.
+ Const Immediate = 2.
+ docUnmasked: !string |-
+ LessEqual compares for less than or equal.
+ Const Immediate = 2.
- go: MaskedIsNan # For float only.
constImm: 3
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 3 if it has; Returns mask element True if either one of the input\\'s element is Nan; Please use this method as x\\.IsNan\\(x\\) to check x only;"
+ documentation: !string |-
+ MaskedIsNan checks if elements are NaN, masked. Use as x.IsNan(x).
+ Const Immediate = 3.
+ docUnmasked: !string |-
+ IsNan checks if elements are NaN. Use as x.IsNan(x).
+ Const Immediate = 3.
- go: MaskedNotEqual
constImm: 4
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 4 if it has;"
+ documentation: !string |-
+ MaskedNotEqual compares for inequality, masked.
+ Const Immediate = 4.
+ docUnmasked: !string |-
+ NotEqual compares for inequality.
+ Const Immediate = 4.
- go: MaskedGreaterEqual
constImm: 5
masked: "true"
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 5 if it has;"
+ documentation: !string |-
+ MaskedGreaterEqual compares for greater than or equal, masked.
+ Const Immediate = 5.
+ docUnmasked: !string |-
+ GreaterEqual compares for greater than or equal.
+ Const Immediate = 5.
- go: MaskedGreater
constImm: 6
masked: "true"
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 6 if it has;"
+ documentation: !string |-
+ MaskedGreater compares for greater than, masked.
+ Const Immediate = 6.
+ docUnmasked: !string |-
+ Greater compares for greater than.
+ Const Immediate = 6.
- go: Div
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ Div divides elements of two vectors.
- go: MaskedDiv
commutative: "false"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedDiv divides elements of two vectors, masked.
+ docUnmasked: !string |-
+ Div divides elements of two vectors.
- go: Sqrt
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ Sqrt computes the square root of each element.
- go: MaskedSqrt
commutative: "false"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedSqrt computes the square root of each element, masked.
+ docUnmasked: !string |-
+ Sqrt computes the square root of each element.
- go: ApproximateReciprocal
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ ApproximateReciprocal computes an approximate reciprocal of each element.
- go: MaskedApproximateReciprocal
commutative: "false"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedApproximateReciprocal computes an approximate reciprocal of each element, masked.
+ docUnmasked: !string |-
+ ApproximateReciprocal computes an approximate reciprocal of each element.
- go: ApproximateReciprocalOfSqrt
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ ApproximateReciprocalOfSqrt computes an approximate reciprocal of the square root of each element.
- go: MaskedApproximateReciprocalOfSqrt
commutative: "false"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedApproximateReciprocalOfSqrt computes an approximate reciprocal of the square root of each element, masked.
+ docUnmasked: !string |-
+ ApproximateReciprocalOfSqrt computes an approximate reciprocal of the square root of each element.
- go: MaskedMulByPowOf2 # This operation is all after AVX512, the unmasked version will be generated.
commutative: "false"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedMulByPowOf2 multiplies elements by a power of 2, masked.
+ docUnmasked: !string |-
+ MulByPowOf2 multiplies elements by a power of 2.

- go: Round
commutative: "false"
extension: "AVX.*"
constImm: 0
+ documentation: !string |-
+ Round rounds elements to the nearest integer.
+ Const Immediate = 0.
- go: MaskedRoundWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 0
masked: "true"
+ documentation: !string |-
+ MaskedRoundWithPrecision rounds elements with specified precision, masked.
+ Const Immediate = 0.
+ docUnmasked: !string |-
+ RoundWithPrecision rounds elements with specified precision.
+ Const Immediate = 0.
- go: MaskedRoundSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 8
masked: "true"
+ documentation: !string |-
+ MaskedRoundSuppressExceptionWithPrecision rounds elements with specified precision, suppressing exceptions, masked.
+ Const Immediate = 8.
+ docUnmasked: !string |-
+ RoundSuppressExceptionWithPrecision rounds elements with specified precision, suppressing exceptions.
+ Const Immediate = 8.
- go: MaskedDiffWithRoundWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 0
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithRoundWithPrecision computes the difference after rounding with specified precision, masked.
+ Const Immediate = 0.
+ docUnmasked: !string |-
+ DiffWithRoundWithPrecision computes the difference after rounding with specified precision.
+ Const Immediate = 0.
- go: MaskedDiffWithRoundSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 8
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithRoundSuppressExceptionWithPrecision computes the difference after rounding with specified precision, suppressing exceptions, masked.
+ Const Immediate = 8.
+ docUnmasked: !string |-
+ DiffWithRoundSuppressExceptionWithPrecision computes the difference after rounding with specified precision, suppressing exceptions.
+ Const Immediate = 8.

- go: Floor
commutative: "false"
extension: "AVX.*"
constImm: 1
+ documentation: !string |-
+ Floor rounds elements down to the nearest integer.
+ Const Immediate = 1.
- go: MaskedFloorWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 1
masked: "true"
+ documentation: !string |-
+ MaskedFloorWithPrecision rounds elements down with specified precision, masked.
+ Const Immediate = 1.
+ docUnmasked: !string |-
+ FloorWithPrecision rounds elements down with specified precision.
+ Const Immediate = 1.
- go: MaskedFloorSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 9
masked: "true"
+ documentation: !string |-
+ MaskedFloorSuppressExceptionWithPrecision rounds elements down with specified precision, suppressing exceptions, masked.
+ Const Immediate = 9.
+ docUnmasked: !string |-
+ FloorSuppressExceptionWithPrecision rounds elements down with specified precision, suppressing exceptions.
+ Const Immediate = 9.
- go: MaskedDiffWithFloorWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 1
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithFloorWithPrecision computes the difference after flooring with specified precision, masked.
+ Const Immediate = 1.
+ docUnmasked: !string |-
+ DiffWithFloorWithPrecision computes the difference after flooring with specified precision.
+ Const Immediate = 1.
- go: MaskedDiffWithFloorSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 9
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithFloorSuppressExceptionWithPrecision computes the difference after flooring with specified precision, suppressing exceptions, masked.
+ Const Immediate = 9.
+ docUnmasked: !string |-
+ DiffWithFloorSuppressExceptionWithPrecision computes the difference after flooring with specified precision, suppressing exceptions.
+ Const Immediate = 9.

- go: Ceil
commutative: "false"
extension: "AVX.*"
constImm: 2
+ documentation: !string |-
+ Ceil rounds elements up to the nearest integer.
+ Const Immediate = 2.
- go: MaskedCeilWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 2
masked: "true"
+ documentation: !string |-
+ MaskedCeilWithPrecision rounds elements up with specified precision, masked.
+ Const Immediate = 2.
+ docUnmasked: !string |-
+ CeilWithPrecision rounds elements up with specified precision.
+ Const Immediate = 2.
- go: MaskedCeilSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 10
masked: "true"
+ documentation: !string |-
+ MaskedCeilSuppressExceptionWithPrecision rounds elements up with specified precision, suppressing exceptions, masked.
+ Const Immediate = 10.
+ docUnmasked: !string |-
+ CeilSuppressExceptionWithPrecision rounds elements up with specified precision, suppressing exceptions.
+ Const Immediate = 10.
- go: MaskedDiffWithCeilWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 2
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithCeilWithPrecision computes the difference after ceiling with specified precision, masked.
+ Const Immediate = 2.
+ docUnmasked: !string |-
+ DiffWithCeilWithPrecision computes the difference after ceiling with specified precision.
+ Const Immediate = 2.
- go: MaskedDiffWithCeilSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 10
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithCeilSuppressExceptionWithPrecision computes the difference after ceiling with specified precision, suppressing exceptions, masked.
+ Const Immediate = 10.
+ docUnmasked: !string |-
+ DiffWithCeilSuppressExceptionWithPrecision computes the difference after ceiling with specified precision, suppressing exceptions.
+ Const Immediate = 10.

- go: Trunc
commutative: "false"
extension: "AVX.*"
constImm: 3
+ documentation: !string |-
+ Trunc truncates elements towards zero.
+ Const Immediate = 3.
- go: MaskedTruncWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 3
masked: "true"
+ documentation: !string |-
+ MaskedTruncWithPrecision truncates elements with specified precision, masked.
+ Const Immediate = 3.
+ docUnmasked: !string |-
+ TruncWithPrecision truncates elements with specified precision.
+ Const Immediate = 3.
- go: MaskedTruncSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 11
masked: "true"
+ documentation: !string |-
+ MaskedTruncSuppressExceptionWithPrecision truncates elements with specified precision, suppressing exceptions, masked.
+ Const Immediate = 11.
+ docUnmasked: !string |-
+ TruncSuppressExceptionWithPrecision truncates elements with specified precision, suppressing exceptions.
+ Const Immediate = 11.
- go: MaskedDiffWithTruncWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 3
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithTruncWithPrecision computes the difference after truncating with specified precision, masked.
+ Const Immediate = 3.
+ docUnmasked: !string |-
+ DiffWithTruncWithPrecision computes the difference after truncating with specified precision.
+ Const Immediate = 3.
- go: MaskedDiffWithTruncSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 11
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithTruncSuppressExceptionWithPrecision computes the difference after truncating with specified precision, suppressing exceptions, masked.
+ Const Immediate = 11.
+ docUnmasked: !string |-
+ DiffWithTruncSuppressExceptionWithPrecision computes the difference after truncating with specified precision, suppressing exceptions.
+ Const Immediate = 11.

- go: AddSub
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ AddSub alternatingly adds and subtracts elements of two vectors.
- go: Average
commutative: "true"
extension: "AVX.*" # VPAVGB/W are available across various AVX versions
+ documentation: !string |-
+ Average computes the rounded average of corresponding elements.
- go: MaskedAverage
commutative: "true"
masked: "true"
extension: "AVX512.*" # Masked operations are typically AVX512
+ documentation: !string |-
+ MaskedAverage computes the rounded average of corresponding elements, masked.
+ docUnmasked: !string |-
+ Average computes the rounded average of corresponding elements.

- go: Absolute
commutative: "false"
# Unary operation, not commutative
extension: "AVX.*" # VPABSB/W/D are AVX, VPABSQ is AVX512
+ documentation: !string |-
+ Absolute computes the absolute value of each element.
- go: MaskedAbsolute
commutative: "false"
masked: "true"
extension: "AVX512.*"
+ documentation: !string |-
+ MaskedAbsolute computes the absolute value of each element, masked.
+ docUnmasked: !string |-
+ Absolute computes the absolute value of each element.

- go: Sign
# Applies sign of second operand to first: sign(val, sign_src)
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ Sign applies the sign of the second operand to the first.
# Sign does not have masked version

- go: MaskedPopCount
commutative: "false"
masked: "true"
extension: "AVX512.*" # VPOPCNT instructions are AVX512 (BITALG or VPOPCNTDQ)
+ documentation: !string |-
+ MaskedPopCount counts the number of set bits in each element, masked.
+ docUnmasked: !string |-
+ PopCount counts the number of set bits in each element.
- go: PairDotProd
commutative: "true"
extension: "AVX.*"
- documentation: "Multiply the elements and add the pairs together"
+ documentation: !string |-
+ PairDotProd multiplies and adds adjacent pairs of elements.
- go: MaskedPairDotProd
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Multiply the elements and add the pairs together"
+ documentation: !string |-
+ MaskedPairDotProd multiplies and adds adjacent pairs of elements, masked.
+ docUnmasked: !string |-
+ PairDotProd multiplies and adds adjacent pairs of elements.
- go: SaturatedPairDotProd
commutative: "true"
extension: "AVX.*"
- documentation: "Multiply the elements and add the pairs together with saturation"
+ documentation: !string |-
+ SaturatedPairDotProd multiplies and adds adjacent pairs of elements with saturation.
- go: MaskedSaturatedPairDotProd
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Multiply the elements and add the pairs together with saturation"
+ documentation: !string |-
+ MaskedSaturatedPairDotProd multiplies and adds adjacent pairs of elements with saturation, masked.
+ docUnmasked: !string |-
+ SaturatedPairDotProd multiplies and adds adjacent pairs of elements with saturation.
+
# QuadDotProd, i.e. VPDPBUSD(S) are operations with src/dst on the same register, we are not supporting this as of now.
- go: DotProdBroadcast
commutative: "true"
extension: "AVX.*"
- documentation: "Multiply the elements and add the pairs together; the result is a broadcast of the dot product; imm8 = 127;"
+ documentation: !string |-
+ DotProdBroadcast multiplies all elements and broadcasts the sum.
+ Const Immediate = 127.
- go: Max
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Max computes the maximum of corresponding elements.
- go: MaskedMax
commutative: "true"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedMax computes the maximum of corresponding elements, masked.
+ docUnmasked: !string |-
+ Max computes the maximum of corresponding elements.
- go: Min
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Min computes the minimum of corresponding elements.
- go: MaskedMin
commutative: "true"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedMin computes the minimum of corresponding elements, masked.
+ docUnmasked: !string |-
+ Min computes the minimum of corresponding elements.
- go: Mul
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Mul multiplies corresponding elements of two vectors.
- go: MulEvenWiden
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the even index elements from the two sources of size X at index i, store the result of size 2X at index i/2"
+ documentation: !string |-
+ MulEvenWiden multiplies even-indexed elements, widening the result.
+ Result[i] = v1.Even[i] * v2.Even[i].
- go: MulHigh
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the elements from the two sources of size X at index i, store the high X bits of the result of size 2X at index i"
+ documentation: !string |-
+ MulHigh multiplies elements and stores the high part of the result.
- go: MulLow
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the elements from the two sources of size X at index i, store the low X bits of the result of size 2X at index i"
+ documentation: !string |-
+ MulLow multiplies elements and stores the low part of the result.
- go: MaskedMul
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedMul multiplies corresponding elements of two vectors, masked.
+ docUnmasked: !string |-
+ Mul multiplies corresponding elements of two vectors.
- go: MaskedMulEvenWiden
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the even index elements from the two sources of size X at index i, store the result of size 2X at index i/2"
+ documentation: !string |-
+ MaskedMulEvenWiden multiplies even-indexed elements, widening the result, masked.
+ Result[i] = v1.Even[i] * v2.Even[i].
+ docUnmasked: !string |-
+ MulEvenWiden multiplies even-indexed elements, widening the result.
+ Result[i] = v1.Even[i] * v2.Even[i].
- go: MaskedMulHigh
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the elements from the two sources of size X at index i, store the high X bits of the result of size 2X at index i"
+ documentation: !string |-
+ MaskedMulHigh multiplies elements and stores the high part of the result, masked.
+ docUnmasked: !string |-
+ MulHigh multiplies elements and stores the high part of the result.
- go: MaskedMulLow
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the elements from the two sources of size X at index i, store the low X bits of the result of size 2X at index i"
+ documentation: !string |-
+ MaskedMulLow multiplies elements and stores the low part of the result, masked.
+ docUnmasked: !string |-
+ MulLow multiplies elements and stores the low part of the result.
diff --git a/internal/simdgen/gen_simdTypes.go b/internal/simdgen/gen_simdTypes.go
index ed29a75..aac85a4 100644
--- a/internal/simdgen/gen_simdTypes.go
+++ b/internal/simdgen/gen_simdTypes.go
@@ -75,37 +75,55 @@

{{- range .OpsLen1}}

-// Asm: {{.Asm}}, Arch: {{.Extension}}{{if .Documentation}}, Doc: {{.Documentation}}{{end}}
+/*{{if .Documentation}}
+{{.Documentation}}{{end}}
+Asm: {{.Asm}}, Arch: {{.Extension}}
+*/
func (x {{(index .In 0).Go}}) {{.Go}}() {{(index .Out 0).Go}}

{{- end}}
{{- range .OpsLen2}}

-// Asm: {{.Asm}}, Arch: {{.Extension}}{{if .Documentation}}, Doc: {{.Documentation}}{{end}}
+/*{{if .Documentation}}
+{{.Documentation}}{{end}}
+Asm: {{.Asm}}, Arch: {{.Extension}}
+*/
func (x {{(index .In 0).Go}}) {{.Go}}(y {{(index .In 1).Go}}) {{(index .Out 0).Go}}

{{- end}}
{{- range .OpsLen3}}

-// Asm: {{.Asm}}, Arch: {{.Extension}}{{if .Documentation}}, Doc: {{.Documentation}}{{end}}
+/*{{if .Documentation}}
+{{.Documentation}}{{end}}
+Asm: {{.Asm}}, Arch: {{.Extension}}
+*/
func (x {{(index .In 0).Go}}) {{.Go}}(y {{(index .In 1).Go}}, z {{(index .In 2).Go}}) {{(index .Out 0).Go}}

{{- end}}
{{- range .OpsLen1Imm8}}

-// Asm: {{.Asm}}, Arch: {{.Extension}}{{if .Documentation}}, Doc: {{.Documentation}}{{end}}
+/*{{if .Documentation}}
+{{.Documentation}}{{end}}
+Asm: {{.Asm}}, Arch: {{.Extension}}
+*/
func (x {{(index .In 1).Go}}) {{.Go}}(imm8 uint8) {{(index .Out 0).Go}}

{{- end}}
{{- range .OpsLen2Imm8}}

-// Asm: {{.Asm}}, Arch: {{.Extension}}{{if .Documentation}}, Doc: {{.Documentation}}{{end}}
+/*{{if .Documentation}}
+{{.Documentation}}{{end}}
+Asm: {{.Asm}}, Arch: {{.Extension}}
+*/
func (x {{(index .In 1).Go}}) {{.Go}}(imm uint8, y {{(index .In 2).Go}}) {{(index .Out 0).Go}}

{{- end}}
{{- range .OpsLen3Imm8}}

-// Asm: {{.Asm}}, Arch: {{.Extension}}{{if .Documentation}}, Doc: {{.Documentation}}{{end}}
+/*{{if .Documentation}}
+{{.Documentation}}{{end}}
+Asm: {{.Asm}}, Arch: {{.Extension}}
+*/
func (x {{(index .In 1).Go}}) {{.Go}}(imm uint8, y {{(index .In 2).Go}}, z {{(index .In 3).Go}}) {{(index .Out 0).Go}}

{{- end}}
diff --git a/internal/simdgen/gen_utility.go b/internal/simdgen/gen_utility.go
index 6ae1cff..5b9a636 100644
--- a/internal/simdgen/gen_utility.go
+++ b/internal/simdgen/gen_utility.go
@@ -357,6 +357,7 @@
return nil, fmt.Errorf("simdgen only recognizes masked operations with name starting with 'Masked': %s", op)
}
op2.Go = strings.ReplaceAll(op2.Go, "Masked", "")
+ op2.Documentation = op2.DocUnmasked // Also change the documentation.
splited = append(splited, op2)
} else {
return nil, fmt.Errorf("simdgen only recognizes masked operations with exactly one mask input: %s", op)
diff --git a/internal/simdgen/godefs.go b/internal/simdgen/godefs.go
index ad20943..c50760b 100644
--- a/internal/simdgen/godefs.go
+++ b/internal/simdgen/godefs.go
@@ -22,6 +22,7 @@
Extension string // Extension
Zeroing *string // Zeroing is a flag for asm prefix "Z", if non-nil it will always be "false"
Documentation *string // Documentation will be appended to the stubs comments.
+ DocUnmasked *string // Documentation, unmasked version
// ConstMask is a hack to reduce the size of defs the user writes for const-immediate
// If present, it will be copied to [In[0].Const].
ConstImm *string
diff --git a/internal/simdgen/ops/AddSub/categories.yaml b/internal/simdgen/ops/AddSub/categories.yaml
index 1d08a94..8dca0e2 100644
--- a/internal/simdgen/ops/AddSub/categories.yaml
+++ b/internal/simdgen/ops/AddSub/categories.yaml
@@ -2,44 +2,76 @@
- go: Add
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Add adds corresponding elements of two vectors.
- go: SaturatedAdd
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ SaturatedAdd adds corresponding elements of two vectors with saturation.
- go: MaskedAdd
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedAdd adds corresponding elements of two vectors, masked.
+ docUnmasked: !string |-
+ Add adds corresponding elements of two vectors.
- go: MaskedSaturatedAdd
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedSaturatedAdd adds corresponding elements of two vectors with saturation, masked.
+ docUnmasked: !string |-
+ SaturatedAdd adds corresponding elements of two vectors with saturation.
- go: Sub
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ Sub subtracts corresponding elements of two vectors.
- go: SaturatedSub
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ SaturatedSub subtracts corresponding elements of two vectors with saturation.
- go: MaskedSub
masked: "true"
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedSub subtracts corresponding elements of two vectors, masked.
+ docUnmasked: !string |-
+ Sub subtracts corresponding elements of two vectors.
- go: MaskedSaturatedSub
masked: "true"
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedSaturatedSub subtracts corresponding elements of two vectors with saturation, masked.
+ docUnmasked: !string |-
+ SaturatedSub subtracts corresponding elements of two vectors with saturation.
- go: PairwiseAdd
commutative: "false"
extension: "AVX.*"
- documentation: "Add pairs of elements in vector x and store them in higher half of the target; Add pairs of elements in vector y and store them in lower half of the target"
+ documentation: !string |-
+ PairwiseAdd horizontally adds adjacent pairs of elements.
+ For x = [x0, x1, x2, x3] and y = [y0, y1, y2, y3], the result is [y0+y1, y2+y3, x0+x1, x2+x3].
- go: PairwiseSub
commutative: "false"
extension: "AVX.*"
- documentation: "Sub pairs of elements in vector x and store them in higher half of the target; Sub pairs of elements in vector y and store them in lower half of the target"
+ documentation: !string |-
+ PairwiseSub horizontally subtracts adjacent pairs of elements.
+ For x = [x0, x1, x2, x3] and y = [y0, y1, y2, y3], the result is [y0-y1, y2-y3, x0-x1, x2-x3].
- go: SaturatedPairwiseAdd
commutative: "false"
extension: "AVX.*"
- documentation: "Add pairs of elements in vector x and store them in higher half of the target; Add pairs of elements in vector y and store them in lower half of the target; With saturation"
+ documentation: !string |-
+ SaturatedPairwiseAdd horizontally adds adjacent pairs of elements with saturation.
+ For x = [x0, x1, x2, x3] and y = [y0, y1, y2, y3], the result is [y0+y1, y2+y3, x0+x1, x2+x3].
- go: SaturatedPairwiseSub
commutative: "false"
extension: "AVX.*"
- documentation: "Sub pairs of elements in vector x and store them in higher half of the target; Sub pairs of elements in vector y and store them in lower half of the target; With saturation"
\ No newline at end of file
+ documentation: !string |-
+ SaturatedPairwiseSub horizontally subtracts adjacent pairs of elements with saturation.
+ For x = [x0, x1, x2, x3] and y = [y0, y1, y2, y3], the result is [y0-y1, y2-y3, x0-x1, x2-x3].
\ No newline at end of file
diff --git a/internal/simdgen/ops/BitwiseLogic/categories.yaml b/internal/simdgen/ops/BitwiseLogic/categories.yaml
index bc4eda7..677db09 100644
--- a/internal/simdgen/ops/BitwiseLogic/categories.yaml
+++ b/internal/simdgen/ops/BitwiseLogic/categories.yaml
@@ -2,30 +2,54 @@
- go: And
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ And performs a bitwise AND operation between two vectors.
- go: MaskedAnd
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedAnd performs a masked bitwise AND operation between two vectors.
+ docUnmasked: !string |-
+ And performs a bitwise AND operation between two vectors.
- go: Or
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Or performs a bitwise OR operation between two vectors.
- go: MaskedOr
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedOr performs a masked bitwise OR operation between two vectors.
+ docUnmasked: !string |-
+ Or performs a bitwise OR operation between two vectors.
- go: AndNot
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ AndNot performs a bitwise AND NOT operation between two vectors.
- go: MaskedAndNot
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedAndNot performs a masked bitwise AND NOT operation between two vectors.
+ docUnmasked: !string |-
+ AndNot performs a bitwise AND NOT operation between two vectors.
- go: Xor
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Xor performs a bitwise XOR operation between two vectors.
- go: MaskedXor
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedXor performs a masked bitwise XOR operation between two vectors.
+ docUnmasked: !string |-
+ Xor performs a bitwise XOR operation between two vectors.
# We also have PTEST and VPTERNLOG, those should be hidden from the users
# and only appear in rewrite rules.
\ No newline at end of file
diff --git a/internal/simdgen/ops/Compares/categories.yaml b/internal/simdgen/ops/Compares/categories.yaml
index 027c8e8..1e01e36 100644
--- a/internal/simdgen/ops/Compares/categories.yaml
+++ b/internal/simdgen/ops/Compares/categories.yaml
@@ -10,77 +10,126 @@
constImm: 0
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 0 if it has;"
+ documentation: !string |-
+ Equal compares for equality.
+ Const Immediate = 0.
- go: Less
constImm: 1
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 1 if it has;"
+ documentation: !string |-
+ Less compares for less than.
+ Const Immediate = 1.
- go: LessEqual
constImm: 2
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 2 if it has;"
+ documentation: !string |-
+ LessEqual compares for less than or equal.
+ Const Immediate = 2.
- go: IsNan # For float only.
constImm: 3
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 3 if it has; Returns mask element True if either one of the input\\'s element is Nan; Please use this method as x\\.IsNan\\(x\\) to check x only;"
+ documentation: !string |-
+ IsNan checks if elements are NaN. Use as x.IsNan(x).
+ Const Immediate = 3.
- go: NotEqual
constImm: 4
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 4 if it has;"
+ documentation: !string |-
+ NotEqual compares for inequality.
+ Const Immediate = 4.
- go: GreaterEqual
constImm: 5
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 5 if it has;"
+ documentation: !string |-
+ GreaterEqual compares for greater than or equal.
+ Const Immediate = 5.
- go: Greater
constImm: 6
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 6 if it has;"
+ documentation: !string |-
+ Greater compares for greater than.
+ Const Immediate = 6.

- go: MaskedEqual
constImm: 0
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 0 if it has;"
+ documentation: !string |-
+ MaskedEqual compares for equality, masked.
+ Const Immediate = 0.
+ docUnmasked: !string |-
+ Equal compares for equality.
+ Const Immediate = 0.
- go: MaskedLess
constImm: 1
masked: "true"
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 1 if it has;"
+ documentation: !string |-
+ MaskedLess compares for less than, masked.
+ Const Immediate = 1.
+ docUnmasked: !string |-
+ Less compares for less than.
+ Const Immediate = 1.
- go: MaskedLessEqual
constImm: 2
masked: "true"
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 2 if it has;"
+ documentation: !string |-
+ MaskedLessEqual compares for less than or equal, masked.
+ Const Immediate = 2.
+ docUnmasked: !string |-
+ LessEqual compares for less than or equal.
+ Const Immediate = 2.
- go: MaskedIsNan # For float only.
constImm: 3
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 3 if it has; Returns mask element True if either one of the input\\'s element is Nan; Please use this method as x\\.IsNan\\(x\\) to check x only;"
+ documentation: !string |-
+ MaskedIsNan checks if elements are NaN, masked. Use as x.IsNan(x).
+ Const Immediate = 3.
+ docUnmasked: !string |-
+ IsNan checks if elements are NaN. Use as x.IsNan(x).
+ Const Immediate = 3.
- go: MaskedNotEqual
constImm: 4
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Predicate immediate is 4 if it has;"
+ documentation: !string |-
+ MaskedNotEqual compares for inequality, masked.
+ Const Immediate = 4.
+ docUnmasked: !string |-
+ NotEqual compares for inequality.
+ Const Immediate = 4.
- go: MaskedGreaterEqual
constImm: 5
masked: "true"
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 5 if it has;"
+ documentation: !string |-
+ MaskedGreaterEqual compares for greater than or equal, masked.
+ Const Immediate = 5.
+ docUnmasked: !string |-
+ GreaterEqual compares for greater than or equal.
+ Const Immediate = 5.
- go: MaskedGreater
constImm: 6
masked: "true"
commutative: "false"
extension: "AVX.*"
- documentation: "Predicate immediate is 6 if it has;"
\ No newline at end of file
+ documentation: !string |-
+ MaskedGreater compares for greater than, masked.
+ Const Immediate = 6.
+ docUnmasked: !string |-
+ Greater compares for greater than.
+ Const Immediate = 6.
\ No newline at end of file
diff --git a/internal/simdgen/ops/FPonlyArith/categories.yaml b/internal/simdgen/ops/FPonlyArith/categories.yaml
index e486225..3f60e84 100644
--- a/internal/simdgen/ops/FPonlyArith/categories.yaml
+++ b/internal/simdgen/ops/FPonlyArith/categories.yaml
@@ -2,136 +2,274 @@
- go: Div
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ Div divides elements of two vectors.
- go: MaskedDiv
commutative: "false"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedDiv divides elements of two vectors, masked.
+ docUnmasked: !string |-
+ Div divides elements of two vectors.
- go: Sqrt
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ Sqrt computes the square root of each element.
- go: MaskedSqrt
commutative: "false"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedSqrt computes the square root of each element, masked.
+ docUnmasked: !string |-
+ Sqrt computes the square root of each element.
- go: ApproximateReciprocal
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ ApproximateReciprocal computes an approximate reciprocal of each element.
- go: MaskedApproximateReciprocal
commutative: "false"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedApproximateReciprocal computes an approximate reciprocal of each element, masked.
+ docUnmasked: !string |-
+ ApproximateReciprocal computes an approximate reciprocal of each element.
- go: ApproximateReciprocalOfSqrt
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ ApproximateReciprocalOfSqrt computes an approximate reciprocal of the square root of each element.
- go: MaskedApproximateReciprocalOfSqrt
commutative: "false"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedApproximateReciprocalOfSqrt computes an approximate reciprocal of the square root of each element, masked.
+ docUnmasked: !string |-
+ ApproximateReciprocalOfSqrt computes an approximate reciprocal of the square root of each element.
- go: MaskedMulByPowOf2 # This operation is all after AVX512, the unmasked version will be generated.
commutative: "false"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedMulByPowOf2 multiplies elements by a power of 2, masked.
+ docUnmasked: !string |-
+ MulByPowOf2 multiplies elements by a power of 2.

- go: Round
commutative: "false"
extension: "AVX.*"
constImm: 0
+ documentation: !string |-
+ Round rounds elements to the nearest integer.
+ Const Immediate = 0.
- go: MaskedRoundWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 0
masked: "true"
+ documentation: !string |-
+ MaskedRoundWithPrecision rounds elements with specified precision, masked.
+ Const Immediate = 0.
+ docUnmasked: !string |-
+ RoundWithPrecision rounds elements with specified precision.
+ Const Immediate = 0.
- go: MaskedRoundSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 8
masked: "true"
+ documentation: !string |-
+ MaskedRoundSuppressExceptionWithPrecision rounds elements with specified precision, suppressing exceptions, masked.
+ Const Immediate = 8.
+ docUnmasked: !string |-
+ RoundSuppressExceptionWithPrecision rounds elements with specified precision, suppressing exceptions.
+ Const Immediate = 8.
- go: MaskedDiffWithRoundWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 0
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithRoundWithPrecision computes the difference after rounding with specified precision, masked.
+ Const Immediate = 0.
+ docUnmasked: !string |-
+ DiffWithRoundWithPrecision computes the difference after rounding with specified precision.
+ Const Immediate = 0.
- go: MaskedDiffWithRoundSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 8
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithRoundSuppressExceptionWithPrecision computes the difference after rounding with specified precision, suppressing exceptions, masked.
+ Const Immediate = 8.
+ docUnmasked: !string |-
+ DiffWithRoundSuppressExceptionWithPrecision computes the difference after rounding with specified precision, suppressing exceptions.
+ Const Immediate = 8.

- go: Floor
commutative: "false"
extension: "AVX.*"
constImm: 1
+ documentation: !string |-
+ Floor rounds elements down to the nearest integer.
+ Const Immediate = 1.
- go: MaskedFloorWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 1
masked: "true"
+ documentation: !string |-
+ MaskedFloorWithPrecision rounds elements down with specified precision, masked.
+ Const Immediate = 1.
+ docUnmasked: !string |-
+ FloorWithPrecision rounds elements down with specified precision.
+ Const Immediate = 1.
- go: MaskedFloorSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 9
masked: "true"
+ documentation: !string |-
+ MaskedFloorSuppressExceptionWithPrecision rounds elements down with specified precision, suppressing exceptions, masked.
+ Const Immediate = 9.
+ docUnmasked: !string |-
+ FloorSuppressExceptionWithPrecision rounds elements down with specified precision, suppressing exceptions.
+ Const Immediate = 9.
- go: MaskedDiffWithFloorWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 1
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithFloorWithPrecision computes the difference after flooring with specified precision, masked.
+ Const Immediate = 1.
+ docUnmasked: !string |-
+ DiffWithFloorWithPrecision computes the difference after flooring with specified precision.
+ Const Immediate = 1.
- go: MaskedDiffWithFloorSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 9
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithFloorSuppressExceptionWithPrecision computes the difference after flooring with specified precision, suppressing exceptions, masked.
+ Const Immediate = 9.
+ docUnmasked: !string |-
+ DiffWithFloorSuppressExceptionWithPrecision computes the difference after flooring with specified precision, suppressing exceptions.
+ Const Immediate = 9.

- go: Ceil
commutative: "false"
extension: "AVX.*"
constImm: 2
+ documentation: !string |-
+ Ceil rounds elements up to the nearest integer.
+ Const Immediate = 2.
- go: MaskedCeilWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 2
masked: "true"
+ documentation: !string |-
+ MaskedCeilWithPrecision rounds elements up with specified precision, masked.
+ Const Immediate = 2.
+ docUnmasked: !string |-
+ CeilWithPrecision rounds elements up with specified precision.
+ Const Immediate = 2.
- go: MaskedCeilSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 10
masked: "true"
+ documentation: !string |-
+ MaskedCeilSuppressExceptionWithPrecision rounds elements up with specified precision, suppressing exceptions, masked.
+ Const Immediate = 10.
+ docUnmasked: !string |-
+ CeilSuppressExceptionWithPrecision rounds elements up with specified precision, suppressing exceptions.
+ Const Immediate = 10.
- go: MaskedDiffWithCeilWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 2
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithCeilWithPrecision computes the difference after ceiling with specified precision, masked.
+ Const Immediate = 2.
+ docUnmasked: !string |-
+ DiffWithCeilWithPrecision computes the difference after ceiling with specified precision.
+ Const Immediate = 2.
- go: MaskedDiffWithCeilSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 10
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithCeilSuppressExceptionWithPrecision computes the difference after ceiling with specified precision, suppressing exceptions, masked.
+ Const Immediate = 10.
+ docUnmasked: !string |-
+ DiffWithCeilSuppressExceptionWithPrecision computes the difference after ceiling with specified precision, suppressing exceptions.
+ Const Immediate = 10.

- go: Trunc
commutative: "false"
extension: "AVX.*"
constImm: 3
+ documentation: !string |-
+ Trunc truncates elements towards zero.
+ Const Immediate = 3.
- go: MaskedTruncWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 3
masked: "true"
+ documentation: !string |-
+ MaskedTruncWithPrecision truncates elements with specified precision, masked.
+ Const Immediate = 3.
+ docUnmasked: !string |-
+ TruncWithPrecision truncates elements with specified precision.
+ Const Immediate = 3.
- go: MaskedTruncSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 11
masked: "true"
+ documentation: !string |-
+ MaskedTruncSuppressExceptionWithPrecision truncates elements with specified precision, suppressing exceptions, masked.
+ Const Immediate = 11.
+ docUnmasked: !string |-
+ TruncSuppressExceptionWithPrecision truncates elements with specified precision, suppressing exceptions.
+ Const Immediate = 11.
- go: MaskedDiffWithTruncWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 3
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithTruncWithPrecision computes the difference after truncating with specified precision, masked.
+ Const Immediate = 3.
+ docUnmasked: !string |-
+ DiffWithTruncWithPrecision computes the difference after truncating with specified precision.
+ Const Immediate = 3.
- go: MaskedDiffWithTruncSuppressExceptionWithPrecision
commutative: "false"
extension: "AVX.*"
constImm: 11
masked: "true"
+ documentation: !string |-
+ MaskedDiffWithTruncSuppressExceptionWithPrecision computes the difference after truncating with specified precision, suppressing exceptions, masked.
+ Const Immediate = 11.
+ docUnmasked: !string |-
+ DiffWithTruncSuppressExceptionWithPrecision computes the difference after truncating with specified precision, suppressing exceptions.
+ Const Immediate = 11.

- go: AddSub
commutative: "false"
- extension: "AVX.*"
\ No newline at end of file
+ extension: "AVX.*"
+ documentation: !string |-
+ AddSub alternatingly adds and subtracts elements of two vectors.
\ No newline at end of file
diff --git a/internal/simdgen/ops/IntOnlyArith/categories.yaml b/internal/simdgen/ops/IntOnlyArith/categories.yaml
index c74b57c..01a55cc 100644
--- a/internal/simdgen/ops/IntOnlyArith/categories.yaml
+++ b/internal/simdgen/ops/IntOnlyArith/categories.yaml
@@ -2,27 +2,45 @@
- go: Average
commutative: "true"
extension: "AVX.*" # VPAVGB/W are available across various AVX versions
+ documentation: !string |-
+ Average computes the rounded average of corresponding elements.
- go: MaskedAverage
commutative: "true"
masked: "true"
extension: "AVX512.*" # Masked operations are typically AVX512
+ documentation: !string |-
+ MaskedAverage computes the rounded average of corresponding elements, masked.
+ docUnmasked: !string |-
+ Average computes the rounded average of corresponding elements.

- go: Absolute
commutative: "false"
# Unary operation, not commutative
extension: "AVX.*" # VPABSB/W/D are AVX, VPABSQ is AVX512
+ documentation: !string |-
+ Absolute computes the absolute value of each element.
- go: MaskedAbsolute
commutative: "false"
masked: "true"
extension: "AVX512.*"
+ documentation: !string |-
+ MaskedAbsolute computes the absolute value of each element, masked.
+ docUnmasked: !string |-
+ Absolute computes the absolute value of each element.

- go: Sign
# Applies sign of second operand to first: sign(val, sign_src)
commutative: "false"
extension: "AVX.*"
+ documentation: !string |-
+ Sign applies the sign of the second operand to the first.
# Sign does not have masked version

- go: MaskedPopCount
commutative: "false"
masked: "true"
- extension: "AVX512.*" # VPOPCNT instructions are AVX512 (BITALG or VPOPCNTDQ)
\ No newline at end of file
+ extension: "AVX512.*" # VPOPCNT instructions are AVX512 (BITALG or VPOPCNTDQ)
+ documentation: !string |-
+ MaskedPopCount counts the number of set bits in each element, masked.
+ docUnmasked: !string |-
+ PopCount counts the number of set bits in each element.
\ No newline at end of file
diff --git a/internal/simdgen/ops/MLOps/categories.yaml b/internal/simdgen/ops/MLOps/categories.yaml
index 01eaf0d..075f836 100644
--- a/internal/simdgen/ops/MLOps/categories.yaml
+++ b/internal/simdgen/ops/MLOps/categories.yaml
@@ -2,24 +2,34 @@
- go: PairDotProd
commutative: "true"
extension: "AVX.*"
- documentation: "Multiply the elements and add the pairs together"
+ documentation: !string |-
+ PairDotProd multiplies and adds adjacent pairs of elements.
- go: MaskedPairDotProd
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Multiply the elements and add the pairs together"
+ documentation: !string |-
+ MaskedPairDotProd multiplies and adds adjacent pairs of elements, masked.
+ docUnmasked: !string |-
+ PairDotProd multiplies and adds adjacent pairs of elements.
- go: SaturatedPairDotProd
commutative: "true"
extension: "AVX.*"
- documentation: "Multiply the elements and add the pairs together with saturation"
+ documentation: !string |-
+ SaturatedPairDotProd multiplies and adds adjacent pairs of elements with saturation.
- go: MaskedSaturatedPairDotProd
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Multiply the elements and add the pairs together with saturation"
+ documentation: !string |-
+ MaskedSaturatedPairDotProd multiplies and adds adjacent pairs of elements with saturation, masked.
+ docUnmasked: !string |-
+ SaturatedPairDotProd multiplies and adds adjacent pairs of elements with saturation.

# QuadDotProd, i.e. VPDPBUSD(S) are operations with src/dst on the same register, we are not supporting this as of now.
- go: DotProdBroadcast
commutative: "true"
extension: "AVX.*"
- documentation: "Multiply all the elements and add them together; the result is a broadcast of the dot product; imm8 = 127;"
+ documentation: !string |-
+ DotProdBroadcast multiplies all elements and broadcasts the sum.
+ Const Immediate = 127.
diff --git a/internal/simdgen/ops/MinMax/categories.yaml b/internal/simdgen/ops/MinMax/categories.yaml
index d513195..b61a85d 100644
--- a/internal/simdgen/ops/MinMax/categories.yaml
+++ b/internal/simdgen/ops/MinMax/categories.yaml
@@ -2,14 +2,26 @@
- go: Max
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Max computes the maximum of corresponding elements.
- go: MaskedMax
commutative: "true"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedMax computes the maximum of corresponding elements, masked.
+ docUnmasked: !string |-
+ Max computes the maximum of corresponding elements.
- go: Min
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Min computes the minimum of corresponding elements.
- go: MaskedMin
commutative: "true"
masked: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedMin computes the minimum of corresponding elements, masked.
+ docUnmasked: !string |-
+ Min computes the minimum of corresponding elements.
diff --git a/internal/simdgen/ops/Mul/categories.yaml b/internal/simdgen/ops/Mul/categories.yaml
index 0ef6cf5..0fe6a1d 100644
--- a/internal/simdgen/ops/Mul/categories.yaml
+++ b/internal/simdgen/ops/Mul/categories.yaml
@@ -2,34 +2,55 @@
- go: Mul
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ Mul multiplies corresponding elements of two vectors.
- go: MulEvenWiden
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the even index elements from the two sources of size X at index i, store the result of size 2X at index i/2"
+ documentation: !string |-
+ MulEvenWiden multiplies even-indexed elements, widening the result.
+ Result[i] = v1.Even[i] * v2.Even[i].
- go: MulHigh
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the elements from the two sources of size X at index i, store the high X bits of the result of size 2X at index i"
+ documentation: !string |-
+ MulHigh multiplies elements and stores the high part of the result.
- go: MulLow
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the elements from the two sources of size X at index i, store the low X bits of the result of size 2X at index i"
+ documentation: !string |-
+ MulLow multiplies elements and stores the low part of the result.
- go: MaskedMul
masked: "true"
commutative: "true"
extension: "AVX.*"
+ documentation: !string |-
+ MaskedMul multiplies corresponding elements of two vectors, masked.
+ docUnmasked: !string |-
+ Mul multiplies corresponding elements of two vectors.
- go: MaskedMulEvenWiden
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the even index elements from the two sources of size X at index i, store the result of size 2X at index i/2"
+ documentation: !string |-
+ MaskedMulEvenWiden multiplies even-indexed elements, widening the result, masked.
+ Result[i] = v1.Even[i] * v2.Even[i].
+ docUnmasked: !string |-
+ MulEvenWiden multiplies even-indexed elements, widening the result.
+ Result[i] = v1.Even[i] * v2.Even[i].
- go: MaskedMulHigh
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the elements from the two sources of size X at index i, store the high X bits of the result of size 2X at index i"
+ documentation: !string |-
+ MaskedMulHigh multiplies elements and stores the high part of the result, masked.
+ docUnmasked: !string |-
+ MulHigh multiplies elements and stores the high part of the result.
- go: MaskedMulLow
masked: "true"
commutative: "true"
extension: "AVX.*"
- documentation: "Multiplies the elements from the two sources of size X at index i, store the low X bits of the result of size 2X at index i"
\ No newline at end of file
+ documentation: !string |-
+ MaskedMulLow multiplies elements and stores the low part of the result, masked.
+ docUnmasked: !string |-
+ MulLow multiplies elements and stores the low part of the result.
\ No newline at end of file

Change information

Files:
  • M internal/simdgen/categories.yaml
  • M internal/simdgen/gen_simdTypes.go
  • M internal/simdgen/gen_utility.go
  • M internal/simdgen/godefs.go
  • M internal/simdgen/ops/AddSub/categories.yaml
  • M internal/simdgen/ops/BitwiseLogic/categories.yaml
  • M internal/simdgen/ops/Compares/categories.yaml
  • M internal/simdgen/ops/FPonlyArith/categories.yaml
  • M internal/simdgen/ops/IntOnlyArith/categories.yaml
  • M internal/simdgen/ops/MLOps/categories.yaml
  • M internal/simdgen/ops/MinMax/categories.yaml
  • M internal/simdgen/ops/Mul/categories.yaml
Change size: L
Delta: 12 files changed, 695 insertions(+), 66 deletions(-)
Open in Gerrit

Related details

Attention set is empty
Submit Requirements:
  • requirement is not satisfiedCode-Review
  • requirement satisfiedNo-Unresolved-Comments
  • requirement is not satisfiedReview-Enforcement
  • requirement is not satisfiedTryBots-Pass
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: newchange
Gerrit-Project: arch
Gerrit-Branch: master
Gerrit-Change-Id: If3323c3a23b0d2197390d1a239bdcbedd60615d2
Gerrit-Change-Number: 679955
Gerrit-PatchSet: 1
Gerrit-Owner: Junyang Shao <shaoj...@google.com>
unsatisfied_requirement
satisfied_requirement
open
diffy

Junyang Shao (Gerrit)

unread,
Jun 7, 2025, 10:19:45 PM (20 hours ago) Jun 7
to goph...@pubsubhelper.golang.org, golang-co...@googlegroups.com
Attention needed from David Chase

Junyang Shao uploaded new patchset

Junyang Shao uploaded patch set #2 to this change.
Open in Gerrit

Related details

Attention is currently required from:
  • David Chase
Submit Requirements:
  • requirement is not satisfiedCode-Review
  • requirement satisfiedNo-Unresolved-Comments
  • requirement is not satisfiedReview-Enforcement
  • requirement is not satisfiedTryBots-Pass
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: newpatchset
Gerrit-Project: arch
Gerrit-Branch: master
Gerrit-Change-Id: If3323c3a23b0d2197390d1a239bdcbedd60615d2
Gerrit-Change-Number: 679955
Gerrit-PatchSet: 2
Gerrit-Owner: Junyang Shao <shaoj...@google.com>
Gerrit-Reviewer: David Chase <drc...@google.com>
Gerrit-Attention: David Chase <drc...@google.com>
unsatisfied_requirement
satisfied_requirement
open
diffy
Reply all
Reply to author
Forward
0 new messages