@@ -862,9 +862,9 @@ Likewise, these represent the [WebAssembly SIMD](https://github.com/WebAssembly/
862862 Performs the bitwise `!a ` operation on each lane .
863863
864864* ```ts
865- function v128 .bitselect (a : v128 , b : v128 , mask : v128 ): v128
865+ function v128 .bitselect (v1 : v128 , v2 : v128 , mask : v128 ): v128
866866 ```
867- Selects bits of either vector according to the specified mask .
867+ Selects bits of either vector according to the specified mask . Selects from ` v1 ` if the bit in ` mask ` is `1`, otherwise from ` v2 `.
868868
869869* ```ts
870870 function v128 .any_true (a : v128 ): bool
@@ -1279,7 +1279,7 @@ Likewise, these represent the [WebAssembly SIMD](https://github.com/WebAssembly/
12791279* ```ts
12801280 function v128 .q15mulr_sat <T >(a : v128 , b : v128 ): v128
12811281 ```
1282- <details ><summary >Performs the line -wise saturating rounding multiplication in Q15 format .</summary >
1282+ <details ><summary >Performs the line -wise saturating rounding multiplication in Q15 format (( a [ i ] * b [ i ] + (1 << ( Q - 1 ))) >> Q where Q =15) .</summary >
12831283
12841284 | T | Instruction
12851285 |-----|-------------
@@ -1348,6 +1348,155 @@ Likewise, these represent the [WebAssembly SIMD](https://github.com/WebAssembly/
13481348 ```
13491349 Initializes a 128-bit vector from two 64-bit float values .
13501350
1351+ #### Relaxed SIMD 🦄
1352+
1353+ The following instructions represent the [WebAssembly Relaxed SIMD ](https :// github.com/WebAssembly/relaxed-simd) specification. Must be enabled with `--enable relaxed-simd`.
1354+
1355+ * ```ts
1356+ function v128 .relaxed_swizzle (a : v128 , s : v128 ): v128
1357+ ```
1358+ Selects 8-bit lanes from `a ` using indices in `s `. Indices in the range \[0-15] select the i -th element of `a `.
1359+
1360+ Unlike `v128 .swizzle `, the result of an out of bounds index is implementation -defined , depending on hardware capabilities : Either `0` or `a [s [i ]%16]`.
1361+
1362+ * ```ts
1363+ function v128 .relaxed_trunc <T >(a : v128 ): v128
1364+ ```
1365+ <details ><summary >Truncates each lane of a vector from 32-bit floating point to a 32-bit signed or unsigned integer as indicated by T .</summary >
1366+
1367+ | T | Instruction
1368+ |----------|-------------
1369+ | i32 | i32x4 .relaxed_trunc_f32x4_s
1370+ | u32 | i32x4 .relaxed_trunc_f32x4_u
1371+ </details >
1372+
1373+ Unlike `v128 .trunc_sat `, the result of lanes out of bounds of the target type is implementation defined , depending on hardware capabilities :
1374+ - If the input lane contains `NaN `, the result is either `0` or the respective maximum integer value .
1375+ - If the input lane contains a value otherwise out of bounds of the target type , the result is either the saturatated result or maximum integer value .
1376+
1377+ * ```ts
1378+ function v128 .relaxed_trunc_zero <T >(a : v128 ): v128
1379+ ```
1380+ <details ><summary >Truncates each lane of a vector from 64-bit floating point to a 32-bit signed or unsigned integer as indicated by T . Unused higher integer lanes of the result are initialized to zero .</summary >
1381+
1382+ | T | Instruction
1383+ |----------|-------------
1384+ | i32 | i32x4 .relaxed_trunc_f64x2_s_zero
1385+ | u32 | i32x4 .relaxed_trunc_f64x2_u_zero
1386+ </details >
1387+
1388+ Unlike `v128 .trunc_sat_zero `, the result of lanes out of bounds of the target type is implementation defined , depending on hardware capabilities :
1389+ - If the input lane contains `NaN `, the result is either `0` or the respective maximum integer value .
1390+ - If the input lane contains a value otherwise out of bounds of the target type , the result is either the saturatated result or maximum integer value .
1391+
1392+ * ```ts
1393+ function v128 .relaxed_madd <T >(a : v128 , b : v128 , c : v128 ): v128
1394+ ```
1395+ <details ><summary >Performs the fused multiply -add operation (a * b + c ) on 32- or 64-bit floating point lanes as indicated by T .</summary >
1396+
1397+ | T | Instruction
1398+ |----------|-------------
1399+ | f32 | f32x4 .relaxed_madd
1400+ | f64 | f64x2 .relaxed_madd
1401+ </details >
1402+
1403+ The result is implementation defined , depending on hardware capabilities :
1404+ - Either `a * b ` is rounded once and the final result rounded again , or
1405+ - The expression is evaluated with higher precision and only rounded once
1406+
1407+ * ```ts
1408+ function v128 .relaxed_nmadd <T >(a : v128 , b : v128 , c : v128 ): v128
1409+ ```
1410+ <details ><summary >Performs the fused negative multiply -add operation (-(a * b ) + c ) on 32- or 64-bit floating point lanes as indicated by T .</summary >
1411+
1412+ | T | Instruction
1413+ |----------|-------------
1414+ | f32 | f32x4 .relaxed_nmadd
1415+ | f64 | f64x2 .relaxed_nmadd
1416+ </details >
1417+
1418+ The result is implementation defined , depending on hardware capabilities :
1419+ - Either `a * b ` is rounded once and the final result rounded again , or
1420+ - The expression is evaluated with higher precision and only rounded once
1421+
1422+ * ```ts
1423+ function v128 .relaxed_laneselect <T >(a : v128 , b : v128 , m : v128 ): v128
1424+ ```
1425+ <details ><summary >Selects 8-, 16-, 32- or 64-bit integer lanes as indicated by T from a or b based on masks in m .</summary >
1426+
1427+ | T | Instruction
1428+ |----------|-------------
1429+ | i8 , u8 | i8x16 .relaxed_laneselect
1430+ | i16 , u16 | i16x8 .relaxed_laneselect
1431+ | i32 , u32 | i32x4 .relaxed_laneselect
1432+ | i64 , u64 | i64x2 .relaxed_laneselect
1433+ </details >
1434+
1435+ Behaves like `v128 .bitselect ` if masks in `m ` do have all bits either set (result is `a [i ]`) or unset (result is `b [i ]`). Otherwise the result is implementation -defined , depending on hardware capabilities : If the most significant bit of `m ` is set , the result is either `bitselect (a [i ], b [i ], mask )` or `a [i ]`, otherwise the result is `b [i ]`.
1436+
1437+ * ```ts
1438+ function v128 .relaxed_min <T >(a : v128 , b : v128 ): v128
1439+ ```
1440+ <details ><summary >Computes the minimum of each 32- or 64-bit floating point lane as indicated by T .</summary >
1441+
1442+ | T | Instruction
1443+ |----------|-------------
1444+ | f32 | f32x4 .relaxed_min
1445+ | f64 | f64x2 .relaxed_min
1446+ </details >
1447+
1448+ Unlike `v128 .min `, the result is implementation -defined if either value is `NaN ` or both are `-0.0` and `+0.0`, depending on hardware capabilities : Either `a [i ]` or `b [i ]`.
1449+
1450+ * ```ts
1451+ function v128 .relaxed_max <T >(a : v128 , b : v128 ): v128
1452+ ```
1453+ <details ><summary >Computes the maximum of each 32- or 64-bit floating point lane as indicated by T .</summary >
1454+
1455+ | T | Instruction
1456+ |----------|-------------
1457+ | f32 | f32x4 .relaxed_max
1458+ | f64 | f64x2 .relaxed_max
1459+ </details >
1460+
1461+ Unlike `v128 .max `, the result is implementation -defined if either value is `NaN ` or both are `-0.0` and `+0.0`, depending on hardware capabilities : Either `a [i ]` or `b [i ]`.
1462+
1463+ * ```ts
1464+ function v128 .relaxed_q15mulr <T >(a : v128 , b : v128 ): v128
1465+ ```
1466+ <details ><summary >Performs the lane -wise rounding multiplication in Q15 format ((a [i ] * b [i ] + (1 << (Q - 1 ))) >> Q where Q =15).</summary >
1467+
1468+ | T | Instruction
1469+ |----------|-------------
1470+ | i16 | i16x8 .relaxed_q15mulr_s
1471+ </details >
1472+
1473+ Unlike `v128 .q15mulr_sat `, the result is implementation -defined if both inputs are the minimum signed value : Either the minimum or maximum signed value .
1474+
1475+ * ```ts
1476+ function v128 .relaxed_dot <T >(a : v128 , b : v128 ): v128
1477+ ```
1478+ <details ><summary >Computes the dot product of two 8-bit integer lanes each , yielding lanes one size wider than the input .</summary >
1479+
1480+ | T | Instruction
1481+ |----------|-------------
1482+ | i16 | i16x8 .relaxed_dot_i8x16_i7x16_s
1483+ </details >
1484+
1485+ Unlike `v128 .dot `, if the most significant bit of `b [i ]` is set , whether `b [i ]` is interpreted as signed or unsigned by the intermediate multiplication is implementation -defined .
1486+
1487+ * ```ts
1488+ function v128 .relaxed_dot_add <T >(a : v128 , b : v128 , c : v128 ): v128
1489+ ```
1490+ <details ><summary >Computes the dot product of two 8-bit integer lanes each , yielding lanes two sizes wider than the input with the lanes of c accumulated into the result .</summary >
1491+
1492+ | T | Instruction
1493+ |----------|-------------
1494+ | i32 | i32x4 .relaxed_dot_i8x16_i7x16_add_s
1495+ </details >
1496+
1497+ Unlike `v128 .dot `, if the most significant bit of `b [i ]` is set , whether `b [i ]` is interpreted as signed or unsigned by the intermediate multiplication is implementation -defined .
1498+
1499+
13511500### Inline instructions
13521501
13531502In addition to using the generic builtins above , most WebAssembly instructions can be written directly in AssemblyScript code . For example , the following is equivalent :
0 commit comments