v28.0.0 - Mesh Shaders, Immediates, and More!
This has been a long time coming. See the tracking issue for more information. They are now fully supported on Vulkan, and supported on Metal and DX12 with passthrough shaders. WGSL parsing and rewriting is supported, meaning they can be used through WESL or naga_oil.
Mesh shader pipelines replace the standard vertex shader pipelines and allow new ways to render meshes. They are ideal for meshlet rendering, a form of rendering where small groups of triangles are handled together, for both culling and rendering.
They are compute-like shaders, and generate primitives which are passed directly to the rasterizer, rather than having a list of vertices generated individually and then using a static index buffer. This means that certain computations on nearby groups of triangles can be done together, the relationship between vertices and primitives is more programmable, and you can even pass non-interpolated per-primitive data to the fragment shader, independent of vertices.
Mesh shaders are very versatile, and are powerful enough to replace vertex shaders, tesselation shaders, and geometry shaders on their own or with task shaders.
A full example of mesh shaders in use can be seen in the mesh_shader example. For the full specification of mesh shaders in wgpu, go to docs/api-specs/mesh_shading.md. Below is a small snippet of shader code demonstrating their usage:
@task
@payload(taskPayload)
@workgroup_size(1)
fn ts_main() -> @builtin(mesh_task_size) vec3<u32> {
// Task shaders can use workgroup variables like compute shaders
workgroupData = 1.0;
// Pass some data to all mesh shaders dispatched by this workgroup
taskPayload.colorMask = vec4(1.0, 1.0, 0.0, 1.0);
taskPayload.visible = 1;
// Dispatch a mesh shader grid with one workgroup
return vec3(1, 1, 1);
}
@mesh(mesh_output)
@payload(taskPayload)
@workgroup_size(1)
fn ms_main(@builtin(local_invocation_index) index: u32, @builtin(global_invocation_id) id: vec3<u32>) {
// Set how many outputs this workgroup will generate
mesh_output.vertex_count = 3;
mesh_output.primitive_count = 1;
// Can also use workgroup variables
workgroupData = 2.0;
// Set vertex outputs
mesh_output.vertices[0].position = positions[0];
mesh_output.vertices[0].color = colors[0] * taskPayload.colorMask;
mesh_output.vertices[1].position = positions[1];
mesh_output.vertices[1].color = colors[1] * taskPayload.colorMask;
mesh_output.vertices[2].position = positions[2];
mesh_output.vertices[2].color = colors[2] * taskPayload.colorMask;
// Set the vertex indices for the only primitive
mesh_output.primitives[0].indices = vec3<u32>(0, 1, 2);
// Cull it if the data passed by the task shader says to
mesh_output.primitives[0].cull = taskPayload.visible == 1;
// Give a noninterpolated per-primitive vec4 to the fragment shader
mesh_output.primitives[0].colorMask = vec4<f32>(1.0, 0.0, 1.0, 1.0);
}
This was a monumental effort from many different people, but it was championed by @inner-daemons, without whom it would not have happened. Thank you @cwfitzgerald for doing the bulk of the code review. Finally thank you @ColinTimBarndt for coordinating the testing effort.
Reviewers:
- @cwfitzgerald
- @jimblandy
- @ErichDonGubler
wgpu Contributions:
- Metal implementation in wgpu-hal. By @inner-daemons in #8139.
- DX12 implementation in wgpu-hal. By @inner-daemons in #8110.
- Vulkan implementation in wgpu-hal. By @inner-daemons in #7089.
- wgpu/wgpu-core implementation. By @inner-daemons in #7345.
- New mesh shader limits and validation. By @inner-daemons in #8507.
naga Contributions:
- Naga IR implementation. By @inner-daemons in #8104.
wgsl-inimplementation in naga. By @inner-daemons in #8370.spv-outimplementation in naga. By @inner-daemons in #8456.wgsl-outimplementation in naga. By @Slightlyclueless in #8481.- Allow barriers in mesh/task shaders. By @inner-daemons in #8749
Testing Assistance:
- @ColinTimBarndt
- @AdamK2003
- @Mhowser
- @9291Sam
- 3 more testers who wished to remain anonymous.
Thank you to everyone to made this happen!
gpu-allocator is the allocator used in the dx12 backend, allowing to configure the allocator the same way in those two backends converging their behavior.
This also brings the Device::generate_allocator_report feature to the vulkan backend.
By @DeltaEvo in #8158.
BREAKING CHANGE: enumerate_adapters is now async:
- pub fn enumerate_adapters(&self, backends: Backends) -> Vec<Adapter> {
+ pub fn enumerate_adapters(&self, backends: Backends) -> impl Future<Output = Vec<Adapter>> {
This yields two benefits:
- This method is now implemented on non-native using the standard
Adapter::request_adapter(…), makingenumerate_adaptersa portable surface. This was previously a nontrivial pain point when an application wanted to do some of its own filtering of adapters. - This method can now be implemented in custom backends.
By @R-Cramer4 in #8230
In the case where a renderpass unconditionally writes to all pixels in the rendertarget, Load can cause unnecessary memory traffic, and Clear can spend time unnecessarily clearing the rendertargets. DontCare is a new LoadOp which will leave the contents of the rendertarget undefined. Because this could lead to undefined behavior, this API requires that the user gives an unsafe token to use the api.
While you can use this unconditionally, on platforms where DontCare is not available, it will internally use a different load op.
load: LoadOp::DontCare(unsafe { wgpu::LoadOpDontCare::enabled() })
By @cwfitzgerald in #8549
This is a breaking change that aligns wgpu with spec.
SamplerDescriptor {
...
- mipmap_filter: FilterMode::Nearest
+ mipmap_filter: MipmapFilterMode::Nearest
...
}
By @sagudev in #8314.
Multiview is a feature that allows rendering the same content to multiple layers of a texture. This is useful primarily in VR where you wish to display almost identical content to 2 views, just with a different perspective. Instead of using 2 draw calls or 2 instances for each object, you can use this feature.
Multiview is also called view instancing in DX12 or vertex amplification in Metal.
Multiview has been reworked, adding support for Metal and DX12, and adding testing and validation to wgpu itself. This change also introduces a view bitmask, a new field in RenderPassDescriptor that allows a render pass to render to multiple non-adjacent layers when using the SELECTIVE_MULTIVIEW feature. If you don't use multi-view, you can set this field to none.
- wgpu::RenderPassDescriptor {
- label: None,
- color_attachments: &color_attachments,
- depth_stencil_attachment: None,
- timestamp_writes: None,
- occlusion_query_set: None,
- }
+ wgpu::RenderPassDescriptor {
+ label: None,
+ color_attachments: &color_attachments,
+ depth_stencil_attachment: None,
+ timestamp_writes: None,
+ occlusion_query_set: None,
+ multiview_mask: NonZero::new(3),
+ }
One other breaking change worth noting is that in WGSL @builtin(view_index) now requires a type of u32, where previously it required i32.
By @inner-daemons in #8206.
- device.push_error_scope(wgpu::ErrorFilter::Validation);
+ let scope = device.push_error_scope(wgpu::ErrorFilter::Validation);
// ... perform operations on the device ...
- let error: Option<Error> = device.pop_error_scope().await;
+ let error: Option<Error> = scope.pop().await;
Device error scopes now operate on a per-thread basis. This allows them to be used easily within multithreaded contexts, without having the error scope capture errors from other threads.
When the std feature is not enabled, we have no way to differentiate between threads, so error scopes return to be global operations.
By @cwfitzgerald in #8685
We have received complaints about wgpu being way too log spammy at log levels info/warn/error. We have adjusted our log policy and changed logging such that info and above should be silent unless some exceptional event happens. Our new log policy is as follows:
- Error: if we can’t (for some reason, usually a bug) communicate an error any other way.
- Warning: similar, but there may be one-shot warnings about almost certainly sub-optimal.
- Info: do not use
- Debug: Used for interesting events happening inside wgpu.
- Trace: Used for all events that might be useful to either
wgpuor application developers.
By @cwfitzgerald in #8579.
As the "immediate data" api is getting close to stabilization in the WebGPU specification, we're bringing our implementation in line with what the spec dictates.
First, in the PipelineLayoutDescriptor, you now pass a unified size for all stages:
- push_constant_ranges: &[wgpu::PushConstantRange {
- stages: wgpu::ShaderStages::VERTEX_FRAGMENT,
- range: 0..12,
- }]
+ immediate_size: 12,
Second, on the command encoder you no longer specify a shader stage, uploads apply to all shader stages that use immediate data.
- rpass.set_push_constants(wgpu::ShaderStages::FRAGMENT, 0, bytes);
+ rpass.set_immediates(0, bytes);
Third, immediates are now declared with the immediate address space instead of the push_constant address space. Due to a known issue on DX12 it is advised to always use a structure for your immediates until that issue is fixed.
- var<push_constant> my_pc: MyPushConstant;
+ var<immediate> my_imm: MyImmediate;
Finally, our implementation currently still zero-initializes the immediate data range you declared in the pipeline layout. This is not spec compliant and failing to populate immediate "slots" that are used in the shader will be a validation error in a future version. See the proposal for details for determining which slots are populated in a given shader.
By @cwfitzgerald in #8724.
To bring our code in line with the WebGPU spec, we have moved information about subgroup size from limits to adapter info. Limits was not the correct place for this anyway, and we had some code special casing those limits.
Additionally we have renamed the fields to match the spec.
- let min = limits.min_subgroup_size;
+ let min = info.subgroup_min_size;
- let max = limits.max_subgroup_size;
+ let max = info.subgroup_max_size;
By @cwfitzgerald in #8609.
- Added support for transient textures on Vulkan and Metal. By @opstic in #8247
- Implement shader triangle barycentric coordinate builtins. By @atlv24 in #8320.
- Added support for binding arrays of storage textures on Metal. By @msvbg in #8464
- Added support for multisampled texture arrays on Vulkan through adapter feature
MULTISAMPLE_ARRAY. By @LaylBongers in #8571. - Added
get_configurationtowgpu::Surface, that returns the current configuration ofwgpu::Surface. By @sagudev in #8664. - Add
wgpu_core::Global::create_bind_group_layout_error. By @ErichDonGubler in #8650.
- Require new enable extensions when using ray queries and position fetch (
wgpu_ray_query,wgpu_ray_query_vertex_return). By @Vecvec in #8545. - Texture now has
from_custom. By @R-Cramer4 in #8315. - Using both the wgpu command encoding APIs and
CommandEncoder::as_hal_muton the same encoder will now result in a panic. - Allow
include_spirv!andinclude_spirv_raw!macros to be used in constants and statics. By @clarfonthey in #8250. - Added support for rendering onto multi-planar textures. By @noituri in #8307.
- Validation errors from
CommandEncoder::finish()will report the label of the invalid encoder. By @kpreid in #8449. - Corrected documentation of the minimum alignment of the end of a mapped range of a buffer (it is 4, not 8). By @kpreid in #8450.
util::StagingBeltnow takes aDevicewhen it is created instead of when it is used. By @kpreid in #8462.wgpu_hal::vulkan::TextureAPI changes to handle externally-created textures and memory more flexibly. By @s-ol in #8512, #8521.- Render passes are now validated against the
maxColorAttachmentBytesPerSamplelimit. By @andyleiserson in #8697.
- Prevent UB with invalid ray query calls on spirv. By @Vecvec in #8390.
- Update the set of binding_array capabilities. In most cases, they are set automatically from
wgpufeatures, and this change should not be user-visible. By @andyleiserson in #8671. - Naga now accepts the
var<function>syntax for declaring local variables. By @andyleiserson in #8710.
- Fixed a bug where mapping sub-ranges of a buffer on web would fail with
OperationError: GPUBuffer.getMappedRange: GetMappedRange range extends beyond buffer's mapped range. By @ryankaplan in #8349 - Reject fragment shader output
locations >max_color_attachmentslimit. By @ErichDonGubler in #8316. - WebGPU device requests now support the required limits
maxColorAttachmentsandmaxColorAttachmentBytesPerSample. By @evilpie in #8328 - Reject binding indices that exceed
wgpu_types::Limits::max_bindings_per_bind_groupwhen deriving a bind group layout for a pipeline. By @jimblandy in #8325. - Removed three features from
wgpu-halwhich did nothing useful:"cargo-clippy","gpu-allocator", and"rustc-hash". By @kpreid in #8357. wgpu_types::PollErrornow always implements theErrortrait. By @kpreid in #8384.- The texture subresources used by the color attachments of a render pass are no longer allowed to overlap when accessed via different texture views. By @andyleiserson in #8402.
- The
STORAGE_READ_ONLYtexture usage is now permitted to coexist with other read-only usages. By @andyleiserson in #8490. - Validate that buffers are unmapped in
write_buffercalls. By @ErichDonGubler in #8454. - Shorten critical section inside present such that the snatch write lock is no longer held during present, preventing other work happening on other threads. By @cwfitzgerald in #8608.
- The
||and&&operators now "short circuit", i.e., do not evaluate the RHS if the result can be determined from just the LHS. By @andyleiserson in #7339. - Fix a bug that resulted in the Metal error
program scope variable must reside in constant address spacein some cases. By @teoxoy in #8311. - Handle
rayQueryTerminatein spv-out instead of ignoring it. By @Vecvec in #8581.
- Align copies b/w textures and buffers via a single intermediate buffer per copy when
D3D12_FEATURE_DATA_D3D12_OPTIONS13.UnrestrictedBufferTextureCopyPitchSupportedisfalse. By @ErichDonGubler in #7721. - Fix detection of Int64 Buffer/Texture atomic features. By @cwfitzgerald in #8667.
- Fixed a validation error regarding atomic memory semantics. By @atlv24 in #8391.
- Fixed a variety of feature detection related bugs. By @inner-daemons in #8439.
- Fixed a bug where the texture aspect was not passed through when calling
copy_texture_to_bufferin WebGPU, causing the copy to fail for depth/stencil textures. By @Tim-Evans-Seequent in #8445.
- Fix race when downloading texture from compute shader pass. By @SpeedCrash100 in #8527
- Fix double window class registration when dynamic libraries are used. By @Azorlogh in #8548
- Fix context loss on device initialization on GL3.3-4.1 contexts. By @cwfitzgerald in #8674.
VertexFormat::Unorm10_10_10_2can now be used onglbackends. By @mooori in #8717.
DropCallbacks are now called after dropping all other fields of their parent structs. By @jerzywilczek in #8353
v27.0.4
This release includes wgpu-hal version 27.0.4. All other crates remain at their previous versions.
- Remove fragile dependency constraint on
ordered-floatthat prevented semver-compatible changes above5.0.0. By @kpreid in #8371.
- Work around extremely poor frame pacing from AMD and Nvidia cards on Windows in
FifoandFifoRelaxedpresent modes. This is due to the drivers implicitly using a DXGI (Direct3D) swapchain to implement these modes and it having vastly different timing properties. See https://github.com/gfx-rs/wgpu/issues/8310 and https://github.com/gfx-rs/wgpu/issues/8354 for more information. By @cwfitzgerald in #8420.
v26.0.6
This release includes wgpu-hal version 26.0.6. All other crates remain at their previous versions.
- Work around extremely poor frame pacing from AMD and Nvidia cards on Windows in
FifoandFifoRelaxedpresent modes. This is due to the drivers implicitly using a DXGI (Direct3D) swapchain to implement these modes and it having vastly different timing properties. See https://github.com/gfx-rs/wgpu/issues/8310 and https://github.com/gfx-rs/wgpu/issues/8354 for more information. By @cwfitzgerald in #8420.
v27.0.3
This release includes naga, wgpu-core and wgpu-hal version 27.0.3. All other crates remain at their previous versions.
- Fix a bug that resulted in the Metal error
program scope variable must reside in constant address spacein some cases. Backport of #8311 by @teoxoy.
- Remove an assertion that causes problems if
CommandEncoder::as_hal_mutis used. By @andyleiserson in #8387.
v26.0.5
This release includes wgpu-hal version 26.0.5. All other crates remain at their previous versions.
v27.0.0
You may schedule buffer mapping and a submission-complete callback to run automatically after you submit, directly from encoders, command buffers, and passes.
// Record some GPU work so the submission isn't empty and touches `buffer`.
encoder.clear_buffer(&buffer, 0, None);
// Defer mapping until this encoder is submitted.
encoder.map_buffer_on_submit(&buffer, wgpu::MapMode::Read, 0..size, |result| { .. });
// Fires after the command buffer's work is finished.
encoder.on_submitted_work_done(|| { .. });
// Automatically calls `map_async` and `on_submitted_work_done` after this submission finishes.
queue.submit([encoder.finish()]);
Available on CommandEncoder, CommandBuffer, RenderPass, and ComputePass.
By @cwfitzgerald in #8125.
By enabling DirectComposition support, the dx12 backend can now support transparent windows.
This creates a single IDCompositionVisual over the entire window that is used by the mfSurface. If a user wants to manage the composition tree themselves, they should create their own device and composition, and pass the relevant visual down into wgpu via SurfaceTargetUnsafe::CompositionVisual.
let instance = wgpu::Instance::new(&wgpu::InstanceDescriptor {
backend_options: wgpu::BackendOptions {
dx12: wgpu::Dx12BackendOptions {
presentation_system: wgpu::Dx12SwapchainKind::DxgiFromVisual,
..
},
..
},
..
});
By @n1ght-hunter in #7550.
We have merged the acceleration structure feature into the RayQuery feature. This is to help work around an AMD driver bug and reduce the feature complexity of ray tracing. In the future when ray tracing pipelines are implemented, if either feature is enabled, acceleration structures will be available.
- Features::EXPERIMENTAL_RAY_TRACING_ACCELERATION_STRUCTURE
+ Features::EXPERIMENTAL_RAY_QUERY
By @Vecvec in #7913.
We have added Features::EXPERIMENTAL_PRECOMPILED_SHADERS, replacing existing passthrough types with a unified CreateShaderModuleDescriptorPassthrough which allows passing multiple shader codes for different backends. By @SupaMaggie70Incorporated in #7834
Difference for SPIR-V passthrough:
- device.create_shader_module_passthrough(wgpu::ShaderModuleDescriptorPassthrough::SpirV(
- wgpu::ShaderModuleDescriptorSpirV {
- label: None,
- source: spirv_code,
- },
- ))
+ device.create_shader_module_passthrough(wgpu::ShaderModuleDescriptorPassthrough {
+ entry_point: "main".into(),
+ label: None,
+ spirv: Some(spirv_code),
+ ..Default::default()
})
This allows using precompiled shaders without manually checking which backend's code to pass, for example if you have shaders precompiled for both DXIL and SPIR-V.
Buffer::get_mapped_range(), Buffer::get_mapped_range_mut(), and Queue::write_buffer_with() now return guard objects without any lifetimes. This makes it significantly easier to store these types in structs, which is useful for building utilities that build the contents of a buffer over time.
- let buffer_mapping_ref: wgpu::BufferView<'_> = buffer.get_mapped_range(..);
- let buffer_mapping_mut: wgpu::BufferViewMut<'_> = buffer.get_mapped_range_mut(..);
- let queue_write_with: wgpu::QueueWriteBufferView<'_> = queue.write_buffer_with(..);
+ let buffer_mapping_ref: wgpu::BufferView = buffer.get_mapped_range(..);
+ let buffer_mapping_mut: wgpu::BufferViewMut = buffer.get_mapped_range_mut(..);
+ let queue_write_with: wgpu::QueueWriteBufferView = queue.write_buffer_with(..);
By @sagudev in #8046 and @cwfitzgerald in #8070.
We want to be able to expose potentially experimental features to our users before we have ensured that they are fully sound to use. As such, we now require any feature that is prefixed with EXPERIMENTAL to have a special unsafe token enabled in the device descriptor acknowledging that the features may still have bugs in them and to report any they find.
adapter.request_device(&wgpu::DeviceDescriptor {
features: wgpu::Features::EXPERIMENTAL_MESH_SHADER,
experimental_features: unsafe { wgpu::ExperimentalFeatures::enabled() }
..
})
By @cwfitzgerald in #8163.
We have removed Features::MULTI_DRAW_INDIRECT as it was unconditionally available on all platforms. RenderPass::multi_draw_indirect is now available if the device supports downlevel flag DownlevelFlags::INDIRECT_EXECUTION.
The Feature::MULTI_DRAW_INDIRECT_COUNT feature can be used to determine if multi-draw is supported natively on the device. This is helpful to know if you are using spirv-passthrough and gl_DrawID in your shaders.
By @cwfitzgerald in #8162.
We removed wgpu::PollType::WaitForSubmissionIndex and added fields to wgpu::PollType::Wait in order to express timeouts.
Before/after for wgpu::PollType::Wait:
-device.poll(wgpu::PollType::Wait).unwrap();
-device.poll(wgpu::PollType::wait_indefinitely()).unwrap();
+device.poll(wgpu::PollType::Wait {
+ submission_index: None, // Wait for most recent submission
+ timeout: Some(std::time::Duration::from_secs(60)), // Previous behavior, but more likely you want `None` instead.
+ })
+ .unwrap();
Before/after for wgpu::PollType::WaitForSubmissionIndex:
-device.poll(wgpu::PollType::WaitForSubmissionIndex(index_to_wait_on))
+device.poll(wgpu::PollType::Wait {
+ submission_index: Some(index_to_wait_on),
+ timeout: Some(std::time::Duration::from_secs(60)), // Previous behavior, but more likely you want `None` instead.
+ })
+ .unwrap();
wgpu::PollType::WaitForSubmissionIndex and wgpu::PollType::Wait had a hard-coded timeout of 60 seconds.
To wait indefinitely on the latest submission, you can also use the wait_indefinitely convenience function:
device.poll(wgpu::PollType::wait_indefinitely());
- Added mesh shader support to
wgpu, with examples. Requires passthrough. By @SupaMaggie70Incorporated in #7345. - Added support for external textures based on WebGPU's
GPUExternalTexture. These allow shaders to transparently operate on potentially multiplanar source texture data in either RGB or YCbCr formats via WGSL'stexture_externaltype. This is gated behind theFeatures::EXTERNAL_TEXTUREfeature, which is currently only supported on DX12. By @jamienicol in #4386. wgpu::Device::pollcan now specify a timeout viawgpu::PollType::Wait. By @wumpf in #8282 & #8285
- Expose
naga::front::wgsl::UnimplementedEnableExtension. By @ErichDonGubler in #8237.
- Command encoding now happens when
CommandEncoder::finishis called, not when the individual operations are requested. This does not affect the API, but may affect performance characteristics. By @andyleiserson in #8220. - Prevent resources for acceleration structures being created if acceleration structures are not enabled. By @Vecvec in #8036.
- Validate that each
push_debug_grouppairs with exactly onepop_debug_group. By @andyleiserson in #8048. set_viewportnow requires that the supplied minimum depth value is less than the maximum depth value. By @andyleiserson in #8040.- Validation of
copy_texture_to_buffer,copy_buffer_to_texture, andcopy_texture_to_textureoperations more closely follows the WebGPU specification. By @andyleiserson in various PRs.- Copies within the same texture must not overlap.
- Copies of multisampled or depth/stencil formats must span an entire subresource (layer).
- Copies of depth/stencil formats must be 4B aligned.
- For texture-buffer copies,
bytes_per_rowon the buffer side must be 256B-aligned, even if the transfer is a single row.
- The offset for
set_vertex_bufferandset_index_buffermust be 4B aligned. By @andyleiserson in #7929. - The offset and size of bindings are validated as fitting within the underlying buffer in more cases. By @andyleiserson in #7911.
- The function you pass to
Device::on_uncaptured_error()must now implementSyncin addition toSend, and be wrapped inArcinstead ofBox. In exchange for this, it is no longer possible for callingwgpufunctions while in that callback to cause a deadlock (not that we encourage you to actually do that). By @kpreid in #8011. - Make a compacted hal acceleration structure inherit a label from the base BLAS. By @Vecvec in #8103.
- The limits requested for a device must now satisfy
min_subgroup_size <= max_subgroup_size. By @andyleiserson in #8085. - Improve errors when buffer mapping is done incorrectly. Allow aliasing immutable [
BufferViews]. By @cwfitzgerald in #8150. - Require new
F16_IN_F32downlevel flag forquantizeToF16,pack2x16float, andunpack2x16floatin WGSL input. By @aleiserson in #8130. - The error message for non-copyable depth/stencil formats no longer mentions the aspect when it is not relevant. By @reima in #8156.
- Track the initialization status of buffer memory correctly when
copy_texture_to_bufferskips over padding space between rows or layers, or when the start/end of a texture-buffer transfer is not 4B aligned. By @andyleiserson in #8099.
- naga now requires that no type be larger than 1 GB. This limit may be lowered in the future; feedback on an appropriate value for the limit is welcome. By @andyleiserson in #7950.
- If the shader source contains control characters, naga now replaces them with U+FFFD ("replacement character") in diagnostic output. By @andyleiserson in #8049.
- Add f16 IO polyfill on Vulkan backend to enable SHADER_F16 use without requiring
storageInputOutput16. By @cryvosh in #7884. - For custom Naga backend authors:
naga::proc::Namernow accepts reserved keywords using two new dedicated types,proc::{KeywordSet, CaseInsensitiveKeywordSet}. By @kpreid in #8136. - BREAKING: Previously the WGSL storage-texture format
rg11b10floatwas incorrectly accepted and generated by naga, but now only accepts the the correct namerg11b10ufloatinstead. By @ErikWDev in #8219. - The
source()method ofShaderErrorno longer reports the error as its own source. By @andyleiserson in #8258. - naga correctly ingests SPIR-V that use descriptor runtime indexing, which in turn is correctly converted into WGSLs binding array. By @hasenbanck in 8256.
- naga correctly ingests SPIR-V that loads from multi-sampled textures, which in turn is correctly converted into WGSLs texture_multisampled_2d and load operations. By @hasenbanck in 8270.
- naga implement OpImageGather and OpImageDrefGather operations when ingesting SPIR-V. By @hasenbanck in 8280.
- Allow disabling waiting for latency waitable object. By @marcpabst in #7400
- Add mesh shader support, including to the example. By @SupaMaggie70Incorporated in #8110
- Validate that effective buffer binding size is aligned to 4 when creating bind groups with buffer entries.. By @ErichDonGubler in 8041.
- Create an event per wait to prevent 60 second hangs in certain multithreaded scenarios. By @Vecvec in #8273.
- Fixed a bug where access to matrices with 2 rows would not work in some cases. By @andyleiserson in #7438.
- Fixed unwrap failed in context creation for some Android devices. By @uael in #8024.
- Fixed wrong color format+space being reported versus what is hardcoded in
create_swapchain(). By @MarijnS95 in #8226.
- [wgsl-in] Allow a trailing comma in
@blend_src(…)attributes. By @ErichDonGubler in #8137. - [wgsl-in] Allow a trailing comma in the list of
casevalues inside aswitch. By @reima in #8165. - Escape, rather than strip, identifiers with Unicode. By @ErichDonGubler in 7995.
- Clarify that subgroup barriers require both the
SUBGROUPandSUBGROUP_BARRIERfeatures / capabilities. By @andyleiserson in #8203.