Moka 0.12.15
- Fixed a bug where re-inserting an expired entry could cause it to lose its expiration time and remain in the cache indefinitely when using a custom
Expirypolicy with per-entry expiration. (#582 by @jiangzhe, #581 by @atrocities, reported in #575):- This occurred when an entry that had expired but not yet been evicted was re-inserted, and
expire_after_updatereturnedNone. This primarily affected users who only overrideexpire_after_create, since the defaultexpire_after_updatereturnsduration_until_expiry, which isNonefor expired entries. - This bug was introduced by the changes in v0.12.13 (#549 and #564).
- Subtle behavior change:
- Before this fix, re-inserting an expired entry was treated as an update, so
Expiry::expire_after_updatewas called. - After this fix, re-inserting an expired entry is treated as a creation, so
Expiry::expire_after_createis called instead. - This may change the expiration time of re-inserted entries, depending on your
Expirytrait implementation.
- Before this fix, re-inserting an expired entry was treated as an update, so
- This occurred when an entry that had expired but not yet been evicted was re-inserted, and
- Fixed flaky tests
cht::segment::tests::drop_many_valuesanddrop_many_values_concurrentthat were failing on high-core-count machines (#586):- These tests were using a CPU-dependent segment count, causing inconsistent bucket array shrinking behavior of the internal segmented hash map across different machines.
- Changed these tests to use a fixed segment count (4) for consistent results.
v4.5.2
- PDF word splitting in extracted text: Pdfium's text extraction inserted spurious spaces mid-word (e.g.
"s hall a b e active"instead of"shall be active"). Added selective page-level respacing: pages with detected broken word spacing are re-extracted using character-level gap analysis (font_size × 0.33threshold). Clean pages use the fast single-call path. Reduces garbled lines from 406 to 0 on the ISO 21111-10 test document with no performance impact. - Markdown underscore escaping: Underscores in extracted text (e.g.
CTC_ARP_01) were incorrectly escaped asCTC\_ARP\_01throughout the markdown output. Underscore escaping has been removed entirely since extracted PDF text contains literal identifiers, not markdown formatting. - Page header/footer leakage: Running headers like
ISO 21111-10:2021(E)and copyright footers leaked into the document body. Added fuzzy alphanumeric matching to detect repeated header/footer text even when spacing or character extraction varies across pages. - R batch function spurious NULL argument: R wrapper batch functions passed an extra
NULLpositional argument to native Rust functions, causing "unused argument" errors on all batch operations. - Elixir Windows ORT DLL staging: ONNX Runtime DLL was only staged in
target/release/but not inpriv/native/where the BEAM VM loads NIFs. OCR/layout/embedding features now work correctly on Windows CI.
- General extraction result caching: All file types (PDF, Office, HTML, archives, etc.) are now cached — not just OCR results. Repeated extractions of the same file with the same config return instantly from cache.
- Cache namespace isolation: New
cache_namespacefield onExtractionConfigenables multi-tenant cache isolation on shared filesystems. Available via--cache-namespaceCLI flag and across all language bindings. - Per-request cache TTL: New
cache_ttl_secsfield onExtractionConfigoverrides the global TTL for individual extractions. Set to0to skip cache entirely. Available via--cache-ttl-secsCLI flag. - Cache namespace deletion:
delete_namespace()removes all cache entries under a namespace.get_stats_filtered()returns per-namespace statistics. - Multi-worker cleanup safety: Cache cleanup no longer triggers excessively when multiple worker pods share the same cache directory.
- Bundled eng.traineddata: English OCR works out of the box with zero runtime configuration (~4MB bundled at build time).
- Tessdata in
cache warm:kreuzberg-cli cache warmnow downloads all tessdata_fast language files (~120 languages) toKREUZBERG_CACHE_DIR/tessdata/, giving full Tesseract language support without system packages. - Tessdata in
cache manifest:kreuzberg-cli cache manifestnow includes all tessdata files with source URLs, enabling--sync-cacheto download tessdata alongside models. KREUZBERG_CACHE_DIR/tessdataresolution:resolve_tessdata_path()now checksKREUZBERG_CACHE_DIR/tessdataand the bundled build path before falling back to system paths.- CLI
embedcommand: Generate vector embeddings from text viakreuzberg embed --text "..." --preset balanced. - CLI
chunkcommand: Split text into chunks viakreuzberg chunk --text "..." --chunk-size 512. - CLI
completionscommand: Generate shell completions for bash, zsh, fish, powershell. - CLI
--log-levelglobal flag: OverrideRUST_LOGviakreuzberg --log-level debug extract doc.pdf. - CLI extraction overrides: 27 flags exposed via
ExtractionOverridesstruct with#[command(flatten)]. - CLI colored output: Text output uses
anstylefor colored headers, labels, success values, and dim separators. RespectsNO_COLORenv var. - API
POST /detect,GET /version,GET /cache/manifest,POST /cache/warm: New REST endpoints. - MCP
get_version,cache_manifest,cache_warm,embed_text,chunk_text: New MCP tools. - Pipeline table extraction tracing: Zero-cost
tracing::trace!andtracing::debug!logging throughout layout detection and table extraction. - TATR model availability check: Layout detection returns an error if table regions are detected but the TATR model is unavailable.
- CLI batch flags: Batch command now supports all extraction override flags via shared
ExtractionOverridesstruct. - CLI config architecture: Replaced 13-parameter function with
ExtractionOverridesstruct using#[command(flatten)]. - MCP tool architecture: Removed dead
tools/trait-based duplicates; all tools implemented directly inserver.rs.
- CLI validation: OCR backend values, chunk size/overlap bounds, DPI range, layout confidence validated.
- API validation: Embedding preset names and chunk bounds checked.
- MCP validation: Empty paths rejected, chunk bounds checked, embedding preset validated.
- Chunk overlap auto-clamping: When
--chunk-sizeis smaller than default overlap, overlap is automatically clamped tosize/4.
See full changelog: https://github.com/kreuzberg-dev/kreuzberg/blob/main/CHANGELOG.md
v4.5.1
See CHANGELOG.md for release notes.
Benchmark Results 2026-03-21 (d062479)
Comparative benchmark results from workflow run 23359982805.
Commit: d0624792f343e8dab8c7468dcc2a2c4930741157 Date: 2026-03-21
v4.5.0
See CHANGELOG.md for full release notes.
v4.5.0
See CHANGELOG.md for full release notes.
v4.5.0
- ONNX-based document layout detection: New
layoutconfig field enables document layout analysis using RT-DETR v2 with 17 element classes. Supports"fast"and"accurate"presets with auto-downloaded models. Available across all language bindings. - SLANet table structure recognition: Detected Table regions are processed by SLANet-plus for neural HTML structure recovery, producing markdown tables with colspan/rowspan support.
- Layout-enhanced heading detection: Layout model SectionHeader and Title regions guide heading detection in both structure tree and heuristic extraction.
- Multi-backend OCR pipeline: New
OcrPipelineConfigenables quality-based fallback across OCR backends (e.g., Tesseract then PaddleOCR) with configurable priority, language, and backend-specific settings. - OCR quality thresholds: New
OcrQualityThresholdsconfig with 16 tunable parameters for OCR output quality assessment and fallback decisions. - OCR auto-rotate: New
OcrConfig.auto_rotateflag (default: false) for automatic page rotation detection. Handles 0/90/180/270 degree rotations. - PaddleOCR v2 model tier system: New
model_tierfield with"mobile"(default, ~21MB, fast) and"server"(~172MB, highest accuracy). Both use unified multilingual models. AccelerationConfigfor GPU/execution provider control: Fine-grained control over ONNX execution providers (CPU, CoreML, CUDA, TensorRT) for layout detection and table recognition. Typed across all bindings.ConcurrencyConfigfor thread limiting: Newmax_threadsfield caps Rayon, ONNX intra-op threads, and batch concurrency to a single limit.EmailConfigfor MSG fallback codepage: Configurable fallback codepage for MSG files lacking a codepage property.- Per-file extraction configuration (
FileExtractionConfig): Per-file config overrides in batch operations. - Opt-in single-column pseudo tables: New
allow_single_column_tablesonPdfConfig. - Experimental:
pdf_oxidetext extraction backend: Pure Rust PDF text extraction as an alternative to pdfium. - CLI
cache warmcommand: Eagerly downloads all PaddleOCR and layout detection models. - CLI
cache manifestcommand: Outputs a JSON manifest of all expected model files with SHA256 checksums. - ChunkSizing configuration:
sizing_type,sizing_model, andsizing_cache_dirfields exposed inChunkingConfigacross all bindings. - Chunk heading context: New
HeadingContexttype inChunkMetadataproviding heading level and text. ModelManifestEntrytype andmanifest()/ensure_all_models()methods: Public API for querying and eagerly downloading model cache manifests.- SF1 structural quality metrics in benchmark CI: SF1 quality scores now computed alongside TF1.
- Layout preset default: Changed from
"fast"to"accurate". TheFastvariant has been removed. - PaddleOCR default model tier: Changed from
"server"to"mobile". Mobile models provide equivalent quality on standard documents while being 3-5x faster. - PaddleOCR v2 models: All models updated to v2 generation (PP-OCRv5 detection, PP-LCNet classification, unified multilingual recognition).
- Unified multilingual recognition models: PP-OCRv5 unified server (84MB) and mobile (16.5MB) models replace per-script English and Chinese models.
- Batch API unification:
_with_configsbatch functions removed; per-fileFileExtractionConfigis now an optional parameter on the unified batch functions. - Layout pipeline no longer forces heuristic extraction: Structure tree extraction proceeds normally when layout detection is enabled.
- Global ONNX model caching: Layout detection and SLANet models are cached globally and reused across extractions.
- Vendored text embedding pipeline: Replaced
fastembeddependency with vendored engine using ONNX Runtime directly. - Embedding
embed()now takes&selfinstead of&mut self: Enables parallel embedding generation. - L2 normalization parallelized: Embedding batches >= 64 vectors now use multi-threaded normalization.
- Language-agnostic section pattern recognition: Headings ending with a period are now allowed when they match structural patterns.
- Layout classification guards: Heading overrides from the layout model now have word count limits, punctuation checks, and body-font-size validation.
- Strong typing across bindings: Replaced weak
Dictionary/Map/arraytypes with strongly typed config classes in C#, Java, and PHP.
fastembeddependency: Replaced by vendored embedding engine using ONNX Runtime directly.EmbeddingModelType::FastEmbedvariant: UsePresetorCustomvariants instead.
- C# FFI struct layout mismatch (#538): Fixed
CExtractionResultstruct layout between Rust and C# that made the C# library completely broken in 4.4.6. - PDF
force_ocrwithout explicit OCR config (#495):force_ocr=truewas silently ignored when noocrconfig block was provided. - PDF image extraction (#511): Extracted images returned raw compressed data instead of properly decoded image bytes.
- Node.js
extractFileInWorkermime_type passthrough (#523): MIME type was silently injected into PDF password config instead of being forwarded. - DOCX parser type inference failure (#519): The
zip8.2.0 dependency introduced type ambiguity in DOCX and XML parsers. - Python
py.typedand.pyimissing from sdist: Type stubs now included in both wheel and sdist. - PDF broken CMap word spacing: Geometric validation now vetoes false word boundaries in PDFs with broken font CMaps.
- PDF structure tree heading trust: Structure tree heading tags (H1-H6) are now trusted as author-intent metadata.
- PDF structure tree extraction performance: Text and style maps now built in a single pass.
- OCR Picture regions suppressing text: Layout-detected Picture regions now preserve embedded text.
- Non-transitive sort comparators: Spatial reading-order sorts now use discrete row buckets.
- Page furniture over-stripping: Added bulk and per-paragraph guards to prevent aggressive furniture stripping.
KREUZBERG_CACHE_DIRnot respected by all caches: All caches now honor the environment variable.- MSG PT_STRING8 encoding: MSG files now correctly decode ANSI string properties using the declared Windows code page.
- SLANet-Plus ONNX model: Re-exported with shape fix, resolving inference failures on macOS CoreML.
- TATR model panic in batch processing: Now falls back gracefully to heuristic table extraction.
- Docker musl builds: Alpine/musl Docker images now link against the system ONNX Runtime library.
- FFI batch functions null handling: C#/Java FFI batch functions now accept NULL for
file_config_jsons.
opentelemetry-otlp 0.31.1
- feat(OTLP): add tls-ring, tls-aws-lc, and tls-provider-agnostic feature flags [patch release v0.31.1] by @lalitb in https://github.com/open-telemetry/opentelemetry-rust/pull/3426
Full Changelog: https://github.com/open-telemetry/opentelemetry-rust/compare/v0.31.0...opentelemetry-otlp-0.31.1
v29.0.0
Surface::get_current_texture no longer returns Result<SurfaceTexture, SurfaceError>. Instead, it returns a single CurrentSurfaceTexture enum that represents all possible outcomes as variants. SurfaceError has been removed, and the suboptimal field on SurfaceTexture has been replaced by a dedicated Suboptimal variant.
match surface.get_current_texture() {
wgpu::CurrentSurfaceTexture::Success(frame) => { /* render */ }
wgpu::CurrentSurfaceTexture::Timeout
| wgpu::CurrentSurfaceTexture::Occluded => { /* skip frame */ }
wgpu::CurrentSurfaceTexture::Outdated
| wgpu::CurrentSurfaceTexture::Suboptimal(frame) => { /* reconfigure surface */ }
wgpu::CurrentSurfaceTexture::Lost => { /* reconfigure surface, or recreate device if device lost */ }
wgpu::CurrentSurfaceTexture::Validation => {
/* Only happens if there is a validation error and you
have registered a error scope or uncaptured error handler. */
}
}
By @cwfitzgerald, @Wumpf, and @emilk in #9141 and #9257.
A display handle represents a connection to the platform's display server (e.g. a Wayland or X11 connection on Linux). This is distinct from a window — a display handle is the system-level connection through which windows are created and managed.
InstanceDescriptor's convenience constructors (an implementation of Default and the static from_env_or_default method) have been removed. In their place are new static methods that force recognition of whether a display handle is used:
new_with_display_handlenew_with_display_handle_from_envnew_without_display_handlenew_without_display_handle_from_env
If you are using winit, this can be populated using EventLoop::owned_display_handle.
- InstanceDescriptor::default();
- InstanceDescriptor::from_env_or_default();
+ InstanceDescriptor::new_with_display_handle(Box::new(event_loop.owned_display_handle()));
+ InstanceDescriptor::new_with_display_handle_from_env(Box::new(event_loop.owned_display_handle()));
Additionally, DisplayHandle is now optional when creating a surface if a display handle was already passed to InstanceDescriptor. This means that once you've provided the display handle at instance creation time, you no longer need to pass it again for each surface you create.
By @MarijnS95 in #8782
This allows gaps in bind group layouts and adds full support for unbinding, bring us in compliance with the WebGPU spec. As a result of this PipelineLayoutDescriptor's bind_group_layouts field now has type of &[Option<&BindGroupLayout>]. To migrate wrap bind group layout references in Some:
let pl_desc = wgpu::PipelineLayoutDescriptor {
label: None,
bind_group_layouts: &[
- &bind_group_layout
+ Some(&bind_group_layout)
],
immediate_size: 0,
});
By @teoxoy in #9034.
wgpu now has a new MSRV policy. This release has an MSRV of 1.87. This is lower than v27's 1.88 and v28's 1.92. Going forward, we will only bump wgpu's MSRV if it has tangible benefits for the code, and we will never bump to an MSRV higher than stable - 3. So if stable is at 1.97 and 1.94 brought benefit to our code, we could bump it no higher than 1.94. As before, MSRV bumps will always be breaking changes.
By @cwfitzgerald in #8999.
To ensure memory safety when accessing mapped GPU memory, MapMode::Write buffer mappings (BufferViewMut and also QueueWriteBufferView) can no longer be dereferenced to Rust &mut [u8]. Instead, they must be used through the new pointer type wgpu::WriteOnly<[u8]>, which does not allow reading at all.
WriteOnly<[u8]> is designed to offer similar functionality to &mut [u8] and have almost no performance overhead, but you will probably need to make some changes for anything more complicated than get_mapped_range_mut().copy_from_slice(my_data); in particular, replacing view[start..end] with view.slice(start..end).
By @kpreid in #9042.
The depth_write_enabled and depth_compare members of DepthStencilState are now optional, and may be omitted when they do not apply, to match WebGPU.
depth_write_enabled is applicable, and must be Some, if format has a depth aspect, i.e., is a depth or depth/stencil format. Otherwise, a value of None best reflects that it does not apply, although Some(false) is also accepted.
depth_compare is applicable, and must be Some, if depth_write_enabled is Some(true), or if depth_fail_op for either stencil face is not Keep. Otherwise, a value of None best reflects that it does not apply, although Some(CompareFunction::Always) is also accepted.
There is also a new constructor DepthStencilState::stencil which may be used instead of a struct literal for stencil operations.
Example 1: A configuration that does a depth test and writes updated values:
depth_stencil: Some(wgpu::DepthStencilState {
format: wgpu::TextureFormat::Depth32Float,
- depth_write_enabled: true,
- depth_compare: wgpu::CompareFunction::Less,
+ depth_write_enabled: Some(true),
+ depth_compare: Some(wgpu::CompareFunction::Less),
stencil: wgpu::StencilState::default(),
bias: wgpu::DepthBiasState::default(),
}),
Example 2: A configuration with only stencil:
depth_stencil: Some(wgpu::DepthStencilState {
format: wgpu::TextureFormat::Stencil8,
- depth_write_enabled: false,
- depth_compare: wgpu::CompareFunction::Always,
+ depth_write_enabled: None,
+ depth_compare: None,
stencil: wgpu::StencilState::default(),
bias: wgpu::DepthBiasState::default(),
}),
Example 3: The previous example written using the new stencil() constructor:
depth_stencil: Some(wgpu::DepthStencilState::stencil(
wgpu::TextureFormat::Stencil8,
wgpu::StencilState::default(),
)),
Added support for loading a specific DirectX 12 Agility SDK runtime via the Independent Devices API. The Agility SDK lets applications ship a newer D3D12 runtime alongside their binary, unlocking the latest D3D12 features without waiting for an OS update.
Configure it programmatically:
let options = wgpu::Dx12BackendOptions {
agility_sdk: Some(wgpu::Dx12AgilitySDK {
sdk_version: 619,
sdk_path: "path/to/sdk/bin/x64".into(),
}),
..Default::default()
};
Or via environment variables:
WGPU_DX12_AGILITY_SDK_PATH=path/to/sdk/bin/x64
WGPU_DX12_AGILITY_SDK_VERSION=619
The sdk_version must match the version of the D3D12Core.dll in the provided path exactly, or loading will fail.
If the Agility SDK fails to load (e.g. version mismatch, missing DLL, or unsupported OS), wgpu logs a warning and falls back to the system D3D12 runtime.
By @cwfitzgerald in #9130.
WGSL shaders using @builtin(primitive_index) must now request it with enable primitive_index;. The SHADER_PRIMITIVE_INDEX feature has been renamed to PRIMITIVE_INDEX and moved from FeaturesWGPU to FeaturesWebGPU. By @inner-daemons in #8879 and @andyleiserson in #9101.
- device.features().contains(wgpu::FeaturesWGPU::SHADER_PRIMITIVE_INDEX)
+ device.features().contains(wgpu::FeaturesWebGPU::PRIMITIVE_INDEX)
// WGSL shaders must now include this directive:
enable primitive_index;
Migrated from the max_inter_stage_shader_components limit to max_inter_stage_shader_variables, following the latest WebGPU spec. Components counted individual scalars (e.g. a vec4 = 4 components), while variables counts locations (e.g. a vec4 = 1 variable). This changes validation in a way that should not affect most programs. By @ErichDonGubler in #8652, #8792.
- limits.max_inter_stage_shader_components
+ limits.max_inter_stage_shader_variables
- Use clearer field names for
StageError::InvalidWorkgroupSize. By @ErichDonGubler in #9192.
- Added TLAS binding array support via
ACCELERATION_STRUCTURE_BINDING_ARRAY. By @kvark in #8923. - Added
wgpu-naga-bridgecrate with conversions betweennagaandwgpu-types(features to capabilities, storage format mapping, shader stage mapping). By @atlv24 in #9201. - Added support for cooperative load/store operations in shaders. Currently only WGSL on the input and SPIR-V, METAL, and WGSL on the output are supported. By @kvark in #8251.
- Added support for per-vertex attributes in fragment shaders. Currently only WGSL input is supported, and only SPIR-V or WGSL output is supported. By @atlv24 in #8821.
- Added support for no-perspective barycentric coordinates. By @atlv24 in #8852.
- Added support for obtaining
AdapterInfofromDevice. By @sagudev in #8807. - Added
Limits::or_worse_values_from. By @atlv24 in #8870. - Added
Features::FLOAT32_BLENDABLEon Vulkan and Metal. By @timokoesters in #8963 and @andyleiserson in #9032. - Added
Dx12BackendOptions::force_shader_modelto allow using advanced features in passthrough shaders without bundling DXC. By @inner-daemons in #8984. - Changed passthrough shaders to not require an entry point parameter, so that the same shader module may be used in multiple entry points. Also added support for metallib passthrough. By @inner-daemons in #8886.
- Added
Dx12Compiler::Autoto automatically use static or dynamic DXC if available, before falling back to FXC. By @inner-daemons in #8882. - Added support for
insert_debug_marker,push_debug_groupandpop_debug_groupon WebGPU. By @evilpie in #9017. - Added support for
@builtin(draw_index)to the vulkan backend. By @inner-daemons in #8883. - Added
TextureFormat::channelsmethod to get some information about which color channels are covered by the texture format. By @TornaxO7 in #9167 - BREAKING: Add
V6_8variant toDxcShaderModelandnaga::back::hlsl::ShaderModel. By @inner-daemons in #8882 and @ErichDonGubler in #9083. - BREAKING: Add
V6_9variant toDxcShaderModelandnaga::back::hlsl::ShaderModel. By @ErichDonGubler in #9083.
- Initial wgsl-in ray tracing pipelines. By @Vecvec in #8570.
- wgsl-out ray tracing pipelines. By @Vecvec in #8970.
- Allow parsing shaders which make use of
SPV_KHR_non_semantic_infofor debug info. Also removesnaga::front::spv::SUPPORTED_EXT_SETS. By @inner-daemons in #8827. - Added memory decorations for storage buffers:
coherent, supported on all native backends, andvolatile, only on Vulkan and GL. By @atlv24 in #9168. - Made the following available in
constcontexts; by @ErichDonGubler in #8943:nagaArena::lenArena::is_emptyRange::first_and_lastfront::wgsl::Frontend::set_optionsir::Block::is_emptyir::Block::len
- Added
GlDebugFnsoption inGlBackendOptionsto control OpenGL debug functions (glPushDebugGroup,glPopDebugGroup,glObjectLabel, etc.). Automatically disables them on Mali GPUs to work around a driver crash. By @Xavientois in #8931.
- Added support for
insert_debug_marker,push_debug_groupandpop_debug_group. By @evilpie in #9017. - Added support for
begin_occlusion_queryandend_occlusion_query. By @evilpie in #9039.
- Tracing now uses the
.metalextension for metal source files, instead of.msl. By @inner-daemons in #8880. - BREAKING: Several error APIs were changed by @ErichDonGubler in #9073 and #9205:
BufferAccessError:- Split the
OutOfBoundsOverrunvariant into newOutOfBoundsStartOffsetOverrunandOutOfBoundsEndOffsetOverrunvariants. - Removed the
NegativeRangevariant in favor of newMapStartOffsetUnderrunandMapStartOffsetOverrunvariants.
- Split the
- Split the
TransferError::BufferOverrunvariant into newBufferStartOffsetOverrunandBufferEndOffsetOverrunvariants. ImmediateUploadError:- Removed the
TooLargevariant in favor of newStartOffsetOverrunandEndOffsetOverrunvariants. - Removed the
Unalignedvariant in favor of newStartOffsetUnalignedandSizeUnalignedvariants. - Added the
ValueStartIndexOverrunandValueEndIndexOverruninvariants
- Removed the
- The various "max resources per stage" limits are now capped at 100, so that their total remains below
max_bindings_per_bind_group, as required by WebGPU. By @andyleiserson in #9118. - The
max_uniform_buffer_binding_sizeandmax_storage_buffer_binding_sizelimits are nowu64instead ofu32, to match WebGPU. By @wingertge in #9146. - The main 3 native backends now report their limits properly. By @teoxoy in #9196.
- Naga and
wgpunow reject shaders with anenabledirective for functionality that is not available, even if that functionality is not used by the shader. By @andyleiserson in #8913. - Prevent UB from incorrectly using ray queries on HLSL. By @Vecvec in #8763.
- Added support for dual-source blending in SPIR-V shaders. By @andyleiserson in #8865.
- Added
supported_capabilitiesto all backends. By @inner-daemons in #9068. - Updated codespan-reporting to 0.13. By @cwfitzgerald in #9243.
- Use autogenerated
objc2bindings internally, which should resolve a lot of leaks and unsoundness. By @madsmtm in #5641. - Implements ray-tracing acceleration structures for metal backend. By @lichtso in #8071.
- Remove mutex for
MTLCommandQueuebecause the Metal object is thread-safe. By @andyleiserson in #9217.
- Expose the
GPU.wgslLanguageFeaturesproperty. By @andyleiserson in #8884. GPUFeatureNamenow includes allwgpuextensions. Feature names for extensions should be written with awgpu-prefix, although unprefixed names that were accepted previously are still accepted. By @andyleiserson in #9163.
- Make ordered texture and buffer uses hal specific. By @NiklasEi in #8924.
- Tracing support has been restored. By @andyleiserson in #8429.
- Pipelines using passthrough shaders now correctly require explicit pipeline layout. By @inner-daemons in #8881.
- Allow using a shader that defines I/O for dual-source blending in a pipeline that does not make use of it. By @andyleiserson in #8856.
- Improve validation of dual-source blending, by @andyleiserson in #9200:
- Validate structs with
@blend_srcmembers whether or not they are used by an entry point. - Dual-source blending is not supported when there are multiple color attachments.
TypeFlags::IO_SHAREABLEis not set for structs other than@blend_srcstructs.
- Validate structs with
- Validate
strip_index_formatisn't None and equals index buffer format for indexed drawing with strip topology. By @beicause in #8850. - BREAKING: Renamed
EXPERIMENTAL_PASSTHROUGH_SHADERStoPASSTHROUGH_SHADERSand made this no longer an experimental feature. By @inner-daemons in #9054. - BREAKING: End offsets in trace and
playercommands are now represented usingoffset+sizeinstead. By @ErichDonGubler in #9073. - Validate some uncaught cases where buffer transfer operations could overflow when computing an end offset. By @ErichDonGubler in #9073.
- Fix
local_invocation_idandlocal_invocation_indexbeing written multiple times in HLSL/MSL backends, and naming conflicts when users name variables__local_invocation_idor__local_invocation_index. By @inner-daemons in #9099. - Added internal labels to validation GPU objects and timestamp normalization code to improve clarity in graphics debuggers. By @szostid in #9094
- Fix multi-planar texture copying. By @noituri in #9069
- The validator checks that override-sized arrays have a positive size, if overrides have been resolved. By @andyleiserson in #8822.
- Fix some cases where f16 constants were not working. By @andyleiserson in #8816.
- Use wrapping arithmetic when evaluating constant expressions involving
u32. By @andyleiserson in #8912. - Fix missing side effects from sequence expressions in GLSL. By @Vipitis in #8787.
- Naga now enforces the
@must_useattribute on WGSL built-in functions, when applicable. You can waive the error with a phony assignment, e.g.,_ = subgroupElect(). By @andyleiserson in #8713. - Reject zero-value construction of a runtime-sized array with a validation error. Previously it would crash in the HLSL backend. By @mooori in #8741.
- Reject splat vector construction if the argument type does not match the type of the vector's scalar. Previously it would succeed. By @mooori in #8829.
- Fixed
workgroupUniformLoadincorrectly returning an atomic when called on an atomic, it now returns the innerTas per the spec. By @cryvosh in #8791. - Fixed constant evaluation for
sign()builtin to return zero when the argument is zero. By @mandryskowski in #8942. - Allow array generation to compile with the macOS 10.12 Metal compiler. By @madsmtm in #8953
- Naga now detects bitwise shifts by a constant exceeding the operand bit width at compile time, and disallows scalar-by-vector and vector-by-scalar shifts in constant evaluation. By @andyleiserson in #8907.
- Naga uses wrapping arithmetic when evaluating dot products on concrete integer types (
u32andi32). By @BKDaugherty in #9142. - Disallow negation of a matrix in WGSL. By @andyleiserson in #9157.
- Fix evaluation order of compound assignment (e.g.
+=) LHS and RHS. By @andyleiserson in #9181. - Fixed invalid MSL when
float16-format vertex input data was accessed via anf16-type variable in a vertex shader. By @andyleiserson in #9166.
- Fixed validation of the texture format in GPUDepthStencilState when neither depth nor stencil is actually enabled. By @andyleiserson in #8766.
- Check that depth bias is not used with non-triangle topologies. By @andyleiserson in #8856.
- Check that if the shader outputs
frag_depth, then the pipeline must have a depth attachment. By @andyleiserson in #8856. - Fix incorrect acceptance of some swizzle selectors that are not valid for their operand, e.g.
const v = vec2<i32>(); let r = v.xyz. By @andyleiserson in #8949. - Fixed calculation of the total number of bindings in a pipeline layout when validating against device limits. By @andyleiserson in #8997.
- Reject non-constructible types (runtime- and override-sized arrays, and structs containing non-constructible types) in more places where they should not be allowed. By @andyleiserson in #8873.
- The query set type for an occlusion query is now validated when opening the render pass, in addition to within the call to
beginOcclusionQuery. By @andyleiserson in #9086. - Require that the blend factor is
Onewhen the blend operation isMinorMax. TheBlendFactorOnUnsupportedTargeterror is now reported withinColorStateErrorrather than directly inCreateRenderPipelineError. By @andyleiserson in #9110.
- Fixed a variety of mesh shader SPIR-V writer issues from the original implementation. By @inner-daemons in #8756
- Offset the vertex buffer device address when building a BLAS instead of using the
first_vertexfield. By @Vecvec in #9220 - Remove incorrect ordered texture uses. By @NiklasEi in #8924.
- Fix one-second delay when switching a wgpu app to the foreground. By @emilk in #9141
- Work around Metal driver bug with atomic textures. By @atlv24 in #9185
- Fix setting an immediate for a Mesh shader. By @waywardmonkeys in #9254
DisplayHandleshould now be passed toInstanceDescriptorfor correct EGL initialization on Wayland. By @MarijnS95 in #8012 Note that the existing workaround to create surfaces before the adapter is no longer valid.- Changing shader constants now correctly recompiles the shader. By @DerSchmale in #8291.
- The GL backend would now try to take advantage of
GL_EXT_multisampled_render_to_textureextension when applicable to skip the multi-sample resolve operation. By @opstic in #8536.
- Expanded documentation of
QuerySet,QueryType, andresolve_query_set()describing how to use queries. By @kpreid in #8776.