8 hours ago
moka

Moka 0.12.15

Version 0.12.15

Fixed

  • Fixed a bug where re-inserting an expired entry could cause it to lose its expiration time and remain in the cache indefinitely when using a custom Expiry policy with per-entry expiration. (#582 by @jiangzhe, #581 by @atrocities, reported in #575):
    • This occurred when an entry that had expired but not yet been evicted was re-inserted, and expire_after_update returned None. This primarily affected users who only override expire_after_create, since the default expire_after_update returns duration_until_expiry, which is None for expired entries.
    • This bug was introduced by the changes in v0.12.13 (#549 and #564).
    • Subtle behavior change:
      • Before this fix, re-inserting an expired entry was treated as an update, so Expiry::expire_after_update was called.
      • After this fix, re-inserting an expired entry is treated as a creation, so Expiry::expire_after_create is called instead.
      • This may change the expiration time of re-inserted entries, depending on your Expiry trait implementation.
  • Fixed flaky tests cht::segment::tests::drop_many_values and drop_many_values_concurrent that were failing on high-core-count machines (#586):
    • These tests were using a CPU-dependent segment count, causing inconsistent bucket array shrinking behavior of the internal segmented hash map across different machines.
    • Changed these tests to use a fixed segment count (4) for consistent results.

Changed

  • Disabled flaky GC-dependent tests by default using run_flaky_tests cfg (#584):
    • These tests rely on epoch-based garbage collection (crossbeam-epoch) timing that is not guaranteed, causing intermittent failures.
    • Fixed #539 and #580.
    • To run these tests, set RUSTFLAGS='--cfg run_flaky_tests'.
10 hours ago
kreuzberg

v4.5.2

Fixed

  • PDF word splitting in extracted text: Pdfium's text extraction inserted spurious spaces mid-word (e.g. "s hall a b e active" instead of "shall be active"). Added selective page-level respacing: pages with detected broken word spacing are re-extracted using character-level gap analysis (font_size × 0.33 threshold). Clean pages use the fast single-call path. Reduces garbled lines from 406 to 0 on the ISO 21111-10 test document with no performance impact.
  • Markdown underscore escaping: Underscores in extracted text (e.g. CTC_ARP_01) were incorrectly escaped as CTC\_ARP\_01 throughout the markdown output. Underscore escaping has been removed entirely since extracted PDF text contains literal identifiers, not markdown formatting.
  • Page header/footer leakage: Running headers like ISO 21111-10:2021(E) and copyright footers leaked into the document body. Added fuzzy alphanumeric matching to detect repeated header/footer text even when spacing or character extraction varies across pages.
  • R batch function spurious NULL argument: R wrapper batch functions passed an extra NULL positional argument to native Rust functions, causing "unused argument" errors on all batch operations.
  • Elixir Windows ORT DLL staging: ONNX Runtime DLL was only staged in target/release/ but not in priv/native/ where the BEAM VM loads NIFs. OCR/layout/embedding features now work correctly on Windows CI.

Added

  • General extraction result caching: All file types (PDF, Office, HTML, archives, etc.) are now cached — not just OCR results. Repeated extractions of the same file with the same config return instantly from cache.
  • Cache namespace isolation: New cache_namespace field on ExtractionConfig enables multi-tenant cache isolation on shared filesystems. Available via --cache-namespace CLI flag and across all language bindings.
  • Per-request cache TTL: New cache_ttl_secs field on ExtractionConfig overrides the global TTL for individual extractions. Set to 0 to skip cache entirely. Available via --cache-ttl-secs CLI flag.
  • Cache namespace deletion: delete_namespace() removes all cache entries under a namespace. get_stats_filtered() returns per-namespace statistics.
  • Multi-worker cleanup safety: Cache cleanup no longer triggers excessively when multiple worker pods share the same cache directory.
  • Bundled eng.traineddata: English OCR works out of the box with zero runtime configuration (~4MB bundled at build time).
  • Tessdata in cache warm: kreuzberg-cli cache warm now downloads all tessdata_fast language files (~120 languages) to KREUZBERG_CACHE_DIR/tessdata/, giving full Tesseract language support without system packages.
  • Tessdata in cache manifest: kreuzberg-cli cache manifest now includes all tessdata files with source URLs, enabling --sync-cache to download tessdata alongside models.
  • KREUZBERG_CACHE_DIR/tessdata resolution: resolve_tessdata_path() now checks KREUZBERG_CACHE_DIR/tessdata and the bundled build path before falling back to system paths.
  • CLI embed command: Generate vector embeddings from text via kreuzberg embed --text "..." --preset balanced.
  • CLI chunk command: Split text into chunks via kreuzberg chunk --text "..." --chunk-size 512.
  • CLI completions command: Generate shell completions for bash, zsh, fish, powershell.
  • CLI --log-level global flag: Override RUST_LOG via kreuzberg --log-level debug extract doc.pdf.
  • CLI extraction overrides: 27 flags exposed via ExtractionOverrides struct with #[command(flatten)].
  • CLI colored output: Text output uses anstyle for colored headers, labels, success values, and dim separators. Respects NO_COLOR env var.
  • API POST /detect, GET /version, GET /cache/manifest, POST /cache/warm: New REST endpoints.
  • MCP get_version, cache_manifest, cache_warm, embed_text, chunk_text: New MCP tools.
  • Pipeline table extraction tracing: Zero-cost tracing::trace! and tracing::debug! logging throughout layout detection and table extraction.
  • TATR model availability check: Layout detection returns an error if table regions are detected but the TATR model is unavailable.

Changed

  • CLI batch flags: Batch command now supports all extraction override flags via shared ExtractionOverrides struct.
  • CLI config architecture: Replaced 13-parameter function with ExtractionOverrides struct using #[command(flatten)].
  • MCP tool architecture: Removed dead tools/ trait-based duplicates; all tools implemented directly in server.rs.

Improved

  • CLI validation: OCR backend values, chunk size/overlap bounds, DPI range, layout confidence validated.
  • API validation: Embedding preset names and chunk bounds checked.
  • MCP validation: Empty paths rejected, chunk bounds checked, embedding preset validated.
  • Chunk overlap auto-clamping: When --chunk-size is smaller than default overlap, overlap is automatically clamped to size/4.

See full changelog: https://github.com/kreuzberg-dev/kreuzberg/blob/main/CHANGELOG.md

1 days ago
kreuzberg

v4.5.1

See CHANGELOG.md for release notes.

1 days ago
kreuzberg

Benchmark Results 2026-03-21 (d062479)

Comparative benchmark results from workflow run 23359982805.

Commit: d0624792f343e8dab8c7468dcc2a2c4930741157 Date: 2026-03-21

1 days ago
kreuzberg

v4.5.0

See CHANGELOG.md for full release notes.

1 days ago
kreuzberg

v4.5.0

See CHANGELOG.md for full release notes.

1 days ago
kreuzberg

v4.5.0

Added

  • ONNX-based document layout detection: New layout config field enables document layout analysis using RT-DETR v2 with 17 element classes. Supports "fast" and "accurate" presets with auto-downloaded models. Available across all language bindings.
  • SLANet table structure recognition: Detected Table regions are processed by SLANet-plus for neural HTML structure recovery, producing markdown tables with colspan/rowspan support.
  • Layout-enhanced heading detection: Layout model SectionHeader and Title regions guide heading detection in both structure tree and heuristic extraction.
  • Multi-backend OCR pipeline: New OcrPipelineConfig enables quality-based fallback across OCR backends (e.g., Tesseract then PaddleOCR) with configurable priority, language, and backend-specific settings.
  • OCR quality thresholds: New OcrQualityThresholds config with 16 tunable parameters for OCR output quality assessment and fallback decisions.
  • OCR auto-rotate: New OcrConfig.auto_rotate flag (default: false) for automatic page rotation detection. Handles 0/90/180/270 degree rotations.
  • PaddleOCR v2 model tier system: New model_tier field with "mobile" (default, ~21MB, fast) and "server" (~172MB, highest accuracy). Both use unified multilingual models.
  • AccelerationConfig for GPU/execution provider control: Fine-grained control over ONNX execution providers (CPU, CoreML, CUDA, TensorRT) for layout detection and table recognition. Typed across all bindings.
  • ConcurrencyConfig for thread limiting: New max_threads field caps Rayon, ONNX intra-op threads, and batch concurrency to a single limit.
  • EmailConfig for MSG fallback codepage: Configurable fallback codepage for MSG files lacking a codepage property.
  • Per-file extraction configuration (FileExtractionConfig): Per-file config overrides in batch operations.
  • Opt-in single-column pseudo tables: New allow_single_column_tables on PdfConfig.
  • Experimental: pdf_oxide text extraction backend: Pure Rust PDF text extraction as an alternative to pdfium.
  • CLI cache warm command: Eagerly downloads all PaddleOCR and layout detection models.
  • CLI cache manifest command: Outputs a JSON manifest of all expected model files with SHA256 checksums.
  • ChunkSizing configuration: sizing_type, sizing_model, and sizing_cache_dir fields exposed in ChunkingConfig across all bindings.
  • Chunk heading context: New HeadingContext type in ChunkMetadata providing heading level and text.
  • ModelManifestEntry type and manifest() / ensure_all_models() methods: Public API for querying and eagerly downloading model cache manifests.
  • SF1 structural quality metrics in benchmark CI: SF1 quality scores now computed alongside TF1.

Changed

  • Layout preset default: Changed from "fast" to "accurate". The Fast variant has been removed.
  • PaddleOCR default model tier: Changed from "server" to "mobile". Mobile models provide equivalent quality on standard documents while being 3-5x faster.
  • PaddleOCR v2 models: All models updated to v2 generation (PP-OCRv5 detection, PP-LCNet classification, unified multilingual recognition).
  • Unified multilingual recognition models: PP-OCRv5 unified server (84MB) and mobile (16.5MB) models replace per-script English and Chinese models.
  • Batch API unification: _with_configs batch functions removed; per-file FileExtractionConfig is now an optional parameter on the unified batch functions.
  • Layout pipeline no longer forces heuristic extraction: Structure tree extraction proceeds normally when layout detection is enabled.
  • Global ONNX model caching: Layout detection and SLANet models are cached globally and reused across extractions.
  • Vendored text embedding pipeline: Replaced fastembed dependency with vendored engine using ONNX Runtime directly.
  • Embedding embed() now takes &self instead of &mut self: Enables parallel embedding generation.
  • L2 normalization parallelized: Embedding batches >= 64 vectors now use multi-threaded normalization.
  • Language-agnostic section pattern recognition: Headings ending with a period are now allowed when they match structural patterns.
  • Layout classification guards: Heading overrides from the layout model now have word count limits, punctuation checks, and body-font-size validation.
  • Strong typing across bindings: Replaced weak Dictionary/Map/array types with strongly typed config classes in C#, Java, and PHP.

Removed

  • fastembed dependency: Replaced by vendored embedding engine using ONNX Runtime directly.
  • EmbeddingModelType::FastEmbed variant: Use Preset or Custom variants instead.

Fixed

  • C# FFI struct layout mismatch (#538): Fixed CExtractionResult struct layout between Rust and C# that made the C# library completely broken in 4.4.6.
  • PDF force_ocr without explicit OCR config (#495): force_ocr=true was silently ignored when no ocr config block was provided.
  • PDF image extraction (#511): Extracted images returned raw compressed data instead of properly decoded image bytes.
  • Node.js extractFileInWorker mime_type passthrough (#523): MIME type was silently injected into PDF password config instead of being forwarded.
  • DOCX parser type inference failure (#519): The zip 8.2.0 dependency introduced type ambiguity in DOCX and XML parsers.
  • Python py.typed and .pyi missing from sdist: Type stubs now included in both wheel and sdist.
  • PDF broken CMap word spacing: Geometric validation now vetoes false word boundaries in PDFs with broken font CMaps.
  • PDF structure tree heading trust: Structure tree heading tags (H1-H6) are now trusted as author-intent metadata.
  • PDF structure tree extraction performance: Text and style maps now built in a single pass.
  • OCR Picture regions suppressing text: Layout-detected Picture regions now preserve embedded text.
  • Non-transitive sort comparators: Spatial reading-order sorts now use discrete row buckets.
  • Page furniture over-stripping: Added bulk and per-paragraph guards to prevent aggressive furniture stripping.
  • KREUZBERG_CACHE_DIR not respected by all caches: All caches now honor the environment variable.
  • MSG PT_STRING8 encoding: MSG files now correctly decode ANSI string properties using the declared Windows code page.
  • SLANet-Plus ONNX model: Re-exported with shape fix, resolving inference failures on macOS CoreML.
  • TATR model panic in batch processing: Now falls back gracefully to heuristic table extraction.
  • Docker musl builds: Alpine/musl Docker images now link against the system ONNX Runtime library.
  • FFI batch functions null handling: C#/Java FFI batch functions now accept NULL for file_config_jsons.
2 days ago
delta
2 days ago
opentelemetry-rust

opentelemetry-otlp 0.31.1

What's Changed

Full Changelog: https://github.com/open-telemetry/opentelemetry-rust/compare/v0.31.0...opentelemetry-otlp-0.31.1

3 days ago
wgpu

v29.0.0

Major Changes

Surface::get_current_texture now returns CurrentSurfaceTexture enum

Surface::get_current_texture no longer returns Result<SurfaceTexture, SurfaceError>. Instead, it returns a single CurrentSurfaceTexture enum that represents all possible outcomes as variants. SurfaceError has been removed, and the suboptimal field on SurfaceTexture has been replaced by a dedicated Suboptimal variant.

match surface.get_current_texture() {
    wgpu::CurrentSurfaceTexture::Success(frame) => { /* render */ }
    wgpu::CurrentSurfaceTexture::Timeout
      | wgpu::CurrentSurfaceTexture::Occluded => { /* skip frame */ }
    wgpu::CurrentSurfaceTexture::Outdated
      | wgpu::CurrentSurfaceTexture::Suboptimal(frame) => { /* reconfigure surface */ }
    wgpu::CurrentSurfaceTexture::Lost => { /* reconfigure surface, or recreate device if device lost */ }
    wgpu::CurrentSurfaceTexture::Validation => {
        /* Only happens if there is a validation error and you
           have registered a error scope or uncaptured error handler. */
    }
}

By @cwfitzgerald, @Wumpf, and @emilk in #9141 and #9257.

InstanceDescriptor initialization APIs and display handle changes

A display handle represents a connection to the platform's display server (e.g. a Wayland or X11 connection on Linux). This is distinct from a window — a display handle is the system-level connection through which windows are created and managed.

InstanceDescriptor's convenience constructors (an implementation of Default and the static from_env_or_default method) have been removed. In their place are new static methods that force recognition of whether a display handle is used:

  • new_with_display_handle
  • new_with_display_handle_from_env
  • new_without_display_handle
  • new_without_display_handle_from_env

If you are using winit, this can be populated using EventLoop::owned_display_handle.

- InstanceDescriptor::default();
- InstanceDescriptor::from_env_or_default();
+ InstanceDescriptor::new_with_display_handle(Box::new(event_loop.owned_display_handle()));
+ InstanceDescriptor::new_with_display_handle_from_env(Box::new(event_loop.owned_display_handle()));

Additionally, DisplayHandle is now optional when creating a surface if a display handle was already passed to InstanceDescriptor. This means that once you've provided the display handle at instance creation time, you no longer need to pass it again for each surface you create.

By @MarijnS95 in #8782

Bind group layouts now optional in PipelineLayoutDescriptor

This allows gaps in bind group layouts and adds full support for unbinding, bring us in compliance with the WebGPU spec. As a result of this PipelineLayoutDescriptor's bind_group_layouts field now has type of &[Option<&BindGroupLayout>]. To migrate wrap bind group layout references in Some:

  let pl_desc = wgpu::PipelineLayoutDescriptor {
      label: None,
      bind_group_layouts: &[
-         &bind_group_layout
+         Some(&bind_group_layout)
      ],
      immediate_size: 0,
  });

By @teoxoy in #9034.

MSRV update

wgpu now has a new MSRV policy. This release has an MSRV of 1.87. This is lower than v27's 1.88 and v28's 1.92. Going forward, we will only bump wgpu's MSRV if it has tangible benefits for the code, and we will never bump to an MSRV higher than stable - 3. So if stable is at 1.97 and 1.94 brought benefit to our code, we could bump it no higher than 1.94. As before, MSRV bumps will always be breaking changes.

By @cwfitzgerald in #8999.

WriteOnly

To ensure memory safety when accessing mapped GPU memory, MapMode::Write buffer mappings (BufferViewMut and also QueueWriteBufferView) can no longer be dereferenced to Rust &mut [u8]. Instead, they must be used through the new pointer type wgpu::WriteOnly<[u8]>, which does not allow reading at all.

WriteOnly<[u8]> is designed to offer similar functionality to &mut [u8] and have almost no performance overhead, but you will probably need to make some changes for anything more complicated than get_mapped_range_mut().copy_from_slice(my_data); in particular, replacing view[start..end] with view.slice(start..end).

By @kpreid in #9042.

Depth/stencil state changes

The depth_write_enabled and depth_compare members of DepthStencilState are now optional, and may be omitted when they do not apply, to match WebGPU.

depth_write_enabled is applicable, and must be Some, if format has a depth aspect, i.e., is a depth or depth/stencil format. Otherwise, a value of None best reflects that it does not apply, although Some(false) is also accepted.

depth_compare is applicable, and must be Some, if depth_write_enabled is Some(true), or if depth_fail_op for either stencil face is not Keep. Otherwise, a value of None best reflects that it does not apply, although Some(CompareFunction::Always) is also accepted.

There is also a new constructor DepthStencilState::stencil which may be used instead of a struct literal for stencil operations.

Example 1: A configuration that does a depth test and writes updated values:

 depth_stencil: Some(wgpu::DepthStencilState {
     format: wgpu::TextureFormat::Depth32Float,
-    depth_write_enabled: true,
-    depth_compare: wgpu::CompareFunction::Less,
+    depth_write_enabled: Some(true),
+    depth_compare: Some(wgpu::CompareFunction::Less),
     stencil: wgpu::StencilState::default(),
     bias: wgpu::DepthBiasState::default(),
 }),

Example 2: A configuration with only stencil:

 depth_stencil: Some(wgpu::DepthStencilState {
     format: wgpu::TextureFormat::Stencil8,
-    depth_write_enabled: false,
-    depth_compare: wgpu::CompareFunction::Always,
+    depth_write_enabled: None,
+    depth_compare: None,
     stencil: wgpu::StencilState::default(),
     bias: wgpu::DepthBiasState::default(),
 }),

Example 3: The previous example written using the new stencil() constructor:

depth_stencil: Some(wgpu::DepthStencilState::stencil(
    wgpu::TextureFormat::Stencil8,
    wgpu::StencilState::default(),
)),

D3D12 Agility SDK support

Added support for loading a specific DirectX 12 Agility SDK runtime via the Independent Devices API. The Agility SDK lets applications ship a newer D3D12 runtime alongside their binary, unlocking the latest D3D12 features without waiting for an OS update.

Configure it programmatically:

let options = wgpu::Dx12BackendOptions {
    agility_sdk: Some(wgpu::Dx12AgilitySDK {
        sdk_version: 619,
        sdk_path: "path/to/sdk/bin/x64".into(),
    }),
    ..Default::default()
};

Or via environment variables:

WGPU_DX12_AGILITY_SDK_PATH=path/to/sdk/bin/x64
WGPU_DX12_AGILITY_SDK_VERSION=619

The sdk_version must match the version of the D3D12Core.dll in the provided path exactly, or loading will fail.

If the Agility SDK fails to load (e.g. version mismatch, missing DLL, or unsupported OS), wgpu logs a warning and falls back to the system D3D12 runtime.

By @cwfitzgerald in #9130.

primitive_index is now a WGSL enable extension

WGSL shaders using @builtin(primitive_index) must now request it with enable primitive_index;. The SHADER_PRIMITIVE_INDEX feature has been renamed to PRIMITIVE_INDEX and moved from FeaturesWGPU to FeaturesWebGPU. By @inner-daemons in #8879 and @andyleiserson in #9101.

- device.features().contains(wgpu::FeaturesWGPU::SHADER_PRIMITIVE_INDEX)
+ device.features().contains(wgpu::FeaturesWebGPU::PRIMITIVE_INDEX)
// WGSL shaders must now include this directive:
enable primitive_index;

maxInterStageShaderComponents replaced by maxInterStageShaderVariables

Migrated from the max_inter_stage_shader_components limit to max_inter_stage_shader_variables, following the latest WebGPU spec. Components counted individual scalars (e.g. a vec4 = 4 components), while variables counts locations (e.g. a vec4 = 1 variable). This changes validation in a way that should not affect most programs. By @ErichDonGubler in #8652, #8792.

- limits.max_inter_stage_shader_components
+ limits.max_inter_stage_shader_variables

Other Breaking Changes

  • Use clearer field names for StageError::InvalidWorkgroupSize. By @ErichDonGubler in #9192.

New Features

General

  • Added TLAS binding array support via ACCELERATION_STRUCTURE_BINDING_ARRAY. By @kvark in #8923.
  • Added wgpu-naga-bridge crate with conversions between naga and wgpu-types (features to capabilities, storage format mapping, shader stage mapping). By @atlv24 in #9201.
  • Added support for cooperative load/store operations in shaders. Currently only WGSL on the input and SPIR-V, METAL, and WGSL on the output are supported. By @kvark in #8251.
  • Added support for per-vertex attributes in fragment shaders. Currently only WGSL input is supported, and only SPIR-V or WGSL output is supported. By @atlv24 in #8821.
  • Added support for no-perspective barycentric coordinates. By @atlv24 in #8852.
  • Added support for obtaining AdapterInfo from Device. By @sagudev in #8807.
  • Added Limits::or_worse_values_from. By @atlv24 in #8870.
  • Added Features::FLOAT32_BLENDABLE on Vulkan and Metal. By @timokoesters in #8963 and @andyleiserson in #9032.
  • Added Dx12BackendOptions::force_shader_model to allow using advanced features in passthrough shaders without bundling DXC. By @inner-daemons in #8984.
  • Changed passthrough shaders to not require an entry point parameter, so that the same shader module may be used in multiple entry points. Also added support for metallib passthrough. By @inner-daemons in #8886.
  • Added Dx12Compiler::Auto to automatically use static or dynamic DXC if available, before falling back to FXC. By @inner-daemons in #8882.
  • Added support for insert_debug_marker, push_debug_group and pop_debug_group on WebGPU. By @evilpie in #9017.
  • Added support for @builtin(draw_index) to the vulkan backend. By @inner-daemons in #8883.
  • Added TextureFormat::channels method to get some information about which color channels are covered by the texture format. By @TornaxO7 in #9167
  • BREAKING: Add V6_8 variant to DxcShaderModel and naga::back::hlsl::ShaderModel. By @inner-daemons in #8882 and @ErichDonGubler in #9083.
  • BREAKING: Add V6_9 variant to DxcShaderModel and naga::back::hlsl::ShaderModel. By @ErichDonGubler in #9083.

naga

  • Initial wgsl-in ray tracing pipelines. By @Vecvec in #8570.
  • wgsl-out ray tracing pipelines. By @Vecvec in #8970.
  • Allow parsing shaders which make use of SPV_KHR_non_semantic_info for debug info. Also removes naga::front::spv::SUPPORTED_EXT_SETS. By @inner-daemons in #8827.
  • Added memory decorations for storage buffers: coherent, supported on all native backends, and volatile, only on Vulkan and GL. By @atlv24 in #9168.
  • Made the following available in const contexts; by @ErichDonGubler in #8943:
    • naga
      • Arena::len
      • Arena::is_empty
      • Range::first_and_last
      • front::wgsl::Frontend::set_options
      • ir::Block::is_empty
      • ir::Block::len

GLES

  • Added GlDebugFns option in GlBackendOptions to control OpenGL debug functions (glPushDebugGroup, glPopDebugGroup, glObjectLabel, etc.). Automatically disables them on Mali GPUs to work around a driver crash. By @Xavientois in #8931.

WebGPU

  • Added support for insert_debug_marker, push_debug_group and pop_debug_group. By @evilpie in #9017.
  • Added support for begin_occlusion_query and end_occlusion_query. By @evilpie in #9039.

Changes

General

  • Tracing now uses the .metal extension for metal source files, instead of .msl. By @inner-daemons in #8880.
  • BREAKING: Several error APIs were changed by @ErichDonGubler in #9073 and #9205:
    • BufferAccessError:
      • Split the OutOfBoundsOverrun variant into new OutOfBoundsStartOffsetOverrun and OutOfBoundsEndOffsetOverrun variants.
      • Removed the NegativeRange variant in favor of new MapStartOffsetUnderrun and MapStartOffsetOverrun variants.
    • Split the TransferError::BufferOverrun variant into new BufferStartOffsetOverrun and BufferEndOffsetOverrun variants.
    • ImmediateUploadError:
      • Removed the TooLarge variant in favor of new StartOffsetOverrun and EndOffsetOverrun variants.
      • Removed the Unaligned variant in favor of new StartOffsetUnaligned and SizeUnaligned variants.
      • Added the ValueStartIndexOverrun and ValueEndIndexOverrun invariants
  • The various "max resources per stage" limits are now capped at 100, so that their total remains below max_bindings_per_bind_group, as required by WebGPU. By @andyleiserson in #9118.
  • The max_uniform_buffer_binding_size and max_storage_buffer_binding_size limits are now u64 instead of u32, to match WebGPU. By @wingertge in #9146.
  • The main 3 native backends now report their limits properly. By @teoxoy in #9196.

naga

  • Naga and wgpu now reject shaders with an enable directive for functionality that is not available, even if that functionality is not used by the shader. By @andyleiserson in #8913.
  • Prevent UB from incorrectly using ray queries on HLSL. By @Vecvec in #8763.
  • Added support for dual-source blending in SPIR-V shaders. By @andyleiserson in #8865.
  • Added supported_capabilities to all backends. By @inner-daemons in #9068.
  • Updated codespan-reporting to 0.13. By @cwfitzgerald in #9243.

Metal

  • Use autogenerated objc2 bindings internally, which should resolve a lot of leaks and unsoundness. By @madsmtm in #5641.
  • Implements ray-tracing acceleration structures for metal backend. By @lichtso in #8071.
  • Remove mutex for MTLCommandQueue because the Metal object is thread-safe. By @andyleiserson in #9217.

deno_webgpu

  • Expose the GPU.wgslLanguageFeatures property. By @andyleiserson in #8884.
  • GPUFeatureName now includes all wgpu extensions. Feature names for extensions should be written with a wgpu- prefix, although unprefixed names that were accepted previously are still accepted. By @andyleiserson in #9163.

Hal

  • Make ordered texture and buffer uses hal specific. By @NiklasEi in #8924.

Bug Fixes

General

  • Tracing support has been restored. By @andyleiserson in #8429.
  • Pipelines using passthrough shaders now correctly require explicit pipeline layout. By @inner-daemons in #8881.
  • Allow using a shader that defines I/O for dual-source blending in a pipeline that does not make use of it. By @andyleiserson in #8856.
  • Improve validation of dual-source blending, by @andyleiserson in #9200:
    • Validate structs with @blend_src members whether or not they are used by an entry point.
    • Dual-source blending is not supported when there are multiple color attachments.
    • TypeFlags::IO_SHAREABLE is not set for structs other than @blend_src structs.
  • Validate strip_index_format isn't None and equals index buffer format for indexed drawing with strip topology. By @beicause in #8850.
  • BREAKING: Renamed EXPERIMENTAL_PASSTHROUGH_SHADERS to PASSTHROUGH_SHADERS and made this no longer an experimental feature. By @inner-daemons in #9054.
  • BREAKING: End offsets in trace and player commands are now represented using offset + size instead. By @ErichDonGubler in #9073.
  • Validate some uncaught cases where buffer transfer operations could overflow when computing an end offset. By @ErichDonGubler in #9073.
  • Fix local_invocation_id and local_invocation_index being written multiple times in HLSL/MSL backends, and naming conflicts when users name variables __local_invocation_id or __local_invocation_index. By @inner-daemons in #9099.
  • Added internal labels to validation GPU objects and timestamp normalization code to improve clarity in graphics debuggers. By @szostid in #9094
  • Fix multi-planar texture copying. By @noituri in #9069

naga

  • The validator checks that override-sized arrays have a positive size, if overrides have been resolved. By @andyleiserson in #8822.
  • Fix some cases where f16 constants were not working. By @andyleiserson in #8816.
  • Use wrapping arithmetic when evaluating constant expressions involving u32. By @andyleiserson in #8912.
  • Fix missing side effects from sequence expressions in GLSL. By @Vipitis in #8787.
  • Naga now enforces the @must_use attribute on WGSL built-in functions, when applicable. You can waive the error with a phony assignment, e.g., _ = subgroupElect(). By @andyleiserson in #8713.
  • Reject zero-value construction of a runtime-sized array with a validation error. Previously it would crash in the HLSL backend. By @mooori in #8741.
  • Reject splat vector construction if the argument type does not match the type of the vector's scalar. Previously it would succeed. By @mooori in #8829.
  • Fixed workgroupUniformLoad incorrectly returning an atomic when called on an atomic, it now returns the inner T as per the spec. By @cryvosh in #8791.
  • Fixed constant evaluation for sign() builtin to return zero when the argument is zero. By @mandryskowski in #8942.
  • Allow array generation to compile with the macOS 10.12 Metal compiler. By @madsmtm in #8953
  • Naga now detects bitwise shifts by a constant exceeding the operand bit width at compile time, and disallows scalar-by-vector and vector-by-scalar shifts in constant evaluation. By @andyleiserson in #8907.
  • Naga uses wrapping arithmetic when evaluating dot products on concrete integer types (u32 and i32). By @BKDaugherty in #9142.
  • Disallow negation of a matrix in WGSL. By @andyleiserson in #9157.
  • Fix evaluation order of compound assignment (e.g. +=) LHS and RHS. By @andyleiserson in #9181.
  • Fixed invalid MSL when float16-format vertex input data was accessed via an f16-type variable in a vertex shader. By @andyleiserson in #9166.

Validation

  • Fixed validation of the texture format in GPUDepthStencilState when neither depth nor stencil is actually enabled. By @andyleiserson in #8766.
  • Check that depth bias is not used with non-triangle topologies. By @andyleiserson in #8856.
  • Check that if the shader outputs frag_depth, then the pipeline must have a depth attachment. By @andyleiserson in #8856.
  • Fix incorrect acceptance of some swizzle selectors that are not valid for their operand, e.g. const v = vec2<i32>(); let r = v.xyz. By @andyleiserson in #8949.
  • Fixed calculation of the total number of bindings in a pipeline layout when validating against device limits. By @andyleiserson in #8997.
  • Reject non-constructible types (runtime- and override-sized arrays, and structs containing non-constructible types) in more places where they should not be allowed. By @andyleiserson in #8873.
  • The query set type for an occlusion query is now validated when opening the render pass, in addition to within the call to beginOcclusionQuery. By @andyleiserson in #9086.
  • Require that the blend factor is One when the blend operation is Min or Max. The BlendFactorOnUnsupportedTarget error is now reported within ColorStateError rather than directly in CreateRenderPipelineError. By @andyleiserson in #9110.

Vulkan

  • Fixed a variety of mesh shader SPIR-V writer issues from the original implementation. By @inner-daemons in #8756
  • Offset the vertex buffer device address when building a BLAS instead of using the first_vertex field. By @Vecvec in #9220
  • Remove incorrect ordered texture uses. By @NiklasEi in #8924.

Metal / macOS

  • Fix one-second delay when switching a wgpu app to the foreground. By @emilk in #9141
  • Work around Metal driver bug with atomic textures. By @atlv24 in #9185
  • Fix setting an immediate for a Mesh shader. By @waywardmonkeys in #9254

GLES

  • DisplayHandle should now be passed to InstanceDescriptor for correct EGL initialization on Wayland. By @MarijnS95 in #8012 Note that the existing workaround to create surfaces before the adapter is no longer valid.
  • Changing shader constants now correctly recompiles the shader. By @DerSchmale in #8291.

Performance

GLES

  • The GL backend would now try to take advantage of GL_EXT_multisampled_render_to_texture extension when applicable to skip the multi-sample resolve operation. By @opstic in #8536.

Documentation

General

  • Expanded documentation of QuerySet, QueryType, and resolve_query_set() describing how to use queries. By @kpreid in #8776.