7 hours ago
pdf_oxide

v0.3.44 | Pluggable cryptographic provider — FIPS 140-3 compliance for

Highlights

  • pdf_oxide::crypto::CryptoProvider trait — new abstraction that decouples PDF encryption and signature paths from any one cryptography crate. Two providers ship out of the box:
    • RustCryptoProvider (default): pure-Rust stack as before (sha2, aes, rsa, p256, p384, getrandom, md-5, sha1). Permits every algorithm PDF specs reference, including the legacy MD5+RC4 path required by ISO 32000-1 R≤4 documents.
    • AwsLcProvider (opt-in via --features fips): backed by aws-lc-rs, FIPS 140-3 validated since 2024. Refuses MD5 / SHA-1-for-signing / RC4 with Error::AlgorithmNotPermitted and a clear remediation message.
  • Single source of randomness. src/encryption/algorithms.rs's former SHA-256(uuid_v4 || timestamp_ns || …) cascade is replaced with crypto::active().random_bytes() — under the default provider this is getrandom::fill() (OS entropy pool); under FIPS it's aws_lc_rs::rand::SystemRandom. Cryptographically suitable for AES-256 file keys and salts; auditable.
  • Closes #236.

Architecture

Three sub-traits compose into CryptoProvider:

  • Hasher — incremental hashing (update / finalize).
  • SymmetricCipher — AES-128/256-CBC (PKCS#7 + no-padding) and RC4.
  • SignatureVerifier — RSA-PKCS#1-v1.5, RSA-PSS, ECDSA P-256/P-384.

Plus an opaque Signer handle so HSM / PKCS#11 / Cloud KMS backends can plug in via SigningKeyMaterial (which is #[non_exhaustive] — future variants for HSM slots etc. are not breaking changes).

The is_legacy_allowed() policy bit lets each provider declare whether MD5 / SHA-1-sign / RC4 are permitted. PDF Standard Security R≤4 documents are gated at EncryptionHandler::new: under a FIPS provider they fail with a remediation message ("re-encrypt at R=6 or build pdf_oxide without the 'fips' feature so the default 'rust-crypto' provider stays active") rather than panic deep inside the cipher path.

Usage

use std::sync::Arc;
use pdf_oxide::crypto::{set_provider, AwsLcProvider};

set_provider(Arc::new(AwsLcProvider::new()))?;
let doc = pdf_oxide::PdfDocument::open("encrypted-r6.pdf")?;

See docs/CRYPTO_PROVIDERS.md for the algorithm coverage matrix, custom-provider walkthrough (sovereign-jurisdiction algorithms, HSMs), and the legacy-PDF policy table.

CI

  • New fips job in .github/workflows/ci.yml builds with --features fips, runs the 11-test AwsLcProvider suite including a cross_provider_aes_compat check that asserts the FIPS and rust-crypto AES paths produce byte-identical output, and enforces clippy -D warnings under the FIPS feature.

Release

  • New .github/workflows/release-fips.yml workflow (manually triggered) builds and publishes parallel FIPS distributions on every package index, all from the same Rust source compiled with --features fips so each binary contains only AWS-LC's FIPS-validated module:

    Ecosystem Package Install
    PyPI pdf_oxide_fips pip install pdf_oxide_fips==0.3.44
    npm pdf-oxide-fips npm install pdf-oxide-fips@0.3.44
    NuGet PdfOxide.Fips dotnet add package PdfOxide.Fips --version 0.3.44
    Go github.com/yfedoseev/pdf_oxide/go-fips go get github.com/yfedoseev/pdf_oxide/go-fips@v0.3.44

    Platform matrix in v0.3.44 (every binding × every platform):

    Platform Python npm NuGet Go
    Linux x86_64
    Linux aarch64
    macOS x86_64
    macOS arm64
    Windows x86_64

    All distributions move in lockstep with the regular release — FIPS and default variants of the same release tag are byte-equal in their non-crypto code paths. Per-platform smoke tests in the workflow confirm the FIPS provider is reachable AND crypto_use_fips() (or equivalent) flips the active provider as expected — catches API mismatches before publishing.

    Why pdf_oxide_fips (underscore) for Python: PyPI normalizes hyphens / underscores to the same canonical form per PEP 503 (pip install pdf_oxide_fips and pip install pdf-oxide-fips resolve to the same package). Using underscore in pyproject.toml makes the wheel filename and the import pdf_oxide path identical to the default distribution — only the package name differs.

    Why parallel distributions instead of pip install pdf_oxide[fips]: Python extras (PEP 508) can add Python dependencies but cannot swap the compiled .so baked inside a wheel. The industry pattern (cryptography, pyOpenSSL) ships separate FIPS distributions; we follow suit.

    Why a go-fips submodule path: Go modules are import-path-bound, so users pick at go get time:

    go get github.com/yfedoseev/pdf_oxide/go            # default
    go get github.com/yfedoseev/pdf_oxide/go-fips       # FIPS
    

    Both submodules re-export the same Go API; only the linked native static lib differs.

Fixes

  • Restore manylinux_2_28 glibc floor for Python wheels. 0.3.42 and 0.3.43 published only manylinux_2_35 Linux glibc wheels because the release workflow ran maturin build directly on ubuntu-latest (Ubuntu 24.04, glibc 2.39), letting the runner's glibc set the wheel tag. That excluded Amazon Linux 2023 / AWS Lambda Python (glibc 2.34), RHEL 8, Ubuntu 20.04 and Debian 11 — pip rejected the wheel and fell back to a source build that OOM-killed rustup-init inside the Lambda build container. Reported by @potatochipcoconut on PR #463. Both release.yml (default wheels) and release-fips.yml (pdf_oxide_fips wheels) now build the Linux glibc wheels via PyO3/maturin-action inside the manylinux_2_28 container, and a CI guard step fails the job if a manylinux_2_28 wheel is not produced for either Linux target — preventing this regression from recurring. The 0.3.21 baseline (originally added in #284) is restored.

Performance — extract_pages_to_bytes 12–54× faster

Extraction of page ranges from large PDFs is now bound by serialisation work instead of redundant document rebuilds and tree walks. Closes #474, reported by community contributor @potatochipcoconut, whose careful root-cause writeup (chunk-by-chunk timings, comparison against PyMuPDF's doc.select(), and a profiling-grade reproduction case from an AWS Lambda IDP pipeline) made this fix possible.

Measured on the public 1112-page / 38 MB Artificial Intelligence — A Modern Approach corpus (pdfs_slow2/) on an idle laptop:

Workload 0.3.43 0.3.44 Speedup
extract_pages_to_bytes(0..300) 7301 ms / 36 MB out 382 ms / 12 MB out 19× + 3× smaller
extract_pages_to_bytes(0..50) 7983 ms / 36 MB out 155 ms / 4 MB out 51× + 9× smaller
Sequential 23 × 50-page chunks ~3 min 1542 ms total ~120×

Extrapolating to the reporter's 12k-page / 50 MB document chunked into five 3000-page slices: an AWS Lambda invocation that previously timed out at 900 s after two chunks now finishes the entire five-chunk batch in roughly 30 s.

Root causes

All in src/editor/document_editor.rs + src/document.rs:

  1. Triple full-document rewrite. extract_pages_to_bytes serialised the whole doc, re-parsed the bytes, removed pages one at a time, and serialised again — three full passes when one would do. Replaced with a non-mutating in-place trimmed page_order, restored after the save (even on Err).
  2. Garbage collector walked the original page tree. The trimmed /Pages dict was rebuilt locally inside write_full_to_writer, but collect_reachable_ids() started its BFS from the unmodified catalog and pulled in every dropped page's resources — so the output never shrank no matter how few pages were kept. Fixed by staging the trimmed /Pages dict in modified_objects before the save; the GC walker already prefers staged dicts over source.
  3. get_page_ref(i) in a 0..n loop is O(n²). Each call walks the page tree from the root and stops at the i-th leaf, so collecting all n leaf refs walks 1 + 2 + … + n nodes. New helper PdfDocument::all_page_refs() does it in one DFS. The flat-tree common case (root /Pages whose /Count matches Kids.len()) reads the ref array straight out of /Kids without touching individual leaves at all.

The same n² loop pattern was lurking in four other call sites on the reporter's hot path (their pipeline does PDF/A validate + convert before the chunked extract). All five collapsed to a single all_page_refs() call:

  • src/outline.rsfind_page_index (O(n²) per outline entry → O(n³) on documents with bookmarks).
  • src/editor/document_editor.rs line ~4275 — page-ref → index map for partial form-flatten.
  • src/editor/document_editor.rs line ~4505 — same map for get_form_fields().
  • src/compliance/validators.rsvalidate_fonts (doc.validate_pdf_a('2b')).
  • src/compliance/converter.rs — per-page /AA strip (doc.convert_to_pdfa('2b')).

New API

Two additions, both directly requested by @potatochipcoconut in #474; both available in Rust and Python (the other bindings can be added on demand):

# Batch extraction — same single-call efficiency, ergonomic for
# the chunked-for-OCR / chunked-for-S3 pattern.
chunks = doc.extract_page_ranges_to_bytes(
    [(0, 3000), (3000, 6000), (6000, 9000), (9000, 12000)]
)

# In-place selection — equivalent to PyMuPDF's doc.select(...).
# After this call, the document holds only the listed pages,
# in the order given. doc.save() / doc.save_to_bytes() then
# emit only those pages with garbage-collected resources.
doc.select_pages([1, 4, 7, 99])

Known limitation

PDFs whose /Pages root publishes shared /Resources used by all leaf pages (typical of high-resolution book scans, atypical of office documents with subset fonts) still produce full-size chunk output: GC correctly preserves resources reachable from kept pages, and a single shared resource pool stays reachable as long as any kept page references it. The principled fix is per-page resource sub-setting — parsing each kept page's content stream to determine which fonts / XObjects are actually used and emitting a minimal /Resources for that page. That is a feature, not a bug fix, and is deferred from this release. The wall-clock speedup (12–54×) holds regardless.

Tests

  • 5050 lib tests pass under --features python,fips (5039 default + 11 FIPS-only).
  • 119 encryption tests still pass byte-equal post-rewire to the trait.
  • 69 signatures tests still pass byte-equal post-rewire.
  • Hash vectors validated against NIST FIPS 180-4 for SHA-256/384/512 and RFC 1321 / 3174 for MD5 / SHA-1.
  • New regression tests cover the issue #474 workflow: test_extract_pages_chunked_sequential (4 sequential chunks on the same DocumentEditor, source observably unchanged between calls), test_extract_pages_non_sequential (out-of-order indices [3, 0, 4]), test_extract_page_ranges_to_bytes_batch, test_select_pages_in_place, and test_select_pages_out_of_range.

Known follow-ups (v0.3.45)

  • AwsLcProvider RSA-PKCS#1 v1.5 verify-from-digest (#475)AwsLcProvider::verify_rsa_pkcs1v15 is currently a stub; PDF/CMS signatures using RSA-PKCS#1 v1.5 return SignerVerify::Unknown instead of verifying under FIPS. Blocked on aws-lc-rs exposing a stable RSA_PKCS1_PRIM_VERIFY API. RustCryptoProvider (default) is not affected.
  • AwsLcProvider signing wiring — signing calls are currently routed to RustCryptoProvider. Full AWS-LC signing integration lands in v0.3.45.
  • musllinux Python wheels for the FIPS variant — FIPS musllinux wheels (Alpine / musl libc) require a musl-targeted aws-lc-fips-sys build; work in progress.

Installation

Rust (crates.io)

cargo add pdf_oxide

Python (PyPI)

pip install pdf_oxide

JavaScript/WASM (npm)

npm install pdf-oxide-wasm

CLI (Homebrew)

brew install yfedoseev/tap/pdf-oxide

CLI (Scoop — Windows)

scoop bucket add pdf-oxide https://github.com/yfedoseev/scoop-pdf-oxide
scoop install pdf-oxide

CLI (Shell installer)

curl -fsSL https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/install.sh | sh

CLI (cargo-binstall)

cargo binstall pdf_oxide_cli

MCP Server (for AI assistants)

cargo install pdf_oxide_mcp

Pre-built Binaries Download archives for Linux, macOS, and Windows from the assets below. Each archive includes both pdf-oxide (CLI) and pdf-oxide-mcp (MCP server).

Platform Support

Platform Architecture Archive
Linux x86_64 (glibc) pdf_oxide-linux-x86_64-*.tar.gz
Linux x86_64 (musl) pdf_oxide-linux-x86_64-musl-*.tar.gz
Linux ARM64 pdf_oxide-linux-aarch64-*.tar.gz
macOS x86_64 (Intel) pdf_oxide-macos-x86_64-*.tar.gz
macOS ARM64 (Apple Silicon) pdf_oxide-macos-aarch64-*.tar.gz
Windows x86_64 pdf_oxide-windows-x86_64-*.zip

Changelog

See CHANGELOG.md for full details.

18 hours ago
rbatis

v4.9.4

what change?

  • add let tx:RBatisTxExecutorGuard = tx.auto_commit(); to support auto commit/rollback for example:
async fn transaction(tx: RBatisTxExecutor) -> Result<(), Error> {
    let tx = tx.auto_commit(); // defer commit or rollback
    log::info!("transaction [{}] start", tx.tx_id());
    let _ = Activity::insert(
        &tx,
        &Activity {
            id: Some("3".into()),
            name: Some("3".into()),
            pc_link: Some("3".into()),
            h5_link: Some("3".into()),
            pc_banner_img: None,
            h5_banner_img: None,
            sort: None,
            status: Some(3),
            remark: Some("3".into()),
            create_time: Some(DateTime::now()),
            version: Some(1),
            delete_flag: Some(1),
        },
    )
    .await;
    //if not commit or rollback,tx.done = false,
    //tx.commit().await?;
    Ok(())
}
1 days ago
ntex-mqtt

ntex-mqtt v8.0.0-beta.5

Changes

  • Use new codec api with BytePages support
1 days ago
quick-xml

v0.39.3 - Fix panic when parse malformed DTD

Bug Fixes

  • #950: Fix subtraction with overflow when parse malformed DTD in some cases. Note, that currently we do not check the validity of DTD, so the returned Event::DocType may contain the malformed DTD.

Full Changelog: https://github.com/tafia/quick-xml/compare/v0.39.2...v0.39.3

1 days ago
egui

0.34.2: Text layout and selection fixes



egui is an easy-to-use immediate mode GUI for Rust that runs on both web and native.

Try it now: https://www.egui.rs/

egui development is sponsored by Rerun, a startup building an SDK for visualizing streams of multimodal data.

⭐ Added

  • Add regression test for O(n²) word boundary scan #8077 by @hallyhaa

🐛 Fixed

  • Fix wrong color of last glyph of selected text #8075 by @emilk
  • Fix text selection of centered and right-aligned text #8076 by @emilk
  • Fix Context::is_pointer_over_egui and Context::egui_wants_pointer_input #8081 by @emilk
  • Fix centered & right aligned TextEdit #8082 by @lucasmerlin

🚀 Performance

  • Optimize text selection performance for large documents #7917 by @rustbasic
1 days ago
tokio

Tokio v1.52.2

1.52.2 (May 4th, 2026)

This release reverts the LIFO slot stealing change introduced in 1.51.0 (#7431), due to its performance impact. (#8100)

2 days ago
tokio

Tokio v1.51.1

1.51.2 (May 4th, 2026)

This release reverts the LIFO slot stealing change introduced in 1.51.0 (#7431), due to its performance impact. (#8100)

2 days ago
pdf_oxide

v0.3.43 | Cross-binding parity, WASI build target, and a basket of issue fixes.

Highlights

  • render_page_fit() now ships in all five bindings (Rust core + Python, Node.js / TypeScript, C#, Go). Picks the largest DPI such that both rendered dimensions fit inside a target pixel box, preserving aspect ratio. No more "what DPI hits 1024×768?" math on the caller's side. Fixes #441, closes #448.
  • Idiomatic page iteration parity across bindings. Rust gets page_indices(), Python gets .pages, Node.js gets [Symbol.asyncIterator] (the sync [Symbol.iterator] was already there). C# Pages and Go Pages() were already shipped. Closes #447.
  • WASI build targetcargo build --target wasm32-wasip1 now builds the lib cleanly on stable Rust. Unblocks @RALaBarge's external pdf-oxide-wasi stdin→stdout wrapper and any other consumer wanting to embed pdf_oxide in a sandboxed WASI runtime. CI now gates that the WASI build stays green. Closes #214.
  • Spurious-table fix on dense word grids — Roland's #405 lands via cherry-pick. A new has_split_modal_column_groups validator inspects the column co-occurrence graph across modal rows and rejects candidates whose populated columns split into two or more disconnected components — the signature of two adjacent text flows mis-clustered as one table. Composes cleanly with v0.3.42's Table::is_real_grid filter. Validated against the 86-PDF cross-build corpus: 888 / 888 byte-equal — zero observable change on common documents, the gate's value is in the safety net for adversarial cases.

Fixes

  • #456PdfDocument::open(path) now populates source_bytes, unblocking convert_to_pdf_a(), the C FFI pdf_document_get_source_bytes, and any other API that re-reads the in-memory copy. Path-loaded documents previously got an empty Vec<u8> and hit "Invalid PDF header: File is empty (0 bytes read)" from the PDF/A converter. Reported by @potatochipcoconut on PR #445.
  • #451 — Standard14 PostScript fonts with no open-source equivalent (Symbol, ZapfDingbats) are now downgraded from hard FontNotEmbedded errors to a new KnownUnembeddableFont warning during PDF/A conversion. A document that's otherwise compliant no longer fails solely because of one symbolic font.
  • #395 — closed; verified the off-by-one C# ExceptionMapper fix in v0.3.38 actually resolves the reported RenderPageSignatureException [8500]. Added a Rust regression test that opens @gevorgter's exact reproducer PDF and asserts render_page succeeds. The fixture is pinned in pdf_oxide_tests.
  • #462 — dropped the scripts/modernize_stubs.py post-processor and the python_version = "3.8" setting from rylai.toml. Rylai's default already emits PEP-585 / PEP-604 syntax with from __future__ import annotations at the top, so post-processing was duplicate work in opposite directions. Runtime support for Python 3.8/3.9 is unaffected — .pyi stubs are type-checker artifacts, never imported at runtime. Reported by @monchin with a clean diagnosis of the root cause.

Behavior changes

  • PdfDocument::open(path) now reads the file once into memory rather than streaming via BufReader<File>. The doc comment already promised "Reads the entire file into memory"; this makes it true. Memory usage on open() is now equivalent to from_bytes(std::fs::read(path)?). Required by #456; the streaming reader was a partial optimisation no caller could rely on (every code path that touched source_bytes already required the in-memory copy).
  • PdfReader enum collapsed to a single in-memory variant — removed unused File variant. std::io::{Read, Seek, BufRead, …} imports are no longer cfg-gated, which is what unblocked the wasm32-wasip1 build target.

Dependencies

  • Batch-applied 9 dependabot bumps onto release/v0.3.43: CI workflows (golangci-lint-action v7→v9, setup-go 5.5→6.4, setup-node 4.4→6.4, github-script SHA refresh, scorecard-action 2.4.0→2.4.3), Go (testify 1.8→1.11 — was declared but unimported, dropped entirely), JS (rimraf 5→6 — @types/node deferred to a follow-up after a TypeScript-strict shake-out), Python (onnx ≥1.14→≥1.19.1).
  • The RustCrypto 0.8 stack (pkcs8 0.11, spki 0.8, der 0.8, digest 0.11, crypto-common 0.2, block-buffer 0.12) stays pinned — rsa 0.10 and p256/p384 0.14 are still RC upstream. See the existing pin note at Cargo.toml:185-187.

Internal

  • New wasm32-wasip1 build smoke check in .github/workflows/ci.yml alongside the existing wasm32-unknown-unknown job.
  • Regenerated SBOMs (pdf_oxide_cli/sbom.cdx.json, pdf_oxide_mcp/sbom.cdx.json) for 0.3.43.
  • New regression tests:
    • tests/test_issue_456_path_open_source_bytes.rs
    • tests/test_issue_447_page_indices.rs
    • tests/test_issue_395_render_page.rs
  • New unit tests on compliance::converter::downgrade_known_unembeddable_fonts.

Validation

86-PDF stratified corpus comparison (academic, mixed, forms, government, newspapers, theses, plus the three #211 fixtures), 888 sampled (pdf, page, method) triples across extract_text, to_plain_text, to_markdown, to_html:

  • v0.3.43 vs v0.3.42 — 888 / 888 byte-equal, zero deltas
  • v0.3.43 vs PyPI v0.3.41 — 860 equal, 28 reorder/de-dup, 0 real content losses (same profile as v0.3.42's regression report)

Community contributors

This release exists because of the community. Special thanks to:

  • @RolandWArnold — landed the spurious-table fix in #405. After iterating away from an earlier density-gate framing, the shipped form is has_split_modal_column_groups: a connected- component check on the column co-occurrence graph across modal rows that flags two-flow grids the regular-row-ratio gate accepts. Roland's doc-comment explicitly flags it as a heuristic, making it easy to revisit later. The fix composes with v0.3.42's struct-tree-aware reading-order rewire without any merge conflict.
  • @RALaBarge — built an external WASI binary wrapper for pdf_oxide (pdf-oxide-wasi) and reported in #214 that it required nightly Rust because of an internal ceil_char_boundary call. That call was already removed; this release fixes the second hidden blocker (cfg-gated std::io imports) and adds CI gating so the WASI target stays green.
  • @gevorgter — flagged two rendering-area gaps: the C# binding's misleading SignatureException on RenderPage (#395, fixed in v0.3.38, regression-guarded here) and the lack of a pixel-dimension render API (#441, closed by render_page_fit shipping in all five bindings).
  • @potatochipcoconut — surfaced the convert_to_pdf_a failure on path-loaded documents while testing PR #445; the investigation traced it to the empty source_bytes field and produced the one-line fix in this release (#456).
  • @monchin — pointed out (#462) that scripts/modernize_stubs.py was redundant work because rylai itself controls the typing flavour via its python_version setting, and noted that office/barcodes/ocr feature alignment between rylai.toml and the released wheel is worth a follow-up. The cleaner stub pipeline ships in this release.

Installation

Rust (crates.io)

cargo add pdf_oxide

Python (PyPI)

pip install pdf_oxide

JavaScript/WASM (npm)

npm install pdf-oxide-wasm

CLI (Homebrew)

brew install yfedoseev/tap/pdf-oxide

CLI (Scoop — Windows)

scoop bucket add pdf-oxide https://github.com/yfedoseev/scoop-pdf-oxide
scoop install pdf-oxide

CLI (Shell installer)

curl -fsSL https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/install.sh | sh

CLI (cargo-binstall)

cargo binstall pdf_oxide_cli

MCP Server (for AI assistants)

cargo install pdf_oxide_mcp

Pre-built Binaries Download archives for Linux, macOS, and Windows from the assets below. Each archive includes both pdf-oxide (CLI) and pdf-oxide-mcp (MCP server).

Platform Support

Platform Architecture Archive
Linux x86_64 (glibc) pdf_oxide-linux-x86_64-*.tar.gz
Linux x86_64 (musl) pdf_oxide-linux-x86_64-musl-*.tar.gz
Linux ARM64 pdf_oxide-linux-aarch64-*.tar.gz
macOS x86_64 (Intel) pdf_oxide-macos-x86_64-*.tar.gz
macOS ARM64 (Apple Silicon) pdf_oxide-macos-aarch64-*.tar.gz
Windows x86_64 pdf_oxide-windows-x86_64-*.zip

Changelog

See CHANGELOG.md for full details.

2 days ago
rbatis

v4.9.3

what change?

  • deprecated of MssqlTableMapper, MysqlTableMapper, PGTableMapper, SqliteTableMapper
  • update table sync plugin to simple code for example:
let rb = RBatis::new();
rb.init(rbdc_sqlite::SqliteDriver {}, "sqlite://target/sqlite.db");
let conn = rb.acquire().await?;
_ = RBatis::sync(
                &conn,
                &rb,
                &Activity {
                    id: Some(String::new()),
                    name: Some(String::new()),
                    pc_link: Some(String::new()),
                    h5_link: Some(String::new()),
                    pc_banner_img: Some(String::new()),
                    h5_banner_img: Some(String::new()),
                    sort: Some(String::new()),
                    status: Some(0),
                    remark: Some(String::new()),
                    create_time: Some(DateTime::now()),
                    version: Some(0),
                    delete_flag: Some(0),
                },
                "activity",
            )
            .await;