faceless-photolib

@faceless-photolib/backend-webgpu

WebGPU backend for faceless-photolib — WGSL codegen and pipeline/bind-group descriptor construction from the render-graph IR, with one shading-language codebase for browser, Node (Dawn), and Expo/RN.

The WebGPU backend for faceless-photolib — a headless, color-managed, GPU-accelerated image-editing engine.

It lowers the engine's frozen render-graph IR to WebGPU: per-pass WGSL codegen plus device-independent pipeline / bind-group descriptors. The GPU work is authored once in WGSL behind a thin, structural GpuBackend port, so the same code runs on browser native WebGPU, Node (Dawn via webgpu), and Expo/React Native (react-native-wgpu/Dawn). When no usable adapter is present the backend returns backend-unavailable — never a silent CPU fallback.

Install

pnpm add @faceless-photolib/backend-webgpu

Usage

import {
  createBackend,
  generateWgsl,
  planGraph,
  renderPipelineFor,
  INTERMEDIATE_FORMAT,
} from "@faceless-photolib/backend-webgpu";
import type { CompiledRenderGraph } from "@faceless-photolib/render-graph";

// Acquire a device and build the live backend. Absence of an adapter is a
// `backend-unavailable` Result, not a throw or a CPU fallback.
const result = await createBackend();
if (result.kind === "ok") {
  const backend = result.value; // GpuBackend: { kind, info(), render(), dispose() }
  // backend.render(graph) returns a Resource<RenderResult> (loading / error / ready).
}

// Codegen + descriptors are computed without a device, so they are fully
// inspectable and testable. Lower one pass to a complete WGSL module:
declare const graph: CompiledRenderGraph;
const pass = graph.passes[0];
const wgsl = generateWgsl(pass); // Result<string> — ok, or `not-implemented` per pass
if (wgsl.kind === "ok") {
  const descriptor = renderPipelineFor(pass, wgsl.value);
  // descriptor.targetFormat === INTERMEDIATE_FORMAT ("rgba16float")
}

// Or lower the whole graph at once (validates every pass is lowerable first):
const plan = planGraph(graph); // Result<RenderPlan>

API

ExportDescription
createBackend()Acquires a WebGPU device and builds the live GpuBackend; missing adapter → backend-unavailable.
createUninitializedBackend()A synchronous GpuBackend handle that reports unavailable until a device is acquired.
acquireBackend()The underlying device-acquisition routine returning Result<GpuBackend>.
makeDeviceBackend(device, state)Builds the live GpuBackend over an already-acquired device.
planGraph(graph)Lowers every pass of a CompiledRenderGraph to its pipeline descriptor, returning Result<RenderPlan>.
executePlanUnverified(device, plan, state)The device-UNVERIFIED encode path (pipeline build per pass); awaited by the engine-runtime orchestrator.
generateWgsl(pass)Lowers one RenderPass to a complete WGSL module (Result<string>).
generateSourceWgsl / generateBlendWgsl / generateAdjustmentWgsl / generateConversionWgsl / generateLut3dWgsl / generateOutputTransformWgslPer-pass WGSL generators for each render-graph node kind.
renderPipelineFor(pass, code)Builds a device-independent RenderPipelineDescriptor for a pass + its WGSL.
bindGroupLayoutFor(pass)Builds the BindGroupLayoutDescriptor matching the WGSL @group(0) bindings.
inputCount(pass)The number of distinct image inputs a pass consumes.
lut3dTextureData(descriptor)Packs a 3D-LUT descriptor into the rgba16float texture-upload data.
probeGpuApi()Detects a usable navigator.gpu, returning a present/absent probe result.
INTERMEDIATE_FORMATThe intermediate connection-space render-target format ("rgba16float").
transferFnDecls / collectTransferDecls / encodeFnName / decodeFnName / transferFnTagWGSL transfer-function (EOTF/OETF) declaration helpers.

Types are re-exported for the conformance harness and engine-runtime orchestrator: RenderPlan, RenderPipelineDescriptor, BindGroupLayoutDescriptor, BindGroupLayoutEntry, and the structural WebGPU surface (GpuApi, GpuAdapter, GpuDevice, GpuNavigator, GpuApiProbe).

The widened v1 passes (fill, text, colorTransform, resample, mask, clip) and the ACES output-transform tone-scale are surfaced loudly as not-implemented rather than rendering a silent identity; they are realized in the CPU reference backend today.

License

MIT

API reference

33 public exports · 33 documented · generated from source.

acquireBackendfunction
acquireBackend(): Promise<Result<GpuBackend>>

Acquire a WebGPU adapter + device and build the backend. The absence of an adapter is `backend-unavailable` (never a throw, never a CPU fallback). A device that is acquired but immediately lost is wired so the next render reports `error`.

bindGroupLayoutForfunction
bindGroupLayoutFor(pass: RenderPass): BindGroupLayoutDescriptor

Build the bind-group-layout descriptor for a pass, matching the `@group(0)` binding numbers the WGSL codegen emits: - single-input fragment passes: 0 = src texture, 1 = sampler - blend: 0 = backdrop texture, 1 = sampler, 2 = source texture - lut3d: 0 = src texture, 1 = sampler, 2 = 3D LUT texture, 3 = LUT sampler `outputTransform` has no lowerable WGSL (the pass rejects), so it has no pipeline; we still describe its single-input layout for symmetry, but callers never build a pipeline for it.

collectTransferDeclsfunction
collectTransferDecls(fns: readonly ({ kind: "linear"; } | { kind: "gamma"; exponent: number; } | { kind: "sRGB"; } | { kind: "rec709"; } | { kind: "rec2020"; } | { kind: "pq"; } | { kind: "hlg"; } | { kind: "logC3"; } | { kind: "logC4"; } | { ...; } | { ...; } | { ...; } | { ...; })[]): string

Emit the deduplicated transfer-fn declarations for a set of transfer fns. The `fp_log10` prelude is emitted exactly once and each unique curve tag's decode+encode helper pair exactly once — so combining several conversions into one WGSL module (e.g. the adjustment sandwich's two conversions) cannot produce a duplicate-definition compile error.

createBackendfunction
createBackend(): Promise<Result<GpuBackend>>

Acquire a WebGPU device + build the backend. Absence of an adapter is `backend-unavailable`.

createUninitializedBackendfunction
createUninitializedBackend(): GpuBackend

A synchronous handle that reports unavailable until a device is acquired (port shape only).

decodeFnNamefunction
decodeFnName(fn: { kind: "linear"; } | { kind: "gamma"; exponent: number; } | { kind: "sRGB"; } | { kind: "rec709"; } | { kind: "rec2020"; } | { kind: "pq"; } | { kind: "hlg"; } | { kind: "logC3"; } | { kind: "logC4"; } | { ...; } | { ...; } | { ...; } | { ...; }): string

Name of the generated decode helper for a transfer fn.

encodeFnNamefunction
encodeFnName(fn: { kind: "linear"; } | { kind: "gamma"; exponent: number; } | { kind: "sRGB"; } | { kind: "rec709"; } | { kind: "rec2020"; } | { kind: "pq"; } | { kind: "hlg"; } | { kind: "logC3"; } | { kind: "logC4"; } | { ...; } | { ...; } | { ...; } | { ...; }): string

Name of the generated encode helper for a transfer fn.

executePlanUnverifiedfunction
executePlanUnverified(device: GpuDevice, plan: RenderPlan, state: DeviceState): Promise<Resource<{ width: number; height: number; colorSpace: string & $brand<"ColorSpaceId">; pixels: Float32Array<...>; }>>

The device-UNVERIFIED GPU execution: encode each pass's pipeline into ping-pong `rgba16float` targets and read back the output as f32 RGBA. NOT executed on this host. Returns a `Resource` so a future async port (or the engine-runtime orchestrator) can adopt it; here it is awaited by no one and exists to make the encode path real rather than a stub.

generateAdjustmentWgslfunction
generateAdjustmentWgsl(desc: AdjustmentDescriptor): Result<string>

Generate the full WGSL fragment-shader module for an adjustment pass. The fragment runs ONLY the effect math on the source buffer (already in the working space — the surrounding `colorConversion` passes are separate IR nodes the render-graph compiler emits; re-wrapping here would double-convert). Alpha passes through untouched (these are tonal/color RGB adjustments). Unknown effect → `rejected("not-implemented")` + beacon; out-of-range params → `invalid-request` (the forwarded failure). Never a silent identity.

generateBlendWgslfunction
generateBlendWgsl(desc: BlendDescriptor): string

Generate the full WGSL fragment-shader module for a blend pass. Reads two inputs (backdrop = group 0 binding 0, source = binding 2) as premultiplied RGBA and writes the premultiplied composite. `desc.opacity`/`desc.fillOpacity` are baked as constants (they are part of the pass identity / Merkle key). Fires the same `warnDegraded` beacon the CPU reference does for an uncalibrated Special-8 Fill response (D5) — never silently.

generateConversionWgslfunction
generateConversionWgsl(desc: ColorConversionDescriptor): string

Generate a full WGSL fragment-shader module for a standalone colorConversion pass. The fragment reads the single input texture, converts the RGB, and writes straight RGBA (alpha passed through). Used by the per-pass pipeline.

generateLut3dWgslfunction
generateLut3dWgsl(desc: Lut3dDescriptor): string

Generate the full WGSL fragment-shader module for a lut3d pass. The 3D LUT is bound at group 0 binding 2 (`texture_3d<f32>`), the source at binding 0. Domain min/max are baked numerically.

generateOutputTransformWgslfunction
generateOutputTransformWgsl(desc: OutputTransformDescriptor): Result<string>

Output-transform WGSL codegen (backend-webgpu; D3). The output transform is the ACES **Display + View** transform — NOT a plain colorspace convert (D3) — so its RRT/ODT tone-scale is intrinsic to the node's meaning. That tone-scale math has **no realization in any of this package's dependencies**: `@…/color` emits only the `{version, display}` descriptor (no RRT/ODT coefficients), the CPU reference backend's render is itself stubbed (no golden output to parity-test against), and no RRT/ODT tone-scale coefficients are pinned in the research corpus (only transfer functions and primaries are). Emitting a gamut+transfer shader without the tone-scale would present the node as having run while its defining step did not — the exact silent-identity substitution the gpu-backend spec forbids — and writing the tone-scale ourselves would mean inventing unpinned color science (project.md §5 forbids that). Therefore BOTH versions return `rejected("not-implemented")` + a beacon, by design and by force of the no-invented-color-math rule — never a partial or identity pass. ACES 2.0 is additionally gated to a later phase (D3). The display-space *encode* (gamut rotation + display transfer fn) is already faithfully covered by the `colorConversion` lowering as its own node; this module intentionally has no display-encode generator (it would be dead code).

generateSourceWgslfunction
generateSourceWgsl(): string

Source-pass WGSL codegen (backend-webgpu). A `source` pass uploads the asset (resolved by content hash) to an input texture and samples it into the connection-space buffer. The fragment is a straight sample (the asset is already in the layer's declared color space; the compiler emits a separate `colorConversion` pass to bring it into the connection space when needed — this pass does no color math, by design). NOTE: GPU execution is UNVERIFIED on this host.

generateWgslfunction
generateWgsl(pass: RenderPass): Result<string>

Per-pass WGSL dispatcher (backend-webgpu; D2). Lowers one backend-agnostic `RenderPass` from the frozen render-graph IR into a complete WGSL module (vertex + fragment). One module + one pipeline per pass — matching the IR's one-effect-per-pass structure. Returns `Result<string>` uniformly: `source`, `blend`, `colorConversion`, and `lut3d` always lower (wrapped in `ok`); `adjustment` and `outputTransform` already return `Result` (an unknown adjustment effect or the unrealized ACES tone-scale surface as `not-implemented` + beacon, never a silent identity). `match().exhaustive()` on `pass.node` means adding a new `RenderPass` variant to the frozen union is a compile error here until it is handled — never a silent miss. The widened v1 passes (`fill`, `text`, `colorTransform`, `resample`, `mask`, `clip`) are realized first in the CPU reference backend (the golden source). Their WGSL lowering is a later phase (GPU execution is out of scope on the GPU-less host this was built on), so they return `not-implemented` + a beacon here rather than a silent identity — the CPU backend renders them today.

inputCountfunction
inputCount(pass: RenderPass): number

The number of distinct *image inputs* a pass consumes (used by the render orchestrator to wire the right intermediate buffers in). `source` reads its uploaded asset (1 external input), `blend` reads backdrop + source (2), single-input effects read 1, and the `lut3d` pass additionally binds a 3D LUT texture (not counted here — it is a resource, not a graph-edge input).

lut3dTextureDatafunction
lut3dTextureData(desc: Lut3dDescriptor): Float32Array<ArrayBufferLike>

Build the row-major `Float32Array` upload buffer for the 3D LUT texture. The descriptor `data` is already R-fastest RGBA in exactly the order WebGPU's `writeTexture` expects for a 3D texture of extent (size, size, size) with R→x, G→y, B→z — so this is a faithful copy with a length assertion.

makeDeviceBackendfunction
makeDeviceBackend(device: GpuDevice, state: DeviceState): GpuBackend

Build the live backend over an acquired device. `render` is synchronous (the frozen port shape), so it cannot await GPU readback; it validates the graph by lowering it, surfaces a lost device as `Resource.error`, and otherwise reports the work as in-flight (`loading`) after building the plan. The actual encode/submit/map-readback is device-UNVERIFIED and lives in `executePlanUnverified`, dispatched here but not awaited by the sync port.

planGraphfunction
planGraph(graph: CompiledRenderGraph): Result<RenderPlan>

Lower every pass of a compiled graph to its pipeline descriptor. A pass that cannot be lowered (an unsupported adjustment effect, or the unrealized ACES output transform) fails the whole plan loudly with the forwarded failure — never a silently dropped pass.

probeGpuApifunction
probeGpuApi(): GpuApiProbe

Probe the host for a WebGPU entry point.

renderPipelineForfunction
renderPipelineFor(pass: RenderPass, code: string): RenderPipelineDescriptor

Build the render-pipeline descriptor for a pass given its generated WGSL. The full-screen triangle (`vs_main`) + the pass fragment (`fs_main`) write to a single `rgba16float` target.

transferFnDeclsfunction
transferFnDecls(fn: { kind: "linear"; } | { kind: "gamma"; exponent: number; } | { kind: "sRGB"; } | { kind: "rec709"; } | { kind: "rec2020"; } | { kind: "pq"; } | { kind: "hlg"; } | { kind: "logC3"; } | { kind: "logC4"; } | { ...; } | { ...; } | { ...; } | { ...; }): string

Emit the WGSL declarations (the log10 prelude + the named decode/encode helpers) for a transfer fn. `linear` still emits real `return x;` helpers (no silent omission) so a conversion always has a callable function.

transferFnTagfunction
transferFnTag(fn: { kind: "linear"; } | { kind: "gamma"; exponent: number; } | { kind: "sRGB"; } | { kind: "rec709"; } | { kind: "rec2020"; } | { kind: "pq"; } | { kind: "hlg"; } | { kind: "logC3"; } | { kind: "logC4"; } | { ...; } | { ...; } | { ...; } | { ...; }): string

A stable WGSL identifier suffix for a transfer fn (so two different curves yield two different helper names, but the same curve always reuses one).

BindGroupLayoutDescriptorinterface
interface BindGroupLayoutDescriptor

A bind-group-layout descriptor for one pass.

GpuAdapterinterface
interface GpuAdapter

Adapter handle; `requestDevice` resolves to a usable `GpuDevice` or rejects.

GpuApiinterface
interface GpuApi

The `navigator.gpu` entry point. `requestAdapter` may resolve to `null`.

GpuDeviceinterface
interface GpuDevice

Minimal device surface. Methods accept the descriptor shapes in `pipeline.ts`.

GpuNavigatorinterface
interface GpuNavigator

A `navigator`-shaped object that MAY expose `.gpu`.

RenderPipelineDescriptorinterface
interface RenderPipelineDescriptor

A render-pipeline descriptor (device-independent; structurally WebGPU-ish).

RenderPlaninterface
interface RenderPlan

The lowered plan for a graph: one pipeline descriptor per pass, in execution order, plus the output pass id. Built without a device, so a graph can be fully validated (every pass lowerable?) before any GPU work.

BindGroupLayoutEntrytype
type BindGroupLayoutEntry

A single bind-group-layout entry (subset of GPUBindGroupLayoutEntry).

GpuApiProbetype
type GpuApiProbe

Read `navigator.gpu` off `globalThis` without assuming a browser. Node 22 exposes a global `navigator` (without `.gpu` unless a Dawn binding installs one), so we probe `globalThis.navigator?.gpu` rather than referencing the `navigator` global directly (which would be a ReferenceError where absent). Returns a named union variant — never a bare `null`/`undefined` crossing the package's own API surface.

INTERMEDIATE_FORMATconst
INTERMEDIATE_FORMAT: GpuTextureFormat

The intermediate render-target format for connection-space buffers (D3).

On this page