blackbar

Local-first PDF redaction for lawyers and anyone handling confidential documents. Deterministic detection finds names, parties, amounts, dates and identifiers in seconds — no AI required — and every export is machine-verified to be extraction-safe. Nothing leaves your machine.

100% local · no telemetry · no cloud · source on Codeberg

Workflow

Five steps, all on your machine

01Upload

Drop a PDF into the local web app. It never touches a network.

02Detect — seconds, no AI

Pattern rules, contract-structure extraction (parties, defined-term aliases, signature blocks), NER and your watchlist. A 100-page contract takes about 15 seconds.

03Deep scan — optional

One button adds a local LLM (via Ollama) for messy or unusual documents. Slower, never required, and it also runs entirely on your machine.

04Review

Every detection is a proposal until you accept it. Draw or resize boxes, reject false positives, redact every occurrence of any term with one click.

05Export — verified

Content is removed from the file, not just covered. Export refuses unless two independent extractors confirm the accepted text is gone.

Result

What you get

The same contract page, three ways. These are real outputs generated by Blackbar from a synthetic test contract.

Original contract page with names, addresses and amounts visible
Before. The original page: parties, addresses, amounts, dates, matter IDs all exposed.
Redacted contract page with black bars over sensitive text
After. True visual redaction — the text underneath is removed from the file, and metadata is scrubbed. Copy-paste recovers nothing.
Redacted page where each black bar carries a label like Person 1 or Organization 2
After, labeled. Optional alias labels ("Person 1", "Organization 2") keep the document readable for translation or AI analysis. The mapping is recorded in a manifest only you hold.

The app

Review everything before it ships

A three-pane workspace: proposals grouped by category on the left (with the detection source of each), the document with live redaction boxes in the middle, and the server-rendered redacted preview on the right.

Blackbar review workspace: proposal sidebar, PDF with black redaction bars, redacted preview pane
Blackbar upload screen with drag-and-drop zone

How to

Install and run

Requirements: macOS or Linux, Python 3.10+, Node 18+. Ollama is optional — without it, Deep scan is simply unavailable and everything else works.

1 — Get the code

git clone https://codeberg.org/russkysong/blackbar.git
cd blackbar

2 — Set up (once)

python3 -m venv .venv
.venv/bin/pip install -e '.[dev]'
.venv/bin/python -m spacy download en_core_web_md
cd frontend && npm install && npm run build && cd ..

3 — Run the review app

.venv/bin/redactor serve
# then open http://127.0.0.1:8000 — or double-click index.html in the project folder

Or: headless, one command

.venv/bin/redactor redact contract.pdf                  # fast, no AI
.venv/bin/redactor redact contract.pdf --deep           # + local LLM deep scan
.venv/bin/redactor redact contract.pdf --alias-labels   # "Person 1" labels + alias map

Each export produces the redacted PDF plus a JSON manifest: SHA-256 of input and output, every proposal and your decision on it, the alias map, and the verification verdict.

Why trust it

Safety is checked, not promised

Verified exports

After redaction the file is re-opened and searched by two independent PDF text extractors. If anything you accepted can still be extracted, the export is refused and destroyed.

Removal, not covering

Black bars are burned in after the underlying text is deleted from the content stream. Document metadata and annotations are scrubbed.

Scanned pages can't slip through

Image-only pages (where text detection is blind) are flagged and block export until you explicitly acknowledge them.

Honest numbers

Measured on a labeled test corpus: recall 1.000, precision 0.954, 100-page contract in 14.9 seconds. Methodology and eval harness ship in the repository.