Data Cleanup

After long-term use, a finance app inevitably accumulates "orphan data" in its local database or attachment folders: tag links to deleted transactions, attachment files without DB rows, sub-categories whose parent was deleted… None of it breaks daily use, but it eats space, slows down sync, and becomes a time bomb long term.

Starting with App 3.2.0 / Cloud 1.3.0, BeeCount ships a manual data cleanup tool: it scans for all orphan data, lists them by group, and lets the user confirm before deleting — nothing runs automatically, nothing is silently removed. Every cleanup is visible.

There's one tool on the App side and one in the Web admin console, each responsible for its own data layer.

App side: local data cleanup

How to open

「Me」 → 「Data Management」 → 「Data Cleanup」

Scope

Only cleans data on the current device: Drift / SQLite database rows + attachments / icon files in the app sandbox + sync_queue pending change records. Does not touch BeeCount Cloud server data.

Workflow

The page starts scanning on open (you can also tap the refresh icon to rescan manually).
Detected orphans are listed in three groups:
- Database orphans (Type A — 10 kinds)
- Disk file orphans (Type B — 3 kinds)
- Sync queue orphans (Type C — 1 kind)
Each record shows a title + subtitle (e.g. "Budget #5 · ¥3000 · ledger deleted"); records can be selected one by one or by group.
Footer shows "N selected, ~X MB reclaimable".
Tap "Clean Selected" → confirmation dialog → execute → auto rescan.

App-side detection list

Type A: database reference broken

ID	Type	Description
A1	Budget pointing to deleted ledger	`budgets.ledger_id` no longer in `ledgers`
A2	Attachment pointing to deleted tx	`transaction_attachments.transaction_id` missing
A3	Tag link pointing to deleted tx	`transaction_tags.transaction_id` missing
A4	Tag link pointing to deleted tag	`transaction_tags.tag_id` missing
A5	Tx account missing	`tx.account_id` or `to_account_id` not in `accounts`
A6	Tx category missing	`tx.category_id` not in `categories`
A7	Sub-category parent missing	`categories.parent_id` not found
A8	Budget category missing	`budgets.category_id` not found
A9	Shared sub-category parent missing	`shared_ledger_categories.parent_sync_id` missing
A10	Tag override missing tx	`transaction_tag_overrides` whose tx is gone

Type B: disk file orphans

ID	Type	Description
B1	Attachment file with no DB ref	File under `attachments/` not referenced by any `transaction_attachments` row
B2	Custom category icon with no DB ref	File under `category_icons/` not referenced by any `categories.custom_icon_path`
B3	Shared category icon cache with no ref	Shared ledger icon sha256 cache no longer referenced

Type C: sync state orphans

ID	Type	Description
C1	`sync_queue` orphan entity	`local_changes` row whose target entity has been deleted

Common situations

Lots of A2 / A3: cascade cleanup never caught up after bulk-deleting transactions
B1 eating hundreds of MB: attachment files left behind after deleting transactions (older versions didn't auto-clean)
A9 + B3: you used a shared ledger and were kicked / left, but local mirror data still around

Notes

Back up before cleaning

We recommend exporting CSV / triggering a cloud sync first. Cleanup is irreversible and is not uploaded to cloud backup.

Failure handling

In rare cases (e.g. attachment file held by another process) a delete can fail. The failure list will show the specific reason — just rescan and try again.

Web side: admin data cleanup

How to open

Top-right avatar dropdown → 「Admin · Data Cleanup」 (admin only)

Direct link: https://your-deploy-domain/app/admin/data-cleanup

Admin only

Regular users will not see this entry in the avatar menu. Self-host admins need to set their account's is_admin = true; see the BeeCount Cloud deployment doc.

Scope

Cleans server-side orphan data across all users: Postgres / SQLite database rows + files under /data/attachments + sync_changes anomalies.

Completely independent from the App tool — the Web tool cleans the server, the App tool cleans the device. They neither conflict nor substitute for each other.

Workflow

Same as the App side: scan → three groups → select → single / batch delete → confirmation → auto rescan.

Web-side detection list

Type A: database reference broken

ID	Type	Description
A1	Tx category missing	`tx_missing_category` — `read_tx_projection` row whose category is deleted
A2	Tx account missing	`tx_missing_account`
A3	Tx from-account missing (transfer)	`tx_missing_from_account`
A4	Tx to-account missing (transfer)	`tx_missing_to_account`
A5	Budget category missing	`budget_missing_category`
A6	sync_changes orphan entity	`sync_change_missing_entity` — drop subsequent LWW writes

Type B: attachments / files

ID	Type	Description
B1	AttachmentFile with no ref	`AttachmentFile` row not referenced by any tx / category icon
B2	Attachment file missing	`AttachmentFile.storage_path` points to a file no longer on disk
B3	Disk file without DB row	File present under `/data/attachments` with no `AttachmentFile` row (auto-skips `profile-avatars/`)
B4	Tx attachment reference broken	`fileId` in tx `attachments_json` does not exist in `AttachmentFile`

Notes

Affects all users

Web-side cleanup acts on all users' data. Deleting a sync_changes row is not propagated back to clients — it's equivalent to dropping that LWW write. Deleting a storage file makes any client request for that attachment return 404.

Back up first

Admins should run a data backup (rsync the whole data/ directory, or sqlite3 .backup / pg_dump) before cleanup.

Profile-avatars protection

The tool auto-skips profile-avatars/, so user avatars will not be misidentified as orphan files. This is a fix shipped in 1.3.1 — older builds may have wrongly cleaned avatars, please upgrade if you're on an older version.

Design principles

Both tools follow the same principles:

Manual trigger — no auto-runs, no silent background cleanup. All deletes are explicit user actions.
Fully visible — every orphan has a title + subtitle, so the user knows exactly what they're deleting.
Group + select — single record or per-group select-all, fine-grained control.
Double confirmation — tapping clean opens a confirmation dialog noting the action is irreversible.
Failure tolerance — one failure in a batch doesn't stop the rest; the failure list shows each specific reason.
Size summary — file orphans show size, helping decide "is it worth cleaning".

When to use

Run it periodically (e.g. quarterly), or in these situations:

App disk usage spiking noticeably without the actual dataset growing
After bulk-deleting transactions / ledgers / categories
After leaving / being kicked from a shared ledger and wanting to clear local mirrors
Self-host admin notices the storage directory growing unexpectedly
A sync after upgrade reports strange reference errors

Privacy

The App tool runs fully offline; scan results are not uploaded anywhere.
The Web tool runs on your own BeeCount Cloud deployment; data stays on your server.
The BeeCount team does not collect any user data via these tools — both are pure local / local-deploy computation.

App side: local data cleanup​

How to open​

Scope​

Workflow​

App-side detection list​

Type A: database reference broken​

Type B: disk file orphans​

Type C: sync state orphans​

Common situations​

Notes​

Web side: admin data cleanup​

How to open​

Scope​

Workflow​

Web-side detection list​

Type A: database reference broken​

Type B: attachments / files​

Notes​

Profile-avatars protection​

Design principles​

When to use​

Privacy​

App side: local data cleanup

How to open

Scope

Workflow

App-side detection list

Type A: database reference broken

Type B: disk file orphans

Type C: sync state orphans

Common situations

Notes

Web side: admin data cleanup

How to open

Scope

Workflow

Web-side detection list

Type A: database reference broken

Type B: attachments / files

Notes

Profile-avatars protection

Design principles

When to use

Privacy