adze.uk — The Digital Toolshed

Data Retention — Policies, Automation & Archival

How to build a data retention strategy — what to keep, when to delete, automated lifecycle policies, and compliance-aware approaches.

Keep What Matters, Delete What Doesn't

Data retention isn't just about storage space — it's about intentionality. Keeping everything forever is expensive, messy, and potentially a liability. Deleting too aggressively loses things you'll wish you had.

A good retention policy answers three questions for every category of data:

  1. How long must it be kept? (legal, regulatory, or practical minimum)
  2. How long should it be kept? (useful lifetime)
  3. When and how should it be deleted? (automated, verified, documented)

Retention by Data Type

Data TypeMinimum RetentionRecommendedNotes
Personal photos/videosForeverForeverIrreplaceable
Financial records6 years (HMRC)7 yearsTax compliance
Business correspondence3 years5 yearsContract disputes
Medical recordsLifetimeLifetimePersonal health history
Code repositoriesActive lifetimeActive + 2 yearsArchive completed projects
Server logs90 days1 yearSecurity forensics
Application logs30 days90 daysDebugging
Backup snapshotsPer rotation policy7d/4w/12mSee Backup Strategies
Container imagesCurrent + previous3 versionsRollback capability
Browser bookmarksForeverForeverTiny storage, high value
EmailVaries3–7 yearsThen archive selectively

Automated Lifecycle Policies

Restic Backup Rotation

# Keep: 7 daily, 4 weekly, 12 monthly, 5 yearly
restic forget \
  --keep-daily 7 \
  --keep-weekly 4 \
  --keep-monthly 12 \
  --keep-yearly 5 \
  --prune

Log Rotation with Logrotate

# /etc/logrotate.d/docker-logs
/var/lib/docker/containers/*/*.log {
    rotate 7
    daily
    compress
    delaycompress
    missingok
    notifempty
    copytruncate
}

Docker System Cleanup

# Remove stopped containers, unused networks, dangling images, build cache
docker system prune -f

# Also remove unused volumes (careful!)
docker system prune --volumes -f

# Automate via cron (weekly)
0 2 * * 0 /usr/bin/docker system prune -f >> /var/log/docker-prune.log 2>&1

Storage Tiering

Not all data needs the same performance or redundancy:

TierStorage TypeWhat Goes HereRetention
HotNVMe SSD (RAID 1)Active databases, current projects, containersAlways available
WarmHDD (RAID 5/6/SHR)Media library, backups, archives <2 yearsMonths to years
ColdExternal HDD, off-siteOld archives, completed projectsYears to decades
GlacierM-DISC, tape, cloud archiveIrreplaceable data, legal holdsDecades

GDPR Considerations for Self-Hosters

If you host services used by others (even family), GDPR may apply:

  • Right to erasure — You must be able to delete a user's data on request
  • Data minimisation — Don't collect data you don't need
  • Storage limitation — Don't keep personal data longer than necessary
  • Documentation — Know what data you hold and why

For a personal/family setup, this is largely common sense. For anything public-facing, it's legally binding.

File Organisation for Retention

A clear folder structure makes retention policies enforceable:

/data/
  /active/          # Current work, hot tier
  /archive/
    /2024/          # Completed year
    /2025/          # Completed year
  /media/
    /photos/        # Organised by year/month
    /music/         # Permanent
    /video/         # Permanent
  /backup/
    /daily/         # 7-day rotation
    /weekly/        # 4-week rotation
    /monthly/       # 12-month rotation
  /temp/            # Auto-delete after 30 days

Auto-Delete Temp Files

# Cron: delete files in /data/temp older than 30 days
0 3 * * * find /data/temp -type f -mtime +30 -delete

Deletion Verification

When you delete data, verify it's actually gone:

  • Logical deletion — File removed from filesystem (recoverable with forensic tools)
  • Secure deletion — Overwritten with random data (shred, srm)
  • Crypto-erasure — Encrypted volume key destroyed (instant, complete erasure of all content)

For most purposes, logical deletion + encrypted storage is sufficient. If the volume is LUKS-encrypted, destroying the key header makes all data on the volume unrecoverable.

Building Your Retention Policy

  1. Inventory — List all data types you store
  2. Classify — Assign each type a retention period with justification
  3. Automate — Set up rotation, cleanup, and archival scripts
  4. Document — Write it down (even a simple text file)
  5. Review — Revisit annually — needs change, storage costs change, regulations change

The goal is intentionality. A deliberate policy, even a simple one, is infinitely better than "keep everything and hope for the best."

Product links may include affiliate partnerships — see our affiliate disclosure for details.