Data Retention — Policies, Automation & Archival
How to build a data retention strategy — what to keep, when to delete, automated lifecycle policies, and compliance-aware approaches.
Keep What Matters, Delete What Doesn't
Data retention isn't just about storage space — it's about intentionality. Keeping everything forever is expensive, messy, and potentially a liability. Deleting too aggressively loses things you'll wish you had.
A good retention policy answers three questions for every category of data:
- How long must it be kept? (legal, regulatory, or practical minimum)
- How long should it be kept? (useful lifetime)
- When and how should it be deleted? (automated, verified, documented)
Retention by Data Type
| Data Type | Minimum Retention | Recommended | Notes |
|---|---|---|---|
| Personal photos/videos | Forever | Forever | Irreplaceable |
| Financial records | 6 years (HMRC) | 7 years | Tax compliance |
| Business correspondence | 3 years | 5 years | Contract disputes |
| Medical records | Lifetime | Lifetime | Personal health history |
| Code repositories | Active lifetime | Active + 2 years | Archive completed projects |
| Server logs | 90 days | 1 year | Security forensics |
| Application logs | 30 days | 90 days | Debugging |
| Backup snapshots | Per rotation policy | 7d/4w/12m | See Backup Strategies |
| Container images | Current + previous | 3 versions | Rollback capability |
| Browser bookmarks | Forever | Forever | Tiny storage, high value |
| Varies | 3–7 years | Then archive selectively |
Automated Lifecycle Policies
Restic Backup Rotation
# Keep: 7 daily, 4 weekly, 12 monthly, 5 yearly
restic forget \
--keep-daily 7 \
--keep-weekly 4 \
--keep-monthly 12 \
--keep-yearly 5 \
--pruneLog Rotation with Logrotate
# /etc/logrotate.d/docker-logs
/var/lib/docker/containers/*/*.log {
rotate 7
daily
compress
delaycompress
missingok
notifempty
copytruncate
}Docker System Cleanup
# Remove stopped containers, unused networks, dangling images, build cache
docker system prune -f
# Also remove unused volumes (careful!)
docker system prune --volumes -f
# Automate via cron (weekly)
0 2 * * 0 /usr/bin/docker system prune -f >> /var/log/docker-prune.log 2>&1Storage Tiering
Not all data needs the same performance or redundancy:
| Tier | Storage Type | What Goes Here | Retention |
|---|---|---|---|
| Hot | NVMe SSD (RAID 1) | Active databases, current projects, containers | Always available |
| Warm | HDD (RAID 5/6/SHR) | Media library, backups, archives <2 years | Months to years |
| Cold | External HDD, off-site | Old archives, completed projects | Years to decades |
| Glacier | M-DISC, tape, cloud archive | Irreplaceable data, legal holds | Decades |
GDPR Considerations for Self-Hosters
If you host services used by others (even family), GDPR may apply:
- Right to erasure — You must be able to delete a user's data on request
- Data minimisation — Don't collect data you don't need
- Storage limitation — Don't keep personal data longer than necessary
- Documentation — Know what data you hold and why
For a personal/family setup, this is largely common sense. For anything public-facing, it's legally binding.
File Organisation for Retention
A clear folder structure makes retention policies enforceable:
/data/
/active/ # Current work, hot tier
/archive/
/2024/ # Completed year
/2025/ # Completed year
/media/
/photos/ # Organised by year/month
/music/ # Permanent
/video/ # Permanent
/backup/
/daily/ # 7-day rotation
/weekly/ # 4-week rotation
/monthly/ # 12-month rotation
/temp/ # Auto-delete after 30 daysAuto-Delete Temp Files
# Cron: delete files in /data/temp older than 30 days
0 3 * * * find /data/temp -type f -mtime +30 -deleteDeletion Verification
When you delete data, verify it's actually gone:
- Logical deletion — File removed from filesystem (recoverable with forensic tools)
- Secure deletion — Overwritten with random data (shred, srm)
- Crypto-erasure — Encrypted volume key destroyed (instant, complete erasure of all content)
For most purposes, logical deletion + encrypted storage is sufficient. If the volume is LUKS-encrypted, destroying the key header makes all data on the volume unrecoverable.
Building Your Retention Policy
- Inventory — List all data types you store
- Classify — Assign each type a retention period with justification
- Automate — Set up rotation, cleanup, and archival scripts
- Document — Write it down (even a simple text file)
- Review — Revisit annually — needs change, storage costs change, regulations change
The goal is intentionality. A deliberate policy, even a simple one, is infinitely better than "keep everything and hope for the best."
Product links may include affiliate partnerships — see our affiliate disclosure for details.