Implementing Soft Deletes in SQLAlchemy: Keeping Audit Trails Without Orphaning Foreign Keys in Multi-Tenant Systems
I’ve shipped soft deletes three times now. The first two times, I broke production.
The first attempt was naive: add an is_deleted boolean to every model, filter it in queries, call it a day. That lasted until a foreign key constraint violation happened at 2 AM because I’d soft-deleted a parent record while child records still pointed to it. The second attempt used database views to hide deleted records—which worked until I realized the views were breaking SQLAlchemy’s relationship loading and my audit queries were silently returning incomplete data.
CitizenApp needed soft deletes for regulatory compliance (finance customers demand audit trails), but also needed referential integrity and point-in-time recovery without corrupting the schema. Here’s the production pattern I built after those failures.
Why Soft Deletes Matter in Multi-Tenant SaaS
Hard deletes are a luxury you lose the moment you touch financial or healthcare data. A user deletes a record by mistake, your compliance officer asks “what was it?”, and if you have no audit trail, you’re either admitting negligence or restoring from backups like it’s 1997.
But soft deletes done wrong are worse than hard deletes:
- Orphaned foreign keys: Child records point to non-existent logical parents if you don’t handle cascades carefully
- Schema pollution: Every query needs
WHERE deleted_at IS NULLor you get zombies in your results - Referential integrity lies: The database thinks your relationship is valid; your application doesn’t
I prefer a polymorphic mixin approach combined with PostgreSQL views. It’s transparent to the application, preserves foreign key constraints, and enables temporal queries without schema hacks.
The Core Pattern: SoftDeleteMixin
Here’s the mixin that powers every soft-delete model in CitizenApp:
from datetime import datetime
from sqlalchemy import Column, DateTime, String, text
from sqlalchemy.orm import declarative_mixin, declared_attr
from sqlalchemy.ext.hybrid import hybrid_property
@declarative_mixin
class SoftDeleteMixin:
"""
Provides soft-delete behavior with audit trail and point-in-time recovery.
Why a mixin? Because inheritance hierarchies in SQLAlchemy become unwieldy.
Mixins let you opt-in per model and keep the schema clean.
"""
@declared_attr
def deleted_at(cls) -> Column:
return Column(DateTime(timezone=True), nullable=True, default=None)
@declared_attr
def deleted_by_user_id(cls) -> Column:
return Column(String(36), nullable=True, default=None)
@hybrid_property
def is_deleted(self) -> bool:
"""
Hybrid property works in both Python and SQL context.
This matters for filtering in relationships.
"""
return self.deleted_at is not None
@is_deleted.expression
def is_deleted(cls):
return cls.deleted_at.isnot(None)
def soft_delete(self, user_id: str, session) -> None:
"""
Mark as deleted instead of removing. Preserves all relationships.
Why pass user_id explicitly? Multi-tenant systems need to know
*who* deleted it, not just *when*. Compliance reports demand this.
"""
self.deleted_at = datetime.utcnow()
self.deleted_by_user_id = user_id
session.add(self)
def restore(self, session) -> None:
"""
Point-in-time recovery. This saved CitizenApp from a data loss incident
where a user script mass-deleted documents by mistake.
"""
self.deleted_at = None
self.deleted_by_user_id = None
session.add(self)
Now apply it to models:
from sqlalchemy import Column, String, Integer, ForeignKey
from sqlalchemy.orm import relationship
class Workspace(Base):
__tablename__ = "workspaces"
id = Column(String(36), primary_key=True)
name = Column(String(255), nullable=False)
tenant_id = Column(String(36), ForeignKey("tenants.id"), nullable=False)
# Mix in soft-delete behavior
deleted_at = Column(DateTime(timezone=True), nullable=True)
deleted_by_user_id = Column(String(36), nullable=True)
class Document(Base):
__tablename__ = "documents"
id = Column(String(36), primary_key=True)
workspace_id = Column(
String(36),
ForeignKey("workspaces.id"), # ← This still enforces FK integrity
nullable=False
)
content = Column(String)
tenant_id = Column(String(36), ForeignKey("tenants.id"), nullable=False)
deleted_at = Column(DateTime(timezone=True), nullable=True)
deleted_by_user_id = Column(String(36), nullable=True)
# Relationship with automatic filtering
workspace = relationship(
"Workspace",
lazy="joined",
# Only load workspaces that aren't deleted
primaryjoin="and_(Document.workspace_id==Workspace.id, Workspace.deleted_at.is_(None))",
foreign_keys=[workspace_id],
)
The Problem: Invisible Deletes and Orphaned Relationships
The code above works until you query documents and expect the workspace relationship to be there. If the workspace is soft-deleted, SQLAlchemy won’t load it (because of the primaryjoin condition), but the foreign key constraint still allows the record to exist.
This breaks assumptions:
# This works fine
doc = session.query(Document).filter(Document.id == "doc-1").first()
print(doc.workspace) # Returns Workspace instance
# But if workspace is soft-deleted...
workspace = session.query(Workspace).filter(Workspace.id == "ws-1").first()
workspace.soft_delete("user-123", session)
session.commit()
# Now this silently returns None
doc = session.query(Document).filter(Document.id == "doc-1").first()
print(doc.workspace) # None! But the FK still points to a logically-deleted row.
I prefer using database views to make soft deletes transparent and correct at the constraint level.
Using PostgreSQL Views for Transparent Soft Deletes
Create a view that only exposes non-deleted records. The foreign key constraint now operates against the view, not the table:
-- Create the base view (non-deleted records only)
CREATE VIEW workspaces_active AS
SELECT * FROM workspaces
WHERE deleted_at IS NULL;
-- Constraint on the FK now uses the view
ALTER TABLE documents
DROP CONSTRAINT documents_workspace_id_fkey,
ADD CONSTRAINT documents_workspace_id_fkey
FOREIGN KEY (workspace_id) REFERENCES workspaces_active(id);
But SQLAlchemy doesn’t have native support for mapping to views with constraints. Instead, use selectable and explicit relationships:
from sqlalchemy import select
# Define the active view in Python
active_workspaces = select(Workspace).where(Workspace.deleted_at.is_(None)).subquery()
class Document(Base):
__tablename__ = "documents"
id = Column(String(36), primary_key=True)
workspace_id = Column(String(36), ForeignKey("workspaces.id"), nullable=False)
content = Column(String)
tenant_id = Column(String(36), ForeignKey("tenants.id"), nullable=False)
deleted_at = Column(DateTime(timezone=True), nullable=True)
deleted_by_user_id = Column(String(36), nullable=True)
# Relationship automatically filters deleted records
workspace = relationship(
"Workspace",
lazy="joined",
foreign_keys=[workspace_id],
viewonly=False, # Allows cascade deletes through the view
)
Querying Including Deleted Records (Audit Mode)
You’ll need to query the full history sometimes:
from sqlalchemy import or_
class DocumentRepository:
def get_with_history(self, doc_id: str, session) -> list[Document]:
"""
Return document and all its versions (including soft-deleted).
Used for audit trails and compliance reporting.
"""
return (
session.query(Document)
.filter(Document.id == doc_id)
.order_by(Document.deleted_at.desc())
.all()
)
def get_active(self, doc_id: str, session) -> Document:
"""
Normal query path—only active records.
This is the DEFAULT and it should be hard to accidentally get deleted data.
"""
return (
session.query(Document)
.filter(
Document.id == doc_id,
Document.deleted_at.is_(None)
)
.first()
)
def get_as_of_time(self, doc_id: str, timestamp: datetime, session) -> Document:
"""
Point-in-time recovery. Fetch state of document at a specific timestamp.
"""
return (
session.query(Document)
.filter(
Document.id == doc_id,
or_(
Document.deleted_at.is_(None),
Document.deleted_at > timestamp
)
)
.first()
)
Gotcha: Cascade Deletes and Event Listeners
Here’s what burned me: If you define a cascade delete on a relationship, soft-delete breaks it silently.
# DON'T do this with soft deletes
workspace = relationship(
"Workspace",
cascade="all, delete", # ← This won't trigger soft_delete()
)
Instead, use event listeners to handle cascades at the application level:
from sqlalchemy import event
from sqlalchemy.orm import Session
@event.listens_for(Workspace