Dirty Checking
What Is Dirty Checking?
Dirty checking is the process of determining which fields of an entity have changed since it was loaded from the database. When you update an entity, the ORM needs to decide:
- Whether to execute an UPDATE statement at all
- Which columns to include in the UPDATE statement
Storm's entities are stateless and immutable by design: plain Kotlin data classes or Java records with no proxies, no bytecode manipulation, and no hidden state. This design simplifies the dirty checking logic and allows for high performance.
Instead of tracking changes implicitly, Storm:
- Observes entity state when you read from the database
- Compares entity state when you call
update()within the same transaction - Generates the appropriate UPDATE statement based on the configured mode
Observed state is stored in the transaction context, not on the entity itself. This keeps entities simple and predictable while still providing intelligent update behavior.
┌─────────────────────────────────────────────────────────────────┐
│ Transaction Scope │
│ │
│ ┌─────────┐ ┌──────────────┐ ┌─────────┐ │
│ │ READ │────────▶│ Observed │────────▶│ UPDATE │ │
│ │ Entity │ │ State │ │ Called │ │
│ └─────────┘ │ (cached) │ └────┬────┘ │
│ └──────────────┘ │ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────┐ │
│ │ Compare current entity │ │
│ │ with observed state │ │
│ └──────────────┬───────────────┘ │
│ │ │
│ ┌─────────────────┼─────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ No change│ │ Some │ │ Some │ │
│ │ detected │ │ changed │ │ changed │ │
│ └────┬─────┘ │ (ENTITY) │ │ (FIELD) │ │
│ │ └────┬─────┘ └────┬─────┘ │
│ ▼ ▼ ▼ │
│ Skip UPDATE Full-row UPDATE Partial UPDATE │
│ │
└─────────────────────────────────────────────────────────────────┘
Key insight: Dirty checking in Storm is scoped to a single transaction. Once the transaction commits, all observed state is discarded. This keeps memory usage predictable and avoids the complexity of managing detached entities.
Entity cache misses can affect dirty checking behavior. When an entity is not found in the cache, Storm falls back to a full-row update. See Entity Cache for cache retention configuration.
Update Modes
Storm supports three update modes, each representing a different trade-off between SQL efficiency, batching potential, and write amplification:
| Mode | Dirty Check | UPDATE Behavior | SQL Variability |
|---|---|---|---|
OFF | None | Always update all columns | Single shape |
ENTITY | Entity-level | Skip if unchanged; full row if any changed | Single shape |
FIELD | Field-level | Update only changed columns | Multiple shapes |
The selected update mode controls:
- Whether an UPDATE is executed (can be skipped if nothing changed)
- What gets updated (all columns vs. only changed columns)
- How predictable the generated SQL is (affects batching and caching)
Choosing the Right Mode
┌─────────────────────────────────────┐
│ What kind of workload do you have? │
└─────────────────┬───────────────────┘
│
┌───────────────────────┼───────────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Batch/ETL │ │ Typical CRUD │ │ Wide tables │
│ processing │ │ application │ │ or hot rows │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Use: OFF │ │ Use: ENTITY │ │ Use: FIELD │
│ │ │ (default) │ │ │
│ Maximum batch │ │ Good balance of │ │ Minimal write │
│ efficiency │ │ efficiency and │ │ amplification │
│ │ │ simplicity │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
UpdateMode.OFF
In OFF mode, Storm bypasses dirty checking entirely. Every call to update() generates a full UPDATE statement that writes all columns, regardless of whether values have actually changed.
val user = orm.find(User_.id eq 1)
val updatedUser = user.copy(name = "New Name")
// Always generates: UPDATE user SET email=?, name=?, city_id=? WHERE id=?
// All columns are included, even though only 'name' changed
orm update updatedUser
No comparison is performed; every update is unconditional.
When to Use OFF Mode
Batch processing and ETL: When importing or transforming large datasets, you often want predictable, unconditional writes. OFF mode gives you maximum batching efficiency because every UPDATE has the same shape.
// Processing 10,000 records - all UPDATEs have identical structure
// JDBC can batch them efficiently
userRepository.update(users.map { processUser(it) })
Simple applications: If your entities are small and updates are infrequent, the overhead of dirty checking may not be worth the complexity. OFF mode keeps things straightforward.
Characteristics
- Single, stable SQL shape (enables efficient JDBC batching)
- Zero CPU overhead (no comparisons to perform)
- Maximum predictability
Trade-offs
- Updates may write unchanged values to the database
- Cannot skip unnecessary UPDATEs
- May cause more database trigger activity if triggers fire on any UPDATE
UpdateMode.ENTITY (Default)
ENTITY mode is Storm's default and provides a balanced approach. Storm checks entities against the observed state from when the entity was read. Based on this comparison:
- If same instance: No UPDATE is executed
- If no field changed: No UPDATE is executed (individual fields are checked when needed)
- If any field changed: A full-row UPDATE is executed (all columns are written)
val user = orm.get(User_.id eq 1) // Storm observes: {id=1, email="a@b.com", name="Alice"}
// Scenario 1: No changes
orm update user // No SQL executed - entity unchanged
// Scenario 2: Any field changed
val updated = user.copy(name = "Bob")
orm update updated // UPDATE user SET email=?, name=?, city_id=? WHERE id=?
// Full row update, even though only 'name' changed
Why Full-Row Updates?
You might wonder: if Storm knows only name changed, why update all columns? The answer is batching efficiency.
When multiple entities of the same type are updated in a transaction, JDBC can batch them together only if they have the same SQL shape. With ENTITY mode, all UPDATEs for a given entity type look identical, enabling efficient batching:
// All updates have identical SQL shape - JDBC batches them
val users = userRepository.findAll(User_.city eq city)
userRepository.update(users.map { it.copy(lastLogin = now()) })
When to Use ENTITY Mode
Most CRUD applications: ENTITY mode provides the right balance for typical web applications. It avoids unnecessary database round-trips when nothing changed, while maintaining predictable SQL patterns.
Read-modify-write patterns: When you load an entity and pass it back to update without modifications, ENTITY mode skips the UPDATE entirely.
val user = orm.get(User_.id eq userId)
// No changes made - UPDATE is skipped
orm update user
// Conditional modification - UPDATE only if actually changed
val updated = if (shouldUpdate) user.copy(name = "New Name") else user
orm update updated
Characteristics
- UPDATE suppression when nothing changed
- Stable SQL shape per entity (enables batching)
- Low memory overhead (stores one copy of observed state per entity)
- Minimal CPU overhead (single comparison per update)
Trade-offs
- Writes unchanged columns when any field is dirty
- Requires storing observed state in memory during transaction
UpdateMode.FIELD
FIELD mode provides the most granular control. Storm compares each field individually and generates UPDATE statements that include only the columns that actually changed. Like ENTITY mode, if no fields changed, Storm skips the UPDATE entirely.
val user = orm.get(User_.id eq 1) // {id=1, email="a@b.com", name="Alice", bio="...", settings="..."}
// Only name changed
val updated = user.copy(name = "Bob")
orm update updated
// UPDATE user SET name=? WHERE id=?
// Multiple fields changed
val updated2 = user.copy(name = "Bob", email = "bob@example.com")
orm update updated2
// UPDATE user SET name=?, email=? WHERE id=?
Why Use Field-Level Updates?
Reduced write amplification: When you have wide tables (many columns) but typically only change a few fields, FIELD mode avoids writing unchanged data. This can significantly reduce I/O, especially for tables with large TEXT or BLOB columns.
// Article has 20 columns including large 'content' field
// But we're only updating the view count
val article = orm.find(Article_.id eq articleId)
orm update article.copy(viewCount = article.viewCount + 1)
// UPDATE article SET view_count=? WHERE id=?
// The large 'content' column is NOT written
Reduced database overhead: Updating fewer columns reduces redo/undo log volume, replication payload size, and avoids rewriting large column values unnecessarily.
Reduced trigger activity: If your database has column-specific triggers, FIELD mode ensures they only fire when their columns actually change.
Understanding SQL Shape Variability
The trade-off with FIELD mode is that it generates different SQL statements depending on which fields changed:
┌────── ────────────────────────────────────────────────────────────┐
│ FIELD Mode SQL Shapes │
├──────────────────────────────────────────────────────────────────┤
│ │
│ Change: name only │
│ SQL: UPDATE user SET name=? WHERE id=? │
│ │
│ Change: email only │
│ SQL: UPDATE user SET email=? WHERE id=? │
│ │
│ Change: name + email │