Insert if missing, update if present: one of the most common write patterns, and one JPA never got a first-class answer for. Storm delegates it to the database, where it is atomic.
A webhook delivers user records, possibly more than once. An import job synchronizes rows from an external system. In both cases the same logical record may arrive repeatedly, and the write must be idempotent: create the row the first time, update it every time after, and survive two deliveries arriving at once.
JPA has no upsert operation. merge() sounds like one but is not: it copies detached state into the session and still decides insert-or-update by looking first. So most codebases end up with this:
// The common JPA workaround: check, then write. Two statements, one race. @Transactional fun importUser(email: String, name: String, city: City) { val existing = userRepository.findByEmail(email) if (existing == null) { userRepository.save(User(email = email, name = name, city = city)) } else { existing.name = name existing.city = city } }
-- two statements with a gap between them SELECT u.* FROM user u WHERE u.email = ? INSERT INTO user (email, name, city_id) VALUES (?, ?, ?) -- two concurrent imports both see "no row" in the gap: -- one insert succeeds, the other dies on the unique constraint
The check and the write are separate statements, so two concurrent requests can both find nothing and both insert. One wins, the other throws DataIntegrityViolationException, and now you are writing retry logic or taking pessimistic locks for a write that the database could have resolved on its own. The alternative is a hand-written native query per database dialect.
Storm treats upsert as a first-class repository operation and generates the native conflict syntax of whatever database you are on:
// One statement, resolved by the database, safe under concurrency val user = orm upsert User( email = "alice@example.com", name = "Alice", city = city, ) // user.id is populated whether the row was inserted or updated
-- PostgreSQL dialect INSERT INTO "user" (email, name, city_id) VALUES (?, ?, ?) ON CONFLICT (email) DO UPDATE SET name = excluded.name, city_id = excluded.city_id -- MySQL/MariaDB: INSERT ... ON DUPLICATE KEY UPDATE -- Oracle, SQL Server, H2: MERGE INTO ...
The database resolves the conflict atomically in one statement: no race window, no retry logic, no application-level locking. Conflict detection rides on the table's unique constraints, in this case the unique index on email. The returned entity carries the database-generated id either way.
storm-postgresql), and a primary key or unique constraint for the database to detect the conflict. On MySQL and MariaDB, any unique constraint can trigger the update branch, so be deliberate when a table has several. See Upserts for the failure modes.Synchronization jobs rarely process one record. Passing a list combines JDBC batching with the native upsert syntax, which is dramatically faster than a loop:
// Synchronize an external source: insert the new, update the known val users = imported.map { row -> User(email = row.email, name = row.name, city = resolveCity(row.city)) } orm upsert users // one batched statement, not a loop
Upserts participate in transactions like every other write, so multi-step imports stay atomic:
// Upserts compose with transactions like any other write transaction { val city = orm insert City(name = "Sunnyvale", population = 161_884) val user = orm upsert User(email = "alice@example.com", name = "Alice", city = city) }
| JPA with Hibernate | Storm | |
|---|---|---|
| Upsert operation | None; find-then-save or a native query per dialect | orm upsert entity, one statement |
| Concurrency | Race window between check and write | Atomic; the database resolves the conflict |
| Portability | Native SQL differs per database | Dialects generate ON CONFLICT, ON DUPLICATE KEY, or MERGE |
| Batches | Loop over single writes | One batched upsert statement |
The reference documentation covers the mechanics in depth: