Tutorials/Solving the N+1 problem

Solving the N+1 problem

The most common performance bug in JPA applications, and how Storm removes the conditions that create it. Side by side, with the SQL each approach actually runs.

Series · JPA to Storm7 min readKotlin

01The task

Render a list of 50 users with the name of the city each user lives in. Two tables, one foreign key, no aggregation. This is the simplest possible relational read, and it is where most JPA applications quietly issue 51 queries instead of one.

02The JPA way

A typical JPA mapping marks the relation LAZY so that loading a user does not always drag in its city. Note the shape JPA needs from the Kotlin class: open (via the compiler plugin), mutable, with nullable defaults, and no data class:

User.kt Kotlin · JPA

// Open, mutable, nullable: the shape JPA needs (via the kotlin-jpa plugin).
@Entity
class User(
    @Id @GeneratedValue
    var id: Int? = null,

    var email: String = "",
    var name: String = "",

    @ManyToOne(fetch = FetchType.LAZY)
    var city: City? = null,
)

The repository call and the rendering loop both look harmless:

UserService.kt Kotlin · JPA Show SQL

val users = userRepository.findAll()   // 1 query
for (user in users) {
    render(user.city!!.name)         // + 1 query per uninitialized proxy
}

↳ generated sql

-- findAll(): one query for the users ...
SELECT u.id, u.email, u.name, u.city_id FROM user u

-- ... then one more per city while rendering
SELECT c.id, c.name FROM city c WHERE c.id = ?
SELECT c.id, c.name FROM city c WHERE c.id = ?
SELECT c.id, c.name FROM city c WHERE c.id = ?
-- 50 users in 50 cities: 51 round trips

The list query fetches users only. Each user.city holds a proxy, and the first name access on each proxy triggers its own SELECT. Nothing in the code, the types, or the compiler warns you; you find out from the query log, or from production latency. Switching to EAGER does not fix the list case either: JPQL list queries still fetch each city with a follow-up select, it just happens before your code runs. And if a proxy escapes the session before being touched, you get the famous LazyInitializationException instead of a result.

03The standard fixes

JPA has two well-known remedies, and they work:

UserRepository.kt Kotlin · JPA

interface UserRepository : JpaRepository<User, Int> {

    // Fix 1: JOIN FETCH, applied one JPQL query at a time
    @Query("SELECT u FROM User u JOIN FETCH u.city")
    fun findAllWithCity(): List<User>

    // Fix 2: an entity graph, applied one call site at a time
    @EntityGraph(attributePaths = ["city"])
    override fun findAll(): List<User>
}

Both share the same limitation: they are opt-in, per query or per call site. The entity declares one loading behavior, individual queries override it, and the compiler verifies none of it. Every new query is a fresh opportunity to forget the fetch clause, which is why the N+1 problem keeps returning to codebases that have already fixed it several times.

The JPA snippets above follow the shape you will find in Baeldung's guide to the N+1 problem, which is the reference most teams reach for. JOIN FETCH and entity graphs are the right fixes within JPA. This page is about what changes when the problem cannot occur by accident in the first place.

04The Storm way

Storm moves the loading decision out of individual queries and into the model. A foreign key field declared with its plain entity type is always loaded, in the same query, via a join. There is no lazy variant of it, so there is nothing to forget:

Entities.kt Kotlin · Storm

// Loading policy is part of the model, not of each query.
data class City(
    @PK val id: Int = 0,
    val name: String,
) : Entity<Int>

data class User(
    @PK val id: Int = 0,
    val email: String,
    val name: String,
    @FK val city: City,   // plain field: joined in the same query, always
) : Entity<Int>

The same list-and-render code now runs one query, every time, for every caller:

UserService.kt Kotlin · Storm Show SQL

val users = userRepository.findAll()   // 1 query, cities included
for (user in users) {
    render(user.city.name)             // plain field access, no database call
}

↳ generated sql

-- findAll() loads the city graph in the same round trip
SELECT u.id, u.email, u.name, c.id, c.name
FROM "user" u
INNER JOIN city c ON u.city_id = c.id
-- 50 users in 50 cities: 1 round trip

The result is a list of plain, immutable data classes. user.city is a field holding a City, not a proxy holding a session reference. You can serialize it, cache it, or hand it to another thread; there is no session to outlive, so LazyInitializationException does not exist in Storm.

The join does not bloat memory either. The join result carries the city columns on every row, but relations in the returned graph that represent the same id share a single instance: 50 users spread over 5 cities hydrate into exactly 5 City objects, each constructed once and shared across the list. Immutability is what makes that sharing safe.

05When you do not want the join

Sometimes the join is genuinely wasted work, and lazy loading is the right call. Storm makes that a type, not a runtime behavior. Declare the field as Ref<City> and Storm reads only the foreign key column; loading the city becomes an explicit method call that is visible in code review:

Entities.kt Kotlin · Storm Show SQL

data class User(
    @PK val id: Int = 0,
    val email: String,
    val name: String,
    @FK val city: Ref<City>,   // Ref field: reads city_id, joins nothing
) : Entity<Int>

val user = userRepository.getById(1)   // no join issued
val city = user.city.fetch()           // loading is a visible, deliberate call

↳ generated sql

-- getById(1): no join, the ref holds the foreign key
SELECT u.id, u.email, u.name, u.city_id FROM "user" u WHERE u.id = ?

-- fetch(): runs only where your code says so
SELECT c.id, c.name FROM city c WHERE c.id = ?

This is the whole model: City means loaded, Ref<City> means not loaded. The decision is written once, in the entity, and every query in the codebase behaves the same way. If a loop calls fetch() per iteration, the N+1 is right there in the code where a reviewer can see it, not hidden inside a getter.

The instance sharing from the previous section applies to refs as well. Within a transaction, refs that represent the same id resolve to one shared instance through the entity cache, so the data behind them is loaded once per distinct id, not once per row: fetching the cities of 50 users spread over 5 cities issues at most 5 queries. And fetching the same ref twice never queries twice.

06Prove it in a test

Because query shapes are deterministic, "no N+1" is not a code-review hope, it is an assertion. Storm's test module captures the SQL a block generates, so a regression that adds a query fails the build:

UserQueryTest.kt Kotlin · storm-test

@StormTest(scripts = ["/schema.sql", "/data.sql"])
class UserQueryTest {

    @Test
    fun `users and cities load in a single query`(orm: ORMTemplate, capture: SqlCapture) {
        val users = capture.execute { orm.entity<User>().findAll() }

        users.forEach { render(it.city.name) }   // walk the graph freely
        capture.count(Operation.SELECT) shouldBe 1
    }
}

@StormTest spins up an in-memory H2 database, runs your schema scripts, and injects the ORMTemplate and SqlCapture parameters. See Testing for the full setup, including running the same test against PostgreSQL with Testcontainers.

07Side by side

	JPA with Hibernate	Storm
Where loading is decided	Per query (JOIN FETCH, entity graphs) or at access time through proxies	In the entity model: a `City` field joins, a `Ref<City>` field does not
N+1 risk	Present by default; every lazy association is a potential 1 + N	Absent by default; extra queries happen only where `fetch()` is written
Failure mode	`LazyInitializationException` when a proxy outlives its session	None; results are plain records with no session to outlive
Verifying behavior	Inspect `show_sql` logging by hand	Assert query counts in a unit test with `SqlCapture`

08Keep going

The reference documentation covers the mechanics in depth:

Refs Relationships Queries Hydration Testing

Get started → Star on GitHub