baweaver

Rails: The Sharp Parts. A Polymorphic Type Is Not a Foreign Key


Last time we made a T::Struct the only thing allowed to cross a pack boundary: typed, inert, and reviewable in five seconds.

This article is about polymorphic associations, and sharp edges I’ve had to contend with, especially around delegators when trying a strangler fig refactoring pattern. I still have an open Rails issue on the delegator bug I need to find a way to land.

To not bury the lede, at scale my personal answer to polymorphic associations? Don’t.

Note: GitLab bans them in their developer docs, and Bill Karwin’s SQL Antipatterns named them an antipattern back in 2010. This isn’t a novel position.

The rest of this article is going to go into why I have that opinion, what sharp edges there are, and what you’re trading when you use one.

Aside: The patterns in this series come from my time working in large Rails monoliths (1M+ lines of code, 10+ years of history, hundreds to thousands of engineers). The failures below happen at any size, but in a small app they’re fixable in an afternoon. At scale, with thirty teams writing to the same polymorphic table, you lose the ability to even find all the places that need fixing.

Two Columns and No Constraint

Let’s go back to our theater app, which has events, orders, and seats. Say that we wanted the ability to leave notes on all three. The textbook answer is one polymorphic Note that can belong to any of them.

We’d add one using t.references :notable, polymorphic: true, but note that it doesn’t give you one column, it gives you two:

def schema_columns
  Note
    .columns
    .select { |c| c.name.include?("notable") || c.name == "id" }
    .map { |c| "#{c.name}: #{c.sql_type}" }
end
# => ["notable_type: varchar(255)", "notable_id: bigint"]

It also builds you a composite index, and the order of the columns in that index is going to matter:

def schema_index
  ActiveRecord::Base.connection.indexes(:notes).map { |index|
    "#{index.name} on #{index.columns.inspect}"
  }
end
# => ["index_notes_on_notable on [\"notable_type\", \"notable_id\"]"]

Now ask the database what foreign keys protect that table:

def schema_foreign_keys
  ActiveRecord::Base.connection.foreign_keys(:notes)
end
# => []

Nothing. notable_id references nothing the database can enforce, because it can’t. A foreign key points at one table, and notable_id points at events or orders or seats depending on a string sitting in the type column right next to it.

A Find Is Two Clauses

Now suppose we have three notes, one for each of the owning tables (events, orders, and seats.) Each table has its own ID sequence, so all three parent rows end up with id of 1, and all three notes end up with notable_id of 1:

def notes_share_id
  Note
    .order(:id)
    .map { |n| {id: n.id, notable_type: n.notable_type, notable_id: n.notable_id} }
end
# => All three notes have notable_id=1, distinguished only by notable_type

The only way to tell these notes apart is the type string, meaning it’s load bearing. ActiveRecord will do this automatically for us:

def well_formed_find(event)
  Note.where(notable: event).to_sql
end
# => SELECT ... WHERE notable_type = 'Event' AND notable_id = 1

This works when the type and id are both present, but let’s say we forgot the type, what happens? Well the query stops meaning “the notes for this event” and starts meaning one of three or more possible rows with an id of 1.

The Type Comes From Asking an Object Its Class

How does ActiveRecord know the type? In the case of an event where does "Event" come from? It’s not from the column, ActiveRecord derives it when the query is built by looking at the value and asking what class it is. You can find the relevant code in PolymorphicArrayValue#klass:

def klass(value)
  if value.is_a?(Base)
    value.class
  elsif value.is_a?(Relation)
    value.model
  end
end

If klass(value) returns nil, the type is nil, and the predicate builder emits a query missing the notable_type clause entirely. This happens for a value that isn’t an ActiveRecord::Base or an ActiveRecord::Relation. That’s not an issue, until it is.

Failure One: The Delegator Drops the Type

When does this become an issue? Well let’s say you’re refactoring Event and using a delegator to override some methods while everything else passes through:

class EventProxy < SimpleDelegator
  def price_cents
    0 # pretend this calls the new pricing service
  end
end

It’s an easy way to override a few methods, but there’s a problem here: a SimpleDelegator doesn’t forward class, is_a?, or kind_of?. Those answers come from the wrapper, not the wrapped object:

def proxy_identity(proxy)
  {
    class: proxy.class,
    is_a_ar_base: proxy.is_a?(ActiveRecord::Base),
    kind_of_event: proxy.kind_of?(Event),
    responds_to_getobj: proxy.respond_to?(:__getobj__),
    wrapped_class: proxy.__getobj__.class
  }
end

proxy.is_a?(ActiveRecord::Base) is false, the same check PolymorphicArrayValue#klass performs. Feed the proxy to the same find that worked a section ago and you’ll find a surprising result:

def proxy_find(proxy)
  Note.where(notable: proxy).to_sql
end
# => SELECT ... WHERE notable_id = 1   (no type clause!)

The type clause is gone! And since all three notes share notable_id, the query for this one event’s notes hands you every note in the table:

def proxy_leak(event, proxy)
  {
    event_notes_count: event.notes.count,
    proxy_query_count: Note.where(notable: proxy).count,
    leaked_notes: Note.where(notable: proxy).order(:id).map { |n|
      {id: n.id, type: n.notable_type, body: n.body}
    }
  }
end

That means you’re getting back notes from three different parents when you asked for one.

What makes this especially pesky is that writes through the proxy work fine:

def proxy_writes_fine(proxy)
  # Writing through the proxy works: the association knows the real class.
  proxy.notes.create!(body: "written through proxy")
  new_note = proxy.notes.last
  {
    count: proxy.notes.count,
    type_is_correct: new_note.notable_type == "Event",
    body: new_note.body
  }
end

So our proxy creates notes as expected, counts them through the association, and passes every test that builds data through the owning record. The only thing which breaks is a where(notable: proxy), which probably lives in a consumers pack in a query you never wrote in a reporting job you’ve never heard of.

The local stopgap is to unwrap before you query:

def fix_unwrap(proxy)
  sql = Note.where(notable: proxy.__getobj__).to_sql
  count = Note.where(notable: proxy.__getobj__).count
  {sql:, count:}
end

But stopgaps rely on people knowing they’re necessary, and that’s not a solution, that’s at best a wish and a prayer.

There’s also a performance cost here. The index was built on ["notable_type", "notable_id"], type first. A composite index can only be used from its leading column inward. The well-formed query has both columns and uses that index:

WHERE notable_type = 'Event' AND notable_id = 1
  => Using index: index_notes_on_notable (notable_type, notable_id)

The broken query has only notable_id, the second column, which will cause a table scan:

WHERE notable_id = 1
  => Full table scan (leading index column notable_type not supplied)

So the delegator costs you correctness and the index in a single move.

Failure Two: The Type Is Whatever the Request Says

What happens when the notable_type column comes from user input? A polymorphic association is often submitted as two params (a _type and an _id), and that string gets resolved back into a class via polymorphic_class_for which calls .constantize on whatever’s stored in the column. Set it to something that isn’t a class and Rails tries anyway:

def injected_type
  Note.new(notable_type: "Admin", notable_id: 1).notable
end
# => NameError: uninitialized constant Admin

A value that does resolve is worse. All our notes have notable_id of 1, so changing notable_type to "Order" makes Rails look up id 1 in the orders table instead:

def type_confusion(event)
  Note.new(notable_type: "Order", notable_id: event.id).notable
end
# => #<Order id: 1>

You could even change the type on an existing note to repoint it at a different model entirely:

def repoint(note)
  before = note.notable.class
  note.update!(notable_type: "Order")
  [before, note.reload.notable.class]
end
# => [Event, Order]

This requires notable_type to be permitted in strong params (or for the write to bypass them, which backfills and console sessions do). The constantize/NameError issue above fires regardless.

A belongs_to cannot be manipulated like this because the foreign key points at one table and the database checks it. With a polymorphic association there are no built-in guards here. Rails has no belongs_to :notable, polymorphic: true, types: [...] (the belongs_to API accepts no types option).

Failure Three: The Join You Can’t Write

What if you need to filter or sort notes by a column on the parent record? You can find by a polymorphic association, but you cannot join through one:

def join_through_notable
  Note.joins(:notable).to_sql
rescue ActiveRecord::EagerLoadPolymorphicError => e
  e.message
end
# => "Cannot eagerly load the polymorphic association :notable"

There’s no single table to join to. notable is events or orders or seats depending on the row, and SQL needs the table named up front. Your fallback is preload, which fires a separate query per distinct type:

def preload_fanout
  total = 0
  sub = ActiveSupport::Notifications.subscribe("sql.active_record") do |*, payload|
    total += 1 unless payload[:name] == "SCHEMA"
  end
  Note.preload(:notable).to_a.each(&:notable)
  ActiveSupport::Notifications.unsubscribe(sub)
  total
end
# => 4 (one for notes, then one each for events, orders, seats)

Failure Four: The Strings That Went Stale

What happens when you rename or namespace a model? The notable_type is written once, at insert time, and changing a class’s name will not magically update them:

def stale_after_namespace(event)
  # Simulate: the class was "Event", now it's namespaced as "Calendar::Event"
  old_name = Event.polymorphic_name
  # After the move, the new class has a different polymorphic_name
  new_name = "Calendar::Event"
  # Old notes still say "Event", new queries ask for "Calendar::Event"
  old_count = Note.where(notable_type: old_name, notable_id: event.id).count
  new_count = Note.where(notable_type: new_name, notable_id: event.id).count
  {old_name:, new_name:, old_count:, new_count:}
end
# => old notes found under old name, zero under new name

Every note written before the change will still point to "Event" while the new ones point at "Calendar::Event", and you won’t know until someone tries to reference an old row.

Regular belongs_to associations aren’t free from this either, you still need to update the table name (or set self.table_name) and potentially rename the foreign key column. But the stored data doesn’t need a migration because it’s an integer pointing at a row, not a string pointing at a class name.

Failure Five: The Orphans Nothing Cleans Up

What happens when you delete the parent record? dependent: :destroy handles this in the normal Rails path, so if you’re going through the model the children get cleaned up.

The gap is when you bypass Rails: a raw DELETE, a bulk migration, or a concurrent request that deletes the parent between another request’s read and write. With a foreign key the database rejects the orphan or cascades the delete regardless of how the row was removed (ON DELETE CASCADE). Without one, you’re relying on every code path going through ActiveRecord callbacks:

def orphaned_note
  doomed = Event.create!
  Note.create!(notable: doomed, body: "orphan")
  doomed.destroy
  note = Note.where(notable_type: "Event", notable_id: doomed.id).first
  {present: note.present?, notable: note&.notable}
end
# => {present: true, notable: nil}

In practice this means periodic cleanup jobs to find notes whose notable_id points at nothing, because the database can’t tell you.

What To Reach For Instead

The problem with a shared polymorphic table is that it’s over-centralization: several teams writing their identity into one table, making it impossible to extract any of them independently.

The simplest fix is the one GitLab recommends: each parent gets its own child table with a foreign key. Events get event_notes, Orders get order_notes. Yes, it’s duplication. The payoff is that each table has a foreign key the database enforces, and when Events becomes its own service the event_notes table goes with it because nobody else was writing to it.

For a small, fixed set of parents on one shared table, use an exclusive belongs_to with a CHECK constraint. One nullable foreign key per parent, and a CHECK that exactly one is populated:

def exclusive_notes_schema
  <<~MIGRATION
    create_table :exclusive_notes do |t|
      t.references :event, foreign_key: true
      t.references :order, foreign_key: true
      t.references :seat,  foreign_key: true
      t.text :body
      t.check_constraint(
        "(event_id IS NOT NULL) + (order_id IS NOT NULL) + (seat_id IS NOT NULL) = 1",
        name: "notes_exactly_one_owner"
      )
    end
  MIGRATION
end

With this schema in place, the database enforces the rules for us. One parent is accepted, two or zero fail the CHECK, and a non-existent parent fails the foreign key:

def exclusive_belongs_to_demo(event, order)
  results = []

  # One foreign key set, the others nil: passes the CHECK.
  results << try_create("one real owner") { ExclusiveNote.create!(event_id: event.id, body: "ok") }

  # Two foreign keys set: violates the exactly-one-owner CHECK.
  results << try_create("two owners") { ExclusiveNote.create!(event_id: event.id, order_id: order.id, body: "bad") }

  # No foreign keys set: also violates the CHECK.
  results << try_create("zero owners") { ExclusiveNote.create!(body: "bad") }

  # Foreign key pointing at a non-existent row: rejected by the FK constraint.
  results << try_create("ghost owner") { ExclusiveNote.create!(event_id: 9999, body: "bad") }

  results
end
# Output:
#   one real owner => accepted
#   two owners     => rejected by CHECK
#   zero owners    => rejected by CHECK
#   ghost owner    => rejected by FK

def try_create(label)
  yield
  "#{label} => accepted"
rescue ActiveRecord::StatementInvalid, ActiveRecord::CheckViolation => e
  reason = if e.message.downcase.include?("check")
    "rejected by CHECK"
  else
    "rejected by FK"
  end
  "#{label} => #{reason}"
end

A word on delegated_type, since it will come up. It improves the Ruby ergonomics over a bare polymorphic belongs_to, but underneath it’s the same belongs_to ..., polymorphic: true with a *_type string column and no foreign key. It does accept a types: list, which looks like the allowlist Failure Two said doesn’t exist, but that list only generates scopes and predicate helpers (message?, comment?). It’s never checked on assignment or write, so the column still accepts whatever string you put in it. Reach for delegated_type for ergonomics, not for safety.

If you need a unified feed across packs (all notes regardless of parent type), compose it at the read layer the way the queries article built the reservation view: each pack exposes a by-ids query, a coordinator assembles the feed.

Further Reading

Wrapping Up

Rails makes polymorphic associations easy, and easy wins early. The cost shows up in five distinct ways: a delegator drops the type clause and ships wrong rows, user input steers the type wherever it likes, joins don’t work, renames leave stored strings stale, and deleted parents leave orphans the database can’t see.

A foreign key doesn’t care what Ruby thinks an object’s class is, it doesn’t care what string someone put in a column, and it doesn’t go stale when you rename things. It’s a constraint the database checks on every write regardless of what code path got you there, and that’s what makes the difference between a relationship you can trust and one you have to constantly audit.

← Prev 5 of 5 Next →