baweaver

Rails: The Sharp Parts. Callbacks Are Not Invariants


Last time we pulled apart indexes and found that add_index writes the suggestion, not the plan, and confusing the two leads to production surprises. Buried in the article before that, on locks, was an aside:

Single-ingress writes are worth their weight in gold… They eliminate callbacks, allow for easier instrumentation, clearer optimization paths, and much easier debugging.

That’s a bold assertion, but for me a very warranted one: model callbacks with domain behavior should be removed, and replaced with explicit objects that have exactly one ingress.

Why do I have that opinion? Of all the sharp parts in Rails, callbacks have led to the most entanglement of otherwise unrelated code in surprising ways that frequently cause outages at scale. Magic has a cost, and while it feels good to write in the moment and seems clear, that cost will come due whether weeks, months, or even years later.

The Same Model, Two Directions

Consider two scenarios on the same model.

Example one starts with a backfill task:

Order.where(region: nil).find_each do |order|
  order.update!(region: "us-west-2")
end

The model had an after_commit that synced any changed order to the CRM, so forty thousand orders re-synced, the webhook queue backed up for half an hour, and whatever unfortunate partners we’re working with probably rate-limited us. Nobody wrote “sync to the CRM” in that backfill task; the model did, implicitly, whether or not we knew it.

Example two. A different engineer runs the same backfill with update_all:

Order.where(region: nil).update_all(region: "unknown")

It’s reasonably fast and because of the update_all it’s not going to trigger a sync storm like the last one, so we’ve avoided the callback cost here, right?

Well that same Order model was also maintaining an OpenSearch search index using a callback, and now thousands of orders are silently falling out of searches for the next few weeks until a customer asks why they can’t find their orders.

Both examples made the same mistake, albeit from opposite directions. The first didn’t consider what callbacks might run on an update, and the second didn’t consider what callbacks won’t run if you skip them. In code review that shared belief sounds like “put it in a callback so it always happens, no matter who writes the record.” That sentence is wrong twice: it doesn’t always happen, and you can’t see when it does.

What save Actually Runs

save reads like a verb, an atomic action, but it’s a program. When you call it, ActiveRecord runs the compiled callback chain for :save (and :create or :update, and :validation, and later :commit) with your record threaded through every entry (Rails: ActiveRecord::Callbacks). Your before_save blocks are entries in that chain, and so are entries you never wrote and may not even know about, because association macros register callbacks too.

Two models, zero user callbacks:

Note: Code samples in this article run against a shared test schema. The self.table_name lines map models to those tables; in a real app these would be separate migrations.

class Author < ActiveRecord::Base
  self.table_name = "events"
  has_many :posts, class_name: "Post", foreign_key: :event_id, dependent: :destroy
end

class Post < ActiveRecord::Base
  self.table_name = "seats"
  belongs_to :author, class_name: "Author", foreign_key: :event_id, touch: true, counter_cache: true
end

def census_demo
  {
    post_save_filters: Post._save_callbacks.map { _1.filter.to_s },
    post_count: census(Post),
    author_count: census(Author)
  }
end

# => {
#   post_save_filters: ["autosave_associated_records_for_author"],
#   post_count: 5,
#   author_count: 6
# }

That lone save entry is autosave, the feature where saving a record also saves any loaded associated records, and belongs_to registered it on your behalf. Two lines of associations and zero callbacks of your own still leave eleven registered, and a typical model adds five to ten more with domain behavior.

Aside: run that same census on ActiveRecord 7.2 and you get 6 and 8. The framework’s own contribution to your chain changes across upgrades, which means the program behind save shifts under you even when your model’s file doesn’t.

A callback is control flow attached to persistence that the call site can’t see, written by an author who can’t see the call sites: two blind spots pointed at each other, and everything below is a consequence of that.

Failure One: The Paths That Skip the Chain

So how does the assumption break? The assumption developers carry is “always happens.” Rails never said that, and the API surface shows exactly where it doesn’t:

Method Callbacks Validations
save / save! / update / update! yes yes
destroy / destroy! yes no (destroy never validates)
save(validate: false) yes no
update_attribute yes no
touch after_touch / after_commit only no
update_column / update_columns no no
update_all / delete / delete_all no no
insert_all / upsert_all / touch_all no no

It’s documented behavior, and the skip methods exist because callbacks exist: they’re the pressure-release valve Rails ships for when the chain is too slow, too loud, or too dangerous to run. Your coworkers use them, Rails uses them internally, and the Example Two engineer used one on purpose.

The unassuming update_attribute (singular) runs the callbacks but skips validations:

class ValidatedSeat < ActiveRecord::Base
  self.table_name = "seats"
  validates :reserved_by, exclusion: {in: ["invalid"]}
end

def update_attribute_demo
  seat = ValidatedSeat.create!(reserved_by: "fine", external_ref: "REF-#{SecureRandom.hex(4)}")
  seat.update_attribute(:reserved_by, "invalid")
  # => "invalid"
  ValidatedSeat.find(seat.id).reserved_by
end

class ValidatedSeat < ActiveRecord::Base
  validates :reserved_by, exclusion: {in: ["invalid"]}
end

Now we have the forbidden value in the database despite the exclusion validation and every callback having fired.

The next are the bulk operations that not only skip callbacks, but all callbacks. That includes things like counter_cache, which if a delete or insert action hits is going to be real confusing for a while:

event = Event.create!(name: "RubyConf")
# chain runs, counter goes to 1
Seat.create!(event: event)
# chain skipped
Seat.insert_all([{event_id: event.id}] * 3)

event.reload.seats_count
# => 1
event.seats.count
# => 4

That means you now have a counter cache that is irreconcilable. If you want another fun one that means that things like PaperTrail are also broken, meaning your audit logs are going to be missing some pretty significant events.

In the index article I said constraints are preferred because they’re enforced on every write path, and this is the other half of that argument: a unique index has no insert_all-shaped hole in it, and a callback is an invariant with a published bypass list, which is a convention with good marketing. That’s the half where callbacks silently don’t run. The next failure is worse: callbacks that silently do.

Failure Two: Firing When Nobody Asked

Failure One was about callbacks that don’t fire when you expect them to. This failure is the opposite: callbacks that fire when nobody asked them to, taking locks on rows you never touched:

class TouchEvent < ActiveRecord::Base
  self.table_name = "events"
  after_touch { audit(:touched) }

  def audit(action) = AppLogger.info("[AUDIT] #{self.class.name}##{id} #{action}")
end

class TouchSeat < ActiveRecord::Base
  self.table_name = "seats"
  belongs_to :event, class_name: "TouchEvent", foreign_key: :event_id, touch: true, optional: true
end

Creating a seat fires in this order:

seat   before_save   transaction_open? => true
seat   after_save    transaction_open? => true
event  after_touch   transaction_open? => true
seat   after_commit  transaction_open? => false

The Event row was written, and its row lock taken, inside a transaction the Event author never sees and the Seat caller never asked about. That’s touch: true doing its job (bumping updated_at so view caches expire), but the side effect is a hidden write to a parent row on every child save.

When we have hidden writes that means we also have hidden locks to go with them. The trace appends a write to events at the end of every seat save, meaning every seat locks its parent event. That means that anything that updates an event and then its seats acquires an event-to-seat lock, and under any load? Well there’s a fun deadlock to debug. You can’t order the execution of locks you can’t see, and callbacks make them effectively invisible.

The chain also fires on reads. after_find fires once per row loaded and after_initialize fires on every instantiation, so a default assigned there (self.priority = true) marks a freshly loaded record dirty: changed? returns true, and any code holding a polite save if changed? starts writing during what should be a read.

The folk fix for accidental re-entry and cycles is update_column inside the callback to break the chain, which works by skipping the rest of it. The cure for Failure Two is a fresh case of Failure One.

Failure Three: Validations Are Callbacks Too

Validations read like declarations, but they execute as entries in the same chain (the census counted a :validation kind for a reason), they run on every save, every create, and every call to valid?, and before_validation gets to rewrite what they see which means more hidden side-effects.

When a validation is put early in a class, and a before_save runs afterwards, it means they execute in that order making a path for invalid values to sneak through:

class MutatingSeat < ActiveRecord::Base
  self.table_name = "seats"
  validates :reserved_by, exclusion: {in: ["invalid"]}
  before_save { self.reserved_by = "invalid" }
end

def mutation_past_validation
  seat = MutatingSeat.new(reserved_by: "fine", external_ref: "MV#{SecureRandom.hex(4)}")
  # => true (validations passed!)

  seat.save!
  reloaded = MutatingSeat.find(seat.id)
  # => "invalid"

  reloaded.reserved_by
  # => false

  {
    value: reloaded.reserved_by,
    valid: reloaded.valid?
  }
end

Validations do not guard writes, they guard a moment before several other callbacks run before the write, with no way to control it.

Beyond ordering, validations have a query cost. A uniqueness validation is a SELECT asking “does this exist yet” before every INSERT, and a transitive validation (one that reads other records to validate this one) is another query on top:

def validation_query_fanout(count: 50)
  event = Event.create!(name: "Fanout#{SecureRandom.hex(4)}", capacity: 100_000)

  validated = count_statements_for {
    count.times do |n|
      CapacitySeat.create!(event: event, external_ref: "VF#{SecureRandom.hex(6)}")
    end
  }

  plain = count_statements_for {
    count.times { |n| Seat.create!(external_ref: "PF#{SecureRandom.hex(6)}") }
  }

  {validated: validated, plain: plain}
end

def count_statements_for(&block)
  total = 0
  sub = ActiveSupport::Notifications.subscribe("sql.active_record") do |event|
    sql = event.payload[:sql].to_s
    next if event.payload[:name] == "SCHEMA"

    total += 1
  end

  block.call
  ActiveSupport::Notifications.unsubscribe(sub)
  total
end

Running fifty creates through that model against fifty through a bare one, with a subscriber counting every statement:

50 plain creates:      150 statements (INSERT, BEGIN, COMMIT)
50 validated creates:  250 statements (the same, plus 100 SELECTs)

That means two extra reads per row on your hottest table. Scale that to fifty thousand rows and you’re paying a hundred thousand extra SELECTs, and the capacity COUNT scans more rows for every successful insert.

But there’s one more surprise: uniqueness checks are vulnerable to race conditions. Two concurrent SELECTs run into each other so they both INSERT which the Rails guides say outright when they tell you to back the validation with a unique index. The validation produces the friendly error message, the index enforces the rule. And all of this runs at a specific point in the transaction lifecycle, which brings us to the next failure.

Failure Four: The Transaction Cuts the Chain in Half

The previous failures were about which callbacks fire. This one is about when. Every callback lives on one side or the other of COMMIT, and each side breaks differently:

class TransactionCheckSeat < ActiveRecord::Base
  self.table_name = "seats"
  after_save   { AppLogger.info([:after_save, self.class.lease_connection.transaction_open?]) }   # => true
  after_commit { AppLogger.info([:after_commit, self.class.lease_connection.transaction_open?]) }  # => false
end

after_save runs inside the transaction, so anything you do there which the database can’t undo won’t be undone:

class EmailSeat < ActiveRecord::Base
  self.table_name = "seats"
  after_save   :send_reservation_email
  after_commit :confirm_reservation

  def send_reservation_email
    ReservationMailer.confirmed(self).deliver_later
  end

  def confirm_reservation
    ReservationMailer.final_confirmation(self).deliver_later
  end
end

The email went out, after_commit never fired, and the seat doesn’t exist. Your customer is now confirmed for a reservation the database has no record of. The callback didn’t lie, exactly: it ran when the Ruby succeeded, and nobody told it the write didn’t.

Being inside the transaction also means IO in a callback holds your row locks open for the duration of that IO. The geocoder gem suggests after_validation :geocode, which turns every save into an HTTP call mid-transaction. It also led to one of the oldest job-queue bugs in Rails: enqueue from after_save, and a fast worker looks up the row before your COMMIT lands.

“Fine, after_commit for everything.” Is it? Now you’re holding the other set of edges.

It fires at the outermost commit, once, no matter how nested you are. It coalesces: create and update in one transaction fires only after_create_commit, and the update’s observers never hear about it:

def lifecycle_coalescing_demo
  fires = []
  klass = Class.new(Seat) do
    after_create_commit { fires << :created }
    after_update_commit { fires << :updated }
  end

  klass.transaction do
    seat = klass.create!(external_ref: "LC#{SecureRandom.hex(4)}")
    seat.update!(reserved_by: "x")
  end
  # => [:created]  (the update's observers never hear about it)
  fires
end

It can’t object to anything (the data is already committed), and for most of Rails history multiple after_commit callbacks ran in reverse declaration order, a behavior that only flipped when you opted into the 7.1 framework defaults.

One good primitive came out of this: since Rails 7.2, ActiveRecord.after_all_transactions_commit { ... } lets any code defer work until the real outermost commit (7.2 release notes), and it’s the piece the fix below is built on.

Failure Five: The Chain Is a State Machine Nobody Wrote Down

None of the above is fatal at one callback, and nobody has one callback. Here’s a model that looks reasonable until you count the paths through it:

class CompositeOrder < ActiveRecord::Base
  self.table_name = "orders"

  before_validation :normalize_email
  before_save :compute_totals, if: :line_items_changed?
  before_save :apply_loyalty_tier, unless: :imported?
  after_save :update_search_index
  after_save :recalculate_inventory, if: :saved_change_to_status?
  after_commit :sync_to_crm
  after_commit :notify_fulfillment, if: -> { saved_change_to_status?(to: "paid") }
  # plus :touch on the customer, plus a counter cache,
  # plus two more in concerns/syncable.rb that you'll find next month

  private

  def normalize_email
    self.email = email&.strip&.downcase
  end

  def compute_totals = nil
  def apply_loyalty_tier = nil
  def update_search_index = nil
  def recalculate_inventory = nil
  def sync_to_crm = nil
  def notify_fulfillment = nil
  def line_items_changed? = false
  def imported? = false
  def saved_change_to_status?(**) = false
end

Conditionals present a real headache: there are two paths for every condition, and with four that means sixteen different paths a single save could result in depending on what attributes changed. The after_commit conditions specifically (saved_change_to_status?) evaluate against ActiveModel::Dirty state that reflects the coalesced view of the transaction, not the individual save, which means intermediate transitions are invisible to them. The execution order is also in declaration order, or rather file load order when Concerns are involved, and prepend: true is happy to cut in line. The framework forces that option in one documented spot: dependent: :destroy registers its own before_destroy the moment the association is declared, so a guard written below it runs after the children are gone:

class DestroyGuardEvent < ActiveRecord::Base
  self.table_name = "events"
  has_many :seats, foreign_key: :event_id, dependent: :destroy
  # declared second, runs second
  before_destroy :refuse_if_seated

  def refuse_if_seated
    throw(:abort) if seats.exists?
  end
end

# event = DestroyGuardEvent.create!(name: "guarded")
# Seat.create!(event_id: event.id)
# event.destroy  # => destroyed! seats deleted before the guard ran

Add prepend: true and the same destroy aborts with the seat intact. We’ve inadvertently created an ad-hoc state machine, and any before_ callback in it can silently veto the whole write:

class BlockedSeat < ActiveRecord::Base
  self.table_name = "seats"
  before_save { throw :abort }
end

def blocked_save_demo
  blocked = BlockedSeat.new(external_ref: "BK#{SecureRandom.hex(4)}")
  # => false
  save_result = blocked.save
  # => ActiveRecord::RecordNotSaved
  save_bang = (blocked.save! rescue $!.class)
  {save: save_result, save_bang: save_bang}
end

The caller gets false back and probably doesn’t check how it got there. You can see this in test suites full of skip_callback and stubbed mailers, because the chain is too expensive to run on every factory create. Commands don’t have this problem, they test like any other class.

What Gets to Stay

With all that said, not every callback needs to go. A callback that’s a pure function of the record’s own attributes, no IO, no clock, no other rows, no other processes, can’t race, can’t outlive a rollback incorrectly, and can’t lock a parent row, and skipping it via update_column produces a formatting bug rather than a consistency hole. Downcasing an email qualifies; so does deriving a slug. As of Rails 7.1 both can be done via normalizes:

class NormalizedUser < ActiveRecord::Base
  self.table_name = "orders"
  normalizes :region, with: -> (val) { val&.strip&.downcase }
end

# NormalizedUser.new(region: "  US-WEST  ").region
# => "us-west"

(Rails: ActiveRecord::Normalization)

The rule to verify this is by checking if a callback reads or writes anything beyond its own attributes.

One Door In

A mutation should be visible, positioned relative to the commit, bulk-capable, and announce that it ran. ActiveSupport::Notifications gives us the announcement half, and a base class gives us the single door:

class ApplicationCommand
  def self.call(...)
    new(...).call
  end

  private_class_method :new

  def call
    ActiveSupport::Notifications.instrument(event_name, payload) do
      execute
    end
  end

  private

  def execute
    raise NotImplementedError, "#{self.class.name} must define #execute"
  end

  def payload = {}

  def event_name
    mod = self.class.module_parent_name
    demod = self.class.name&.demodulize&.underscore || "anonymous"

    "#{demod}.#{mod&.underscore || "unknown"}"
  end
end
def private_constructor_demo
  Seats::ReserveSeat.new(seat_id: 1, by: "x")
  # => raises NoMethodError: private method 'new' called
rescue NoMethodError => e
  e.message
end
def forgetful_demo
  Seats::Forgetful.call
  # => raises NotImplementedError: Seats::Forgetful must define #execute
rescue NotImplementedError => e
  e.message
end

When you later want logging, metrics, or tracing around every mutation in the app, they attach to the published events as subscribers. Here’s what a real command looks like built on this base:

module Seats
  class ReserveSeat < ApplicationCommand
    class AlreadyReserved < StandardError
    end

    def initialize(seat_id:, by:)
      @seat_id = seat_id
      @by = by
    end

    private

    attr_reader :seat_id, :by
    def payload = {seat_id:, by:}

    def execute
      seat = reserve
      announce(seat)
      seat
    end

    def reserve
      seat = Seat.find(seat_id)
      seat.with_lock do
        raise AlreadyReserved, "seat #{seat_id} is already reserved" if seat.reserved?

        seat.update!(reserved: true, reserved_by: by)
        Ledger::RecordReservation.call(seat: seat, by: by)
      end

      seat
    end

    def announce(seat)
      ReservationMailer.confirmed(seat).deliver_later
      Webhooks::Emit.call(event: :seat_reserved, record: seat)
    end
  end

instrument publishes on every invocation, including when the block raises, with the exception in the payload:

def instrumentation_output
  seat = Seat.create!(external_ref: "IO#{SecureRandom.hex(4)}")
  events = []
  sub = ActiveSupport::Notifications.subscribe(/\.seats\z/) do |event|
    events << {name: event.name, payload: event.payload.except(:exception_object), ms: event.duration.round(1)}
  end

  Seats::ReserveSeat.call(seat_id: seat.id, by: "brandon")
  begin
    Seats::ReserveSeat.call(seat_id: seat.id, by: "someone-else")
  rescue Seats::ReserveSeat::AlreadyReserved
  end

  ActiveSupport::Notifications.unsubscribe(sub)
  events
end
# => [
#   { name: "reserve_seat.seats", payload: { seat_id: 1, by: "brandon" }, ms: 7.4 },
#   { name: "reserve_seat.seats", payload: { seat_id: 1, by: "someone-else",
#       exception: ["Seats::ReserveSeat::AlreadyReserved", "seat 1 is already reserved"] }, ms: 0.5 }
# ]

Even the writes that didn’t happen leave a trace.

One fair objection is that announce still isn’t transactional. If the process dies between the commit and the mailer call, the side effect is lost. That’s true, and at scale the answer is a transactional outbox that writes the event inside the transaction and delivers it asynchronously. For most apps the visibility alone (the side effect is in one file, not scattered across a chain) is the win; the outbox is there when you need guaranteed delivery.

Bulk gets the same shape with a different verb, condensed to its differences:

class ImportSeats < ApplicationCommand
  def initialize(rows:) = @rows = rows

  private

  def payload = {count: @rows.size}

  def execute
    Seat.insert_all(@rows)

    # insert_all doesn't return records on MySQL (no RETURNING clause);
    # re-query to get the ActiveRecord objects for downstream use.
    seats = Seat.where(external_ref: @rows.pluck(:external_ref))

    SearchIndex::IndexSeats.call(seats: seats)
    ImportMailer.completed(seats.count).deliver_later

    seats
  end
end

One class per verb, one way in, and everything else that touches seats reads.

Making It Enforced, Not Polite

The packs/seats/app/public/ path in the comment above was deliberate. Packwerk splits a Rails app into packs, folders with declared dependency boundaries enforced at CI, and treats each pack’s app/public directory as that pack’s API: reference anything outside it from another pack and CI fails with a privacy violation. So commands are the API and the model isn’t in it:

packs/seats/
├── package.yml          # enforce_privacy: true
└── app/
    ├── public/
    │   └── seats/
    │       ├── reserve_seat.rb
    │       ├── release_seat.rb
    │       └── import_seats.rb
    └── models/
        └── seat.rb      # private to the pack

Every write announces itself through the base’s instrumentation, including refused ones:

def notification_subscriber_demo
  events = []
  sub = ActiveSupport::Notifications.subscribe(/\.seats\z/) do |event|
    events << "[WRITE] #{event.name} #{event.payload} #{event.duration.round}ms"
  end

  seat = Seat.create!(external_ref: "NS#{SecureRandom.hex(4)}")

  Seats::ReserveSeat.call(seat_id: seat.id, by: "test")
  ActiveSupport::Notifications.unsubscribe(sub)

  events
end
# => ["[WRITE] reserve_seat.seats {seat_id: 1, by: \"test\"} 7ms"]

Logging, metrics, and APM tracing attach as subscribers rather than edits to three hundred command files. This is event-driven architecture at the application level, and it trades one kind of coupling for another. Callbacks couple implicitly (the subscriber is invisible to the publisher and to static tools). Events couple explicitly (the subscriber is a constant reference Packwerk can check, and the event name is a grep-able string), but Pack B still reacts to Pack A’s writes without Pack A knowing.

  • You gain discoverability (subscribers are declared, not hidden in association macros)
  • You gain replaceability (swap a subscriber without touching the publisher)
  • You pay in indirection (following a notification to its subscriber takes a search, not a stack trace)
  • You pay in ordering uncertainty (subscribers fire in registration order, which is load order)

For most apps the discoverability wins outright. If indirection becomes painful, the boundary is in the wrong place rather than the mechanism.

None of this holds without enforcement. A convention is only as strong as the mechanisms that guarantee it, and without guardrails standards become requests that get summarily ignored.

One: Packwerk privacy. The commands already live in app/public, so enforce_privacy: true makes the model unreachable from outside the pack. A direct seat.update! from another pack fails CI with a privacy violation before it merges (Packwerk).

Two: RuboCop, a linter, accepts custom rules called cops, and this pattern earns two of them. If you haven’t written a cop before, ASTs in Ruby: Pattern Matching covers how to work with Ruby’s syntax tree programmatically. The first cop: no mutations outside command files.

module ArticleCops
  # Only flags mutation calls on constants (Model.create!) or instance
  # variables/self (@order.save!, save!). Skips local variables like
  # set.delete or hash.update which aren't ActiveRecord.
  class WriteOutsideCommand < RuboCop::Cop::Base
    def on_send(node)
      return unless mutation?(node)
      return unless model_receiver?(node)
      return if processed_source.file_path.include?("app/public")

      add_offense(node, message: "Mutate ActiveRecord models through a command in app/public.")
    end

    private

    def mutation?(node)
      %i[
        save
        save!
        update
        update!
        update_column
        update_columns
        update_all
        destroy
        destroy!
        delete
        delete_all
        insert_all
        upsert_all
        create
        create!
      ]
        .include?(node.method_name)
    end

    def model_receiver?(node)
      receiver = node.receiver

      # bare save! (implicit self)
      return true if receiver.nil?

      # User.create!
      return true if receiver.const_type?

      # @order.save!
      return true if receiver.ivar_type?

      # self.update!
      return true if receiver.self_type?

      false
    end
  end
end

# The below is only necessary to prove the cop works inline in tests.
# In a real project you'd drop the class into .rubocop/ and run normally.
def write_outside_command_cop_check(source, file_path: "app/services/foo.rb")
  registry = RuboCop::Cop::Registry.new([ArticleCops::WriteOutsideCommand])
  config = RuboCop::Config.new({"ArticleCops/WriteOutsideCommand" => {"Enabled" => true}}, "")
  team = RuboCop::Cop::Team.mobilize(registry, config)

  source_obj = RuboCop::ProcessedSource.new(source, RUBY_VERSION.to_f, file_path)
  result = team.investigate(source_obj)

  result.offenses
end
# write_outside_command_cop_check("Order.create!(name: 'x')", file_path: "app/services/foo.rb")
# => [#<RuboCop::Cop::Offense: Mutate ActiveRecord models through a command in app/public.>]
#
# write_outside_command_cop_check("Order.create!(name: 'x')", file_path: "app/public/orders/capture.rb")
# => []

The second cop guarantees the one-entrant rule. A command defines a private execute method, and no public methods outside of the base call.

module ArticleCops
  class CommandSingleEntrant < RuboCop::Cop::Base
    def on_class(class_node)
      return unless class_node.parent_class&.source == "ApplicationCommand"

      visibility = :public
      body = class_node.body

      nodes = if body.nil?
        []
      elsif body.begin_type?
        body.children
      else
        [body]
      end

      nodes.each do |node|
        if node.send_type? && %i[public private protected].include?(node.method_name)
          visibility = node.method_name
        end

        next unless node.def_type?
        next if node.method_name == :initialize

        if node.method_name == :call || visibility == :public
          add_offense(
            node,
            message: "Commands keep one door: define a private execute, never define call or another public method."
          )
        end
      end
    end
  end
end

# Same harness as above — only needed for inline verification.
def command_single_entrant_cop_check(source)
  registry = RuboCop::Cop::Registry.new([ArticleCops::CommandSingleEntrant])
  config = RuboCop::Config.new({"ArticleCops/CommandSingleEntrant" => {"Enabled" => true}}, "")
  team = RuboCop::Cop::Team.mobilize(registry, config)

  source_obj = RuboCop::ProcessedSource.new(source, RUBY_VERSION.to_f, "test.rb")
  result = team.investigate(source_obj)

  result.offenses
end
# command_single_entrant_cop_check("class Bad < ApplicationCommand\n  def extra\n  end\nend")
# => [#<RuboCop::Cop::Offense: Commands keep one door...>]

The first cop only flags calls on constants, ivars, and self, so set.delete passes through cleanly. The second misses class << self tricks. Both catch the common mistakes and a rubocop:disable handles the rest.

Three: database constraints under everything, because commands are still application code. Unique indexes, CHECKs, foreign keys, NOT NULL. When someone bypasses the commands anyway, and someone will, the constraint converts silent corruption into an exception.

What’s left in the model after all this? Schema, associations, scopes, normalizes, and validations for caller convenience. The database holds the invariants, the commands hold the behavior.

Strangling the Chain in an Existing App

The problem with these patterns is by the time you consider them your app has a lot of legacy to contend with, meaning this isn’t a clean greenfield project we can change however we want. The name of the game in these scenarios is progressive enhancement, or from another vantage the strangler fig pattern: grow a replacement around the old system, hollow it out, and replace it over time (Fowler, 2004). The goal with large migrations on old codebases is to make them obvious, mechanical, isolatable, rollback-able (because you’re going to need it), and progressive so a team currently underwater has time to adapt.

Start by making the invisible visible:

def census_strangler
  census_results = ActiveRecord::Base.descendants.filter_map do |model|
    next if model.abstract_class?

    count = %i[validation save create update destroy commit touch].sum do |kind|
      model.send(:"_#{kind}_callbacks").to_a.size
    end

    [model.name, count] if count.positive?
  end

  census_results.sort_by(&:last).reverse.first(20)
end

The top of that list is your starting point.

Take the callbacks from your worst models and divide them into three buckets: pure-self (keep, or convert to normalizes), cross-model writes (into a command, inside the transaction), and IO (into a command, below the transaction). From there build commands around the current behavior first, copying the bodies inline in the same order the current chain runs. Avoid the instinct to refactor or improve, as a refactor and a behavior change in one PR is a regression waiting to happen.

Seriously: Minimal, surgical, copy-paste changes when doing this. That one-liner is not nearly as obvious or non-impacting as you think and it won’t be worth it. Ask me how I know, I’ve learned that one the hard way more times than I’d care to admit.

From there route the call sites and turn the cop on for those files. Deletion only comes afterwards, with a safety net underneath it.

Thankfully routing is the safe part, because the command itself’s update! is still going to fire those callbacks, resulting in no behavioral changes. Deletion, on the other hand, is more complicated and involves a deploy which are categorically not known to be fast in most applications. That’s where feature flagging comes in, and there are several options here to help us make these cutovers more clean. Flipper is a common Ruby implementation, with gates for a single actor, a group, or a percentage of actors, and with it the legacy callback gets a guard and the command gets its mirror:

class FlipperOrder < ActiveRecord::Base
  self.table_name = "orders"
  include Flipper::Identifier

  after_commit :sync_to_crm, unless: -> { Flipper.enabled?(:orders_capture_via_command, self) }

  def sync_to_crm
    Crm::SyncOrder.call(order: self)
  end
end

module Orders
  class CaptureOrder < ApplicationCommand
    def initialize(order:) = @order = order

    private

    def payload = {order_id: @order.id}

    def execute
      @order.update!(region: "captured")
      announce(@order)
      @order
    end

    def announce(order)
      if Flipper.enabled?(:orders_capture_via_command, order)
        Crm::SyncOrder.call(order: order)
      end
    end
  end
end

The guard and the command check the same flag on the same actor, so each order syncs exactly once regardless of which path it takes. If something goes wrong, Flipper.disable(:orders_capture_via_command) moves everyone back to the callback path instantly with no deploy.

The rollout itself is one line:

Flipper.enable_percentage_of_actors(:orders_capture_via_command, 5)
# => 5
Flipper[:orders_capture_via_command].percentage_of_actors_value

Walk the percentage up over a week, compare the outputs between cohorts, and once it’s at a hundred for a while delete the callback, the guard, and the flag in one commit. One important ordering note: turn the cop on for a model’s call sites before enabling the flag, otherwise flag-on orders hitting un-routed paths skip both syncs.

A model with thirty callbacks won’t convert in one pass. Carve it verb by verb, one flag per verb, and the census count ticks down with each extraction.

Counting the Cost

None of this is free. A command is a class per verb where a callback was a line, and on a small app with one team it’s ceremony. Callbacks are fine until the second write path or the second team shows up. The 37signals school will tell you disciplined callbacks scale further than I’m giving them credit for (Vanilla Rails is plenty, Globals, Callbacks and Other Sacrileges), and inside one cohesive team they’re not wrong. The weakness of that position is that it assumes every developer has read the lore guide and understood the sharp parts. In a 15+ year old monolith with 500+ engineers across 30+ teams, with varying skill levels, language familiarity, incentives, and deadlines, guidance on a page doesn’t hold. Enforcement that relies on developers reading docs is soft at best. We need mechanisms that make the wrong thing hard, not docs that ask nicely.

The pattern also requires enforcement to survive (that’s what the cops and Packwerk are for), and it doesn’t cover Rails’ own callbacks. counter_cache, touch, autosave, and dependent: stay, those are framework bookkeeping. The rule applies to your domain behavior: emails, webhooks, ledger writes, cross-model consistency.

So Where Does the Logic Go?

For quick reference, here’s where each kind of callback behavior lands in the new world:

The callback was doing Put it
Normalizing the record’s own attributes normalizes (7.1+), or keep the pure before_validation
Defaulting a column Database default, or the command
Maintaining a counter / timestamp counter_cache / touch are fine; they’re framework bookkeeping
Writing to another model A command, inside the transaction
Email / job / webhook / HTTP A command, after the transaction block
Search index / cache invalidation A command, after the transaction block, with a bulk verb
Audit / version history A command, or database-level capture; callback gems sample, they don’t record
Reacting to another pack’s write Subscribe to its command’s notification, or be called by it explicitly
Cross-record checks (uniqueness, capacity) A unique index or CHECK for the truth; a guard in the command for the friendly error
Enforcing “this must always be true” Database constraint, with commands as the only door
Vetoing a save (throw :abort) A guard at the top of the command, where it raises something with a name
Cascade cleanup Foreign key ON DELETE, or a command

Further Reading

Wrapping Up

Callbacks attach behavior to persistence in a place the caller can’t see and the author can’t control. The fix is to move that behavior into explicit commands with one door, enforce the boundary with tooling, and let the database hold the real invariants.

Next time we’re going to name the pattern we just built half of: CQRS (Command Query Responsibility Segregation), the idea that reads and writes should flow through separate interfaces. We don’t need separate databases for that, just separate objects in the same app. Commands already own the writes; next we add Queries to own the reads, and ActiveRecord stops leaking out of packs entirely.