Spectator Sport, a brief introduction to an upcoming Rails plugin

Hi! 👋 I’m Ben Sheldon. I’m the author of GoodJob, an Active Job backend that I’ll humbly share is mildly popular and known for its broad features and ease of use. I’m working on a new plugin for Rails: ✨

Spectator Sport creates and replays video-like recordings of your live, production website, via a self-hosted Ruby on Rails Engine that lives in your application.

Spectator Sport uses the rrweb library to create recordings of your website’s DOM as your users interact with it, from the perspective of their web browser’s screen (html, css, images, dynamic content, mouse movements and clicks, navigation). These recordings are stored in your Active Record database for replay by developers and administrators to analyze user behavior, reproduce bugs, and make building for the web more engaging, satisfying, and fun.

Here’s a proof of concept demo. It’s very, very, very, very rough and early, but everyone I have demoed it for says “wow, that is totally different and much better than I imagined it when you first explained it me”: https://spectator-sport-demo-1ca285490d99.herokuapp.com

🚧 🚧 This gem is very early in its development lifecycle and will undergo significant changes on its journey to v1.0. I would love your feedback and help in co-developing it so fyi it’s going to be so much better than it is right now.

You can help:

Who is this gem for?

I’m writing this on the weekend following Rails World 2024, on eve of Rails 8’s release. The Rails team is tackling the hard problems of making it easy to deploy your website into production with tools like Kamal, and addressing operational complexity with the Solid suite. What comes next?

Spectator Sport intends to transform your relationship with your website after it is deployed to the web and real people have the opportunity to use it:

  • See how people are actually using your website, directly.
  • Remove the necessity of defining funnel metrics or analytics up front, or the necessity of interpreting user behavior through a limited lens of aggregated or averaged numbers.
  • As a developer and product-maker, more fully engage your sympathetic senses, in addition to your analytical senses, to ultimately be more effective and fulfilled when building for the web.

Launching a website kinda sucks. I’ve been a solopreneur, and web-marketing consultant, and “founding” engineer, and “growth” engineer at VC backed startups and product labs, and a participant and mentor in entrepreneur communities and mastermind and accountability groups for 19 years. People are not ok.

The fast development feedback of imagining and building the thing locally, the dopamine rush of making something nice, one step at a time, for other people to use… that all drops off a cliff once, well, you put it out there for other people to use. The post-launch and release feedback: it’s not there.

It sucks! People feel it! They’re confused, they’re sad, sometimes mad, looking for help, wanting to be seen by others, spinning on valueless technical changes, sharing tangential hot takes and engagement baits. Developers are going anywhere but directly to the people they’re building for. One reason, I believe, is because their visitors’ and users’ activity on their website is largely invisible and unknowable, and the only way to see it is through a foggy, confusing and deeply unsatisfying window of abstract metrics and aggregation.

Building for the web should be a spectator sport. More than only a fantasy game of metrics and aggregates and guesses and spread-table gambles. It should be fun and engaging and direct. We should simply be able to observe and react and cheer and cry and fall on the floor and get up and make it better and go again. Believe.

There are constraints to what I can do to achieve this vision, with this gem. I’m focused on building for Ruby on Rails. And specifically hobbyists, soloprenieurs, small teams, SMBs (small and midsize businesses), and unique applications, including:

  • applications with limited budgets or the inability (because of geography or policy) to contract or procure a 3rd party service.
  • applications in government or healthcare or on an internal intranet with unique data or privacy constraints who don’t have the budget for a BAA (business associate agreement) or other compliance contracts
  • applications for which operational simplicity is paramount and don’t have the resources to operate a more complex self-hosted solution

We have the technology

Browser recording isn’t new. Fullstory was my own introduction to it nearly a decade ago, also Tealeaf and Sentry and PostHog and Highlight and Matomo and many others, some of which are no-cost self-hostable as a separate service, though often with complex dependencies. Many of them use rrweb too.

I believe Spectator Sport is the first no-cost, self-hostable browser-recording tool that works anywhere your application runs (Heroku being the narrowest target I can imagine). Tell me what I’m missing!

If my adjectives themselves aren’t compelling and your website already has massive scale and a rich revenue stream and/or no concerns about 3rd-party data subprocessors, I highly recommend checking out PostHog (just $0.005 per recording!) or Sentry (enterprise gated, but integrated into Sentry’s other features which are fantastic).

A good job, again

I mentioned in my introduction that my other gem, GoodJob, is well-regarded. I think we can do it again with Spectator Sport:

  • Focus on solving a class of problems developers experience over a long period of time, not building a specific technology tool and calling it a day.
  • Serve the vastly more solo and full-stack dev teams with limited time and budgets who will benefit from something tailored to their largely consistent needs (easy, good, inexpensive) and are nice and appreciative when you deliver, than the very small number of experienced folks with big budgets and unique needs who inexplicably have time on their hands to be outspoken in telling you it will never work for them.
  • Provide a wide offering of polished features, using boring, existing tech to do the complex bits (like Postgres advisory locks in GoodJob, or rrweb in Spectator Sport). The value comes from the usability of the integration. A full-featured, cleanly designed web dashboard really impresses too; Dark Mode is the epitome of a non-trivial feature to maintain that demonstrates care.
  • Maintain a narrow compatibility matrix, focus on “omakase” Rails (Active Record, Active Storage, etc.) with a sensible EOL policy. Complexity kills. Relational databases are great. Squeeze the hell out of the system you have.
  • Be exceptionally responsive and supportive of developers who need help and meet them where they are. Be personally present because the library can’t speak for itself. Make mistakes, change direction, communicate intent, move forward.
  • Keep the cost of change low, release frequently, build up, iterate, document and test and provide deprecation notices, follow SemVer, and defer application-breaking changes as long as possible.

I do want to try one thing new compared to GoodJob: I want Spectator Sport to be compatible with Postgres and MySQL and SQLite. I believe it’s possible.

Front-running the criticism

Here are the things I have worked through myself when thinking about Spectator Sport, and talked about with others:

Is it creepy? Yes, a little. There is overlap with advertising and marketing and “growth” tech, And many service providers market browser recording as a premium capability with premium prices and sell it hard. Realistically, I have implemented enough dynamic form validations in my career that I no longer imagine any inherent sanctity in an unsubmitted form input on a random website. Conceptually, Spectator Sport observes your website as it is operated by a user, it does not observe the user. Every webpage deserves to be a place, and this just happens to be your CCTV camera pointed at it, for training purposes.

Is it a replacement for usability research? No, of course not. Spectator Sport can only show you half of the picture (or less) that you get from real usability research. When you do real usability research and ask a subject to do something on your website, you ask them to explain what they’re doing, in their own words, based on their own understanding of the task and what they see through their own eyes. Browser recordings alone can’t give you all that. You still have to fill in the blanks in the story.

Is it safe? I think so. I intend all user input to be masked by default, be secure by default, and provide comprehensive documentation that explains both the why and the how to lock down what’s stored and who can access it. Spectator Sport is shipping the DOM to your own database, and it’s likely the same data already lives in the database in a more structured way, and is already reflected back through your application too.

Does it use a lot of storage? Not as much as you might fear. If people’s big scaling worry for GoodJob was “it will be too slow” I already think Spectator Sport’s is “it will be too big”. I’ve been running the proof of concept on my own websites and 1.5k recordings took up ~500MB of storage in Postgres. Retention periods can be configured, data can be compressed and offloaded to Active Storage. I believe it is ok, and worth the squeeze.

Can it do xyz? Maybe. Open an issue on GitHub. I’d love to discuss it with you.

Wouldn’t you rather do something with AI? I dunno, man. I freaking love watching recordings of my websites being driven by people and thinking about how to make the website easier and better for them. I think this is an immensely satisfying missing piece of building for the web, and I think you will too.

Tell me what I’m missing or overlooking!

The call to action, a second time, at the bottom

Something I learned a long time ago, from watching browser recordings (true story!), is that visitors will go deep below the hero’s call-to-action, read all the lovely explanatory content, get to the bottom… and bounce because the call to action wasn’t reinforced.

So, please:


Seeing like a Rails and Ruby platform team

When I’m not hacking on GoodJob, I work at GitHub, where I’m the engineering manager of the “Ruby Architecture” team, which is filled with fantastic rubyists. Our team mission is to:

Make it easy for GitHub engineers to create, deliver, and operate best-of-class Ruby and Rails applications, and share the best of it with the world.

This is an adaptation of a post I published internally at GitHub, and its ensuing discussions, to explain what a team like ours does when we’re supporting other teams and giving technical feedback. I imagine this is similar to other big companies’ Rails and Ruby platform teams, like Shopify’s “Ruby Infrastructure” team. I hope this is useful in thinking about your own Rails and Ruby work, experience, and career development focuses.

Before you “architecture”

The rest of this post is a big ol’ list of deep Ruby technical topics. To avoid premature optimization and architecture astronautics, I want to just quickly ground some expectations of any technical change you lead:

  • Is it clear what it does, especially to others, who may be yourself in the future?
  • Does it follow established patterns and precedent throughout the codebase, and is it internally consistent with itself?
  • Does it accomplish the business goal? Does it work?
  • Does it not prevent other components from accomplishing their business goals? Does it not break or negatively impact other stuff?

I write these things out because it’s very common, as a technical feature goes through multiple reviews and revisions, to lose sight of its original goals or purpose or business constraints. So set yourself up for success by being clear on that stuff up front, and push back (or go deeper) if someone tells you something needs to change for technical reasons but it compromises your intended non-technical outcome.

Architecting Ruby, the list

A brief note about my authority on this. The following list comes out of my experience working on a big Rails and Ruby monolith at GitHub, which has largely co-evolved with Rails and Ruby over the past 15+ years, and alongside 1k+ other engineers. (I’m also a consultant, and worked in a lot of software labs, and untangled a lot of other people’s applications too; and not-Rails stuff too.) Many members of the team are core maintainers of Rails and Ruby, and we treat the Rails Framework as an extension of our application. Our team is responsible for integrating upstream changes in Rails and Ruby within GitHub’s vast monolith. We upgrade and integrate Rails and Ruby main/dev/trunk changes weekly! (Never repeat, never forget.) This continuous practice produces a deep familiarity with how change happens, and where friction builds up between an application and its upstream dependencies. Performing these upgrades over and over leads to experience, and repeated experience leads to intuition.

(btw, please reach out if your company has a practice of continuously upgrading Rails main/dev/trunk and running it in production. GitHub and Shopify and Gusto are trying to form a club and we want you in it.)

There is a general order here, from most important to least in broad strokes. Remember, nothing here is intrinsically bad or should never be done; but in those situations there should be well-considered decision points.

  • Global namespace and Library/Dependency Privacy Violations
    • Avoid monkeypatching or reaching into private methods or objects.
    • The most appropriate place to make changes is upstream.
  • Safety, Security
    • Avoiding thread safety issues, like globally held objects and privacy violations, not leaking data between requests, or retaining big objects in memory. Profile, profile, profile.
    • Seeking object locality (or avoiding globalness) by storing objects on instances of controllers and jobs (or their attributes) and embracing the natural lifecycles provided by the framework. Frequently a developer desires not to call SomeObject.new at the usage-site, but to have a DSL-like callable method already ready in the appropriate scope (eg. current_some_object). We love a good DSL and they can be difficult to get right.
  • Code Loading, Autoloading, and Reloading
    • Code autoloading is one of the most important design-constraints in Rails that can vastly affect inner-loop development (the “hands-on-keyboard” part) and production availability because of impact to application boot speed.
    • Designing for code loading and autoloading is critical to design, file placement (app vs lib vs config) and dependencies interactions
  • Internal to the Ruby VM constraints
    • Even though Ruby makes it easy to introspect the runtime ( descendants or subclasses or ObjectSpace) they shouldn’t be used outside of application boot or exception handling (and sparingly even then); they may have performance implications or be overly nuanced and non-deterministic in their output. Using callers and introspecting the Ruby callstack is a particularly expensive operation.
    • While infrequent and not-obvious, some patterns can massively de-optimize the Ruby VM with either localized or global effects. The Ruby VM (or accelerators like YJIT) are unable to optimize certain code patterns, and some patterns may cause VM-internal caches to churn inefficiently or to retain objects and their references unnecessarily (this can get tricky so please partner with us!). You probably want examples:
      • OpenStruct (though probably isn’t a reason to use it at all)
      • eval and class_/instance_eval
      • Modifying singleton classes (using extend on objects) (example)
      • Anything that adds to the callstack (call-wrapping, complicated delegation)
      • (handwaves) Things that YJIT isn’t yet optimized for, things that deoptimize object shapes, which is the result of new fast-paths being introduced which now mean there are slow-paths that didn’t previously exist.
      • Native extensions that don’t release the interpreter lock
      • Metaprogramming generally
      • None of these are intrinsically bad (except OpenStruct and poorly done native extensions), and framework and platform level code definitely make use of them. And they’re also constantly changing because of upstream Ruby work. And are maybe ok in isolation but a problem when copied as a pattern or introduced as a part of the platform for broad consumption. Something John Hawthorn has said:

        A thought experiment I like to try is asking myself how I would implement this in another language without [Ruby magic]… Adding that constraint can help unblock thinking of simpler, more “normal” approaches without expensive metaprogramming.

  • External to the Ruby VM constraints and dependencies (memory, compute, file descriptors, database connections, etc.)
    • Database stuff alone is a lot. The design prompt everyone is largely working from is “how does one architect an efficient, stateless application that sits between an end-user client and stateful data sources and manages bidirectional transformations of data?” Sounds hard when you put it that way, right?
    • Thinking about resource lifecycle, pooling, and how they interact across the various concurrency models available to use (process forking, threads, etc.). We do expect the frameworks and platform libraries we choose to keep these out of mind for most development tasks 😅
  • Design of the thing, for use
    • Rails’s model of “convention over configuration” frequently means that how an object is structured and where it’s placed can have an outsized impact on how it behaves: e.g. within App, Lib, Rack Middleware, Other Library Middleware (Faraday, jobs system, etc.), Rails Configuration/Initialization/Railties, and more!
    • …and how those conventions relate to Maintainability, Developer Usability, and Conceptual Integrity.
    • Sometimes what may appear as simply an aesthetic decision can have a functional impact.
    • Identifying atypical or disordered usage patterns. Sometimes a desired behavior can be more of a happy accident than an enforced intention, and it might change upstream because no one expected it to be used that way.
  • Dependency Stewardship
    • In addition to Rails and Ruby, our monolith depends on hundreds of gems, double hundreds of their transitive gem dependencies, and several other runtimes and system libraries.
    • The nature of a monolith is that we go together. If some dependency isn’t compatible with the latest Rails or Ruby, or any other dependency upgrade, we must adapt. We work upstream, we patch locally, and worst case, we remove and replace the dependency with something more maintainable. All of this takes time and effort and resources.
    • We want to choose dependencies that are well-maintained: their maintainers proactively respond to upstream changes, are responsive to issues and PRs, and importantly in Ruby, are nice. (And to whom we are nice too!) That’s more important than benchmarks.
    • And dependencies should be well architected too, obvs.
  • Automating and Scaling: Packwerk, Sorbet, Rubocop
    • We do our best to encode our knowledge and shape the application through tooling; that’s how our team scales! We send our custom rules upstream, too.
    • But it’s complicated! Sometimes that means that developers may focus on designing their code in response to the automated tooling and ending up with a less effective design or even introduce global risks and impacts to the application. At worst, a developer might even glaze over the linter’s intent by smuggling their design through a spelling or arrangement the linter doesn’t recognize 💀 Unfortunately the most important things are often the most abstract and arguable and difficult to detect or automatically warn about. We regret when we do have to tell folks that an approach is untenable in a PR or even after the fact when we notice production metrics have degraded.

A conclusion about lists

I like making lists of things; I find them helpful. I also realize that not everyone experiences lists the same way I do. For me, the purpose of a good laundry list is to be a quick reminder (“don’t forget to wash the handkerchiefs”) and not not an exhaustive list of actionable instructions (“the exact and best temperature to wash this t-shirt and that pair of jeans”). So please reach out to me (Mastodon / Twitter/X) if:

  • You think there is something that should be added to the list, or explained in more detail
  • You’re curious how something in the list might apply to a specific thing you have

I’d love to chat. Thanks for reading!


The secret to perfectly calculate Rails database connection pool size

Ruby on Rails maintains a pool of database connections for Active Record. When a database connection is needed for querying the database, usually one per thread (though that’s changing to per-transaction), a connection is checked out of the pool, used, and then returned to the pool. The size of the pool is configured in the config/database.yml. The default, as of Rails 7.2, is pool: <%%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>.

The database connection pool size is frequently misconfigured. A lot. How to calculate the database connection pool size is one of the most common questions I get on GoodJob (Hi! I’m the author of GoodJob 👋). I have spent an embarrassingly large amount of time trying to come up with a precise pool size calculator and give advice to take into account Puma threads, and GoodJob async jobs, and load_async queries and everything that might be asking for a database connection at the same time. It’s nearly impossible to get the number exactly right.

If the connection pool is misconfigured to be too small , it can slow down web requests and jobs while waiting for a connection to become available, or raise ActiveRecord::ConnectionTimeoutError if there isn’t a connection available within a reasonable amount of time (5 seconds by default). That’s bad! We never want that to happen. Here’s what you should do:

✨ The secret to perfectly calculate Rails database connection pool size: Don’t! Set the pool size to a very large, constant number, and never worry about it again. E.g. pool: 100, and remove the reference to RAILS_MAX_THREADS entirely:

# config/database.yml
default: &default
  # ...
  pool: 100 # <-- that's it 👍
  # ...

WAIT, WHAT?! Why? I described that bad things happen if the pool size is too small. Here’s the trick: it’s impossible to set the connection pool size to be too big. You can’t do it! That’s why it’s always better to set a number that’s too large. And the best number is one that can never be too small regardless of how you configure (and inevitably reconfigure) your application. Here’s why:

  • Database connections are lazily created and added to the pool as they’re needed. Your Rails application will never create more database connections than it needs. And the database connection pool reaper removes idle and unused connections from the pool. The pool will never be larger than it needs to be.
  • It’s possible you may run out of available database connections at the database. For example, Heroku’s new Essentials-0 Postgres database only has 20 database connections available globally. But any problems you run into won’t be because the database connection pool is too big, it’s because your application is using too many concurrent database connections.
  • If you find yourself in a situation where your application is using too many concurrent database connections, you should be configuring and re-sizing the things using database connections concurrently, not the database connection pool itself:
    • Configure the number of Puma threads
    • Configure the number of GoodJob async threads (Solid Queue now has similar functionality too!)
    • Configure the load_async thread pool
    • Configure anything else using a background thread making database queries
    • Configure the number of parallel processes/Puma workers/dynos/containers you’re using, which the database connection pool does not affect anyways.
  • If you still don’t have enough database connections at the database, then you should increase the number of database connections at the database. Which means scaling your database, or using a connection multiplexer like PgBouncer. Judoscale has a nice calculator to estimate the number of connections you’ll need at the database (which again, is not the pool size).
  • If, in an incredibly rare case, your application concurrency is very, very spiky and you worry that idle database connections are sitting in the connection pool for too long before they are automatically removed by the connection pool reaper, then configure that:
    • idle_timeout: number of seconds that a connection will be kept unused in the pool before it is automatically disconnected (default: 5 minutes). Set this to zero to keep connections forever.
    • reaping_frequency: number of seconds between invocations of the database connection pool reaper to disconnect and remove unused connections from the pool (default: 1 minute)

I know this is wild advice, but it’s based on facts and experience. Even Rails maintainers have intentions to remove this configuration option entirely:

…we want the pool not to have a limit by default anymore.

So please, stop sweating the precise, exact, perfect database connection pool value. Set it to something really big, that can never be too small, and never worry about it again.


The Novice Problem

Brandon Weaver’s “Beyond Senior - Metric Obsessions” has been stuck in my mind ever since we caught up at a SF Ruby Meetup and chatted about rules-adherence as a general problem:

…by definition a vast majority of your engineers are likely to be concentrated more towards the novice end of the spectrum, and will frequently over rate themselves on this scale.

If folks in the novice to advanced beginner stages are known for a rigid adherence to rules and almost legalistic approach to them what do you think might happen if you give them a giant list of metrics [, coding rules, linter warnings, dependency violations, or type-checking errors]?

Will they exercise discretion and nuance? Will they have the ability to prioritize based on that information? Will they make appropriate tradeoffs? [No.]

This is coming from the Dreyfus Model of Skills Acquisition, which is like Shuhari but with more levels:

  1. Novice:
    • “rigid adherence to taught rules or plans”
    • no exercise of “discretionary judgment”
  2. Advanced beginner
    • limited “situational perception”
    • all aspects of work treated separately with equal importance
  3. Competent
    • “coping with crowdedness” (multiple activities, accumulation of information)
    • some perception of actions in relation to goals
    • deliberate planning
    • formulates routines
  4. Proficient
    • holistic view of situation
    • prioritizes importance of aspects
    • “perceives deviations from the normal pattern”
    • employs maxims for guidance, with meanings that adapt to the situation at hand
  5. Expert
    • transcends reliance on rules, guidelines, and maxims
    • “intuitive grasp of situations based on deep, tacit understanding”
    • has “vision of what is possible”
    • uses “analytical approaches” in new situations or in case of problems

Notes from Carrierwave to Active Storage

I recently migrated Day of the Shirt, my graphic t-shirt sale aggregator, from storing image attachments with Carrierwave to Active Storage. It went ok! 👍

There were a couple of things driving this migration, though Carrierwave had served me very well for nearly a decade:

  • For budgetary reasons, I was moving the storage service from S3 to Digital Ocean Spaces. I knew I’d be doing some sort of data migration regardless.
  • I was using some monkeypatches of Carrierwave v2 that weren’t compatible with Carrierwave v3. So I knew I’d have to dig into the internals anyways if I wanted to stay up to date.
  • I generally trust Rails, and by extension Active Storage, to be reliable stewards when I take them on as a dependency.

And I had a couple of requirements to work though, largely motivated because images in Day of the Shirt are the content with dozens or hundreds displayed on a single page:

  • For budget (slash performance), I need to link directly to image assets. No proxying or redirecting through the Rails app.
  • For SEO, I need to customize the image filenames so they are relevant to the content.
  • For performance (slash availability), I need to pre-process image transformations (convert, scale, crop) before they are published. Dozens of new designs can go up on the homepage at once.
  • For availability, I need to validate that the images are (1) transformable and (2) actually transformed before they are published; invalid or missing images are unacceptable.

How’d it go? Great! 🎉 I am now fully switched over to Active Storage. It’s working really well and I was able to meet all of my requirements. Active Storage is very nice, as nice as Carrierwave.

But the errata? Yes, that’s why I’m writing the blog post, and probably why you’re reading. To document all of the stuff I did that wasn’t in the very excellent Active Storage Rails Guide. Let’s go through it:

Direct Linking to images is possible via the method described in this excellent post from Florin Lipan: “Serving Active Storage uploads through a CDN with Rails direct routes”.

Customizing Active Storage filenames is possible with a monkeypatch (maybe someday it will be possible directly). The patch simply adds the specified filename to the end of what otherwise would be a random string; and it seems durable through variants such that the variant extensions will be updated properly when the format is transformed (e.g. from a .png to a .jpg):

# config/initializers/active_storage.rb
module MonkeypatchBlobKey
  def key
    # => hhw3kzc7wcqyglwi7alno9o5yf2v/the-image-filename.png
    self[:key] ||= File.join(self.class.generate_unique_secure_token(length: ActiveStorage::Blob::MINIMUM_TOKEN_LENGTH), filename.to_s)
  end
end

ActiveSupport.on_load(:active_storage_blob) do
  ActiveStorage::Blob.prepend MonkeypatchBlobKey
end

Preprocessing variants required tapping into some private methods to get the variant names back out of the system. Here’s an example of processing all of the variants when the attachment changes. Beware: attachments happen in an after_commit, which is good, but means that I had to introduce a published state to the record to ensure it was not visible until the variants were processed (there is a preprocessed: option to process individual variants async in a background job but that, unfortunately, doesn’t meet my needs for synchronizing them all at once):


class Shirt < ApplicationRecord
  has_one_attached :graphic do |attachable|
    attachable.variant :full, format: :jpg
    attachable.variant :large, resize_to_limit: [1024, 1024], format: :jpg
    attachable.variant :square, resize_to_fill: [300, 300], format: :jpg
    attachable.variant :thumb, resize_to_fill: [100, 100], format: :jpg
  end

  after_commit :process_graphic_variants_and_publish, if: -> (shirt){ shirt.graphic&.blob&.saved_changes? }, on: [:create, :update]

  def process_graphic_variants
    attachment_variants(:graphic).each do |variant|
      graphic.variant(variant).processed
    end
    update(published: true)
  end

  # All of the named variants for an attachment
  # @param attachment [Symbol] the name of the attachment
  # @return Array[Symbol] the names of the variants
  def attachment_variants(attachment)
    send(attachment).attachment.send(:named_variants).keys
  end
end

Validating variants was easy with a very nice and well-named gem: active_storage_validations. It works really well.

You will have N+1s, where you forget to add with_attached_* scopes to some queries. Unfortunately Active Storage’s schema is laid out in a way that it will emit queries to the same model/table even when it’s loading correctly, so you may get detection false positives too. You can see that clearly in the next example with the doubly-nested blob association.

Active Storage’s schema is a beast. I get that it’s gone through a lot of changes, and Named Variants are an amazing hack when you see how they’ve been implemented. And it’s wild. You can see that by how the scope for with_attached_* is generated:

includes("#{name}_attachment": { blob: {
  variant_records: { image_attachment: :blob },
  preview_image_attachment: { blob: { variant_records: { image_attachment: :blob } } }
} })

I originally thought that when eager-loading through an association (e.g. Merchant.includes(:shirts)) I’d have to do something like this (🫠):

Merchant.includes(shirts: { blob: {
  variant_records: { image_attachment: :blob },
  preview_image_attachment: { blob: { variant_records: { image_attachment: :blob } } })

…but fortunately this seems to work too (💅):

Merchant.includes(:shirts).merge(Shirt.with_attached_graphic)

That’s everything. All in all I’m very happy with the migration 🌅


On the importance of Rails code reloading and autoloading

I’ve elevated to “strongly held belief” that code reloading and autoloading is the most important design constraint when designing or architecting for Ruby on Rails.

  • Code reloading is what powers the “make a code change, refresh the browser, see the result” development loop.
  • Code autoloading is what allows Rails to boot in milliseconds (if you’ve designed for it!) to run generators and application scripts and a single targeted test for tight test-driven-development loops.

When autoloading and reloading just works, it probably isn’t something you think about. When code autoloading and reloading doesn’t work or works poorly, as it has on numerous apps across my career and consulting, it can be maddening:

  • Spending hours “debugging” some code only to realize that your changes were never being run at all.
  • Waiting tens of excruciatingly boring seconds to run a simple test or watching the browser churn away while it slowly waits for a response from the development server.
  • Feeling like you can write the code yourself each time faster than running a scaffold/template generator, repetitively over and over again.

Code reloading and autoloading not working correctly is a huge pain. It’s not great, at all!

The history of code reloading and autoloading came up recently in the Rails Performance Slack. A developer working on an old Rails application asked what Spork was (a forking preloader), and whether it was necessary (not necessarily). As a Rails Developer who is increasingly aware of my age experience (I started working with Rails in 2012, long after it first launched in 2004, but it’s still been a minute), I realized I had something to share.

Over history, various strategies have been taken to make the development loop faster because that’s so important. Those strategies usually boil down to:

  • Separating the (static) framework code from the (changing, developed) application code and only loading, just in time, what’s needed for the part of the application that’s currently running.
  • Loading/booting the framework code that is unlikely to change, and then only (re-)load the application code when invoking a command or running a test.

There have been various approaches to doing this:

  • Forking Preloaders (Spork, though Spring is the more contemporary version): load up the framework code in a process once, then fork into a subprocess when you invoke a command and reload just the application code. Sometimes, things can get out of sync (some application code or state pollutes the primary process), and things get weird/confusing. This is why you’ll hear of people hating on Spring or complaining, “I wasted all day on a development bug, and it turns out I just needed to restart Spring” (the analogous “it was DNS all along” of the Rails world).
  • Bootsnap, though operating on a cache strategy rather than a process-forker, serves a similar purpose of trying to speed up an application’s code loading time. The adoption of Bootsnap, and much, much faster CPUs in general, has largely replaced the usage of Spring in applications (though it’s still okay!).
  • Zeitwerk autoloader also plays a role in this history because it, too, is trying to “solve” the necessity of separating the framework code (which changes infrequently) from the application code during development (which is actively being changed) to produce faster development feedback cycles. Zeitwerk replaced the previous autoloader built into Rails, whose lineage seems to date all the way back to Rails 2.0 circa 2005. Tell me the history / raison d’être of the original autoloader if you know it!

Look, a lot of labor has gone into this stuff. It’s important! And it’s easy to get wrong and produce a slow and disordered application where development is a pain. It happens! A lot!

I wish I could easily leave this post with some kind of nugget of something actionable to do, but it’s really more like: please take care. Some rules of thumb:

  • Don’t reference, don’t access, don’t use or touch any constants in app/, or allow them to be referenced (looking at you, custom Rack Middleware) unless you’re doing so from another constant in app/ (or somewhere that you know is autoloaded).
  • Take care with config/initializers/ and ensure you’re making the most of ActiveSupport.on_load hooks. Rails may even be missing some load hooks, so make an upstream PR if you need to configure an autoloaded object and you can’t. It’s super common to run into trouble; in writing this blog post alone, I discovered a problem with a gem I use.
  • If you’re writing library code, become familiar with the configuration-class-initializer-attribute-pattern dance (my name for it), which is how you’ll get something like config.action_view.something = :the_thing lifted and constantized into ActionView::Base.something #=> TheThing

You might find luck with this bin/autoload-check script, that I adapted from something John Hawthorn originally wrote, giving output like:

❌ Autoloaded constants were referenced during during boot.
These files/constants were autoloaded during the boot process,
which will result in inconsistent behavior and will slow down and
may break development mode. Remove references to these constants
from code loaded at boot.

🚨 ActionView::Base (action_view) referenced by config/initializers/field_error.rb:3:in `<main>'
🚨 ActiveJob::Base (active_job)   referenced by config/initializers/good_job.rb:7:in `block in <main>'
🚨 ActiveRecord::Base (active_record)
                                         /Users/bensheldon/.rbenv/versions/3.3.3/lib/ruby/gems/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/base.rb:338:in `<module:ActiveRecord>'
                                         /Users/bensheldon/.rbenv/versions/3.3.3/lib/ruby/gems/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/base.rb:15:in `<main>'
                                         .....

Introducing GoodJob v4

GoodJob version 4.0 has been released! 🎉 GoodJob v4 has breaking changes that should be addressed through a transitionary v3.99 release, but if you’ve kept up with v3.x releases and migrations, you’re likely ready to upgrade 🚀

The README has an upgrade guide. If you’d like to leave feedback about this release, please comment on the GitHub Discussions post 📣

If you’re not familiar with GoodJob, you can read the introductory blog post from four years ago. We’ve come pretty far.

Breaking changes to job schema

GoodJob v4 changes how job and job execution records are stored in the database; moving from job and executions being commingled in the good_jobs table to Jobs (still in good_jobs) having many discrete Execution records in the good_job_executions table.

To safely upgrade, all unfinished jobs must use the new schema relationship, tracked in the good_jobs.is_discrete column. This change was transparently introduced in GoodJob v3.15.4 (April 2023), so your application is likely ready-to-upgrade already if you have kept up with GoodJob updates and migrations. You can check by running v3.99’s GoodJob.v4_ready? in production or run the following SQL query on the production database and check it returns zero: SELECT COUNT(*) FROM "good_jobs" WHERE finished_at IS NULL AND is_discrete IS NOT TRUE. If not all unfinished jobs are stored in the new format, either wait to upgrade until those jobs finish or discard them. If you upgrade prematurely to v4 without allowing those jobs to finish, they may never be performed.

Other notable changes

GoodJob v4:

  • Only supports Rails 6.1+, CRuby 3.0+ and JRuby 9.4+, Postgres 12+. Rails 6.0 is no longer supported. CRuby 2.6 and 2.7 are no longer supported. JRuby 9.3 is no longer supported.
  • Changes job priority to give smaller numbers higher priority (default: 0), in accordance with Active Job’s definition of priority.
  • Enqueues and executes jobs via the GoodJob::Job model instead of GoodJob::Execution
  • Changes the behavior of config.good_job.cleanup_interval_jobs, GOOD_JOB_CLEANUP_INTERVAL_JOBS, config.good_job.cleanup_interval_seconds, or GOOD_JOB_CLEANUP_INTERVAL_SECONDS set to nil or to no longer disable count- or time-based cleanups. Instead, now set to false to disable, or -1 to run a cleanup after every job execution.

New Features

GoodJob v4 does not introduce any new features on its own. In the 110 releases since GoodJob v3.0 was released (June, 2022), these new features and improvements have been introduced:

  • Batches
  • Bulk enqueueing including support for Active Job’s perform_all_later.
  • Labelled jobs
  • Throttling added to Concurrency Controls
  • Improvements to the Web Dashboard, including Dark Mode, performance dashboard, and improved UI, and customizable templates.
  • Storage of error backtraces. Improved handling of job error conditions, including signal interruptions. Added GoodJob.current_thread_running? and GoodJob.current_thread_shutting_down? to support job iteration.
  • Ordered Queues, queue_select_limit and further options for configuring queue order and performance.
  • Improvements to Cron / Repeating Jobs.
  • Operational improvements including systemd integration, improved health checks.

A huge thank you to 88 (!) GoodJob v3.x contributors 🙇🏻 @afn, @ain2108, @aisayo, @Ajmal, @aki77, @alec-c4, @AndersGM, @andyatkinson, @andynu, @arnaudlevy, @baka-san, @benoittgt, @bforma, @BilalBudhani, @binarygit, @bkeepers, @blafri, @blumhardts, @ckdake, @cmcinnes-mdsol, @coreyaus, @DanielHeath, @defkode, @dixpac, @Earlopain, @eric-christian, @erick-tmr, @esasse, @francois-ferrandis, @frans-k, @gap777, @grncdr, @hahwul, @hidenba, @hss-mateus, @Intrepidd, @isaac, @jgrau, @jklina, @jmarsh24, @jpcamara, @jrochkind, @julienanne, @julik, @LucasKendi, @luizkowalski, @maestromac, @marckohlbrugge, @maxim, @mec, @metalelf0, @michaelglass, @mitchellhenke, @mkrfowler, @morgoth, @Mr0grog, @mthadley, @namiwang, @nickcampbell18, @padde, @patriciomacadden, @paul, @Pauloparakleto, @pgvsalamander, @remy727, @rrunyon, @saksham-jain, @sam1el, @sasha-id, @SebouChu, @segiddins, @SemihCag, @shouichi, @simi, @sparshalc, @stas, @steveroot, @TAGraves, @tagrudev, @thepry, @ur5us, @WailanTirajoh, @yenshirak, @ylansegal, @yshmarov, @zarqman


Rails Strict Locals, undefined local_assigns, and reserved keywords

Update: This has been mostly fixed upstream in Rails (Rails 8.0, I think) and documented in the Rails Guides.

Huge thank you to Vojta Drbohlav in the Rails Performance Slack for helping me figure this out! 🙇

Things I learned today about a new-to-me Ruby on Rails feature:

  • Rail 7.1 added a feature called “Strict Locals” that uses a magic comment in ERB templates to declare required and optional local variables. It looks like this: <%# locals: (shirts:, class: nil) %> which in this example means when rendering the partial, it must be provided a shirts parameter and optional class parameter. Having ERB templates act more like callable functions with explicit signatures is a nice feature.
  • When using Rails Strict Locals, the local_assigns variable is not defined. You can’t use local_assigns. You’ll see an error that looks like ActionView::Template::Error (undefined local variable or method 'local_assigns' for an instance of #<Class:0x0000000130536cc8>). This has been fixed 🎉 though local_assigns doesn’t expose default values.
  • This is a problem if your template has locals that are also Ruby reserved keywords like class or if, which can be accessed with local_assigns[:class] unless you start using Strict Locals. To access local variables named with reserved keywords in your ERB template when using Strict Locals, you can use binding.local_variable_get(:the_variable_name), e.g., binding.local_variable_get(:class) or binding.local_variable_get(:if). This is still necessary if you want to access reserved keywords with defaults because the defaults don’t show up in local_assigns.

Recently, June, 2024

  • I finished reading the Poppy Wars trilogy. It got tiresome by the end. I liked Babel much more, and I’m probably reading Yellowface next. I also read Exit Interview, which was another thrilling entry to the canon of “fantastically brilliant not-men who work in tech for whom it really should go better but unfortunately and predictably doesn’t”.
  • We saw “Film is dead. Long live film!” at the Roxie. It was enjoyable, reminded me of my big-Cable-Access-TV-energy days, and also gave far too little screen time to hearing from the film collectors wives (yes, exactly) and children. I thought this was my first movie theater since Covid, but Angelina reminded me we saw the Barbie Movie in a theater.
  • I played Animal Well until the credits roll, and then I read the spoilers and have been going for completionism. Though I don’t imagine I’ll get there before fully losing interest.
  • I joined the Program Committee for RubyConf in Chicago in November. We’re trying to get the Call for Papers/Speakers released this week. Should be a good one.
  • I started working on the GoodJob major version 4 release. It’s simply doing all the breaking changes that were previously warned about. A deprecation-warning made is debt unpaid. It’s not so bad.
  • With my friend Rob, I started volunteering on the backend of Knock for Democracy. I occasionally see people try to make a Tolstoy-inspired statement like “Healthy applications are all the same, but unhealthy ones are each unhealthy in their own way”. But that’s not true! All the apps that need some TLC have the exact same problems as all the others. Surprise me for once!
  • At work I’m nearly, nearly done with performance feedback and promotion packets and calibrations and all that. It’s the stuff that I truly enjoy as a manager, and also is terrible because of everything that’s outside my control. I also got asked to draw up a Ruby sponsorship budget for the year, which is the most normal administrative thing I think I’ve ever been asked in my (checks watch) 12 years of working in Bay Area tech.
  • I think the days of mobile apps for Day of the Shirt are numbered. I haven’t updated them ever since Apple rejected an update for the olde “Section 4.2: Minimum Functionality” thing, after like 8 freaking years in the App Store already. I did the olde “talk to someone on the phone at Apple who unhelpfully can’t tell me what would be enough functionality.” And so they’ve just been sitting and recently I got an email from a user that it won’t install on their new Android phone. So that sucks. It was a pain (when I was doing it) to develop 3 separate versions of Day of the Shirt: web, iOS, and Android, so maybe this is a sign to probably just commit to the web.
  • A triple recipe of bolognese is too much for my pot to handle.

A comment on Second Systems

I recently left this comment on a Pragmatic Engineer review of Fred Brook’s Mythical Man Month in “What Changed in 50 Years of Computing: Part 2”. This was what I reacted to:

Software design and “the second-system effect”

Brooks covers an interesting phenomenon in Chapter 5: “The Second-System Effect.” He states that architects tend to design their first system well, but they over-engineer the second one, and carry this over-engineering habit on to future systems.

“This second system is the most dangerous system a [person] ever designs. When [they] do this and [their] third and later ones, [their] prior experiences will confirm each other as to the general characteristics of such systems, and their differences will identify those parts of [their] experience that are particular and not generalizable.”

The general tendency is to over-design the second system, using all the ideas and frills that were sidetracked on the first one.”

I can see this observation making sense at a time when:

  • Designing a system took nearly a year
  • System designs were not circulated, pre-internet
  • Architects were separated from “implementers”

Today, all these assumptions are wrong:

  • Designing systems takes weeks, not years
  • System designs are commonly written down and critiqued by others. We cover more in the article, Engineering Planning with RFCs, Design Documents and ADRs
  • “Hands-off architects” are increasingly rare at startups, scaleups, and Big Tech

As a result, engineers design more than one or two systems in a year and get more feedback, so this “second-system effect” is likely nonexistent.

And this was my comment/reaction:

I think the Second-System Effect is still very present.

I would say it most frequently manifests as a result of not recognizing Gall’s Law: “all complex systems that work evolved from simpler systems that worked.”

What trips people up is usually that they start from a place of “X feature is hard to achieve in the current system” and then they start designing/architecting for that feature and not recognizing all of the other table-stakes necessities and Chesterton Fences of the current system, which only are recognized and bolted on late in the implementation when it is more difficult and complicated.

The phrase “10 years of experience, or 1 year of experience 10 times” comes to mind when thinking of people who only have the experience of implementing a new system once and trivially, and do not have the experience of growing and supporting and maintaining a system they designed over a meaningful lifecycle.

Which also reminds me of a recent callback to a Will Larson review of Kent Beck’s Tidy First about growing software:

I really enjoyed this book, and reading it I flipped between three predominant thoughts:

  • I’ve never seen a team that works this way–do any teams work this way?
  • Most of the ways I’ve seen teams work fall into the “never tidy” camp, which is sort of an implicit belief that software can’t get much better except by replacing it entirely. Which is a bit depressing, if you really think about it
  • Wouldn’t it be inspiring to live in a world where your team believes that software can actually improve without replacing it entirely?