Frontier novelty

From Benji Edward’s “10 things I learned from burning myself out with AI coding agents” describing the challenge of maintaining novelty during AI coding:

Due to what might poetically be called “preconceived notions” baked into a coding model’s neural network (more technically, statistical semantic associations), it can be difficult to get AI agents to create truly novel things, even if you carefully spell out what you want.
For example, I spent four days trying to get Claude Code to create an Atari 800 version of my HTML game Violent Checkers, but it had trouble because in the game’s design, the squares on the checkerboard don’t matter beyond their starting positions. No matter how many times I told the agent (and made notes in my Claude project files), it would come back to trying to center the pieces to the squares, snap them within squares, or use the squares as a logical basis of the game’s calculations when they should really just form a background image.

To get around this in the Atari 800 version, I started over and told Claude that I was creating a game with a UFO (instead of a circular checker piece) flying over a field of adjacent squares—never once mentioning the words “checker,” “checkerboard,” or “checkers.” With that approach, I got the results I wanted.

I’ve experienced something similar in my own AI-assisted game design. In my case, it’s my perennial desire to have a Simcity-like city builder that operates on “parcels” rather than tiles or zones and, oh boy, the LLM really chafes at it. Is it weird enough to work?

GoodJob, Solid Queue, Sidekiq, Active Job, in 2026

Hey, I’m Ben, the author of GoodJob. Last year at RailsConf I hosted a panel discussion between myself, Solid Queue’s Rosa Gutiérrez, Sidekiq’s Mike Perham, and Karafka and Shoryuken’s Maciej Mensfield called “The Past Present and Future of Background Jobs”. You can watch that video here.

In this post, I’m writing my personal perspective on how you, dear developer, might decide what background job backend to choose for Rails and Active Job. But everything in context. And oh boy, it’s all context.

A sober look at how technical decisions are actually made

I’ve worked in software engineering a long time and seen a lot of bullshit justifications for… doing stuff.

  • It’s “more modern”, meaning newer, I guess.
  • It has more features, or a unique architecture.
  • Everybody’s doing it. Or maybe, this particular someone is doing it. Or nobody is doing it, except for the successful ones.

All of this has some flavor of: I want to be different than everyody else, but also the same as the people who are succeeding.

There’s an assumption that the generation of knowledge is efficient. Meaning that all possible problems and solutions have been rigorously measured and stress-tested and their upsides and downsides unearthed; that people of equivalent smarts have spent equivalent time running out equivalent unflawed methodologies and sharing the results across equivalent channels. And our work is to place that preexisting knowledge on the platters of a scale and note which way it tips.

Also true: familiarity breeds contempt. The grass is always greener. And despite things being inherently complex, where the goal should be less “work your way out of the job” and more “don’t work yourself into an early grave”, your managers—whether actual people or your own intrusive thoughts—will not accept that.

Alternatively, I know, from such evergreen classics like Glass’s Facts and Fallacies of Software Engineering and Weinberg’s The Psychology of Computer Programming, it’s rare to deeply track our own technical and operational bottlenecks, the full business contexts, the skills and resources of the team and the broader talent market, and integrate those details into the decision-making process. Most folks start with the solution they want, and work backwards in their argumentation through whatever technical decision process their engineering organization requires. No one shows their work ’cause there’s no work to show; they read the assignment and the answer just popped out.

But not you, of course.

There’s another dimension to this that I think is really important to note: Decision making for new projects is vastly different than existing projects.

  • With New Projects, the decisions don’t really deeply matter cause nobody knows how it’s going to go and everything is speculative. But also because of bikeshedding and the Law of Triviality those decisions become the most forcefully argued over. It’s really hard for a room of technical people to plainly say they just want to do it one way cause they know it, or try something different for the hell of it and then figure out an effective social compromise instead of debating to death some hardboiled technical justification.
  • With Existing Projects, you have some data and some context and continuity to base your needs and your decisions on. But there is also an inexorable pressure to do something different and you end up picking a solution-at-hand rather than more deeply digging into a problem area where the solutions are more elusive in the sense of “I’m searching for my keys under the streetlight not because I lost them nearby but because the light is better”.

Bringing it back to Rails and Active Job, let’s look at a really big differentiator.

Omakase

Solid Queue is a default gem in Rails. If you run rails new, you get Solid Queue. That’s a strong signal that it will work, and that the Rails team confidently stands behind it as much as anything else in the Rails framework. Solid Queue is Ruby on Rails’ chefs’ choice made for you; it’s omakase.

“Eliminate valueless choices” is a powerful principle, as part of the Rails Manifesto. Solid Queue is a good choice, especially if you don’t have to choose. This alone is an incredibly powerful reason to Solid Queue. You probably chose Rails for the lore. There’s a manifesto. And it’s actually good. Conceptual integrity is important. Make the center hold.

But if you’ve been following the Rails discourse for a while, you’ll know not everyone agrees that omakase is best, or best for them. Lots of people quietly (and loudly!) go their own way. Callbacks are the worst, they comment underneath every single Reddit post. Nobody actually uses ActionMailer. Why isn’t there a Services directory? That pattern encourages methods that are too long! All those Concerns, are you for real? Nobody does it that way anymore. It’s not modern.

More recently 37 Signals has started releasing source code for several of their applications, and yes, they’re for real and yes, they do still do it that way. But Rails is a big framework (“Batteries included”) and you don’t have to use everything (“No one paradigm”). You can swap something out and remain lore canon.

And really, Solid Queue slots into a real sweet spot. It’s both new and it’s the default. Solid Queue can occupy the unicorn stall that’s both excitingly cutting edge and boringly conservative all at once. This will be a different discussion 5 years from now even if nothing has changed other than the year on the calendar.

But, maybe you do need to order off the menu! And we should dig into that.

Incomparables

Given the infinite rainbow of the human experience, there has to be someobody who does not collapse their categories. I have yet to meet them.

People want a simple decision tree that starts with “Postgres, MySQL, SQLite or Redis?” It’s not so simple.

  • Postgres on Heroku? Digital Ocean? Neon? Fly’s umpteenth attempt at it? Self-managed? How? Co-located?
  • MySQL? MariaDB? Behind Vitess? Planetscale?
  • SQLite? Litestream? Did you apply the good patches? How’s your IO?
  • Redis? HA? How? You didn’t choose “Flex” did you?

And any one of these can have the DBA who swears by that one deep-cut config and won’t budge. And while there is wisdom in sinking engineer-years into completely rearchitecting your system to sidestep a problematic technical peer—oh, I’ve done it—there’s no way you’re gonna write those words in the justification doc. You’re slapping down “more modern” and your junior (or junior-minded) colleague is tweeting “more modern” and HackerNews will cite that a decade from now.

Knowledge is not efficient. You’ll always find another subfloor and several sheer walls of complexity below and beyond whichever one you just pried open. This stuff is not enumerable, so any good decision tree you’ll receive will, on its surface, appear to be a gross and useless simplification despite the experience that went into it.

In conclusion

As we’ve deeply explored together, you find yourself in one of two scenarios regarding background jobs:

  • The backend doesn’t matter. Pick whatever, or pick nothing and use the default. Give whatever nonsense justification you want that serves you best. No shame, all respect if you can recognize this for yourself. You can stop reading now (I know, we love the lore).
  • It does actually matter, a lot. And successfully navigating this will be the crowning achievement of your career.

If you, dear developer, find yourself in the second bucket, here’s my advice.

  • If performance matters, like actually matters, use Sidekiq Enterprise or Karafka. But it’s complicated, you comment. It costs money. No one uses it anymore. It’s not modern. Shush. Sidekiq Enterprise and Karafka are, hands down the best for performance outside of building it in-house (a legit choice we won’t explore here). You can make it complicated by trying to precisely nail down what is meant by “performance” (throughput, latency, resources, spiky, steady, whatever), but if you use the word superlatively–“most performant”–your choice is made.
  • Otherwise:
    • If you’re using MySQL or SQLite, use SolidQueue. it’s your best option for those databases, and it’s really good in an absolute sense too. And it’s getting even better. it should have GoodJob-esque batches soon. No shade, all love.
    • If you’re using Postgres, use GoodJob. It’s full featured, and those features are implemented simple-like because they use Postgres-only features; there’s no workaround complexity for other databases. Want to break all the rules of encapsulation and fiddle with the job data? It’s one table. FANG uses GoodJob. It’s modern.

Because I’m close to GoodJob (familiarity!) I know the limitations too. You’ll have to go around PgBouncer. One guy profiled Postgres several years ago and pointed out where in the performance stratosphere Advisory Locks fall apart. It’s all true.

And there is a lot of relational-database arcana that I don’t think GoodJob nor Solid Queue have documented that you could do. Like vacuuming your tuples and stuff like that. I don’t do it, but some people swear by it. Every system falls apart if you run it hot or weird for long enough, and relational databases are no different than Redis or Kafka.

I could laundry-list any number of downsides and problems people have had with GoodJob… and you probably won’t experience them. Rosa and Mike and Maciej can tell you the same about their own backends. If and when you do experience something bad, it’s likely you’d experience something else bad on any other job backend and then you’d have to work through the problem and it’s likely that problem is unique to you, either because of your unique setup, or a trivial, personal failure to have simply googled it earlier.

And to solve that problem, you’ll probably do some research. And ideally you have a strong community of peers that think like you think, and work on similar systems to the one you work on, and you talk to them and sometimes help solve their problems too. They won’t have your specific answer, but they’ll have some suggestions about what to look at and what to try looking at or googling for you, and maybe the change you make is to change your Active Job backend, but it probably isn’t.

So the takeaway here is that today in 2026 there are a small number of really, really good Active Job backends to choose from, and the most important consideration is that you make friends along the way.

Tricks to work around nested form elements, for Rails

I recently migrated GoodJob’s Dashboard from Rails UJS (“Unobtrusive Javascript”) to Turbo. One of the most significant changes for me of moving from UJS to Turbo is that UJS’s data-method is not functionally replaceable with Turbo’s data-turbo-method. This -method attribute allows you to make a button or link act like a form submission without using a<form> element.

I learned some stuff, but first let’s back up even further

HTML <forms> are hard

There’s three practical things, and one conceptual, that are going to challenge us here

You cannot nest a <form> element within another <form> element. When rendering the page, the browser will remove it, or ignore it; regardless it won’t work. Doubly annoying because Chrome’s DevTools will simply remove the element; you have to use view-source to witness your mistake.

Some designs would really benefit from a form nested in another form. For example: You have a screen with a bunch of inputs to update a record, and you want to put a “Destroy” button visually adjacent to the “Update” button. Or, for example: you are displaying a list of records that is wrapped entirely in one big form element so that each item can be checked/unchecked to apply actions upon multiple records and you want to be able to have buttons to perform actions on records individually in the list too. Remember, you always want to put destructive or mutating actions behind a POST button rather than a GET link. It can be tricky. Here’s an example of a design challenge in the GoodJob Dashboard:

A screenshot from the GoodJob dashboard

HTML forms only support two HTTP methods: GET and POST. All the other one’s (PUT, PATCH, DELETE ) are valid HTTP methods, but you can’t use them in an HTML form. Rails works around this with its form helpers by adding a hidden input named _method that puts the unsupported method in the formdata, or using Javascript fetch or XMLHttpRequest.

You need a CSRF token. The form payload must contain the CSRF token payload, which Rails form helpers include as a hidden form element. If the payload doesn’t have a CSRF token, Rails will reject it. THe CSRF token really is important, please don’t disable it. Maybe that’ll go away someday, but not today

Conceptually, I want to leave you with this: Despite Ruby on Rails being a very good monolithic framework, we must clearly distinguish the HTML Documents and HTTP payloads from the Ruby bits and helpers.. In this case, we’ll be focusing more on:

  • The actual HTML document that the browser will render… over the ERB and Ruby Helpers that can make it easier to author that document.
  • The precise HTTP request and form-data payload that a form or javascript produces to send back to the server… over the Form Helpers and superficial Turbo interface.
  • How that HTTP request and form-data payload is processed by the Rails and Rack-middleware stack… over the pretty little Ruby hashes and objects you’ll access inside your Controller Actions.

The intent here is not to throw shade at Ruby on Rails. I want to break through what I have sometimes seen from developers who treat Rails like a native app SDK that paints UI on their screen and tightly responds to user input though OS interrupts or something. I want to elevate the HTML Document and HTTP requests that are buzzing back and forth and the limitations and opportunities therein.

Throw out your UJS data-method

As I mentioned in the intro, migrating from UJS to Turbo has some changes. In UJS, the simplest thing to do was use data-method and data-params for anything extra:

<a href="/posts/draft" data-method="put" data-params="something=extra" class="button">Set Draft</a>

…and that would generate a form-like HTTP request with all the bits: the _method , your CSRF token, and the extra data attributes:

POST /posts/draft
_method=put
authenticity_token=alsdkfjasldkjfasdljkf
something=extra

But seems like Turbo is giving up on data-turbo-method , and Turbo never implemented a data-params equivalent. So we have to do something different.

Alternatives for your consideration

Depending on what you need to change about the payload, you have some options:

If you only want to know if someone clicked Button B instead of Button A, then use a different commit value for the button; commit is just the arbitrary name the Rails framework chose to put in the form-data:

<%= form_with model: @post do |f| %>
  <%= f.button "Publish", name: "commit", value: "publish" %>
  <%= f.button "Save Draft", name: "commit", value: "draft" %>
<% end %>

…and then do your controller logic based on the params[:commit] value. Buttons can have any single name and value and it goes into the form-data.

We can overload this if we only need to change the method. E.g. if we want to have our Delete button alongside, we can use our knowledge of the magic _method overloading to set the method to be treated like a DELETE.

<%= form_with model: @post do |f| %>
  <%= f.button "Update" %>
  <%= f.button "Delete", name: "_method", value: "delete" %>
<% end %>

This produces an HTTP payload that looks like what we want, though it includes all the other form inputs:

POST /posts/42
_method=put
authenticity_token=alsdkfjasldkjfasdljkf
title=What was in the form
body=Anything else in the form
_method=delete

This works perfectly fine because the Form URL for both the update and destroy actions are the same, and it’s only the _method method that’s different and who cares about some extra form-data we won’t use.

But what about the duplicate _method keys? Rails (or maybe Rack). when it comes to duplicate keys, will expose only the last value in the params hash. It’s only if you named the key with square brackets like foo[] does it become an array of values. This isn’t HTTP, but rather a convention of Rails/Rack for parsing form-data into Ruby data objects.

If we want a different URL, we can use HTML’s formaction= attribute to change where the Form is posted:

<%= f.button "Fancy Delete", name: "_method", value: "delete", formaction: fancy_post_path(@post) %>

…which leads to HTML that looks like:

<form action="/posts/42">
  <input type="hidden" name="_method" value="put">
  
  <button>Save</button> <!-- the regular form button -->
  <button name="_method" value="delete" formaction="/posts/42/fancy">
    Fancy Delete
  </button>
</form>

Pretty sweet!

One more tool in our toolbox: In addition to formaction which changes the payload target URL, we can redirect our button to an entirely different form using the form= attribute on the button and targeting the other form by id:

<form action="/posts/42" method="POST">
  <input type="hidden" name="_method" value="put">

  <button>Save</button> <!-- the regular form button -->
  <button name="commit" value="delete" form="fancy_delete_form">
    Fancy Delete
  </button>
</form>

<form id="fancy_delete_form" action="/posts/42/fancy" method="POST">
  <input type="hidden" name="_method" value="put">
  <input type="hidden" name="foo" value="bar">
</form>

In this case, we can include several more values in our payload in the external form. The downside here is that the HTTP payload won’t include any of the data from the primary form, only the inputs within the targettedform=.

Almost but not quite equivalent

We can cover these scenarios:

  • Want to change a method or single value or single URL: add some attributes to the button.
  • Want to submit multiple different values to a different place: use the form= attribute.
  • Want to submit both the original form’s data and multiple other values…. I don’t know how to do that easily. With UJS you could slot those key-values into data-params , but Turbo doesn’t have an equivalent. I encounter this very very rarely. I guess… put them in the query parameters?… Stimulus?

Conflicted and commingled

More than a decade ago, I was seated on the jury of a civil trial for “complex litigation”. I’ll try to keep this quick, but the case does come to mind more frequently than I would have imagined at the time.

In this trial, the plaintiff. a pharmaceutical company. was suing the defendant, a chemistry professor, for fraud. The chemistry professor, as part of his day job at a university, would create a bunch of novel molecules (put a carbon there, or an extra hydrogen here) that the university would test for various interesting bio-medical properties, and then license them to pharmaceutical companies for commercialization.

In this specific instance, the pharmaceutical company licensed a molecule from the university, and then turned around and invited the chemistry professor to join their scientific advisory board and gave the professor some company stock/equity. Fast forward several years and that molecule is now a very valuable drug. And the university also licensed a different molecule to a different company that the chemistry professor also had some equity in.

The alleged fraud was that the chemistry professor didn’t tell that first pharmaceutical company about the other molecule or the other pharmaceutical company… and should have? That was why it went to litigation.

The big picture wasn’t why it stuck in my mind so much as what we did during the six week jury trial: we read a lot of really, really old emails that the plaintiffs would paint as evidence of ill-intent and the defense would explain that nothing was very serious in the first place. I’m making the following exchange up, but thematically it was this over and over again:

Plaintiffs: You must not have been taking your duties very seriously when you replied: “I hope the wine is good winkie-face.”

Defense: Can you read the whole exchange?

Plaintiffs: This was in reply to the scientific advisory board chair writing “The meeting will be brief and then we’ll have dinner together at [nice restuarant]”

Defense: Now this very serious advisory board that you allege our client was taking so unseriously several years ago that you are suing him for fraud today. Was there any written agenda or minutes or notes from those meetings?

Plaintiffs: No.

Lots and lots of that: some personal correspondence recontextualized in a trial against a background of potentially billions of pharmaceutical dollars at stake. Of what I learned:

  • Poe’s Law is everywhere: don’t be cute anywhere it might turn into discovery.
  • Conflicts of Interest and commingling of personal and employer and contractual stuff can cause lots of problems. And also that creating conflicts of interest is a strategy for muddying the water.
  • Complexity causes problems.

On that last point, during the trial there was an entire plaintiff subplot of “but maybe actually it’s the same molecule.” The plaintiffs spent a whole day at least explaining benzene rings. Which was all of a theme to make the chemistry professor appear extra deceptive, like “it’s just one more carbon atom, how dumb do you think we are?!” And then at the end after several objections the judge was like “I’ve already ruled that they’re different molecules and the university owns them and can license them to whoever they want.” Any potential for confusion seemed to be taken advantage of in arguments to set a tone.

Anyways, I’ve been telling this story a lot to people in the context of Ruby Central / Rubygems drama. Usually with the explainer of “totally random, but did ever tell you about the really long jury trial I was part of? It was a lot.”

Rails 103 Early Hints could be better, maybe doesn’t matter

I recently went on a brief deep dive into 103 Early Hints because I looked at a Shakapacker PR for adding 103 Early Hints support. Here’s what I learned.

Briefly, 103 Early Hints is a status code for an HTTP response that happens before a regular HTTP response with content like HTML. The frontrunning response hints to the browser what additional assets (javascript, css) the browser will have to load when it renders the subsequent HTTP response with all the content. The idea being that the browser could load those resources while waiting for the full content response to be transmitted, and thus load and render the complete page with all its assets faster overall.

If you look at a response that includes 103 Early Hints, it looks like 2 responses:

HTTP/2 103
link: </application.css>; as=style; rel=preload,</application.js>; as=script; rel=modulepreload

HTTP/2 200
date: Fri, 17 Oct 2025 15:07:24 GMT
content-type: text/html; charset=utf-8
link: </application.css>; as=style; rel=preload,</application.js>; as=script; rel=modulepreload

<html> 
... the content

I keep writing “103 Early Hints” because Early Hints the status code response (103), also gets confused with the Link header of a content response that serves the same purpose (hinting what assets will need to be loaded), and near identical content: the 103 Early Hint header is usually same the Link value that the actual-content response header has. Because of this conceptual collision, it’s tough to google for and there are various confused StackOverflow responses.

Eileen Uchitelle built out the original implementation in Rails. It’s good. It can be better. It also maybe doesn’t matter. I’ll tell you how and why.

It can be better

There’s two ways that the Rails implementation of 103 Early Hints can be better:

  1. There should only be one 103 Early Hints response.
  2. The 103 Early Hints response should be emitted in a before_action instead of near the tail-end of the response.

There should only be one 103 Early Hint response. According to the RFC, there can be multiple 103 responses, but according to the Browsers, they only look at the first 103 response.

A server might send multiple 103 responses, for example, following a redirect. Browsers only process the first early hints response, and this response must be discarded if the request results in a cross-origin redirect. — MDN

Chrome ignores the second and following Early Hints responses. Chrome only handles the first Early Hints response so that Chrome doesn’t apply inconsistent security policies (e.g. Content-Security-Policy). — Chromium Docs

Rails emits a 103 Early Hint response each and every time your application calls javascript_include_tag, stylesheet_link_tag, or preload_link_tag.

Instead, it would be better if the application could accumulate multiple asset links and then flush them to a single 103 Early Hint response all together.

Aside: it’s really, really, cool how 103 Early Hint responses in Rack/Puma/Rails are emitted in the middle of a handling a response. The webserver puts a lambda/callable into the Rack Environment, and then the application calls that lambda with the contents of the 103 Early Hint response, and that causes the webserver to write the content to the socket. Here’s how it’s done in Puma, in pseudocode:

# In the Puma webserver
request.env["rack.early_hints"] = lambda do |early_hints_str|
  fast_write_str socket, "HTTP/1.1 103 Early Hints\r\n#{early_hints_str}\r\n" 
end

# In the application
request.env["rack.early_hints"]&.call("link: </application.css>; as=style; rel=preload,</application.js>; as=script; rel=modulepreload")

The 103 Early Hint response should be emitted in a before_action instead of near the tail-end of the response. As mentioned, the 103 Early Hint response gets triggered when using javascript_include_tag, stylesheet_link_tag, or preload_link_tag. Those usually are used in a Rails Layout erb file.

In Rails, Layouts get rendered last, after the view is rendered, which means that 103 Early Hints get emitted when the response is almost done being constructed: after the controller action, after the databae queries, after most of the HTML has been rendered to a string.

Instead, it would be better if the 103 Early Hint response was emitted in a before_action before any slow database queries or view rendering happens. The purpose of the 103 Early Hint is to be early. I’ve done this myself, manually constructing the links and flushing them through request.send_early_hints, it’s not difficult, but it would be nice if it was easier.

It maybe doesn’t matter

I can’t actually get 103 Early Hints to be returned all the way to me in any of my production environments. Likely because there is a network device, reverse proxy, load balancer, CDN, or something that’s blocking them.

  • 👎 Heroku with Router 2.0 and custom domain
  • 👎 Heroku behind Cloudfront
  • 👎 Digital Ocean App Platform behind Cloudflare
  • 👎 AWS ECS+Fargate behind an ALB (this one actually breaks the website: HTTP/2 stream 1 was not closed cleanly)

I can see them working locally, using Puma or Puma behind Thruster, but in production…. nada. Obviously this isn’t comprehensive list of production environments, but they’re the ones I am using.

If you want to see them locally:


# Run Puma with early hints. Or use `early_hints` DSL directive in puma.rb
$ bin/rails s --early-hints

# Make a request, this works locally or against a production target
$ curl -s -k -v --http2 localhost:3000 2>&1 | grep -A 5 -E '103 Early Hints|HTTP/2 103'

< HTTP/1.1 103 Early Hints
< link: </assets/application-316caf93b23ca4756d151eaa97d8122c7173f8bdfea91203603e56621193c19e.css>; rel=preload; as=style; nopush
<
< HTTP/1.1 103 Early Hints
< link: </vite-dev/assets/application-WvRi4PrU.js>; rel=modulepreload; as=script; crossorigin=anonymous; nopush
<
< HTTP/1.1 103 Early Hints
< Link: </vite-dev/assets/index-ilXdZXkf.js>; rel=modulepreload; as=script; crossorigin=anonymous
<

And if you want to see 103 Early Hints… anywhere… good luck! I have yet to find an example of a website that serves them.

# Basecamp
$ curl -s -k -v --http2 https://basecamp.com 2>&1 | grep -A 5 -E '103 Early Hints|HTTP/2 103'
# nothing

# GitHub
$ curl -s -k -v --http2 https://github.com 2>&1 | grep -A 5 -E '103 Early Hints|HTTP/2 103'
# nothing

# Shopify
$ curl -s -k -v --http2 https://www.shopify.com 2>&1 | grep -A 5 -E '103 Early Hints|HTTP/2 103'
# nothing

# Google
$ curl -s -k -v --http2 https://www.google.com 2>&1 | grep -A 5 -E '103 Early Hints|HTTP/2 103'
# nothing

# Someone's tester for 103 Early Hints
$ curl -s -k -v --http2 https://code103.hotmann.de 2>&1 | grep -A 5 -E '103 Early Hints|HTTP/2 103'
< HTTP/2 103
< link: </app.min.css>; as=style; rel=preload
<
# ... ok, that returns something

Hanami and loading code, faster

I’ll be giving a talk in November in at SF Ruby Conference (tickets on sale now!). My talk is speeding up your application’s development cycle by taking a critical eye at your application’s development boot. Which all boils down to do less. In Ruby, the easiest, though not the simplest, is to load less code. So yeah, autoloading.

To expand my horizons and hopefully give a better talk, I branched out beyond my experience with Ruby on Rails to talk to Tim Riley about Hanami and how it handles code loading during development.

The following are my notes; it’s not a critical review of Hanami, and it only looks into a very narrow topic: code loading and development performance.

Ruby, and analogously Rails

Ruby has a global namespace; constants (classes, modules, CONSTANTS) are global singletons. When your code (or some code you’re loading—Ruby calls each file it loads a “feature” identified by its filesystem path) defines a constant, Ruby is evaluating everything about the constant: the class body, class attributes, basically anything that isn’t in a block or a method definition. And so any constants that are referenced in the code also need to be loaded and evaluated, and class ancestors, and their code and so forth. That’s the main reason booting an application is slow: doing stuff just to load the code that defines all the constants so the program can run.

The name of the game in development, where you want to run a single test or browser a single route or open the CLI, is load less. If you can just avoid loading the constant, you can avoid loading the file the constant is defined in, and avoid loading all of its other dependencies and references until later, when you really need them (or never, in development).

The most common strategy for deferring stuff is: use a string as a stand-in for the constant, and only later, when you really need to convert the string to a constant, do it. An example is in Rails Routes, where you’ll write to: “mycontroller#index” and not MyController. At some point the mycontroller gets constantized to MyController, but that’s later, when you hit that particular route. Another example is Active Record Relation definitions, where you’ll use class_name: “MyModel" instead of class_name: MyModel, which only gets constantized when you use record.my_models.

In Rails, a lot of performance repair work for development is identifying places where a constant shouldn’t be directly referenced and instead should use some other stand-in until it’s really needed. In Rails, it can be confusing, because sometimes you can use a configuration string to refer to a constant, and sometimes you have to use a constant; it is inconsistent.

In Hanami, (nearly) everything has a string key

Hanami’s approach: make all the application components referencable by a string, called a key. (again, Hanami does quite a bit more than that, I just mean in regards to code loading). Objects are configured by what keys they have dependencies upon, and those objects are injected by the framework. So instead of writing this:

class MyClass
  cattr_accessor :api_client
  self.api_client = ApiClient.new # <-- loads that constants

  def transmit_something
    MyClass.api_client.transmit("something")
  end
end

…you would instead use Hanami’s Deps and write:

class MyClass
  include Deps["api_client"] # <-- injects the object

  def transmit_something
    api_client.transmit("something")
  end
end

Keys are global, and keys whose objects have been loaded live in Hanami.app.keys . If the key’s object hasn’t been loaded yet, it will be converted from a string to… whatever (not just constants)… when it’s needed to execute. Individual objects can be accessed with Hanami.app["thekey"] when debugging, but normal code should get them injected from Deps. By convention, keys match a class name but they don’t have to. This is powered by dry-system.

Not everything has to have a key. Functional components in Hanami have a key, but classes that embody a bit of data (in Hanami these are called Structs) do not have entries in the app container, and therefore don’t have keys.

If you have something functional coming from outside Hanami, like that ApiClient in the code above or coming from a non-Hanami specific gem or wherever, then you can give them a key and define their lifecycle within the application via a Provider.

Briefly, commentary: Some common Rails development discourse is “Rails is too magic”, which is leveled because Rails framework can work out what constants you mean without directly referencing them (e.g. has_many :comments implies there’s an Active Record Comment), and “just use a PORO” (plain old ruby object) when a developer is trying to painfully jam everything into narrow Rails framework primitives. With Hanami:

  • Hanami has quite a bit of like “here’s a string, now it’s an object 🪄” , but it is consistently applied everywhere and has some nice benefits beyond just brevity, like overloading dependencies.
  • Everything does sorta have to be fit into the framework, but there’s an explicit interface for doing so.

Assorted notes in this general theme

  • Providers are like “Rails initializers but with more juice” – they register components in the container. They have lifecycle hooks (prepare, start, stop) for managing resources. They’re lazily loaded and can have namespace capabilities for organizing related components.
  • Hanami encourages namespacing over Rails’ flat structure. “Slices” provide first-class support for modularizing applications like Rails Engines. Each slice has its own container and can have its own providers, creating bounded contexts.
  • Hanami uses Zeitwerk for code loading.
  • Dev server uses Guard to restart puma in development. Because everything is so modularized, it’s good enough.
  • Code is lazy-loaded in development but fully pre-loaded in production.

Where things are going

In the Hanami Discord, Tim shared a proposal for building out a plugin system for Hanami… and to me looks a lot like Railties and ActiveSupport lazy load hooks:

Using your grant, I propose to implement this Hanami extensions API. The end
goal will be to:

  • Allow all first-party “framework extension code” to move from the core Hanami
    gem back into the respective Hanami subsystem gems (e.g. the core Hanami
    gem should no longer have specific extension logic for views).
  • Allow third-party gems to integrate with Hanami on an equal footing to the first-
    party gems.

This will require building at least some of the following:

  • Ability for extensions to be detected by or registered with the Hanami framework.
  • Ability to enhance or replace Hanami CLI commands.
  • Ability to register new configuration settings on the Hanami app.
  • Hooks for extending core Hanami classes.
  • Hooks for adding logic to Hanami’s app boot process.
  • Adjustments to first-party Hanami gems to allow their classes to be used in an un-extended state when required.
  • A separate “extension” gem that can allow Hanami extensions to register their extensions without depending on the main Hanami gem.

And how this all started

Ending on what I originally shared with Tim to start our discussion, which I share partly cause I think it’s funny how easily I can type out 500 words today on a thesis of like “why code loading in Ruby is hard”:

Making boot fast; don’t load the code unless you need it

Don’t load code until/unless you need it. DEFINITELY don’t create database connections or make any http calls or invoke other services. How Rails does it, Rails autoloads as much as possible (framework, plugin/extension, and application code), either via Ruby Autoload or Zeitwerk. The architecture challenge is: how do you set up configuration properties, so that when the code is loaded (and all the different pieces of framework/plugin/extension/application get their fingers on it), it is configured with the properties y’all ultimately want on it? There are two mechanisms:

  • A configuration hash, that is intended to be made up (somewhat) of primitives that are dependency free and thus don’t load a bunch of code themselves,
  • A callback hook that is placed within autoloaded code, that one can register against and use it to pull data out of configuration (framework/plugin/extension) or override/overload behavior (your application), that is only triggered when the code is loaded for reals. Extensions put this in a Railtie, maybe you put it in an initializer.,
    The practical problems are:

  • Ideally everything was stateless and just pulled values from configuration and got torn down after every request/transaction/task, but also:
    • Some objects are long-lived, and you don’t want to constantly be tearing them down,
    • Sometimes locality of properties is nice and it would be annoying to be like “either use this locally assigned value OR use this value from really far away in this super deep config object”.,
    • Hopefully that config object is thread and fiber safe if you’re gonna be changing it later and you’re not really sure what’s happening right then in your application lifecycle.,
  • A hook doesn’t exist in the place that you want to hook into, so you either have to:
    • go upstream and get a hook added; which is annoying (just hook every class and feature, why not?!),
    • load the code prematurely so you can directly modify it,
  • When something else (framework/plugin/extension/application) prematurely loads the code (chaotically or intentionally), before you add your own configuration or before you register a hook callback, and the behavior is stateful or had to be backed out (example: it’s configuration for connections in a connection pool and early invocation fills the pool with connection objects with premature configuration. So to re-configure you have to drain the pool of the old prematurely configured connections and maybe that’s hard),
  • Examples of pain:
    • Devise.
      • It’s route (devise_for) loads your active record model, when routes load, which in < Rails 8.0 was when your app boots, which is premature otherwise,
      • Changing the layout of devise controllers. They don’t have load hooks (maybe they should?). You can subclass them and manually mount them in your app, but htat’s annoying,
    • Every initializer where you try to assign config and maybe it won’t work cause something else already hooked it and loaded it and it’s baked.,

How Hanami does it:

@inouire in the Rails Discord shared a couple of links: You can find some information about Hanami way of handling dependency container: https://guides.hanamirb.org/v2.2/app/container-and-components/ Also autoloading: https://guides.hanamirb.org/v2.2/app/autoloading/ And info about lazy boot: https://guides.hanamirb.org/v2.2/app/booting/

Hanami questions from Ben:

  • Components are singletons that are pure-ish functions? Do they get torn down / recreated on every request, or does the same object exist for the lifetime of the application?,
  • Is there a pattern of assigning properties to class variables? Seems like most stuff is pure-ish functions. How do you handle objects that you want to be long-lived, like Twitter::Client.new or something?,
  • I didn’t see plugins/extensions. Are you required to subclass and overload a component or can you poke around in an existing class/component? Can I defer poking around in a component until it’s loaded? (like an autoload hook),
  • Are there any patterns you see people do, that would slow down their hanami app’s boot, that you wish they didn’t do?

Serializing ViewComponent for Active Job and Turbo Broadcast Later

I recently started using ViewComponent. I’ve been gradually removing non-omikase libraries from my Rails applications over the past decade, but ViewComponent is alright. I was strongly motivated by Boring Rails’ “Hotwire components that refresh themselves”, cause matching up all the dom ids and stream targets between views/partials and… wherever you put your Stream and Broadcast renderers is a pain.

You might also know me as the GoodJob author. So of course I wanted to have my Hotwire components refresh themselves later and move stream broadcast rendering into a background job. I to simply call MessagesComponent.add_message(message) and broadcasts an update later to the correct stream and target that are all nice and compactly stored inside of the View Component:

class MessagesComponent < ApplicationComponent
  def self.add_message(message)
    user = message.user
    Turbo::StreamsChannel.broadcast_action_later_to(
      user, :message_list,
      action: :append,
      target: ActionView::RecordIdentifier.dom_id(user, :messages),
      renderable: MessageComponent.serializable(message: message), # <- that right there
      layout: false
    )
  end

  def initialize(user:, messages:)
    @user = user
    @messages = messages
  end

  erb_template <<~HTML
    <%= helpers.turbo_stream_from @user, :message_list %>
    <div id="<%= dom_id(@user, :messages) %>">
      <%= render MessageComponent.with_collection @messages %>
    </div>
  HTML
end

That’s a simple example

Making a renderable work later

The ViewComponent team can be really proud of achieving first-class support Rails for a library like ViewComponent. Rails already supported views and partials and now it also supports an object that quacks like a renderable.

For ViewComponent to be compatible with Turbo Broadcasting later, those View Components need to be serializable by Active Job. That’s because Turbo Rail’s broadcast_*_later_to takes the arguments it was passed and serializes them into a job so they can be run elsewhere better/faster/stronger.

To serialize a ViewComponent, we need to collect its initialization arguments, so that we can reconstitute it in that elsewhere place where the job is executed and the ViewComponent is re-initialized. To initialize a ViewComponent, you call new which calls its initialize method. To patch into that, there’s a couple of different strategies I thought of taking:

  • Make the developer figure out which properties of an existing ViewComponent (ivars, attributes) should be grabbed and how to do that.
  • prepend a module method in front of ViewComponent#initialize. Our module would always have to be at the top of the ancestors hierarchy, because subclasses might overload initialize themselves, so we’d have to have an inherited callback that would prepend the module (again) every time that happened
  • Simply initialize the ViewComponent via another, more easily interceptable method, when you want it to be serializable.

I respect that ViewComponent really wanted a ViewComponent to be just like any other Ruby object that you create with new and initialize , but it makes this particular goal, serialization, rather difficult. You can maybe see the ViewComponent maintainers ran into a few problems with initialization themselves: a collection of ViewComponents can optionally have each member initialized with an iteration number, but to do that ViewComponent has to introspect the initialize parameters to determine if the object implements the iteration parameter to decide whether to send it 🫠 That parameter introspection also means that we can’t simply prepend a redefined generic initialize(*args, **kwargs) because that would break the collection feature. Not great 💛

So, given the compromises i’m willing to make between ergonomics and complexity and performance, given my abilities, and my experience, and what I know at this time…. I decided to simply make a new initializing class method, named serializable. If I want my ViewComponent to be serializable, I initialize it with MyComponent.serializable(foo, bar:).

# frozen_string_literal: true
# config/initializers/view_component.rb
#
# Instantiate a ViewComponents that is (optionally) serializable by Active Job
# but otherwise behaves like a normal ViewComponent. This allows it to be passed
# as a renderable into `broadcast_action_later_to`.
#
# To use, include the `ViewComponent::Serializable` concern:
#
#  class ApplicationComponent < ViewComponent::Base
#    include ViewComponent::Serializable
#  end
#
# And then call `serializable` instead of `new` when instantiating:
#
#   Turbo::StreamsChannel.broadcast_action_later_to(
#     :admin, client, :messages,
#     action: :update,
#     target: ActionView::RecordIdentifier.dom_id(client, :messages),
#     renderable: MessageComponent.serializable(message: message)
#   )
#
module ViewComponent
  module Serializable
    extend ActiveSupport::Concern

    included do
      attr_reader :serializable_args
    end

    class_methods do
      def serializable(*args)
        new(*args).tap do |instance|
          instance.instance_variable_set(:@serializable_args, args)
        end
      end
      ruby2_keywords(:serializable)
    end
  end
end

class ViewComponentSerializer < ActiveJob::Serializers::ObjectSerializer
  def serialize?(argument)
    argument.is_a?(ViewComponent::Base) && argument.respond_to?(:serializable_args)
  end

  def serialize(view_component)
    super(
      "component" => view_component.class.name,
      "arguments" => ActiveJob::Arguments.serialize(view_component.serializable_args),
    )
  end

  def deserialize(hash)
    hash["component"].safe_constantize&.new(*ActiveJob::Arguments.deserialize(hash["arguments"]))
  end

  ActiveJob::Serializers.add_serializers(self)
end

Real talk: I haven’t packaged this into a gem. I didn’t want to maintain it for everyone, and there’s some View Component features (like collections) it doesn’t handle yet because I haven’t used them (yet). I think this sort of thing is first class behavior for the current state of Rails and Active Job and Turbo, and I’d rather the library maintainers figure out what the best balance of ergonomics, complexity, and performance is for them. I’ve been gently poking them about it in their Slack; they’re great and I believe we can arrive at something even better than this patch I’m running with myself for now 💖

Notes from building a “who is doing what right now on our website?” presence feature with Action Cable

A screenshot of my application with little presence indicators decorating content

I recently was heads down building a “presence” feature for the case and communications management part of my startup’s admin dashboard. The idea being that our internal staff can see what their colleagues are working on, better collaboarate together as a team of overlapping responsibility, and reduce duplicative work.

The follow is more my notes than a cohesive narrative. But maybe you’ll get something out of it.

Big props

In building this feature, I got a lot of value from:

  • Basecamp’s Campfire app, recently open sourced, which has a sorta similar feature.
  • Rob Race’s Developer Notes about building a Presence Feature
  • AI slop, largely Jetbrains Junie agent. Not because it contributed code to the final feature, but because I had the agent try to implement from scratch 3 different times, and while none of them fully worked (let alone met my quality standards or covered all edges), it helped sharpen the outlines and common shapes and surfaced some API methods to click into that I wasn’t aware of. And made the difference between undirected poking around vs being like “ok, this is gonna require no more than 5 objects in various places working together; let’s go!”

The big idea

The feature I wanted to build would track multiple presence keys at the same time. So if someone is on a deep page (/admin/clients/1/messages) they’d be present for that specific client, any client, as well as the dashboard a whole.

I also wanted to keep separate “track my presence” and “display everyone’s presence”.

What I ended up with was:

  1. Client in the browser subscribes to the PresenceChannel with a key param. It also sets up a setInterval heartbeat to send down a touch message every 30 seconds. This is a Stimulus controller that uses the Turbo cable connection, cause it’s there.
  2. On the server, the PresenceChannel has connected, disconnected, and touch actions and stores the key passed during connect. It writes to an Active Record model UserPresence and calls increment, decrement, and touch respectively.
  3. The Active Record model persists all these things atomically (Postgres!) and then triggers vanilla Turbo Stream Broadcast Laters (GoodJob!).
  4. The frontend visually is all done with vanilla Turbo Stream Broadcasts over the vanilla Turbo::StreamsChannel appending to and removing unique dom elements that are avatars of the present users.

It works! I’m happy with it.

Ok, let’s get some grumbles out.

Action Cable could have a bit more conceptual integrity

I once built some Action Cable powered features about 7 years ago, before Turbo Broadcast Streams, and it wasn’t my favorite. Since then, Turbo Broadcast Streams totally redeemed my feelings about Action Cable… and then I had to go real deep again on Action Cable to build this Presence feature.

At first I thought it was me, “why am I not just getting this?”, but as I became more familiar I came to the conclusion: nah, there’s just a lot of conceptual… noise… in the interface. I get it, it’s complicated.

In the browser/client: You have a Connection, a Connection “opens” and “closes”, but also “reconnects” (reopens?). Then you create a Subscription on the Connection by Subscribing to a named Channel (which is a backend/server concept); Subscriptions have a “connected” callback when “subscription has been successfully completed” (subscribed?) and “disconnected” “when the client has disconnected with the server” (a Connection disconnect). If the Connection closes, reconnects, and reopens, then the Channel’s disconnected and reconnected callbacks are triggered again. Subscriptions can also be “rejected”. You can see some of this drift too in the message types key/value constants .

…as a concrete example: you don’t connection.subscribe(channelName, ...) you consumer.subscriptions.create(channelName, ...) (oh jeez, it’s called Consumer). Turbo Rails tries to clean up some of this as you can call cable.subscribeTo(channelName, ...)to subscribe to a Channel using Turbo Stream Broadcasts’ existing connection. But even that is compromised because you don’t subscribeTo a channel, you subscribeTo by passing an object of { channel: channelName, params: paramsforChannelSubscribe } . Here’s an example from Campfire.

On the server, I have accepted that the Connection/Channel/Streams challenges me, which is probably because of the inherent complexity of multiplexing Streams (no, not Turbo “Streams”, Action Cable “Streams”) over Channels that are themselves multiplexed over connection(s), and it makes my head spin. . That Channels connect Streams, and one Broadcasts on Streams, and one can also transmit on a channel to a specific client in a Channel, and often one does broadcast(channel, payload) but channel may be the name of a Stream. My intuition is that Streams were bolted onto Action Cable’s Channel implementation rather part of the initial conception though it all landed in Rails at once.

I’m a pedantic person, and it’s tiring for me to write about this stuff with precision. Active Storage named variants—with its record-blob-variant-blob-record—has as an analogous vibe of “I guess it works and I have a hard time looking directly at it”.

I have immense compassion and sympathy and empathy for trying to wrangle something as complex as Action Cable. And also fyi, it is a lot.

Testing

  • You’ll need to isolate and reset Action Cable after individual tests to prevent queries from being made after the transaction rollback, or changing of pinned database connection:ActionCable.server.restart
  • If you see deadlocks, pg.exec freezes or AR gives you undefined method 'count' for nil inside of Active Record because the query result object is nil, that’s a sign that the database connection is being read out-of-order/unsafely asynchronously/all whack.

Page lifecycle

Live and die by the Browser Page Lifecycle API.

Even with data-turbo-permanent, Stimulus controllers and turbo-cable-streams JavaScript get disconnected and reconnected. Notice that there is a lot of use of nextTick/nextFrame to try to smooth over it.

  • hotwired/turbo: [ does not work as permanent](https://github.com/hotwired/turbo/issues/868#issuecomment-1419631586)
  • Miles Woodroffe: “Out of body experience with turbo” about DOM connect/disconnects during Turbo Drive

And general nits that otherwise would necessitate less delicate coding.

I ended up making a whole new custom element data-permanent-cable-stream-source. All that to wait a tick before actually unsubscribing the channel in case the element is reconnected to the page again by data-turbo-permanent. What does that mean for unload events? Beats me for now.

What am I doing about it?

All this work did generate some upstream issues and PRs. I mostly worked around them in my own app, but maybe we’ll roll the rock uphill a little bit:

Notes, right?

Yep, these are my notes. Maybe they’re helpful. No big denouement. The feature works, I’m happy with it, my teammates are happy, and I probably wouldn’t have attempted it at all if I didn’t have such positive thoughts about Action Cable going in, even if the work itself got deeply into the weeds.

Serializing ViewComponent for Active Job and Turbo broadcast later

I recently started using ViewComponent. I’ve been gradually removing non-omikase libraries from my Rails applications over the past decade, but ViewComponent is alright. I was strongly motivated by Boring Rails’ “Hotwire components that refresh themselves”, cause matching up all the dom ids and stream targets between views/partials and …. wherever you put your Stream and Broadcast renderers is a pain.

You might be familiar with me as the GoodJob author. So of course I wanted to have my Hotwire components refresh themselves later and move stream broadcast rendering into a background job. I simply call MessagesComponent.add_message(message) and broadcasts an update later to the correct stream and target that are all nice and local when defined inside the View Component:

class MessageListComponent < ApplicationComponent
  def self.add_message(message)
    user = message.user
    Turbo::StreamsChannel.broadcast_action_later_to(
      user, :message_list,
      action: :append,
      target: ActionView::RecordIdentifier.dom_id(user, :message_list),
      renderable: MessageComponent.serializable(message: message), # <- that right there
      layout: false
    )
  end

  def initialize(user:, messages:)
    @user = user
    @messages = messages
  end

  erb_template <<~HTML
    <%= helpers.turbo_stream_from @user, :message_list %>
    <div id="<%= dom_id(@user, :message_list) %>">
      <%= render MessageComponent.with_collection @messages %>
    </div>
  HTML
end

That’s a simple example.

Making a renderable work later

The ViewComponent team can be really proud of adding first-class support to Rails for a library like ViewComponent. Rails already supported views and partials and now it also supports an object that quacks like a renderable .

For ViewComponent to be compatible with Turbo Broadcasting later, those View Components need to be serializable by Active Job. That’s because Turbo Rail’s broadcast_*_later_to takes the arguments it was passed and serializes them into a job so they can be run elsewhere better/faster/stronger.

To serialize a ViewComponent, we need to collect its initialization arguments so that we can reconstitute it in that elsewhere place where the job is executed and the ViewComponent is re-initialized. To initialize a ViewComponent, you call new which calls its initialize method. To patch into that, there are a couple of different strategies I thought of taking:

  • Make the developer figure out which properties of an existing ViewComponent (ivars, attributes) should be grabbed and how to do that.

  • prepend a module method in front of ViewComponent#initialize. Our module would always have to be at the top of the ancestors hierarchy, because subclasses might overload initialize themselves, so we’d have to have an inherited callback that would prepend the module (again) every time that happened

  • Simply initialize the ViewComponent via another, more easily interceptable method, when you want it to be serializable.

I respect that ViewComponent maintainers really want a ViewComponent to be just like any other Ruby object that you create with new and initialize , but it makes this particular goal, serialization, rather difficult. You can maybe see the ViewComponent maintainers ran into a few problems with initialization themselves: a collection of ViewComponents can optionally have each member initialized with an iteration number, but to do that ViewComponent has to introspect the initialize parameters to determine if the object implements the iteration parameter to decide whether to send it 🫠 That parameter introspection also means that we can’t simply prepend a redefined generic initialize(*args, **kwargs) because that would break the collection feature. Not great 💛

So, given the compromises I’m willing to make between ergonomics and complexity and performance, given my abilities, and my experience, and what I know at this time…. I decided to simply make a new initializing class method, named serializable. If I want my ViewComponent to be serializable, I initialize it with MyComponent.serializable(foo, bar:).

# config/initializers/view_component.rb
#
# Instantiate a ViewComponents that is (optionally) serializable by Active Job
# but otherwise behaves like a normal ViewComponent. This allows it to be passed
# as a renderable into `broadcast_action_later_to`.
#
# To use, include the `ViewComponent::Serializable` concern:
#
#  class ApplicationComponent < ViewComponent::Base
#    include ViewComponent::Serializable
#  end
#
# And then call `serializable` instead of `new` when instantiating:
#
#   Turbo::StreamsChannel.broadcast_action_later_to(
#     :admin, user, :messages,
#     action: :update,
#     target: ActionView::RecordIdentifier.dom_id(user, :messages),
#     renderable: MessageComponent.serializable(message: message)
#   )
#
module ViewComponent
  module Serializable
    extend ActiveSupport::Concern

    included do
      attr_reader :serializable_args
    end

    class_methods do
      def serializable(*args)
        new(*args).tap do |instance|
          instance.instance_variable_set(:@serializable_args, args)
        end
      end
      ruby2_keywords(:serializable)
    end
  end
end

class ViewComponentSerializer < ActiveJob::Serializers::ObjectSerializer
  def serialize?(argument)
    argument.is_a?(ViewComponent::Base) && argument.respond_to?(:serializable_args)
  end

  def serialize(view_component)
    super(
      "component" => view_component.class.name,
      "arguments" => ActiveJob::Arguments.serialize(view_component.serializable_args),
    )
  end

  def deserialize(hash)
    hash["component"].safe_constantize&.new(*ActiveJob::Arguments.deserialize(hash["arguments"]))
  end

  ActiveJob::Serializers.add_serializers(self)
end

Real talk: I haven’t packaged this into a gem. I didn’t want to maintain it for everyone, and there are some View Component features (like collections) it doesn’t handle yet because I haven’t used them (yet). I think this sort of thing is first-class behavior for the current state of Rails and Active Job and Turbo, and I’d rather the library maintainers figure out what the best balance of ergonomics, complexity, and performance is for them. I’ve been gently poking them about it in their Slack; they’re great 💖

Building deterministic, reproducible assets with Sprockets

This is a story that begins with airplane wifi, and ends with the recognition that everything is related in web development.

While on slow airplane wifi, I was syncing this blog’s git repo, and it was taking forever. That was surprising because this blog is mostly text, which I expected shouldn’t require many bits to transfer for Git. Looking more deeply into it (I had a 4-hour flight), I discovered that the vast majority of the bits were in the git branch of built assets that gets deployed to GitHub Pages (gh-pages) when I build my Rails app into a static site with Parklife. And the bits in that branch were assets (css, javascript, and a few icons and fonts) built by Sprockets, whose contents were changing every time the blog was built and published. What changed?

  • Sprockets creates a file manifest that is randomly named ".sprockets-manifest-#{SecureRandom.hex(16)}.json".
  • Within the file manifest, there is an entry for every file built by Sprockets, that includes that original asset’s mtime—when the file on the filesystem was last touched, even if the contents didn’t change.
  • By default, Sprockets generates gzipped .gz copies of compressible assets, and it includes the uncompressed file’s mtime in the gzipped file’s header, producing different binary content even though the compressed payloads’ contents didn’t change.

Do I need that? Let’s go through it.

The Sprockets Manifest

The Sprockets Manifest is pretty cool (I mean public/assets/.sprockets-manifest-*.json, not app/assets/config/manifest.js which is different). The manifest is how Sprockets is able to add unique cache-breaking digests to each file while still remembering what the file was originally named. When building assets on a server with a persisted filesystem, Sprockets also uses the manifest to keep old versions of files around: bin/rails assets:clean will keep the last 3 versions of built assets, which is helpful for blue-green deployments. Heroku also has a bunch of custom stuff powered by this too to make deployments seamless.

But none of that is applicable to me and this blog, which gets built from scratch and committed to Git. Or for that matter, when I build some of my other Rails apps with Docker; not unnecessarily busting my cached file layers would be nice 💅

The following is a monkeypatch, which works with Sprockets right now but I’m hoping to ultimately propose as a configuration option upstream (as others have proposed).

# config/initializers/sprockets.rb
module SprocketsManifestExt
  def generate_manifest_path
    # Always generate the same filename
    ".sprockets-manifest-#{'0' * 32}.json"
  end

  def save
    # Use the epoch as the mtime for everything
    zero_time = Time.at(0).utc
    @data["files"].each do |(_path, asset)|
      asset["mtime"] = zero_time
    end

    super
  end

  Sprockets::Manifest.prepend self
end

Now, if you’re like me (on a plane), you might be curious about why the obsessive tracking of mtime. I have worked alongside several people in my career with content-addressable storage obsessions. The idea being: focus on the contents, not the container. And mtime is very much a concern of the container. But Sprockets makes the case that “Compiling assets is slow” so I can see it’s useful to quickly check when the file was modified, in a lot of cases… but not mine.

Let’s move on.

GZip, but maybe you don’t need it

So… everything in web development is connected. While wondering why new copies of every .gz file were being committed on every build, I remembered what my buddy Rob recently did in Rails: MakeActiveSupport::Gzip.compressdeterministic.

I have some tests of code that uses ActiveSupport::Gzip.compress that have been flaky for a long time, and recently discovered this is because the output of that method includes the timestamp of when it was compressed. If two calls with the same input happen during different seconds, then you get different output (so, in my flaky tests, they fail to compare correctly).

GZip takes a parameter called mtime, which is stored and changes the timestamp of the compressed file(s) when they are uncompressed. It changes the content of the gzipped file, because it stores the timestamp in the contents of the file, but doesn’t affect the mtime of the gzipped file container.

So in the case of Sprockets, if the modification date of the uncompressed asset changes, regardless of whether its contents have changed, a new and different (according to git or Docker) gzipped file will be generated. This was really bloating up my git repo.

Props to Rack maintainer Richard Schneeman who dug further down this hole previously, admirably asking the zlib group themselves for advice. The commentary made a mention of nginx docs, which I assume is for ngx_http_gzip_static_module which says:

The files can be compressed using the gzip command, or any other compatible one. It is recommended that the modification date and time of the original and compressed files be the same.

But that’s not GZip#mtime value stored inside the contents of the gzip file, that’s the mtime of the .gz file container. Sprockets also sets that, with File.utime.

It’s easy enough to patch the mtime to the “unknown” value of 0:

# config/initializers/sprockets.rb
module SprocketsGzipExt
  def compress(file, _target)
    archiver.call(file, source, 0)
    nil
  end

  Sprockets::Utils::Gzip.prepend self
end

…though if you’re in my shoes, you might not even need these gzipped assets. afaict only Nginx makes use of them with the non-default ngx_http_gzip_static_module module; Apache requires some complicated RewriteRules; Puma doesn’t serve them, CDNs don’t request them. Maybe turn them off? 🤷

# config/initializers/sprockets.rb
Rails.application.configure do
  config.assets.gzip = false
end

Fun fact: that configuration was undocumented

Maybe please don’t even pass mtime to gzip for web assets

All of this stuff about file modification dates reminded me of another thing I had once previously rabbit-holed on, which was poorly behaved conditional requests in RSS Readers. The bad behavior involved inappropriately caching web requests whose Last-Modified HTTP header changed, but their contents didn’t. And how do webservers generate their Last-Modified header value? That’s right, file mtime, the one that can be set by File.utime!

…but not the one set by GZip#mtime=. I cannot find any evidence anywhere that value, in the contents of the gzip file matters. Nada. All it does is make the gzip file’s contents be different, because of that one tiny value being included. I can’t imagine anything cares about the original mtime when it’s unzipped, that wasn’t already transmitted via the Last-Modified HTTP header. What am I missing?

Of the evidence I have, it seems like developers set GZip#mtime=… because it’s an option? I couldn’t find a reason in the Sprockets history. I noticed that Rack::Deflater does the same for reasons I haven’t figured out in their history either. This behavior probably is not busting a lot of content-based caches unnecessarily, but it probably does some. So maybe don’t do it unless you need to.


Older posts