Whatever you do, don’t autoload Rails lib/

Update: Rails v7.1 will introduce a new configuration method config.autoload_lib to make it safer and easier to autoload the /lib directory and explicitly exclude directories from autoloading. When released, this advice may no longer be relevant, though I imagine it will still be possible for developers to twist themselves into knots and cause outages with autoloading overrides.

One of the most common problems I encounter consulting on Rails projects is that developers have previously added lib/ to autoload paths and then twisted themselves into knots creating error-prone, project-specific conventions for subsequently un-autoloading a subset of files also in lib/.

Don’t do it. Don’t add your Rails project’s lib/ to autoload paths.

How does this happen?

A growing Rails application will accumulate a lot of ruby classes and files that don’t cleanly fit into the default app/ directories of controllers, helpers, jobs, or models. Developers should also be creating new directories in app/* to organize like-with-like files (your app/services/ or app/merchants/, etc.). That’s ok!

But frequently there are one-off classes that don’t seem to rise to the level of their own directory in app/. From looking through the cruft of projects like Mastodon or applications I’ve worked on, these files look like:

  • A lone form builder
  • POROs (“Plain old Ruby objects”) like PhoneNumberFormatter, or ZipCodes or Seeder, or Pagination. Objects that serve a single purpose and are largely singletons/identity objects within the application.
  • Boilerplate classes for 3rd party gems, e.g. ApplicationResponder for the responders gem.

That these files accumulate in a project is a fact of life. When choosing where to put them, that’s when things can go wrong.

In a newly built Rails project lib/ looks like the natural place for these. But lib/ has a downside: lib/ is not autoloaded. This can come as a surprise, even to experienced developers, because they have been accustomed to the convenience of autoloaded files in app/. It’s not difficult to add an explicit require statement into application.rb or in an initializer, but that may not be one’s first thought.

That’s when people jump to googling “how to autoload lib/”. Don’t do it! lib/ should not be autoloaded.

The problem with autoloading lib/ is that there will subsequently be files added to lib/ that should not be autoloaded; because they should only be provisionally loaded in a certain environment or context, or deferred, for behavioral, performance, or memory reasons. If your project has already enabled autoloading on lib/, it’s now likely you’ll then add additional configuration to un-autoload the new files. These overrides and counter-overrides accumulate over time and become difficult to understand and unwind, and they cause breakage because someone’s intuition of what will or won’t be loaded in a certain environment or context is wrong.

What should you do instead?

An omakase solution

DHH writes:

lib/ is intended to be for non-app specific library code that just happens to live in the app for now (usually pending extraction into open source or whatever). Everything app specific that’s part of the domain model should live in app/models (that directory is for POROs as much as ARs)… Stuff like a generic PhoneNumberFormatter is exactly what lib/ is intended for. And if it’s app specific, for some reason, then app/models is fine.

The omakase solution is to manually require files from lib/ or use app/models generically to mean “Domain Models” rather than solely Active Record models. That’s great! Do that.

A best practice

Xavier Noria, Zeitwerk’s creator writes:

The best practice to accomplish that nowadays is to move that code to app/lib. Only the Ruby code you want to reload, tasks or other auxiliary files are OK in lib.

Sidekiq’s Problems and Troubleshooting explains:

lib/ directory will only cause pain. Move the code to app/lib/ and make sure the code inside follows the class/filename conventions.

The best practice is to create an app/lib/ directory to home these files. Mastodon does it, as do many others.

This “best practice” is not without contention, as usually anything in Rails that deviates from omakase does, like RSpec instead of MiniTest or FactoryBot instead of Fixtures. But creating app/lib as a convention for Rails apps works for me and many others.

Really, don’t autoload lib/

Whatever path you take, don’t take the path of autoloading lib/.


I read "The Constant Rabbit" by Jasper Fforde

| Review | ★★★★★

An “Event” has caused rabbits to become anthropomophic. This exchange is the book in a walnut-shell:

So while we ate the excellent walnut cake that the Venerable Bunty’s mother’s sister’s daughter’s husband’s son had baked, Venerable Bunty and Connie told us about life inside the colonies, which despite the lack of freedom and limited space were the only areas within the United Kingdom that ran themselves entirely on rabbit socio-egalitarian principles.

‘It’s occasionally aggressive and often uncompromising,’ said Finkle, ‘but from what I’ve seen of both systems, a country run on rabbit principles would be a step forward – although to be honest, I’m not sure we’d be neurologically suited to the regime. While most humans are wired to be reasonably decent, a few are wired to be utter shits – and they do tend to tip the balance.’

‘The decent humans are generally supportive of doing the right thing,’ said the Venerable Bunty, ‘but never take it much farther than that. You’re trashing the ecosystem for no reason other than a deluded sense of anthropocentric manifest destiny, and until you stop talking around the issue and actually feel some genuine guilt, there’ll be no change.’

‘Shame, for want of a better word, is good,’ said Finkle. ‘Shame is right, shame works. Shame is the gateway emotion to increased self-criticism, which leads to realisation, an apology, outrage and eventually meaningful action. We’re not holding our breaths that any appreciable numbers can be arsed to make the journey along that difficult chain of emotional honesty – many good people get past realisation, only to then get horribly stuck at apology – but we live in hope.’

‘I understand,’ I said, having felt that I too had yet to make the jump to apology.

‘It’s further evidence of satire being the engine of the Event,’ said Connie, ‘although if that’s true, we’re not sure for whose benefit.’

‘Certainly not humans’,’ said Finkle, ‘since satire is meant to highlight faults in a humorous way to achieve betterment, and if anything, the presence of rabbits has actually made humans worse.’

‘Maybe it’s the default position of humans when they feel threatened,’ I ventured, ‘although if I’m honest, I know a lot of people who claim to have “nothing against rabbits” but tacitly do nothing against the overt leporiphobia that surrounds them.’

‘Or maybe it’s just satire for comedy’s sake and nothing else,’ added Connie, ‘or even more useless, satire that provokes a few guffaws but only low to middling outrage – but is coupled with more talk and no action. A sort of  . . . empty cleverness.’

‘Maybe a small puff in the right moral direction is the best that could be hoped for,’ added Finkle thoughtfully. ‘Perhaps that’s what satire does – not change things wholesale but nudge the collective consciousness in a direction that favours justice and equality. Is there any more walnut cake?’

‘I’m afraid I had the last slice,’ I said, ‘but I did ask if anyone else wanted it.’


I read "The Dawn of Everything" by David Graeber and David Wengrow

| Review | ★★★★

If there is a particular story we should be telling, a big question we should be asking of human history (instead of the ‘origins of social inequality’), is it precisely this: how did we find ourselves stuck in just one form of social reality, and how did relations based ultimately on violence and domination come to be normalized within it?

What happens if we treat the rejection of urban life, or of slavery, [or of certain technologies] in certain times and places as something just as significant as the emergence of those same phenomena in others.

What is the purpose of all this new knowledge, if not to reshape our conceptions of who we are and what we might yet become? If not, in other words, to rediscover the meaning of our third basic freedom: the freedom to create new and different forms of social reality?

I imagine I’m already on board with David Graeber’s political project, so while I greatly enjoyed it, I found it too long by about a third.

The overall thrust is that people are much more interesting and creative than we give them credit for, and there’s a lot (too much for me in this book) of historical evidence that this is the case. And that it’s bunk to claim that increasing social complexity and scale requires an authoritarian state or bureacracy. I guess it’s an argument to unstick the “End of History”-framing we’re mired in.

Of various things I learned / was confronted with:

  • Indulging children is Native American practice. Makes sense cause it’s a common theme in Kim Stanley Robinson books of which the Haudenosaunee make frequent appearance too.
  • Roman-style property ownership (of which we inherit), is pretty fucked up when stared directly at, based on a patriarch’s relations with household slaves.
  • It seems like a pretty legit critique of Western society to point out that there are a lot of legitimate ways poeple are free to harm other people during their every day life, and that’s got to be pretty warpy.
  • Spending more time imagining and debating the society and politics you want to live in… probably makes for a better society and politics. One of those, if it’s hard do it a lot sorts of things. And if that sounds annoying in the context of the present, that’s probably because we’ve severely narrowed the scope of debate and possibility.

There’s a lot of history and anthropology to boil down:

…the key point to remember is that we are not talking here about ‘freedom’ as an abstract ideal or formal principle (as in ‘Liberty, Equality and Fraternity!’). Over the course of these chapters we have instead talked about basic forms of social liberty which one might actually put into practice:

  1. the freedom to move away or relocate from one’s surroundings;
  2. the freedom to ignore or disobey commands issued by others; and
  3. the freedom to shape entirely new social realities, or shift back and forth between different ones

….three elementary principles of domination:

  1. control of violence (or sovereignty),
  2. control of knowledge, and
  3. charismatic politics

…and a lot of historical and anthropoligical critique:

Environmental determinists have an unfortunate tendency to treat humans as little more than automata, living out some economist’s fantasy of rational calculation. To be fair, they don’t deny that human beings are quirky and imaginative creatures – they just seem to reason that, in the long run, this fact makes very little difference.

For much of the twentieth century, anthropologists tended to describe the societies they studied in ahistorical terms, as living in a kind of eternal present. Some of this was an effect of the colonial situation under which much ethnographic research was carried out. The British Empire, for instance, maintained a system of indirect rule in various parts of Africa, India and the Middle East where local institutions like royal courts, earth shrines, associations of clan elders, men’s houses and the like were maintained in place, indeed fixed by legislation. Major political change – forming a political party, say, or leading a prophetic movement – was in turn entirely illegal, and anyone who tried to do such things was likely to be put in prison. This obviously made it easier to describe the people anthropologists studied as having a way of life that was timeless and unchanging.

….

Social science has been largely a study of the ways in which human beings are not free: the way that our actions and understandings might be said to be determined by forces outside our control. Any account which appears to show human beings collectively shaping their own destiny, or even expressing freedom for its own sake, will likely be written off as illusory, awaiting ‘real’ scientific explanation; or if none is forthcoming (why do people dance?), as outside the scope of social theory entirely. This is one reason why most ‘big histories’ place such a strong focus on technology. Dividing up the human past according to the primary material from which tools and weapons were made (Stone Age, Bronze Age, Iron Age) or else describing it as a series of revolutionary breakthroughs (Agricultural Revolution, Urban Revolution, Industrial Revolution), they then assume that the technologies themselves largely determine the shape that human societies will take for centuries to come – or at least until the next abrupt and unexpected breakthrough comes along to change everything again.

Choosing to describe history the other way round, as a series of abrupt technological revolutions, each followed by long periods when we were prisoners of our own creations, has consequences. Ultimately it is a way of representing our species as decidedly less thoughtful, less creative, less free than we actually turn out to have been. It means not describing history as a continual series of new ideas and innovations, technical or otherwise, during which different communities made collective decisions about which technologies they saw fit to apply to everyday purposes, and which to keep confined to the domain of experimentation or ritual play. What is true of technological creativity is, of course, even more true of social creativity. One of the most striking patterns we discovered while researching this book – indeed, one of the patterns that felt most like a genuine breakthrough to us – was how, time and again in human history, that zone of ritual play has also acted as a site of social experimentation – even, in some ways, as an encyclopaedia of social possibilities.


How GoodJob’s mountable Rails Engine delivers Javascript importmaps and frontend assets

GoodJob is a multithreaded, Postgres-based ActiveJob backend for Ruby on Rails.

GoodJob includes a full-featured (though optional) web dashboard to monitor and administer background jobs. The web dashboard is included in the good_job gem as a mountable Rails Engine.

As the maintainer of GoodJob, I want to make gem development easier for myself by innovating as little as possible. That’s why GoodJob builds on top of Active Record and Concurrent::Ruby.

But, the frontend can be a beast. When thinking about how to build a full-featured web dashboard packaged within a Rails Engine within a gem, I had three goals:

  1. Be asset pipeline agnostic with zero dependencies. As of Rails ~7.0, a Rails developer can choose between several different asset pipeline tools (Sprockets, Webpacker/Shakapacker, esbuild/jsbundling, etc.). That’s too many! I want to ensure what I package with GoodJob is compatible with all of them. I also don’t want to affect the parent application at all; everything must be independent and self-contained.
  2. Allow easy patching/debugging. I want the GoodJob web dashboard to work when using the Git repo directly in a project’s Gemfile or simply bundle open good_job to debug a problem.
  3. Write contemporary frontend code. I want to use Bootstrap UI, Stimulus, Rails UJS, and write modular JavaScript with imports. Maybe even Turbo!

And of course, I want GoodJob to be secure, performant, and a joy to develop and use for myself and the developer community.

Read on for how I achieved it all (mostly!) with GoodJob.

What I didn’t do

Here’s all the things I considered, but decided not to do:

  • Nope: Manually construct/inline a small number of javascript files: I did not want to manually build a javascript file, copy-pasting various various 3rd-party libraries into a single file, and then writing my code at the bottom. This seemed laborious and prone to error, especially when I would need to update a library; and my IDE doesn’t work well with large files so writing my own code would be difficult.
  • Nope: Precompile javascript in the repository or on gem build: I did not want to force a pre-commit step to build javascript, or to only package built javascript into the gem because that would make patching and debugging difficult. Over my career I’ve struggled contributing to a number of otherwise fantastic gems that use this workflow pattern.
  • Nope: Compile javascript in the application: Rails has too many different asset pipeline patterns right now for me to consider supporting them all. I consider this more a result of a highly fragmented frontend ecosystem than an indictment of Rails. I can’t imagine supporting all of the different options and whatever else shows up in the future at the same time. (I’m in awe of the gems that do; nice work rails_admin!)

What I did do

As I wrote earlier: “innovate as little as possible”:

Serve vanilla JS and CSS out of vanilla Routes/Controller

GoodJob has explicit routes for frontend assets that wire up to a controller that serves those assets statically with render :file. Let’s break that down…

In my Rails Engine’s router, I define a namescape, frontend, and two get routes. The first route, modules , is for Javascript modules that will go into the importmap. The second route, static , is for Javascript and CSS that I’ll link/script directly in the HTML head.

# config/routes.rb
scope :frontend, controller: :frontends do
  get "modules/:name", action: :module, as: :frontend_module, constraints: { format: 'js' }
  get "static/:name", action: :static, as: :frontend_static, constraints: { format: %w[css js] }
end

In the matching controller, I create static constants that define hashes of files that are matched and served, which I store in a app/frontend directory in my Rails Engine. I want paths to be explicit for security reasons because passing any sort of dynamic file path through the URL could be a path traversal vulnerability. All of the frontend assets are stored in app/frontend and served out of this controller:

# app/controllers/good_job/frontends_controller.rb

module GoodJob
  class FrontendsController < ActionController::Base # rubocop:disable Rails/ApplicationController
    STATIC_ASSETS = {
      css: {
        bootstrap: GoodJob::Engine.root.join("app", "frontend", "good_job", "vendor", "bootstrap", "bootstrap.min.css"),
        style: GoodJob::Engine.root.join("app", "frontend", "good_job", "style.css"),
      },
      js: {
        bootstrap: GoodJob::Engine.root.join("app", "frontend", "good_job", "vendor", "bootstrap", "bootstrap.bundle.min.js"),
        chartjs: GoodJob::Engine.root.join("app", "frontend", "good_job", "vendor", "chartjs", "chart.min.js"),
        es_module_shims: GoodJob::Engine.root.join("app", "frontend", "good_job", "vendor", "es_module_shims.js"),
        rails_ujs: GoodJob::Engine.root.join("app", "frontend", "good_job", "vendor", "rails_ujs.js"),
      },
    }.freeze

		# Additional JS modules that don't live in app/frontend/good_job/modules
    MODULE_OVERRIDES = {
      application: GoodJob::Engine.root.join("app", "frontend", "good_job", "application.js"),
      stimulus: GoodJob::Engine.root.join("app", "frontend", "good_job", "vendor", "stimulus.js"),
    }.freeze

    def self.js_modules
      @_js_modules ||= GoodJob::Engine.root.join("app", "frontend", "good_job", "modules").children.select(&:file?).each_with_object({}) do |file, modules|
        key = File.basename(file.basename.to_s, ".js").to_sym
        modules[key] = file
      end.merge(MODULE_OVERRIDES)
    end

    # Necessarly to serve Javascript to the browser
		skip_after_action :verify_same_origin_request, raise: false
    before_action { expires_in 1.year, public: true }

    def static
      render file: STATIC_ASSETS.dig(params[:format].to_sym, params[:name].to_sym) || raise(ActionController::RoutingError, 'Not Found')
    end

    def module
      raise(ActionController::RoutingError, 'Not Found') if params[:format] != "js"

      render file: self.class.js_modules[params[:name].to_sym] || raise(ActionController::RoutingError, 'Not Found')
    end
  end
end

One downside of this is that I’m unable to use Sass or Typescript or anything that isn’t vanilla CSS or Javascript. So far I haven’t missed that too much; Bootstrap brings a very comprehensive design system and Rubymine is pretty good at hinting Javscript on its own.

Another downside is that I package several hundred kilobytes of frontend code within my gem. This increases the gem size, which is a real bummer if an application isn’t mounting the dashboard. I’ve considered separating the optional dashboard into a separate gem, but I’m deferring that until anyone notices that it’s problematic (so far so good!).

Manually link assets and construct a JS importmaps in my Engine’s layout <head>

Having created the routes and controller actions, I can simply link the static files in the layout html header:

<!-- app/views/layouts/good_job/application.html.erg -->
<head>
  <!-- ... -->
  <%# Note: Do not use asset tag helpers to avoid paths being overriden by config.asset_host %>
  <%= tag.link rel: "stylesheet", href: frontend_static_path(:bootstrap, format: :css, v: GoodJob::VERSION, locale: nil), nonce: content_security_policy_nonce %>
  <%= tag.link rel: "stylesheet", href: frontend_static_path(:style, format: :css, v: GoodJob::VERSION, locale: nil), nonce: content_security_policy_nonce %>
	
  <%= tag.script "", src: frontend_static_path(:bootstrap, format: :js, v: GoodJob::VERSION, locale: nil), nonce: content_security_policy_nonce %>
  <%= tag.script "", src: frontend_static_path(:chartjs, format: :js, v: GoodJob::VERSION, locale: nil), nonce: content_security_policy_nonce %>
  <%= tag.script "", src: frontend_static_path(:rails_ujs, format: :js, v: GoodJob::VERSION, locale: nil), nonce: content_security_policy_nonce %>

Beneath this, I manually construct the JSON the browser expects for importmaps:

<!-- Link es_module_shims -->
<%= tag.script "", src: frontend_static_path(:es_module_shims, format: :js, v: GoodJob::VERSION, locale: nil), async: true, nonce: content_security_policy_nonce %>

<!-- Construct the importmaps JSON object -->
<% importmaps = GoodJob::FrontendsController.js_modules.keys.index_with { |module_name| frontend_module_path(module_name, format: :js, locale: nil, v: GoodJob::VERSION) } %>
<%= tag.script({ imports: importmaps }.to_json.html_safe, type: "importmap", nonce: content_security_policy_nonce) %>

<!-- Import the entrypoint: application.js -->
<%= tag.script "", type: "module", nonce: content_security_policy_nonce do %> import "application"; <% end %>

That’s it!

I’ll admit, serving frontend assets using render file: is boring, but I experienced a thrill the first time I wired up the importmaps and it just worked. Writing small Javascript modules and using import directives is really nice. I recently added Stimulus and I’m feeling optimistic that I could reliably implement Turbo in my gem’s Rails Engine fully decoupled from the parent application.

I hope this post about GoodJob inspires you to build full-featured web frontends for your own gems and libraries.


Recently, March 12, 2023

  • Work has been complicated, recently. Layoffs, as a general idea, were announced a month ago; it was the same week I came down with a bad cold. I’ve been fairly low energy since and have had trouble differentiating the two. I’m supremely proud and confident that my team is doing the most important work possible. We’ll see!
  • The week prior to all of this, my dad came to visit and stay with us. Having an easier time hosting family was one of our goals in getting a 2nd bedroom. Success.
  • Wow, it’s nearly been a year since I left my last job. I’ve had a number of former colleagues asking for help in leaving, in addition to talking with folks being pushed out: I was surprised to see Code for America finally kill Brigades, and really twist the knife too by forcing groups to rename themselves.
  • GoodJob is great! I’ve been thinking about replacing Advisory Locks with a lock strategy that’s more compatible with PgBouncer. But that will probably be a 2-year project at least of incrementally crabwalking towards that goal while avoiding breaking changes. Rubygems.org just adopted GoodJob; I am humbled.
  • On other projects, I’ve been trying to lower costs. My S3 data-transfer bill went from $10 to $50 a month, which I’m not happy about; scrapers are the worst 🤷‍♀️ I’ve also been experimenting with Dokku for packing some smaller projects (paying $15 once rather than $12 per app), though the VM locked up once on a Saturday night and I had to reboot it and this is exactly why I don’t want to manage my own servers.
  • My brother and I have been planning a Celebration of Life for my mom.
  • I’m so happy to finally be back on Daylight Savings Time.

Service Object Objects in Ruby

For anyone that follows me on social media, I’ll sometimes get into a Coplien-esque funk of “I don’t wanna write Classes. I want to write Objects!”. I don’t want to negotiate an industrial-relations policy for instances of Person in the current scope. I want to imagine the joy and misery Alice and Bob will experience working together right now.

I was thinking of that recently when commenting on Reddit about Caleb Hearth’s “The Decree Design Pattern”. Which ended up in the superset of these two thoughts:

  • heck yeah! if it’s globally distinct it should be globally referenceable
  • oh, oh no, I don’t like looking at that particular Ruby code

This was my comment to try to personally come to terms with those thoughts, iteratively:

# a consistent callable
my_decree = -> { do_something }

# ok, but globally scoped
MY_DECREE = -> { do_something }

# ok, but without the shouty all-caps
module MyDecree
  def self.call 
    do_something 
  end 
end

# ok, but what about when it gets really complex
class MyDecree 
  def self.call(variable)
    new(variable).call 
  end
  
  def new(variable)
    @variable = variable
  end

  def call
    do_something
    do_something_else(@variable)
    do_even_more
  end

  def do_even_more
    # something really complicated....
  end
end

From the outside, object perspective, these are all have the same interchangeable interface (.call), and except for the first one, accessible everywhere. That’s great, from my perspective! Though I guess it’s a pretty short blog post to say:

  • Decrees are globally discrete and call-able objects
  • The implementation is up to you!

Unfortunately, the moment the internals come into play, it gets messy. But I don’t think that should take away from the external perspective.


Slop and call

In my role as Engineering Manager, I frequently play Keeper of the Process. Having worked effectively alongside plenty of agile #noplanning people (RIP Andrew), and carrying the scars of dysfunctional processes (oh, PRDs and OGSM), it feels historically out of character to lean into OKR scores and target dates. And I think I’ve made my peace with it.

When I was in high school, my friend’s dad Gary (RIP Gary) retired and bought a championship pool table. The pool table went in their living room and everything else came out. Nothing else fit. The room was a pool table and a stero, which Gary kept tuned to classic jazz. We played a lot of pool and listened to a lot of Charles Mingus.

The two games I remember playing most was 2-ball “English” and 9-ball. English is a “called” game; you have to say which ball and hole you’re aiming for before making the shot. 9-ball is played “slop”, as long as you hit the lowest-numbered ball first, it doesn’t matter which ball goes into which hole.

Both games have their techniques. Playing English I got really good at fine ball handling and putting a sidespin on the ball (that’s the “English”) and having a narrow intent. With 9-ball, I learned to do a lot of what we call a “textbook”-shot (I dunno why we gave only this one shot that name; we were 17). The shot was to bounce the ball off of as many alternating rails as possible until the ball eventually walked itself into a pocket. Just slam it really.

The point is, both of them were ok ways to play. They were just different. It’s fine.


Prevent CDN poisoning from Fat GET/HEAD Requests in Ruby on Rails

There are many different flavors of web cache poisoning discovered by Security Researcher James Kettle. Read on for an explanation of one I’ve run across…

What is a Fat GET/HEAD Request? A GET or HEAD request is “fat” when it has a request body. It’s unexpected! Typically one sees a request body with a POST or PUT request because the body contains form data. The HTTP specification says that including a request body with GET or HEAD requests is undefined. You can do it, and it’s up to the application to figure out what that means. Sometimes it’s bad!

You can get a sense of the applications that intentionally support Fat Requests (and how grumpy it makes some people) by reading through this Postman issue.

Fat Requests can lead to CDN and cache poisoning in Rails. CDNs and caching web proxies (like Varnish) are frequently configured to cache the response from a GET or HEAD request based solely on the request’s URL and not the contents of the request body (they don’t cache POSTs or PUTs at all). If an application isn’t deliberately handling the request body, it may cause unexpected content to be cached and served.

For example, you have a /search endpoint:

  • GET /search shows a landing page with some explanatory content
  • GET /search?q=foo shows the search results for “foo”.
  • Here’s what a Fat Request looks like:

    GET /search     <== the url for the landing page
    
    q=verybadstuff  <== oh, but with a request body
    

In a Rails Controller, parameters (alias params) merges query parameters (that’s the URL values) with request parameters (that’s the body values) into a single data structure. If your controller uses the presence of params[:q] to determine whether to show the landing page or the search results, it’s possible that when someone sends that Fat Request, your CDN may cache and subsequently serve the results for verybadstuff every time someone visits the /search landing page. That’s bad!

Here’s how to Curl it:

curl -XGET -H "Content-Type: application/x-www-form-urlencoded" -d "q=verybadstuff" http://localhost:3000/search

Here are 3 ways to fix it…

Solution #1: Fix at the CDN

The most straightforward place to fix this should be at the caching layer, but it’s not always easy.

With Cloudflare, you could rewrite the GET request’s Content-Type header if it is application/x-www-form-urlencoded or multipart/form-data. Or use a Cloudflare Worker to drop the request body.

Varnish makes it easy to drop the request body for any GET request.

Other CDNs or proxies may be easier or more difficult. It depends!

Update via Mr0grog: AWS Cloudfront returns a 403 by default.

Solution #2: Deliberately use query_parameters

Rails provides three different methods for accessing parameters:

  • query_parameters for the values in the request URL
  • request_parameters ) for the values in the request body
  • parameters (alias params) for the problematic combination of them both. Values in query_parameters take precedence over values in request_parameters when they are merged together.

Developers could be diligent and make sure to only use query_parameters in #index or #show , or get routed actions. Here’s an example from the git-scm project.

Solution #3: Patch Rails

Changes were proposed in Rails to not have parameters merge in the body values for GET and HEAD requests; it was rejected because it’s more a problem with the upstream cache than it is with Rails.

You can patch your own version of Rails. Here’s an example that patches the method in ActionDispatch::Request:

# config/initializers/sanitize_fat_requests.rb
module SanitizeFatRequests
  def parameters
    params = get_header("action_dispatch.request.parameters")
    return params if params

    if get? || head?
      params = query_parameters.dup
      params.merge!(path_parameters)
      set_header("action_dispatch.request.parameters", params)
      params
    else
      super
    end
  end
  alias :params :parameters
end

ActionDispatch::Request.include(SanitizeFatRequests)

# Some RSpec tests to verify this
require 'rails_helper'

RSpec.describe SanitizeFatRequests, type: :request do
  it 'does not merge body params in GET requests' do
    get "/search", headers: {'CONTENT_TYPE' => 'application/x-www-form-urlencoded'}, env: {'rack.input': StringIO.new('q=verybadstuff') }

    # verify that the request is correctly shaped because
    # the test helpers don't expect this kind of request
    expect(request.request_parameters).to eq("q" => "verybadstuff")
    expect(request.parameters).to eq({"action"=>"panlexicon", "controller"=>"search"})

    # the behavioral expectation
    expect(response.body).not_to include "verybadstuff"
  end
end

Introducing GoodJob Bulk and Batch

GoodJob is a multithreaded, Postgres-based, ActiveJob backend for Ruby on Rails. I recently released two new features:

  • GoodJob::Bulk to optimize enqueuing large numbers of jobs (released in GoodJob v3.9)
  • GoodJob::Batch to coordinate parallelized sets of jobs (released in GoodJob v3.10)

Big thanks to @julik, @mollerhoj, @v2kovac, @danielwestendorf, @jrochkind, @mperham and others for your help and counsel!

Bulk enqueue

GoodJob’s Bulk-enqueue functionality can buffer and enqueue multiple jobs at once, using a single INSERT statement. This can be more performant when enqueuing a large number of jobs.

I was inspired by a discussion within a Rails pull request to implement perform_all_later within Active Job. I wanted to both support the way most people enqueue Active Job jobs with perform_later and also encourage people to work directly with Active Job instances too.

# perform_later within a block
active_jobs = GoodJob::Bulk.enqueue do
  MyJob.perform_later
  AnotherJob.perform_later
end
# or with Active Job instances
active_jobs = [MyJob.new, AnotherJob.new]
GoodJob::Bulk.enqueue(active_jobs)

Releasing Bulk functionality was a two-step: I initially implemented it while working on Batch functionality, and then with @julik’s initiative and help, we extracted and polished it to be used on its own.

Batches

GoodJob’s Batch functionality coordinates parallelized sets of jobs. The ability to coordinate a set of jobs, and run callbacks during lifecycle events, has been a highly demanded feature. Most people who talked to me about job batches were familiar with Sidekiq Pro ‘s batch functionality, which I didn’t want to simply recreate (Sidekiq Pro is excellent!). So I’ve been collecting use cases and thinking about what’s most in the spirit of Rails, Active Job, and Postgres:

  • Batches are mutable, database-backed objects with foreign-key relationships to sets of job records.
  • Batches have properties which use Active Job’s serializer, so they can contain and rehydrate any GlobalID object, like Active Record models.
  • Batches have callbacks, which are themselves Active Job jobs

Here’s a simple example:

GoodJob::Batch.enqueue(on_finish: MyBatchCallbackJob, user: current_user) do
  MyJob.perform_later
  OtherJob.perform_later
end

# When these jobs have finished, it will enqueue your `MyBatchCallbackJob.perform_later(batch, options)`
class MyBatchCallbackJob < ApplicationJob
  # Callback jobs must accept a `batch` and `params` argument
  def perform(batch, params)
    # The batch object will contain the Batch's properties, which are mutable
    batch.properties[:user] # => <User id: 1, ...>
    # Params is a hash containing additional context (more may be added in the future)
    params[:event] # => :finish, :success, :discard
  end
end

There’s more depth and examples in the GoodJob Batch documentation.

Please help!

Batches are definitely a work in progress, and I’d love your feedback:

  • What is the Batch functionality missing? Tell me your use cases.
  • Help improve the Web Dashboard UI (it’s rough but functional!)
  • Find bugs! I’m sure there are some edge cases I overlooked.

Framing open source contributions at work

Excerpts from the excellent RailsConf 2022 keynote: The Success of Ruby on Rails by Eileen Uchitelle [reformatted from the transcript]:

Upgrading is one of the easiest ways to find an area of Rails that can benefit from your contributions. Fixing an issue in a recent release has a high likelihood of being merged.

Running off Rails Main is another way to find contributions to Rails. If you don’t want to run your Main in production, you could run it in a separate CI build. Shopify, GitHub and Basecamp run it.

Running off Main may be harder than running off a release because features and bug fixes are a little in flux sometimes. If you are running off of Main, a feature added to the Main branch could be removed without deprecation. This is a worthwhile risk to take on because it lowers the overall risk of an upgrade. When you run off Main, you’re less likely to fall behind upgrading because it becomes part of your weekly or monthly maintenance. Upgrading becomes routine, second nature rather than novel and scary. Changes are easy to isolate. It’s just slightly less polished. Like I said, I still think it’s pretty stable.

Another way to find places to contribute is look at your own applications.

  • Do you have monkey patches on Rails code that are fixes bugs or changing behavior? Instead of leaving them there, upstream the fix and delete the monkey patch.
  • Is there infrastructure level code that doesn’t really pertain to your product? It’s possible this could be a great addition to Rails. When I wrote the database in Rails, it came from GitHub’s monolith. It made perfect sense because it was getting in the way of upgrades, didn’t expose any intellectual property, had nothing to do with your product features and
    something many applications could benefit from.
  • Lastly and most importantly, keep showing up.

… Ultimately, if more companies treated the framework as an extension of the application, it would result in higher resilience and stability. Investment in Rails ensures your foundation will not crumble under the weight of your application. Treating it as an unimportant part of your application is a mistake and many, many leaders make this mistake.

…leaders see funding open source risky is because they don’t actually understand the work. … Often, leaders worry if there’s a team working in open source, other teams are going to be jealous or resentful that that team is doing “fun” work. …

Maintainers need to make changes, deal with security incidents and also handle criticism from many people. Maintaining and contributing to open source requires a different skill set than product work. That doesn’t make it any less essential.

…Many product companies don’t like words like “research” and “experimental.” They can imply work without goals. Use words like “investment.” And demonstrate the direct value will bring. Make sure it is measurable and will make the application and product more successful. A great example of measurable work is a change that improves performance. If you can tie contributions to direct customer improvements, it’s easier to show leadership.

…As I started contributing more and more and pealing back the layers of Rails, the impact is limitless. I started looking at how applications stretched the boundaries of what Rails was meant to do.

…Ultimately, I want you to contribute to Rails because it’s going to enable you to build a better company and product. The benefits of investing in Rails go far beyond improving the framework.

Investing in Rails will build up the skills of your engineering team. They will developer better communication skills, learn to navigate criticism, debugging skills and how the framework functions. It will teach engineers about the inner-workings and catch bugs.

Monkey patching is far more dangerous than I think most realize. They break with upgrades and cause security incidents. When you write a monkey patch, you maintain a portion of Rails code. Wouldn’t it have been better to patch it upstream rather than taking on that risk and responsibility.

It will give your engineering team the skills they need to make better technical decisions. You’re ensuring that Rails benefits your application and the company for the long-term.

…Contributing to Rails is only not essential if you don’t care about the direction the framework is headed in. We should be contributing because we care about the changes.

We want to ensure our apps are upgradeable, performant and stable.

Investment in Rails also means you won’t have to rewrite your application in a few years because Rails no longer supports what you need. When you fail to invest in your tools, you end up being unable to upgrade. Your engineering team is miserable. The codebase is a mess and writing features is impossible. You’re forced into a rewrite, your engineers want to write Rails and you can no longer let them do that. You have to build a bunch of features before you site falls over.

It’s not Rails’ fault you made the decision to invest elsewhere.

If you build contributing into your culture, the benefits are clear:

  • Your engineering teams’ skills will improve.
  • Rails will evolve with your application because you’re helping decide how it needs to change.
  • Your application will be more resilient because there’s low tech debt and your foundation is stable. Active investment prevents your application from degrading.

Building a team to invest in Rails is proactive. Rewriting an application is reactive. Which one do you think is better for business in the long run?