Living Parklife with Rails, coming from Jekyll

I recently migrated this blog from Jekyll to Ben Pickles’s Parklife and Ruby on Rails, still hosted as a static website on GitHub Pages. I’m pretty happy with the experience.

I’m writing this not because I feel any sense of advocacy (do what you want!) but to write down the reasons for myself. Maybe they’ll rhyme for you.

Here’s this blog’s repo if you want to see: https://github.com/bensheldon/island94.org

Background

I’ve been blogging here for 20 years and this blog has been through it all: Drupal, Wordpress, Middleman, Jekyll, and now Parklife+Rails.

For the past decade the blog has largely been in markdown files, which I don’t intend to change. Over the past 2 years I also exported 15 years of pinboard/del.icio.us bookmarks, and my Kindle book highlights into markdown-managed files too. I’ve also dialed in some GitHub Action and Apple Shortcut powered integrations. I’m really happy with Markdown files in a git repo, scripted with Ruby.

…but there’s more than just Ruby.

Mastery

I’m heavily invested in the Ruby on Rails ecosystem. I think it’s fair to say I have mastery in Rails: I’m comfortable building applications with it, navigating and extending the framework code, intuiting the conceptual vision of the core team, and being involved in the life of the comunity where I’ve earned some positive social capital to spend as needed.

I don’t have that in Jekyll. I am definitely handy in Jekyll. I’ve built plugins, I’ve done some wild stuff with liquid. But I’m not involved with Jekyll in the everyday sense like I am with Rails. I feel that when I go to make changes to my blog. There’s a little bit of friction mentally switching over to liquid, and Jekyll’s particular utilities that are similar to but not the same as Action View and Active Support. Jekyll is great; it’s me and the complexity of my website that’s changed.

(I do still maintain some other Jekyll websites; no complaints elsewhere.)

Parklife with Ruby on Rails

I hope I’m not diminishing Parklife by writing that it isn’t functionally much more than wget. It’s in Ruby and mounts+crawls a Rack-based web application, does a little bit to rewrite the base URLs, and spits out a directory of static HTML files. That’s it! It’s great!

It was pretty easy for me to make a lightweight Ruby on Rails app, that loaded up all my markdown-formatted content and frontmatter, and spat them out again as Controllers and ERB Views.

This blog is about 7k pages. For a complete website artifact:

  • Jekyll: takes about 20 seconds to build
  • Parklife with Rails: takes about 20 seconds to build

In addition to the productivity win for me of being able to work with the ERB and Action View helpers I’m familiar with, I also find my development loop with Parklife and Rails is faster than Jekyll: I don’t have to rebuild the entire application to see the result of a single code or template change. I use the Rails development server to develop, not involving Parklife at all. On my M1 MBP a cold boot of Rails takes less than a second, a code reload is less than 100ms, and most pages render in under 10ms.

With Jekyll, even with --incremental, most development changes required a 10+ second rebuild. Not my favorite.

The technically novel bits

  1. if you want to trigger Rails code reload with any arbitrary set of files, like a directory of markdown files, you use ActiveSupport::FileUpdateChecker (which has a kind of complicated set of arguments):

    # config/application.rb
    self.reloaders << ActiveSupport::FileUpdateChecker.new([], {
      "_posts" => ["md", "markdown"],
      "_bookmarks" => ["md", "markdown"],
    }) do
      Rails.application.reload_routes!
    end
    
  2. Each of my blog posts has a list of historical redirects stored in their frontmatter (a legacy of so many framework changes). I had to think about how to do a catch-all route to render a static meta-refresh template:

    # config/routes.rb
    Rails.application.routes.draw do
      get "*path", to: "redirects#show", constraints: ->(req) { Redirect.all.key? req.path.sub(%r{\A/}, "").sub(%r{/\z}, "") }
      # ...all the other routes
    end
    

In conclusion

Here’s this blog’s Parkfile. I did a little bit of convenience monkeypatching of things I intend to contribute upstream to Parklife. I dunno, maybe you’ll like the Parklife too.

How I’m thinking about AI (LLMs)

With AI, in my context we’re talking about LLMs (Large Language Models), which I simplify down to “text generator”: they take text as input, and they output text.

I wrote this to share with some folks I’m collaborating with on building AI-augmented workflows. I’ve struggled to find something that is both condensed and whose opinionations match my own. So I wrote it myself.

The following explanation is intended to be accurate, but not particularly precise. For example, there is ChatGPT the product, there is an LLM at the bottom, and then in the middle there are other functions and capabilities. Or Claude or AWS Nova or Llama. These things are more than *just* LLMs, but they are also not much more than an LLM. Some of these tools can also interpret images and documents and audio and video. To do so, they’re passing those documents through specialized functions like OCR (optical character recognition), voice-recognition and image-recognition tools and then those results are turned into more text input. And some of them can take “Actions” with “Agents” which is still based on text output, just being structured and fed into something else. It’s text text text.

(also, if something is particularly wrong, let me know please)

A little about LLMs

The language around LLMs and “AI” is fucked up with hype and inappropriate metaphors. But the general idea is there is two key phases to keep track of:

  1. Training: Baking the model. At which point it’s done. I don’t know anyone who is actually building models Everyone is using something like OpenAI or Claude or Llama. And even while these things can be “fine tuned” I don’t know anyone doing it; operating at the model level requires input data on the order of tens of thousands of inputs/examples.
  2. Prompting: Using the model, giving input and getting output. This is everything the vast majority of developers are doing.

That’s it. Those are the only two buckets you need to think about.

1. Training

The way AI models get made is to first collect trillions of pages of written text (an example is Common Crawl which scrapes the Internet). Then use machine learning to identify probabilistic patterns that can be represented by only several billion variables (floating point numbers). This is called “Pre Training”. At this point, you can say “Based on the input data, it’s probabilistically likely that the word after “eeny meany miney” is “moe”.

Then there is the phase of “Fine Tuning” which makes sure that longer strings of text input are completed in ways that are intended (never right or wrong, just intended or expected). For example, if the text input is “Write me a Haiku about frogs” you expect a short haiku about frogs and not a treatise on the magic of the written word or amphibians. Fine tuning is largely accomplished by tens of thousands of workers in Africa and South Asia reading examples of inputs and outputs and clicking 👍 or 👎 on their screen. This is then fed back into machine learning models to say, of the billion variables, which variables should get a little more or less oomph when they’re calculating the output. Fine Tuning requires tens of thousands of these scored examples; again, this is probabilistic-scale stuff. This can also be called RLHF (Reinforcement Learning from Human Feedback), though that sometimes also refers to few-shot prompting, which is Prompt-phase (“Learning” is a nonsense word in the AI domain; it has zero salience without clarifying which phase you’re talking about). A lot of the interesting fine-tuning, imo, comes from getting these text generators to:

  • Read like a human chatting with you, rather than textual diarrhea
  • Getting structured output, like valid JSON, rather than textual diarrhea

Note: You can mentally slot in words like “parameters”, “dimensions”, “weights” and “layers” into all this. Also whenever someone says “we don’t really know how they work” what they really mean is “there’s a lot of variables and I didn’t specifically look at them all”. But that’s no different than being given an Excel spreadsheet with several VLOOKUPS and functions and saying “sure, that looks ok” and copy-pasting the report on to your boss; I mean, you could figure it all out, but it seems to work and you’re a busy person.

Ok, now we’re done with training. The model at this point is baked and no further modification takes place: no memory, no storage, no “learning” in the sense of a biological process. From this point further they operate as a function: input in, output out, no side effects.

Here’s how AWS Bedrock, which is how I imagine lots of companies are using AI in their product, describes all this:

After delivery of a model from a model provider [Anthropic, OpenAI, Meta] to AWS, Amazon Bedrock will perform a deep copy of a model provider’s inference and training software into those accounts for deployment. Because the model providers don’t have access to those accounts, they don’t have access to Amazon Bedrock logs or to customer prompts and completions.

See! It’s all just dead artifacts uploaded into S3, that are then loaded onto EC2 on-demand. A fancy lambda! Nothing more.

2. Prompting

Prompting is when we give the model input, and then it gives back some output. That’s it. Unless we are specifically collecting trillions of documents, or doing fine-tuning against thousands of examples (which we are NOT!), we are simply writing a prompt, and having the model generate some text based on it. It riffs. The output can be called “completions” because they’re just that: More words.

(Fun fact: how to get the LLM to stop writing words is a hard problem to solve)

Note: You might sometimes see prompting called model “testing” (as opposed to model building or training). That’s because you’re powering up the artifact to put some words through it. Testing testing is called “Evaluations” (“Evals” for short) and like all test test regimes the lukewarm debate I hear from everybody is “we aren’t but should we?”

Writing prompts

This is the work! Unfortunately the language used to describe all of this is truly and totally fucked. By which I mean that words like “learning” and “thought” and “memory” and even “training” is reused again.

It’s all about simply writing a prompt that boops the resulting text generator output into the shape you want. It’s all snoot-booping, all the time.

Going back to the Training data, let’s make some conceptual distinctions:

  • Content: specific facts, statements and assertions that were (possibly) encoded into those billions of probabilities from the training data
  • Structure: the overall probability that a string of words (really fragments of words) comes out again looking like something we expect, which has been adjusted via Fine Tuning

Remember, this is just a probabilistic text generator. So there is probabilistic facts, and probabilistic structure. And that probabilistic part is why we have words like “hallucination” and “slop” and “safety”. There’s no there there. It’s just probabilities. There’s no guarantee that a particular fact has been captured in those billions of variables. It’s just a text generator. And it’s been trained on a lot of dumb shit people write. It’s just a text generator. Don’t trust it.

So on to some prompting strategies:

  • Zero-Shot Prompting: This just means to ask something open-ended and the AI returns something that probabilistical follows:

    Classify the sentiment of the following review as positive, neutral, or negative: “The quality is amazing, and it exceeded my expectations”

  • Few-Shot (or one-shot/multi-shot) Prompting: This just means to provide one or more examples of “expected” completions in the prompt (remember, this is all prompt, not fine-tuning) to try to narrow down what could probabilistically follow:

    Task: Classify the sentiment of the following reviews as positive, neutral, or negative.
    Examples:

    1. “I absolutely adore this product. It’s fantastic!” - positive
    2. “It’s okay, not the best I’ve used.” - neutral
    3. “This is terrible. I regret buying it.” - negative
      Now classify this review:
    4. “The quality is amazing, and it exceed my expectations” - [it’s blank, for the model to finish]

Note: Zero/One/Few/Multi-Shot is sometimes called “Learning” instead of “Prompting”. This is a terrible name, because there is no learning (the models are dead!) but is one of those things where the most assume-good-intent explanation is that over the course of the prompt and its incrementally generated completion that the output assumes the desired shape.

  • Chain of Thought Prompting: The idea here is that the prompt includes a description of how a human might explain what they were doing to complete the prompt. And that boops the completion into filling out those steps, and arriving at a more expected answer:

    Classify the sentiment of the following review as positive, neutral, or negative.
    “I absolutely adore this product. It’s fantastic!”
    Analysis 1: How does it describe the product?
    Analysis 2: How does it describe the functionality of the product?
    Analysis 3: How does it describe their relationship with the product?
    Analysis 4: How does it describe how friends, family, or others relate to the product?
    Overall: Is it positive, neutral, or negative?

Note: again, there is no “thought” happening. The point of the prompt is to boop the text completion into giving a more expected answer. There are some new models (as of late 2024) that are supposed to do Chain-of-Thought implicitly; afaik there is just a hidden/unshown prompt that says “break this down into steps” and an intermediate output that takes the output of that and feeds it into another hidden/unshown prompt and then the output of that is shown to you. That’s why they costs more, cause it’s invoking the the LLM twice on your behalf.

  • Chain Prompting: This simply means that you take the output of one prompt, and then feed that into a new prompt. This can be useful to isolate a specific prompt and output. It might also be necessary because of the length: LLMs can only operate on so many words, so if you need to summarize a long document in a prompt, you’d need to first break it down into smaller chunks, use the LLM to summarize each chunk, and then combine the summaries into a new prompt for the LLM summarize that.
  • RAG (Retrieval Augmented Generation) Prompting: This means that you look up some info in a database, and then insert that into the prompt before handing it to the LLM. Everything is prompt, there is only prompt.

Note: “Embeddings” are a way of search indexing your text. LLMs take all those trillions of documents and probabilistically boils them down to several billion variables. Embeddings boil down further to a couple thousand variables (floating point numbers). Creating an embedding means providing a piece of text, and you get back the values of those thousand floating point numbers that probabilistically describe that text (big brain idea: it is the document’s location in a thousand-dimensional space). That lets you compute across multiple documents “given this document within the n-dimensional space, what are its closest neighboring documents semantically/probabilistically?” Embeddings are useful when you want to do RAG Prompting to pull out relevant documents and insert their text into your prompt before it’s fed to the LLM to generate output.

  • Cues and Nudges There are certain phrases, like “no yapping” or “take a deep breath” that change the output. I don’t think there is anything delightful about this; it’s simply trying to boop up the variables you want in your output and words are the only inputs you have. I’m sure there will someday be better ways to do it, but whatever works.

A strong opinion about zero-shot prompting

Don’t do it! I think it’s totally fine if you just want to ask a question and try to intuit the extent that the model has been trained or tuned on the particular domain you’re curious about. But you should put ZERO stock in the answer as something factual.

If you need facts, you must provide the facts as part of your prompt. That means:

  • Providing a giant pile of text as content, or breaking it down (like via embeddings) and injecting smaller chunks via RAG
  • Providing any and all input you ever expect to get out of the output

It’s ok to summarize, extract, translate or sentiment. The only reason it’s ok to zero-shot code is because it’s machine verifiable (you run it). Otherwise, you must verify! Or don’t do it at all.

Including Rails View Helpers is a concern

If you’re currently maintaining a Ruby on Rails codebase, I want you to do a quick regex code search in your Editor:

include .*Helper

Did you get any hits? Do any of those constants point back to your app/helpers directory? That could be a problem.

Never include a module from app/helpers into anything in your application. Don’t do it.

  • Modules defined in app/helpers should exclusively be View Helpers. Every module in the app/helpers directory is automatically included into Views/Partials, and available within Controllers via the helpers proxy e.g. helpers.the_method in Controllers or ApplicationController.helpers.the_method anywhere else.
  • Including View Helpers into other files (Controllers, Models, etc.) creates a risk that some methods may not be safely callable because they depend on View Context that isn’t present. (They’re also hell to type with Sorbet.)
  • If you do have includable mixins (“bucket of methods”) that do make sense to be included into lots of different classes (Controllers, Models, Views, etc.), make them a concern and don’t put them in app/helpers.

Some general background

Rails has always had View Helpers. Prior to Rails 2 (~2009), only the ApplicationHelper was included into all controller views and other helpers would have to be added manually. Rails 2 changed the defaults via helpers :all and config.action_controller.include_all_helpers to always include all Helpers in all Views.

Rails 4.0 (2012) introduced Concerns, which formalized conventions around extracting shared behaviors into module mix-ins.

Rails 5.0 (2016) introduced the Action Controller helpers proxy, and clearly summarizes the problem that I’ve observed too:

It is a common pattern in the Rails community that when people want to use any kind of helper that is defined inside app/helpers they includes the helper module inside the controller like:

module UserHelper
  def my_user_helper
    # ...
  end
end

class UsersController < ApplicationController
  include UserHelper

  def index
    render inline: my_user_helper
  end
end

This has problem because the helper can’t access anything that is
defined in the view level context class.

Also all public methods of the helper become available in the controller
what can lead to undesirable methods being routed and behaving as
actions.

Also if you helper depends on other helpers or even Action View helpers
you need to include each one of these dependencies in your controller
otherwise your helper is not going to work.

Some specific background

This has come up as a problem at my day job, GitHub. GitHub has the unique experience of being one of the oldest and largest Ruby on Rails monoliths and it’s full of opportunities to identify friction, waste, and toil.

Disordered usage of View Helpers and the app/views directory became very visible as we’ve been typing our monolith with Sorbet. Typing module mixins in Sorbet is itself inherently difficult, but View Helpers had accumlated a significant amount of T.unsafe escape-hatches and in understanding why… we discovered that explicitly including View Helpers in lots of different types of classes was a cause.

What’s the alternative?

I analyzed the different types of modules that were being created, and came up with this list:

  • Concerns are shared behaviors that may be optionally included into multiple other classes/objects when that behavior is desired. We can further break down:
    • Application-level Concerns are agnostic about the kind of object they are included into (could be a controller, or model, or a job, or a PORO)
    • Component-level Concerns are intended to only be mixed into a specific kind of object, like a controller with expectations that controller-methods are available to be used in that concern (like an http request object, or other view helpers like path helpers)
  • Dependencies are non-shared behaviors that have been extracted into a module from a specific, singular controller to improve behavioral cohesion, and is then included back into that one, specific class or object.
  • View Helpers are intended to be across Views (or Controllers via the helpers view-proxy method in Controllers or ApplicationController.helpers anywhere else) for formatting and presentation purposes and have access to other view helpers or http request objects. These are the only objects that modules that should go in app/helpers.

And this is what you might do about them:

  • Stop and remove include MyHelper from Controllers. Instead, you can access any View Helper method in a controller via helpers.the_name_of_the_method
  • Move Concerns and Dependencies out of app/helpers. If what is currently in app/helpers is not a View Helper, move it:
    • Application-level Concerns should be moved into app/concerns
    • Component-level Concerns should be moved into their appropriate app/controllers/concerns or your_package/models/concerns, etc.
    • Dependencies should be moved to an appropriate place in their namespaces hierarchy. e.g. if the module is only included into ApplicationController, it should be named ApplicationController::TheBehavior and live in app/controllers/application_controller/the_behavior.rb
  • Never include a module from app/helpers anywhere. Don’t do it.
  • Use the Controller helpers proxy or ApplicationController.helpers.the_helper_method to access helpers (like ActionView::Helpers::DateHelper) in Controller or other Object contexts.
  • Invert the relationship between Helpers and Concerns. If you have behavior that you want available to lots of different kinds of components and views, start by creating a Concern, and then include that Concern into a View Helper or ApplicationHelper. Don’t go the other direction.
  • Invert the relationship between Views and Controllers. If you have a private method that is specific to a single controller, and you want to expose that method to the controller’s views, you can expose that method to the views directly using helper_method :the_method_name . Use this sparingly, because it extends singleton View objects which deoptimizes the Ruby VM; but really, don’t twist yourself into knots to avoid it either, that’s what it’s there for.
  • (optional but recommended) Rename the constant too, not just move it, when it’s not a View Helper. Naming things is hard, but *Helper is… not very descriptive. While it’s the placement in app/helpers that brings the automatic behavior… so it’s not technically a problem to have a SomethingHelper that isn’t a View Helper living in app/controllers/concerns … it is confusing to have non-helpers named SomethingHelper. Some suggestions for renaming Concerns and Dependencies:
    • Use the “-able” suffix to turn the behavior or capability into an adjective. e.g. SoftDeletable
    • Append Dependancy to the end, like AbilityDependency
    • If you’re out of ideas, use Methods or Mixin, like UserMethods or UserMixin.

Keep your secrets.yml in Rails 7.2+

Ruby on Rails v7.1 deprecated and v7.2 removed support for Rails.application.secrets and config/secrets.yml in favor of Encrypted Credentials. You don’t have to go along with that! I like Secrets functionality because it allows for consolidating and normalizing ENV values in a single configuration file with ERB (Encrypted Credentials doesn’t).

It’s extremely simple to reimplement the same behavior using config_for and the knowledge that methods defined in application.rb show as methods on Rails.application:

# config/application.rb

module ExampleApp
  class Application < Rails::Application
    # ....
    config.secrets = config_for(:secrets) # loads from config/secrets.yml
    config.secret_key_base = config.secrets[:secret_key_base]

    def secrets
      config.secrets
    end
  end
end

That is all you need to continue using a secrets.yml file that looks like this:

# config/secrets.yml

defaults: &defaults
  default_host: <%= ENV.fetch('DEFAULT_HOST', 'localhost:3000') %>
  twilio_api_key: <%= ENV.fetch('TWILIO_API_KEY', 'fake') %>
  mailgun_secret: <%= ENV.fetch('MAILGUN_SECRET', 'fake') %>

development:
  <<: *defaults
  secret_key_base: 79c6d24d26e856bc2549766552ff7b542f54897b932717391bf705e35cf028c851d5bdf96f381dc41472839fcdc8a1221ff04eb4c8c5fbef62a6d22747f079d7

test:
  <<: *defaults
  secret_key_base: 0b3abfc0c362bab4dd6d0a28fcfea3f52f076f8d421106ec6a7ebe831ab9e4dc010a61d49e41a45f8f49e9fc85dd8e5bf3a53ce7a3925afa78e05b078b31c2a5

# Do not keep production secrets in the repository,
# instead read values from the environment.
production:
  <<: *defaults
  secret_key_base: <%= ENV["SECRET_KEY_BASE"] %>
  default_host: <%= ENV['DEFAULT_HOST'] || (ENV['HEROKU_APP_NAME'] ? "#{ENV['HEROKU_APP_NAME']}.herokuapp.com": nil) %>

Note: This only works for secrets.yml not secrets.enc.yml which was called “Encrypted Secrets.” If you’re using “Encrypted Secrets” then you should definitely move over to the Encrypted Credentials feature.

A mostly technical reflection on Disaster Relief Assistance for Immigrants

“Meteors are not needed less than mountains”

— Robinson Jeffers, “Shine, Perishing Republic”

I recently kicked off a new outside project to build on my experience building GetCalFresh, a digital welfare assister that’s helped millions of Californian’s successfully apply for billions of dollars of food assistance from CalFresh/SNAP. While going through my contemporaneous notes from that time, I realized I had never written about another project I was deeply involved with: Disaster Relief Assistance for Immigrants (DRAI) during the COVID-19 pandemic.

Code for America published a little bit about DRAI in “Dismantling the Invisible Wall”:

DRAI was a modest but crucial lifeline for undocumented Californians. The program’s goal was to distribute $500 bank cards to 150,000 undocumented adults who had experienced some adverse impact from the pandemic—from those who lost wages or jobs and had kids home from school needing care, to those who had gotten the coronavirus or had to take care of a family member who did. The California Department of Social Services (CDSS) selected twelve community-based organizations (CBOs) that usually provide legal assistance to undocumented communities to distribute these funds. … Ultimately, the project succeeded in distributing every single bank card to community members, putting a total of $75 million dollars directly into peoples’ pockets.

I wanted to write about my experiences with the technical bits, as I was the lead engineer at Code for America building that technical system to facilitate distribution of these vital funds during a global pandemic. This included building:

  • The digital intake form, which was used by frontline assisters to read a script and collect information
  • A review flow for supervisors to approve applications, screening out incomplete and duplicated applications
  • A disbursement flow to issue and activate payment cards and discourage theft of funds
  • Table stakes operational faff, like user accounts and supervisory dashboards, and audit logs and all the wayfinding and chrome and umwelt and pastiche for people to navigate from place to place in the system with as little intentional training as possible.

I no longer work at Code for America, but the code is on GitHub and I figured it would be fun to fill in some of the technical story. My screenshots contain faked seed data.

Some of the human bits

My pandemic timeline starts on March 13, 2020. That is the date San Francisco went into lockdown, as well as the date purchase offers were due on the home where I now currently live as a first-time homebuyer. That week was also fairly tumultuous at Code for America because it was in the last days leading up to CfA’s first-ever Summit conference to be held in Washington DC, and it looked like a contentious debate from my desk’s view outside the glass-walled conference rooms.

I was an engineer and manager on GetCalFresh then, and those first months of the pandemic were hard. Application intake quadrupled from ~2,000 a day to over 8,000 per day. Because the end-to-end process can take up to 45 days (even longer as government offices were overwhelmed), we were supporting an active caseload of more than 300,000 applicants every day. Code for America was then a strict pair-programming shop modeled on Pivotal Labs, and we were figuring out how best to move that culture to a remote one. And I remember, darkly, that all of us with people-management responsibilities each put together a “Continuity Plan” to document responsibilities and fallbacks should anything happen to ourselves or our reports.

My notes for DRAI from the period are somewhat sketchy about exact dates:

  • Late March CfA was in talks to build the DRAI portal
  • April 14 the first commit was laid down by myself and another GCF engineer
  • May 18 applicants could start applying through community partners
  • End of June all funds were committed
  • End of July all funds were distributed
  • August final reporting
  • September began tearing it down

The initial team was pulled from GetCalFresh. A program manager, a product manager, a designer, a client researcher, and two engineers, one of them me. We were a high-trust group having already been closely working on GetCalFresh.

There was some initial strain in responsibilities. On the very mature project GetCalFresh, we had highly structured responsibilities: the PM and researcher decide what to build, the designer makes it usable, the program manager checks correctness, and the engineers make it work. On GetCalFresh I had to learn to work that way: “Respect for colleagues is trust in their expertise”. I’d come from labs and fellowships and early-stage startups where as a developer you just did and questioned… well, everything. So that was an adjustment back to like “um, I don’t see anyone doing this. is it, ok for me to? I didn’t hear back so I’m doing the thing.” Given the short timeline and the evolving requirements, it was largely the designer working on the intake form (which was a beast!), and myself…. designing everything else, at least in the initial weeks: entities, roles, stages, states, and how users of the system would navigate through them. A state named “disbursement” is exactly the sort of thing that’s my fault.

A screenshot of the DRAI dashboard with various states of committed and disbursed

Of the requirements, I remember they were… vague, which is a function of the timeline, and also par for the course. I remember disambiguating “manager” from “supervisor”, asking what exactly constituted a duplicate identity and hard and soft matching and transparency (having GetCalFresh data was an invaluable resource here), and whether anyone cared to look at the specifics about how we implemented auth and 2FA and audit logging though they were (waves hands) required. Like most government-led projects during my CfA tenure, a lot more effort in the requirements was spent on ineligibility criteria than on how eligible people are expected to access the benefit or escalate when they run into trouble. And there were many calls with the frontline community organizations to understand what they needed to support and supervise their workers and administer the program. And this was during lockdown, remember.

All work is human work. My second engineer’s father, a doctor, passed away, so she stepped back and two more GetCalFresh engineers joined us. We also had a lot of help from other folks at CfA like data science and Client Success. Our CTO was totally ok whenever I asked for tens of thousands of dollars of expenditure, and our Director of Engineering had previously at no small effort streamlined our secure infrastructure with Aptible. And folks outside of CfA; a former CfA fellow and engineer at Twilio was able to get us a Short Code in under a week. Code for America has a sometimes unbearable abundance of financial and human capital, and I’m grateful and proud we were able to access it.

I wasn’t sleeping very well at the time, and had a lot of manic energy. I’d be writing Google Docs at 4am. I had started working on GoodJob just the month before. My wife and I were in the midst of IVF treatment cycles. My mom would get her first cancer diagnosis. We closed on our first home; the movers wore tyvek suits and masks. It was the middle of the pandemic. It was a time!

Of vocation, I can’t imagine working on anything better, either!

Pronunciation guide: We initially named it DAFI (Disaster Assistance For Immigrants; daffy like the duck) but then when it shifted to DRAI, no one ever was consistent in pronouncing it dry or dray. 🦆🌵🚚🤷

Onto the technical bits

Boring-ass Rails I shouldn’t be shocked by now, with so many proof points over my career, but I am: how simple it is, with experience, to respond to rapidly shifting requirements in fairly-vanilla Rails: CRUD, fat controllers, fat models, fat jobs, validation contexts, ERB, form builders, UJS, system tests, seeds, factories…. it’s not particularly difficult or mentally stimulating; it’s simply hands-on-keyboard time with the satisfaction of being done with that feature or capability and on to the next one, and the next one, and the next one. (aside: there’s a “Patterns vs Platform” thought floating around here)

A walking skeleton. It’s ingrained in me that you rails new and then immediately set up CI and a production deployment pipeline… but when I did it in pair with my second engineer for the first time, they were like “I wouldn’t have thought of doing that.” So I mention it.

An expert system. Code for America did not do expert systems, and the contemporary design system, Honeycrisp, was woefully inadequate as it was intended for big, fresh, juicy, low-cognition wizards and not tight, dry, high-density, familiarized workflows and reporting. I got to design all that! Which in practice meant running out the same thing I did at Pantheon where I also managed the usability testing program (startups!) so I felt confident in what we laid down: 🎶 Tabs and dropdowns, tables and lists; horizontal, vertical, click, click, click! 🎶

A screenshot of the dashboard with various tabs and tables

Building the plane while flying it. The most joyous, stressful part of the whole thing was that we launched the Portal with only the intake section, and then each week we frantically fixed the bugs and built the next step in the workflow process. The tabs are numbered in the order we built them, and we just had them disabled in the UI with their contents not even designed yet. I remember surprising so many folks cause we’d get an inbound email with a bug report, I’d zing back out an email confirming it, us engineers would pair on the fix, deploy, and send another email back out to confirm within minutes. Expectations in this field are so low. I think we sent had a telephone bridge and emails on Friday with like “You can now click on Step 3 in the portal”; everyone was so wired! 😰 Also, in this metaphor, the plane itself held no enduring value, we were building it to safely land at the destination.

A screenshot of the intake form with numbered tabs along the screen

ApplicationTexter. There was space (not a lot!) for some experimentation from experience building GetCalFresh. I have a deeper dive to write about this, but briefly: we used Action Mailer to also format and send SMS messages via a custom Twilio-based delivery method. It made sending SMS messages look the same, in code, as sending an email. I’m really proud to have upstreamed one part of it: Action Mailer deliver callbacks.

Streaming CSV. One of the needs was to generate 10k+ row CSVs so that the administering organizations could do analysis and oversight. We were able to stream them directly from Postgres.

Typheous. One of the most unique aspects of DRAI for me was that our system was activating payment cards. A supervisor would type in the number on the physical card packaging, and then the system generated a unique activation code and sent it to the card issuer, Blackhawk (who were just the best partners 💖), who’d assign it to the card, and then we’d send a message to the client with the code. Everything worked great in the validation environment with curl, but we could never get it to work with net/http or any other Ruby HTTP library. We spent way too long poking at it, together with Blackhawk engineers. And then, because we had to move on, we just used curl via Typheous. Never figured that one out.

Localization. While there wasn’t a public, client-facing part of the system, there was a lot of localization we did to make it easier for bilingual assisters to read off the application to an applicant. This meant we had a lot of mixed language pages which I had limited experience with until then, as well as incomplete translations because we were building and pushing and translating all at the same time. I had a particularly difficult time with Arabic, which is a right-to-left language and we had to patch translate to get missing-translations working properly, and I remember difficulty getting all form-inputs to be LTR regardless of the surrounding content. We did make some awesome i18n-tasks tooling for importing and exporting translations to CSV and then into Google Sheets for translators and back again.

Multiparameters. Active Record’s worst dark magic is multiparameter attributes, which is how Active Record can decompose a single Date attribute into a 3-part datefield form and back again. It’s wild! And a huge source of 500 errors when users not-unreasonably choose invalid values (“February 31”). But I also not unreasonably believe that decomposing dates is just good UX, so there’s a patch for that.

Metabase. Metabase was one of the tools I had previously introduced to Code for America, and it was invaluable on this project. We were using paper_trail gem for auditability, which produced a very rich event stream. I got real good at COUNT FILTER, JSONB querying, and some neat Metabase features like trendlines, variables, and reusing SQL questions. I’m particularly proud of how many other people were able to build wicked-good dashboards with Metabase.

Rufus-scheduler. This was my first project using rufus-scheduler as a container-compatible replacement for cron on VMs. It worked really well, and that experience was a source of my initial reluctance to build such functionality into GoodJob. (I did eventually relent in GoodJob and am happy with that too).

Factories and Seeds. A quality-of-life thing: we spent some intentional time optimizing FactoryBot factories with associations and traits to make test setup as direct as possible. And having comprehensive seeds made it no-fuss to reset one’s development database, or set up a Heroku Review App for acceptance.

Overusing Scenic Views and SQL Counts. There were several features near the end of the project that we fit in without significantly rearranging the data structure, specifically around reporting precision and a fifo-waitlisting feature. There were several messy Scenic-powered Views that used Window Functions and other complex queries; this was the source of the only significant outage I remember: joining a relation to a database view that resulted in a table scan that kicked over the database. There was also several places where we made complex association counters as optimized database queries rather than counter caches, in the spirit of this, that were messy but ok; 50/50 would do again.

Telecom Outages. The most out of our control, there was a major telecom outage during the most critical part of the project that required a lot of recovery work to ensure SMS activation codes were delivered.

ZIP to FIPS. Ok, not truly technical but when I was writing this and refamiliarizing myself with the codebase, I was reminded of how much of the work is interpreting between geographic data systems and the overrides on overrides because different parties are using different data sets from different providers and provenances. Yeeesh! This is why you save your innovation tokens by writing boring-ass Rails… to spend it on this nonsense.

Open Source. I don’t imagine I would even be writing this if the project code wasn’t publicly available on GitHub. I’ve frequently linked to bits when folks on Reddit ask for Rails examples, and for every story here I can find a reference to: a PR, a commit, a piece of code (the ticket backlog was in Pivotal Tracker RIP). In my contemporaneous notes, I discovered I had written “Week of March 9, 2020… Org Open Source Strategy Plan… license… contributing… marketing… talent”; clearly that didn’t happen at large (someday GetCalFresh, someday?) but it probably had an effect in this small, for which I am grateful to the many people who let it happen.

Spectator Sport, a brief introduction to an upcoming Rails plugin

Hi! 👋 I’m Ben Sheldon. I’m the author of GoodJob, an Active Job backend that I’ll humbly share is mildly popular and known for its broad features and ease of use. I’m working on a new plugin for Rails: ✨

Spectator Sport creates and replays video-like recordings of your live, production website, via a self-hosted Ruby on Rails Engine that lives in your application.

Spectator Sport uses the rrweb library to create recordings of your website’s DOM as your users interact with it, from the perspective of their web browser’s screen (html, css, images, dynamic content, mouse movements and clicks, navigation). These recordings are stored in your Active Record database for replay by developers and administrators to analyze user behavior, reproduce bugs, and make building for the web more engaging, satisfying, and fun.

Here’s a proof of concept demo. It’s very, very, very, very rough and early, but everyone I have demoed it for says “wow, that is totally different and much better than I imagined it when you first explained it me”: https://spectator-sport-demo-1ca285490d99.herokuapp.com

🚧 🚧 This gem is very early in its development lifecycle and will undergo significant changes on its journey to v1.0. I would love your feedback and help in co-developing it so fyi it’s going to be so much better than it is right now.

You can help:

Who is this gem for?

I’m writing this on the weekend following Rails World 2024, on eve of Rails 8’s release. The Rails team is tackling the hard problems of making it easy to deploy your website into production with tools like Kamal, and addressing operational complexity with the Solid suite. What comes next?

Spectator Sport intends to transform your relationship with your website after it is deployed to the web and real people have the opportunity to use it:

  • See how people are actually using your website, directly.
  • Remove the necessity of defining funnel metrics or analytics up front, or the necessity of interpreting user behavior through a limited lens of aggregated or averaged numbers.
  • As a developer and product-maker, more fully engage your sympathetic senses, in addition to your analytical senses, to ultimately be more effective and fulfilled when building for the web.

Launching a website kinda sucks. I’ve been a solopreneur, and web-marketing consultant, and “founding” engineer, and “growth” engineer at VC backed startups and product labs, and a participant and mentor in entrepreneur communities and mastermind and accountability groups for 19 years. People are not ok.

The fast development feedback of imagining and building the thing locally, the dopamine rush of making something nice, one step at a time, for other people to use… that all drops off a cliff once, well, you put it out there for other people to use. The post-launch and release feedback: it’s not there.

It sucks! People feel it! They’re confused, they’re sad, sometimes mad, looking for help, wanting to be seen by others, spinning on valueless technical changes, sharing tangential hot takes and engagement baits. Developers are going anywhere but directly to the people they’re building for. One reason, I believe, is because their visitors’ and users’ activity on their website is largely invisible and unknowable, and the only way to see it is through a foggy, confusing and deeply unsatisfying window of abstract metrics and aggregation.

Building for the web should be a spectator sport. More than only a fantasy game of metrics and aggregates and guesses and spread-table gambles. It should be fun and engaging and direct. We should simply be able to observe and react and cheer and cry and fall on the floor and get up and make it better and go again. Believe.

There are constraints to what I can do to achieve this vision, with this gem. I’m focused on building for Ruby on Rails. And specifically hobbyists, soloprenieurs, small teams, SMBs (small and midsize businesses), and unique applications, including:

  • applications with limited budgets or the inability (because of geography or policy) to contract or procure a 3rd party service.
  • applications in government or healthcare or on an internal intranet with unique data or privacy constraints who don’t have the budget for a BAA (business associate agreement) or other compliance contracts
  • applications for which operational simplicity is paramount and don’t have the resources to operate a more complex self-hosted solution

We have the technology

Browser recording isn’t new. Fullstory was my own introduction to it nearly a decade ago, also Tealeaf and Sentry and PostHog and Highlight and Matomo and many others, some of which are no-cost self-hostable as a separate service, though often with complex dependencies. Many of them use rrweb too.

I believe Spectator Sport is the first no-cost, self-hostable browser-recording tool that works anywhere your application runs (Heroku being the narrowest target I can imagine). Tell me what I’m missing!

If my adjectives themselves aren’t compelling and your website already has massive scale and a rich revenue stream and/or no concerns about 3rd-party data subprocessors, I highly recommend checking out PostHog (just $0.005 per recording!) or Sentry (enterprise gated, but integrated into Sentry’s other features which are fantastic).

A good job, again

I mentioned in my introduction that my other gem, GoodJob, is well-regarded. I think we can do it again with Spectator Sport:

  • Focus on solving a class of problems developers experience over a long period of time, not building a specific technology tool and calling it a day.
  • Serve the vastly more solo and full-stack dev teams with limited time and budgets who will benefit from something tailored to their largely consistent needs (easy, good, inexpensive) and are nice and appreciative when you deliver, than the very small number of experienced folks with big budgets and unique needs who inexplicably have time on their hands to be outspoken in telling you it will never work for them.
  • Provide a wide offering of polished features, using boring, existing tech to do the complex bits (like Postgres advisory locks in GoodJob, or rrweb in Spectator Sport). The value comes from the usability of the integration. A full-featured, cleanly designed web dashboard really impresses too; Dark Mode is the epitome of a non-trivial feature to maintain that demonstrates care.
  • Maintain a narrow compatibility matrix, focus on “omakase” Rails (Active Record, Active Storage, etc.) with a sensible EOL policy. Complexity kills. Relational databases are great. Squeeze the hell out of the system you have.
  • Be exceptionally responsive and supportive of developers who need help and meet them where they are. Be personally present because the library can’t speak for itself. Make mistakes, change direction, communicate intent, move forward.
  • Keep the cost of change low, release frequently, build up, iterate, document and test and provide deprecation notices, follow SemVer, and defer application-breaking changes as long as possible.

I do want to try one thing new compared to GoodJob: I want Spectator Sport to be compatible with Postgres and MySQL and SQLite. I believe it’s possible.

Front-running the criticism

Here are the things I have worked through myself when thinking about Spectator Sport, and talked about with others:

Is it creepy? Yes, a little. There is overlap with advertising and marketing and “growth” tech, And many service providers market browser recording as a premium capability with premium prices and sell it hard. Realistically, I have implemented enough dynamic form validations in my career that I no longer imagine any inherent sanctity in an unsubmitted form input on a random website. Conceptually, Spectator Sport observes your website as it is operated by a user, it does not observe the user. Every webpage deserves to be a place, and this just happens to be your CCTV camera pointed at it, for training purposes.

Is it a replacement for usability research? No, of course not. Spectator Sport can only show you half of the picture (or less) that you get from real usability research. When you do real usability research and ask a subject to do something on your website, you ask them to explain what they’re doing, in their own words, based on their own understanding of the task and what they see through their own eyes. Browser recordings alone can’t give you all that. You still have to fill in the blanks in the story.

Is it safe? I think so. I intend all user input to be masked by default, be secure by default, and provide comprehensive documentation that explains both the why and the how to lock down what’s stored and who can access it. Spectator Sport is shipping the DOM to your own database, and it’s likely the same data already lives in the database in a more structured way, and is already reflected back through your application too.

Does it use a lot of storage? Not as much as you might fear. If people’s big scaling worry for GoodJob was “it will be too slow” I already think Spectator Sport’s is “it will be too big”. I’ve been running the proof of concept on my own websites and 1.5k recordings took up ~500MB of storage in Postgres. Retention periods can be configured, data can be compressed and offloaded to Active Storage. I believe it is ok, and worth the squeeze.

Can it do xyz? Maybe. Open an issue on GitHub. I’d love to discuss it with you.

Wouldn’t you rather do something with AI? I dunno, man. I freaking love watching recordings of my websites being driven by people and thinking about how to make the website easier and better for them. I think this is an immensely satisfying missing piece of building for the web, and I think you will too.

Tell me what I’m missing or overlooking!

The call to action, a second time, at the bottom

Something I learned a long time ago, from watching browser recordings (true story!), is that visitors will go deep below the hero’s call-to-action, read all the lovely explanatory content, get to the bottom… and bounce because the call to action wasn’t reinforced.

So, please:

Seeing like a Rails and Ruby platform team

When I’m not hacking on GoodJob, I work at GitHub, where I’m the engineering manager of the “Ruby Architecture” team, which is filled with fantastic rubyists. Our team mission is to:

Make it easy for GitHub engineers to create, deliver, and operate best-of-class Ruby and Rails applications, and share the best of it with the world.

This is an adaptation of a post I published internally at GitHub, and its ensuing discussions, to explain what a team like ours does when we’re supporting other teams and giving technical feedback. I imagine this is similar to other big companies’ Rails and Ruby platform teams, like Shopify’s “Ruby Infrastructure” team. I hope this is useful in thinking about your own Rails and Ruby work, experience, and career development focuses.

Before you “architecture”

The rest of this post is a big ol’ list of deep Ruby technical topics. To avoid premature optimization and architecture astronautics, I want to just quickly ground some expectations of any technical change you lead:

  • Is it clear what it does, especially to others, who may be yourself in the future?
  • Does it follow established patterns and precedent throughout the codebase, and is it internally consistent with itself?
  • Does it accomplish the business goal? Does it work?
  • Does it not prevent other components from accomplishing their business goals? Does it not break or negatively impact other stuff?

I write these things out because it’s very common, as a technical feature goes through multiple reviews and revisions, to lose sight of its original goals or purpose or business constraints. So set yourself up for success by being clear on that stuff up front, and push back (or go deeper) if someone tells you something needs to change for technical reasons but it compromises your intended non-technical outcome.

Architecting Ruby, the list

A brief note about my authority on this. The following list comes out of my experience working on a big Rails and Ruby monolith at GitHub, which has largely co-evolved with Rails and Ruby over the past 15+ years, and alongside 1k+ other engineers. (I’m also a consultant, and worked in a lot of software labs, and untangled a lot of other people’s applications too; and not-Rails stuff too.) Many members of the team are core maintainers of Rails and Ruby, and we treat the Rails Framework as an extension of our application. Our team is responsible for integrating upstream changes in Rails and Ruby within GitHub’s vast monolith. We upgrade and integrate Rails and Ruby main/dev/trunk changes weekly! (Never repeat, never forget.) This continuous practice produces a deep familiarity with how change happens, and where friction builds up between an application and its upstream dependencies. Performing these upgrades over and over leads to experience, and repeated experience leads to intuition.

(btw, please reach out if your company has a practice of continuously upgrading Rails main/dev/trunk and running it in production. GitHub and Shopify and Gusto are trying to form a club and we want you in it.)

There is a general order here, from most important to least in broad strokes. Remember, nothing here is intrinsically bad or should never be done; but in those situations there should be well-considered decision points.

  • Global namespace and Library/Dependency Privacy Violations
    • Avoid monkeypatching or reaching into private methods or objects.
    • The most appropriate place to make changes is upstream.
  • Safety, Security
    • Avoiding thread safety issues, like globally held objects and privacy violations, not leaking data between requests, or retaining big objects in memory. Profile, profile, profile.
    • Seeking object locality (or avoiding globalness) by storing objects on instances of controllers and jobs (or their attributes) and embracing the natural lifecycles provided by the framework. Frequently a developer desires not to call SomeObject.new at the usage-site, but to have a DSL-like callable method already ready in the appropriate scope (eg. current_some_object). We love a good DSL and they can be difficult to get right.
  • Code Loading, Autoloading, and Reloading
    • Code autoloading is one of the most important design-constraints in Rails that can vastly affect inner-loop development (the “hands-on-keyboard” part) and production availability because of impact to application boot speed.
    • Designing for code loading and autoloading is critical to design, file placement (app vs lib vs config) and dependencies interactions
  • Internal to the Ruby VM constraints
    • Even though Ruby makes it easy to introspect the runtime (descendants or subclasses or ObjectSpace) they shouldn’t be used outside of application boot or exception handling (and sparingly even then); they may have performance implications or be overly nuanced and non-deterministic in their output. Using callers and introspecting the Ruby callstack is a particularly expensive operation.
    • While infrequent and not-obvious, some patterns can massively de-optimize the Ruby VM with either localized or global effects. The Ruby VM (or accelerators like YJIT) are unable to optimize certain code patterns, and some patterns may cause VM-internal caches to churn inefficiently or to retain objects and their references unnecessarily (this can get tricky so please partner with us!). You probably want examples:
      • OpenStruct (though probably isn’t a reason to use it at all)
      • eval and class_/instance_eval
      • Modifying singleton classes (using extend on objects) (example)
      • Anything that adds to the callstack (call-wrapping, complicated delegation)
      • (handwaves) Things that YJIT isn’t yet optimized for, things that deoptimize object shapes, which is the result of new fast-paths being introduced which now mean there are slow-paths that didn’t previously exist.
      • Native extensions that don’t release the interpreter lock
      • Metaprogramming generally
      • None of these are intrinsically bad (except OpenStruct and poorly done native extensions), and framework and platform level code definitely make use of them. And they’re also constantly changing because of upstream Ruby work. And are maybe ok in isolation but a problem when copied as a pattern or introduced as a part of the platform for broad consumption. Something John Hawthorn has said:

        A thought experiment I like to try is asking myself how I would implement this in another language without [Ruby magic]… Adding that constraint can help unblock thinking of simpler, more “normal” approaches without expensive metaprogramming.

  • External to the Ruby VM constraints and dependencies (memory, compute, file descriptors, database connections, etc.)
    • Database stuff alone is a lot. The design prompt everyone is largely working from is “how does one architect an efficient, stateless application that sits between an end-user client and stateful data sources and manages bidirectional transformations of data?” Sounds hard when you put it that way, right?
    • Thinking about resource lifecycle, pooling, and how they interact across the various concurrency models available to use (process forking, threads, etc.). We do expect the frameworks and platform libraries we choose to keep these out of mind for most development tasks 😅
  • Design of the thing, for use
    • Rails’s model of “convention over configuration” frequently means that how an object is structured and where it’s placed can have an outsized impact on how it behaves: e.g. within App, Lib, Rack Middleware, Other Library Middleware (Faraday, jobs system, etc.), Rails Configuration/Initialization/Railties, and more!
    • …and how those conventions relate to Maintainability, Developer Usability, and Conceptual Integrity.
    • Sometimes what may appear as simply an aesthetic decision can have a functional impact.
    • Identifying atypical or disordered usage patterns. Sometimes a desired behavior can be more of a happy accident than an enforced intention, and it might change upstream because no one expected it to be used that way.
  • Dependency Stewardship
    • In addition to Rails and Ruby, our monolith depends on hundreds of gems, double hundreds of their transitive gem dependencies, and several other runtimes and system libraries.
    • The nature of a monolith is that we go together. If some dependency isn’t compatible with the latest Rails or Ruby, or any other dependency upgrade, we must adapt. We work upstream, we patch locally, and worst case, we remove and replace the dependency with something more maintainable. All of this takes time and effort and resources.
    • We want to choose dependencies that are well-maintained: their maintainers proactively respond to upstream changes, are responsive to issues and PRs, and importantly in Ruby, are nice. (And to whom we are nice too!) That’s more important than benchmarks.
    • And dependencies should be well architected too, obvs.
  • Automating and Scaling: Packwerk, Sorbet, Rubocop
    • We do our best to encode our knowledge and shape the application through tooling; that’s how our team scales! We send our custom rules upstream, too.
    • But it’s complicated! Sometimes that means that developers may focus on designing their code in response to the automated tooling and ending up with a less effective design or even introduce global risks and impacts to the application. At worst, a developer might even glaze over the linter’s intent by smuggling their design through a spelling or arrangement the linter doesn’t recognize 💀 Unfortunately the most important things are often the most abstract and arguable and difficult to detect or automatically warn about. We regret when we do have to tell folks that an approach is untenable in a PR or even after the fact when we notice production metrics have degraded.

A conclusion about lists

I like making lists of things; I find them helpful. I also realize that not everyone experiences lists the same way I do. For me, the purpose of a good laundry list is to be a quick reminder (“don’t forget to wash the handkerchiefs”) and not not an exhaustive list of actionable instructions (“the exact and best temperature to wash this t-shirt and that pair of jeans”). So please reach out to me (Mastodon / Twitter/X) if:

  • You think there is something that should be added to the list, or explained in more detail
  • You’re curious how something in the list might apply to a specific thing you have

I’d love to chat. Thanks for reading!

The secret to perfectly calculate Rails database connection pool size

Ruby on Rails maintains a pool of database connections for Active Record. When a database connection is needed for querying the database, usually one per thread (though that’s changing to per-transaction), a connection is checked out of the pool, used, and then returned to the pool. The size of the pool is configured in the config/database.yml. The default, as of Rails 7.2, is pool: <%%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>.

The database connection pool size is frequently misconfigured. A lot. How to calculate the database connection pool size is one of the most common questions I get on GoodJob (Hi! I’m the author of GoodJob 👋). I have spent an embarrassingly large amount of time trying to come up with a precise pool size calculator and give advice to take into account Puma threads, and GoodJob async jobs, and load_async queries and everything that might be asking for a database connection at the same time. It’s nearly impossible to get the number exactly right.

If the connection pool is misconfigured to be too small , it can slow down web requests and jobs while waiting for a connection to become available, or raise ActiveRecord::ConnectionTimeoutError if there isn’t a connection available within a reasonable amount of time (5 seconds by default). That’s bad! We never want that to happen. Here’s what you should do:

✨ The secret to perfectly calculate Rails database connection pool size: Don’t! Set the pool size to a very large, constant number, and never worry about it again. E.g. pool: 100, and remove the reference to RAILS_MAX_THREADS entirely:

# config/database.yml
default: &default
  # ...
  pool: 100 # <-- that's it 👍
  # ...

WAIT, WHAT?! Why? I described that bad things happen if the pool size is too small. Here’s the trick: it’s impossible to set the connection pool size to be too big. You can’t do it! That’s why it’s always better to set a number that’s too large. And the best number is one that can never be too small regardless of how you configure (and inevitably reconfigure) your application. Here’s why:

  • The pool: configuration value, despite its name, is the max size of the database connection pool.
  • Database connections are lazily created and added to the pool as they’re needed. Your Rails application will never create more database connections than it needs. And the database connection pool reaper removes idle and unused connections from the pool. The pool will never be larger than it needs to be.
  • It’s possible you may run out of available database connections at the database. For example, Heroku’s new Essentials-0 Postgres database only has 20 database connections available globally. But any problems you run into won’t be because the database connection pool is too big, it’s because your application is using too many concurrent database connections.
  • If you find yourself in a situation where your application is using too many concurrent database connections, you should be configuring and re-sizing the things using database connections concurrently, not the database connection pool itself:
    • Configure the number of Puma threads
    • Configure the number of GoodJob async threads (Solid Queue now has similar functionality too!)
    • Configure the load_async thread pool
    • Configure anything else using a background thread making database queries
    • Configure the number of parallel processes/Puma workers/dynos/containers you’re using, which the database connection pool does not affect anyways.
  • If you still don’t have enough database connections at the database, then you should increase the number of database connections at the database. Which means scaling your database, or using a connection multiplexer like PgBouncer. Judoscale has a nice calculator to estimate the number of connections you’ll need at the database (which again, is not the pool size).
  • If, in an incredibly rare case, your application concurrency is very, very spiky and you worry that idle database connections are sitting in the connection pool for too long before they are automatically removed by the connection pool reaper, then configure that:
    • idle_timeout: number of seconds that a connection will be kept unused in the pool before it is automatically disconnected (default: 5 minutes). Set this to zero to keep connections forever.
    • reaping_frequency: number of seconds between invocations of the database connection pool reaper to disconnect and remove unused connections from the pool (default: 1 minute)

I know this is wild advice, but it’s based on facts and experience. Even Rails maintainers have intentions to remove this configuration option entirely:

…we want the pool not to have a limit by default anymore.

So please, stop sweating the precise, exact, perfect database connection pool value. Set it to something really big, that can never be too small, and never worry about it again.

The Novice Problem

Brandon Weaver’s “Beyond Senior - Metric Obsessions” has been stuck in my mind ever since we caught up at a SF Ruby Meetup and chatted about rules-adherence as a general problem:

…by definition a vast majority of your engineers are likely to be concentrated more towards the novice end of the spectrum, and will frequently over rate themselves on this scale.

If folks in the novice to advanced beginner stages are known for a rigid adherence to rules and almost legalistic approach to them what do you think might happen if you give them a giant list of metrics [, coding rules, linter warnings, dependency violations, or type-checking errors]?

Will they exercise discretion and nuance? Will they have the ability to prioritize based on that information? Will they make appropriate tradeoffs? [No.]

This is coming from the Dreyfus Model of Skills Acquisition, which is like Shuhari but with more levels:

  1. Novice:
    • “rigid adherence to taught rules or plans”
    • no exercise of “discretionary judgment”
  2. Advanced beginner
    • limited “situational perception”
    • all aspects of work treated separately with equal importance
  3. Competent
    • “coping with crowdedness” (multiple activities, accumulation of information)
    • some perception of actions in relation to goals
    • deliberate planning
    • formulates routines
  4. Proficient
    • holistic view of situation
    • prioritizes importance of aspects
    • “perceives deviations from the normal pattern”
    • employs maxims for guidance, with meanings that adapt to the situation at hand
  5. Expert
    • transcends reliance on rules, guidelines, and maxims
    • “intuitive grasp of situations based on deep, tacit understanding”
    • has “vision of what is possible”
    • uses “analytical approaches” in new situations or in case of problems

On the importance of Rails code reloading and autoloading

I’ve elevated to “strongly held belief” that code reloading and autoloading is the most important design constraint when designing or architecting for Ruby on Rails.

  • Code reloading is what powers the “make a code change, refresh the browser, see the result” development loop.
  • Code autoloading is what allows Rails to boot in milliseconds (if you’ve designed for it!) to run generators and application scripts and a single targeted test for tight test-driven-development loops.

When autoloading and reloading just works, it probably isn’t something you think about. When code autoloading and reloading doesn’t work or works poorly, as it has on numerous apps across my career and consulting, it can be maddening:

  • Spending hours “debugging” some code only to realize that your changes were never being run at all.
  • Waiting tens of excruciatingly boring seconds to run a simple test or watching the browser churn away while it slowly waits for a response from the development server.
  • Feeling like you can write the code yourself each time faster than running a scaffold/template generator, repetitively over and over again.

Code reloading and autoloading not working correctly is a huge pain. It’s not great, at all!

The history of code reloading and autoloading came up recently in the Rails Performance Slack. A developer working on an old Rails application asked what Spork was (a forking preloader), and whether it was necessary (not necessarily). As a Rails Developer who is increasingly aware of my age experience (I started working with Rails in 2012, long after it first launched in 2004, but it’s still been a minute), I realized I had something to share.

Over history, various strategies have been taken to make the development loop faster because that’s so important. Those strategies usually boil down to:

  • Separating the (static) framework code from the (changing, developed) application code and only loading, just in time, what’s needed for the part of the application that’s currently running.
  • Loading/booting the framework code that is unlikely to change, and then only (re-)load the application code when invoking a command or running a test.

There have been various approaches to doing this:

  • Forking Preloaders (Spork, though Spring is the more contemporary version): load up the framework code in a process once, then fork into a subprocess when you invoke a command and reload just the application code. Sometimes, things can get out of sync (some application code or state pollutes the primary process), and things get weird/confusing. This is why you’ll hear of people hating on Spring or complaining, “I wasted all day on a development bug, and it turns out I just needed to restart Spring” (the analogous “it was DNS all along” of the Rails world).
  • Bootsnap, though operating on a cache strategy rather than a process-forker, serves a similar purpose of trying to speed up an application’s code loading time. The adoption of Bootsnap, and much, much faster CPUs in general, has largely replaced the usage of Spring in applications (though it’s still okay!).
  • Zeitwerk autoloader also plays a role in this history because it, too, is trying to “solve” the necessity of separating the framework code (which changes infrequently) from the application code during development (which is actively being changed) to produce faster development feedback cycles. Zeitwerk replaced the previous autoloader built into Rails, whose lineage seems to date all the way back to Rails 2.0 circa 2005. Tell me the history / raison d’être of the original autoloader if you know it!

Look, a lot of labor has gone into this stuff. It’s important! And it’s easy to get wrong and produce a slow and disordered application where development is a pain. It happens! A lot!

I wish I could easily leave this post with some kind of nugget of something actionable to do, but it’s really more like: please take care. Some rules of thumb:

  • Don’t reference, don’t access, don’t use or touch any constants in app/, or allow them to be referenced (looking at you, custom Rack Middleware) unless you’re doing so from another constant in app/ (or somewhere that you know is autoloaded).
  • Take care with config/initializers/ and ensure you’re making the most of ActiveSupport.on_load hooks. Rails may even be missing some load hooks, so make an upstream PR if you need to configure an autoloaded object and you can’t. It’s super common to run into trouble; in writing this blog post alone, I discovered a problem with a gem I use.
  • If you’re writing library code, become familiar with the configuration-class-initializer-attribute-pattern dance (my name for it), which is how you’ll get something like config.action_view.something = :the_thing lifted and constantized into ActionView::Base.something #=> TheThing

You might find luck with this bin/autoload-check script, that I adapted from something John Hawthorn originally wrote, giving output like:

❌ Autoloaded constants were referenced during during boot.
These files/constants were autoloaded during the boot process,
which will result in inconsistent behavior and will slow down and
may break development mode. Remove references to these constants
from code loaded at boot.

🚨 ActionView::Base (action_view) referenced by config/initializers/field_error.rb:3:in `<main>'
🚨 ActiveJob::Base (active_job)   referenced by config/initializers/good_job.rb:7:in `block in <main>'
🚨 ActiveRecord::Base (active_record)
                                         /Users/bensheldon/.rbenv/versions/3.3.3/lib/ruby/gems/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/base.rb:338:in `<module:ActiveRecord>'
                                         /Users/bensheldon/.rbenv/versions/3.3.3/lib/ruby/gems/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/base.rb:15:in `<main>'
                                         .....

Older posts