Wide Models and Active Record custom validation contexts

This post is a brief description of a pattern I use a lot using when building features in Ruby on Rails apps and that I think needed a name:

Wide Models have many attributes (columns in the database) that are updated in multiple places in the application, but not always all at once i.e. different forms will update different subsets of attributes on the same model.

How is that not a “fat model”?

As you add more intrinsic complexity (read: features!) to your application, the goal is to spread it across a coordinated set of small, encapsulated objects (and, at a higher level, modules) just as you might spread cake batter across the bottom of a pan. Fat models are like the big clumps you get when you first pour the batter in. Refactor to break them down and spread out the logic evenly. Repeat this process and you’ll end up with a set of simple objects with well defined interfaces working together in a veritable symphony.

I dunno. I’ve seen teams take Wide Models pretty far (80+ attributes in a model) while still maintaining cohesion and developer productivity. And I’ve seen the opposite where there is a profusion of tiny service objects and any functional change must be threaded not just through a model, view and controller but also a form object and a decorator and several command objects, or where there is large number of narrow models that all have to be joined or included nearly all of the time in the app—and it sucks to work with. I mean, find the right size for you and your team, but the main thrust here is that bigger doesn’t inherently mean worse.

This all came to mind while reading Paweł Świątkowski’s “On validations and the nature of commands”:

Recently I took part in a discussion about where to put validations. What jarred me was how some people inadvertently try to get all the validation in a one fell swoop, even though the things they validate are clearly not one family of problems.

The post goes on to suggest differentiating between:

  • “input validation”, which I take to mean user-facing validation that is only necessary when the user is editing some fields concretely on a form in the app. Example: that an account’s email address is appropriately constructed.
  • “domain checks”, which I take to mean as more fundamental invariants/constraints of the system. Example: that an account is uniquely identified by its email address.

I didn’t entirely agree with this advice though:

In Rails world you could use dry-validation for input validations and ActiveRecord validation for domain checks. Another approach would be to heavily use form objects (input validation) and limit model validations to actual business invariants.

My disagreement is because Active Record validations have a built-in feature to selectively apply validations: Validation Contexts (the on: keyword) and specifically custom validation contexts:

You can define your own custom validation contexts for callbacks, which is useful when you want to perform validations based on specific scenarios or group certain callbacks together and run them in a specific context. A common scenario for custom contexts is when you have a multi-step form and want to perform validations per step.

I use custom validation contexts a lot. I don’t intend for this to be a tutorial on custom validation contexts, but just to give a quick example:

  • Imagine you have an Account model
  • A person can register for an account with just an email address so they can sign in with a magic link.
  • An account holder can later add a password to their account if they want to optionally sign in with a password
  • An account holder can later add a username to their account which will be displayed next to their posts and comments.

You might set up the Account model validations like this:

class Account < ApplicationRecord
  validates :email, uniqueness: true, presence: true
  # also set up uniqueness/not-null constraints in the database too
  validates :email, email_structure: true, on: [:signup_form, :update_email_form]

  validates :password, password_complexity: true, allow_blank: true
  validates :password, presence: true, password_complexity: true, on: [:add_password_form, :edit_password_form]

  validates :username, uniqueness: true, allow_blank: true
  validates :username, presence: true, on: [:add_username_form, :edit_username_form]
end

Note: it’s possible to add custom validation contexts on before_validation and after_validation callbacks, but not others like before_save, after_commit, etc. only take the non-custom callbacks like on: :create.

So to wrap it up: sure, maybe it can all go in the Active Record model.

Recently, March 26, 2025

  • I am on a new work adventure. I gave my notice at GitHub and will be doing this full-time starting in April. The new job should be a nice combination of a cozy “this again” and some thrilling new.
  • I finished reading Careless People; recommend as a good sequence of business trainwrecks that will leave you wondering if this one is penultimate trainwreck (spoiler: it’s not). Now I’m reading Wicked; I didn’t really like the beginning but it’s gotten more interesting.
  • I finished Severance. Hopefully without spoilers, the consistent plot driver seems to be “Mark (yes) sucks”. So now just White Lotus and with palate cleansers of Say Yes to the Dress.
  • I have been desultorily playing Bracket City; the scoring system generates no motivation for me but it’s fun to have found another use for the decades spent training my brain to parse deeply nested hierarchical syntax. I was also told that LinkedIn has games, and other than being “Faster than 95% of CEOs” at Queens, I have already lost my streak.
  • I asked on Rails Performance Slack how to better delegate Rails model association accessors and got some good ideas.
  • My RailsConf session proposal was accepted! See you there 🙌

Recently, March 16, 2025

  • We have promoted another cat to fostering: Merlin, the cat formerly known as Gray Cat.
  • I finished the latest Bruno, Chief of Police book. I read it for the food and culture, but it has some bad descriptions of hacking in this one. I started The Midnight Library, which as close as you can imagine to a TED talk but actually a novel. Next is Careless People, which I’m looking forward to; hopefully as exhilarating/vicariously-traumatic as Exit Interview.
  • At work the latest is that all planning must snap to 1-month objectives. “If you don’t produce a plan, someone will produce one for you” is an advice. Super proud of the work: doing Pitchfork, kicking the tires on ruby/json, adding more knobs to Active Record. My Incident Commander shift was this week too; 2pm - 8pm really destroys the possibility of pre-dinner errands. I did go to a Mustqche Harbor show while on secondary Friday night and nothing bad happened (though I was still 15 minutes from home should something have).
  • I bought a RailsConf supporter ticket. I submitted a panel discussion talk, so even if that’s accepted, I think I’ll still need a ticket. It’s for a good cause.
  • Some new Playdate games. Echo: The Oracle’s Scroll was the only one I’ve beaten so far, despite fiddly jumping puzzle, and the surprise ending by which I mean I was surprised when I went to chat with an owl-person and then the credits rolled.
  • I am recovering from pink eye, again. Reinfecting yourself is a thing that can happen. The last six months or so have not been my favorite, minor ailments-wise.
  • Folks on Rails Performance Slack asked about Cursor rules, which was an opportunity for me to consolidate mine from several projects. I dunno, it’s ok.
  • After about a month, I think I’m an iPad Mini person. The screen is very not good, but I guess “the best screen is the screen you have” when said screen is bigger than a phone, but smaller than 11 inches.
  • This is the most SF press release and I can’t wait.

Addressing it directly

Lost to time in my Code for America email’s sent folder was a list of reasons why deferring to software engineers can be problematic. It included this theme, from Will Larson’s “Building personal and organizational prestige”:

In my experience, engineers confronted with a new problem often leap to creating a system to solve that problem rather than addressing it directly. I’ve found this particularly true when engineers approach a problem domain they don’t yet understand well, including building prestige.

For example, when an organization decides to invest into its engineering brand, the initial plan will often focus on project execution. It’ll include a goal for publishing frequency, ensuring content is representationally accurate across different engineering sub-domains, and how to incentivize participants to contribute. If you follow the project plan carefully, you will technically have built an engineering brand, but my experience is that it’ll be both more work and less effective than a less systematic approach.

Sometimes you just do stuff.

Flattening the curve for the safety net, five years later

It’s been 5 years since the start of the COVID-19 pandemic. From my notebook, I found a brief presentation I gave at Code for America in April, 2020 about that first month of the pandemic and the positive impact that GetCalFresh had during the initial lockdown and economic turmoil. There’s a contemporary postscript at the end too.

The idea of flattening the curve is to create time and space to build up the system capacity and avoid a catastrophic failure leading to greater social disruption and deaths.

Within the social safety net, like the healthcare system, there is a limited systemic capacity to help people. Within the social safety net, catastrophic failure is not only that people aren’t able to apply for or receive benefits because the systems to receive and process their applications are overloaded, but also that they lose trust in society and government entirely as a result.

Demand for CalFresh / SNAP / Food Stamps has massively increased over the past month. Our digital assister, GetCalFresh.org, has seen 6x the number of applicants, with a peak of over 9,000 applications per day.

The government and their contractors are beefing up the capacity of their own systems to deal with the increased volume but it’s taken them several weeks to marshal those resources.

During this time period of massive demand, these government-managed systems have suffered, leading to client-facing error messages, timeouts and service degradations.

GetCalFresh, independently operated by Code for America and funded by CDSS (California Department of Social Services) and private philanthropy, has been online, stable and accepting applications this entire time, giving CalFresh applicants a path for submitting their applications regardless of the stability or availability of the underlying government systems. GetCalFresh is able to accept and hold those applications until they can be successfully processed through the government systems, once their outage is fixed or during non-peak usage times like overnight.

GetCalFresh is a fantastic resource for Californians. And we’re seeing heavy promotion of GetCalFresh, likely because of the quality and stability of our system.

GetCalFresh is now assisting two-thirds of all statewide CalFresh applications.

And we’re maybe starting to see the government systems stabilize. Over the past 3 days we’ve observed a decrease in error rates and an increase in stability when interfacing with these government systems, which should also be comparable to how applicants would experience these government websites too. This implies that the government is successfully growing their capacity to address the increased volume of applicants.

GetCalFresh has been a critical resource in ensuring that people-in-need can get safety-net resources during this unprecedented pandemic and maintain trust between themselves, society, and government. 👍


Postscript (2025)

Here we are, 5 years later. Of what I remember of putting this presentation together, it came of a desperation to find a story, a meaning, to the grief and fear and exhaustion of that first month. It creates a narrative arc: that things were fucked, and through the specificity of our efforts, they became unfucked. I believe that discovering the tidy stories in what we have done is inarguably a necessary comfort. And such stories are, inarguably too, inadequate at giving certainty to what we must do next.

I’m immensely proud of what we accomplished during this time. It strengthens my conviction of what small, durable, cross-functional teams, supported by stable, well-funded organizations with long-term goals, can accomplish together. And every act and decision I see leading up to that, during the good times: every boring technology decision, every generalist full-stack hire, every retrospective and team norms and career ladder conversation… it was worth it, because we performed how we had previously practiced together: exemplary.

And what the fuck! I have to reflect on this in the contemporary context of DOGE and the gutting of 18F and USDS and everyone else and any sense of stability or generative capacity in our federal government and the trickle down it will have everywhere. My original presentation is rather bland in calling them “Government Systems” but in reality these are systems that have already been outsourced, for decades, to private enterprise. They fell over, badly. And us, some stupid nonprofit geeks playing house in silicon valley, we happened to be there to hold things together for 60 million Californians until the safety-net could be stood back up again. Whatever the fuck DOGE is doing is bad. To face the dangers of an uncertain world, we need more capacity in-house in government, not less. I am angry, still.

There’s so much more that must be done.

Ruby “Thread Contention” is simply GVL Queuing

There’s been a ton of fantastic posts from Jean Boussier recently explaining application shapes, instrumenting the GVL (Global VM Lock), and thoughts on removing the GVL. They’re great reads!

For the longest time, I’ve misunderstood the phrase “thread contention”. It’s a little embarrassing that given I’m the author of GoodJob (👍) and a maintainer of Concurrent Ruby and have been doing Ruby and Rails stuff for more than a decade. But true.

I’ve been reading about thread contention for quite a while.

Through all of this, I perceived thread contention as contention: a struggle, a bunch of threads all elbowing each other to run and stomping all over each other in a an inefficient, disagreeable, disorganized dogpile. But that’s not what happens at all!

Instead: when you have any number of threads in Ruby, each thread waits in an orderly queue to be handed the Ruby GVL, then they gently hold the GVL until they graciously give it up or it’s politely taken from them, and then the thread goes to the back of the queue, where they patiently wait again.

That’s what “thread contention” is in Ruby: in-order queuing for the GVL. It’s not that wild.

Let’s go deeper

I came to this realization when researching whether I should reduce GoodJob’s thread priority (I did). This came up after some exploration at GitHub, my day job, where we have a maintenance background thread that would occasionally blow out our performance target for a particular web request if the background thread happened to run at the same time that the web server (Unicorn) was responding to the web request.

Ruby threads are OS (operating system) threads. And OS threads are preemptive, meaning the OS is responsible for switching CPU execution among active threads. But, Ruby controls its GVL. Ruby itself takes a strong role in determining which threads are active for the OS by choosing which Ruby thread to hand the GVL to and when to take it back.

(Aside: Ruby 3.3 introduced M:N threads which decouples how Ruby threads map to OS threads, but ignore that wrinkle here.)

There’s a very good C-level explanation of what happens inside the Ruby VM in The Ruby Hacking Guide. But I’ll do my best to explain briefly here:

When you create a Ruby thread (Thread.new), that thread goes into the back of a queue in the Ruby VM. The thread waits until the threads ahead of it in the queue have their chance to use the GVL.

When the thread gets to the front of the queue and gets the GVL, the thread will start running its Ruby code until it gives up the GVL. That can happen for one of two reasons:

  • When the thread goes from executing Ruby to doing IO, it releases the GVL (usually; it’s mostly considered a bug in the IO library if it doesn’t). When the thread is done with its IO operation, the Thread goes to the back of the queue.
  • When the thread has been executing for longer than the length of the thread “quantum”, the Ruby VM takes back the GVL and the thread steps to the back of the queue again. The Ruby thread quantum default is 100ms (this is configurable via Thread#priority or directly as of Ruby 3.4).

That second scenario is rather interesting. When a Ruby thread starts running, the Ruby VM uses yet another background thread (at the VM level) that sleeps for 10ms (the “tick”) and then checks how long the Ruby thread has been running for. If the thread has been running for longer then the length of the quantum, the Ruby VM takes back the GVL from the active thread (“preemption”) and gives the GVL to the next thread waiting in the GVL queue. The thread that was previously executing now goes to the back of the queue. In other words: the thread quantum determines how quickly threads shuffle through the queue and no less/faster than the tick.

That’s it! That’s what happens with Ruby thread contention. It’s all very orderly, it just might take longer than expected or desired.

What’s the problem

The dreaded “Tail Latency” of multithreaded behavior can happen, related to the Ruby Thread Quantum, when you have what might otherwise be a very short request, for example:

  • A request that could be 10ms because it’s making ten 1ms calls to Memcached/Redis to fetch some cached values and then returns them (IO-bound Thread)

⠀…but when it’s running in a thread next to:

  • A request that takes 1,000ms and largely spends its time doing string manipulation, for example a background thread that is taking a bunch of complex hashes and arrays and serializing them into a payload to send to a metrics server. Or rendering slow/big/complex views for Turbo Broadcasts (CPU-bound Thread)

In this scenario, the CPU-bound thread will be very greedy with holding the GVL and it will look like this:

  1. IO-bound Thread: Starts 1ms network request and releases GVL
  2. CPU-bound Thread: Does 100ms of work on the CPU before the GVL is taken back
  3. IO-bound Thread: Gets GVL again and starts next 1ms network request and releases GVL
  4. CPU-bound Thread: Does 100ms of work on the CPU before the GVL is taken back
  5. Repeat … 8 more times…
  6. Now 1,000 ms later, the IO-bound Thread, which ideally would have taken 10ms is finally done. That’s not good!

That’s the worse case in this simple scenario with only two threads. With more threads of different workloads, you have the potential to have even more of a problem. Ivo Anjo also wrote about this too. You could speed this up by lowering overall thread quantum, or by reducing the priority of the CPU-bound thread (which lowers the thread quantum). This would cause the CPU-bound thread to be more finely sliced, but because the minimum slice is governed by the tick (10ms) you’d never get below a theoretical maximum of 100ms for the IO-bound thread; 10x more than optimal.

Living Parklife with Rails, coming from Jekyll

I recently migrated this blog from Jekyll to Ben Pickles’s Parklife and Ruby on Rails, still hosted as a static website on GitHub Pages. I’m pretty happy with the experience.

I’m writing this not because I feel any sense of advocacy (do what you want!) but to write down the reasons for myself. Maybe they’ll rhyme for you.

Here’s this blog’s repo if you want to see: https://github.com/bensheldon/island94.org

Background

I’ve been blogging here for 20 years and this blog has been through it all: Drupal, Wordpress, Middleman, Jekyll, and now Parklife+Rails.

For the past decade the blog has largely been in markdown files, which I don’t intend to change. Over the past 2 years I also exported 15 years of pinboard/del.icio.us bookmarks, and my Kindle book highlights into markdown-managed files too. I’ve also dialed in some GitHub Action and Apple Shortcut powered integrations. I’m really happy with Markdown files in a git repo, scripted with Ruby.

…but there’s more than just Ruby.

Mastery

I’m heavily invested in the Ruby on Rails ecosystem. I think it’s fair to say I have mastery in Rails: I’m comfortable building applications with it, navigating and extending the framework code, intuiting the conceptual vision of the core team, and being involved in the life of the comunity where I’ve earned some positive social capital to spend as needed.

I don’t have that in Jekyll. I am definitely handy in Jekyll. I’ve built plugins, I’ve done some wild stuff with liquid. But I’m not involved with Jekyll in the everyday sense like I am with Rails. I feel that when I go to make changes to my blog. There’s a little bit of friction mentally switching over to liquid, and Jekyll’s particular utilities that are similar to but not the same as Action View and Active Support. Jekyll is great; it’s me and the complexity of my website that’s changed.

(I do still maintain some other Jekyll websites; no complaints elsewhere.)

Parklife with Ruby on Rails

I hope I’m not diminishing Parklife by writing that it isn’t functionally much more than wget. It’s in Ruby and mounts+crawls a Rack-based web application, does a little bit to rewrite the base URLs, and spits out a directory of static HTML files. That’s it! It’s great!

It was pretty easy for me to make a lightweight Ruby on Rails app, that loaded up all my markdown-formatted content and frontmatter, and spat them out again as Controllers and ERB Views.

This blog is about 7k pages. For a complete website artifact:

  • Jekyll: takes about 20 seconds to build
  • Parklife with Rails: takes about 20 seconds to build

In addition to the productivity win for me of being able to work with the ERB and Action View helpers I’m familiar with, I also find my development loop with Parklife and Rails is faster than Jekyll: I don’t have to rebuild the entire application to see the result of a single code or template change. I use the Rails development server to develop, not involving Parklife at all. On my M1 MBP a cold boot of Rails takes less than a second, a code reload is less than 100ms, and most pages render in under 10ms.

With Jekyll, even with --incremental, most development changes required a 10+ second rebuild. Not my favorite.

The technically novel bits

  1. if you want to trigger Rails code reload with any arbitrary set of files, like a directory of markdown files, you use ActiveSupport::FileUpdateChecker (which has a kind of complicated set of arguments):

    # config/application.rb
    self.reloaders << ActiveSupport::FileUpdateChecker.new([], {
      "_posts" => ["md", "markdown"],
      "_bookmarks" => ["md", "markdown"],
    }) do
      Rails.application.reload_routes!
    end
    
  2. Each of my blog posts has a list of historical redirects stored in their frontmatter (a legacy of so many framework changes). I had to think about how to do a catch-all route to render a static meta-refresh template:

    # config/routes.rb
    Rails.application.routes.draw do
      get "*path", to: "redirects#show", constraints: ->(req) { Redirect.all.key? req.path.sub(%r{\A/}, "").sub(%r{/\z}, "") }
      # ...all the other routes
    end
    

In conclusion

Here’s this blog’s Parkfile. I did a little bit of convenience monkeypatching of things I intend to contribute upstream to Parklife. I dunno, maybe you’ll like the Parklife too.

How I’m thinking about AI (LLMs)

With AI, in my context we’re talking about LLMs (Large Language Models), which I simplify down to “text generator”: they take text as input, and they output text.

I wrote this to share with some folks I’m collaborating with on building AI-augmented workflows. I’ve struggled to find something that is both condensed and whose opinionations match my own. So I wrote it myself.

The following explanation is intended to be accurate, but not particularly precise. For example, there is ChatGPT the product, there is an LLM at the bottom, and then in the middle there are other functions and capabilities. Or Claude or AWS Nova or Llama. These things are more than *just* LLMs, but they are also not much more than an LLM. Some of these tools can also interpret images and documents and audio and video. To do so, they’re passing those documents through specialized functions like OCR (optical character recognition), voice-recognition and image-recognition tools and then those results are turned into more text input. And some of them can take “Actions” with “Agents” which is still based on text output, just being structured and fed into something else. It’s text text text.

(also, if something is particularly wrong, let me know please)

A little about LLMs

The language around LLMs and “AI” is fucked up with hype and inappropriate metaphors. But the general idea is there is two key phases to keep track of:

  1. Training: Baking the model. At which point it’s done. I don’t know anyone who is actually building models Everyone is using something like OpenAI or Claude or Llama. And even while these things can be “fine tuned” I don’t know anyone doing it; operating at the model level requires input data on the order of tens of thousands of inputs/examples.
  2. Prompting: Using the model, giving input and getting output. This is everything the vast majority of developers are doing.

That’s it. Those are the only two buckets you need to think about.

1. Training

The way AI models get made is to first collect trillions of pages of written text (an example is Common Crawl which scrapes the Internet). Then use machine learning to identify probabilistic patterns that can be represented by only several billion variables (floating point numbers). This is called “Pre Training”. At this point, you can say “Based on the input data, it’s probabilistically likely that the word after “eeny meany miney” is “moe”.

Then there is the phase of “Fine Tuning” which makes sure that longer strings of text input are completed in ways that are intended (never right or wrong, just intended or expected). For example, if the text input is “Write me a Haiku about frogs” you expect a short haiku about frogs and not a treatise on the magic of the written word or amphibians. Fine tuning is largely accomplished by tens of thousands of workers in Africa and South Asia reading examples of inputs and outputs and clicking 👍 or 👎 on their screen. This is then fed back into machine learning models to say, of the billion variables, which variables should get a little more or less oomph when they’re calculating the output. Fine Tuning requires tens of thousands of these scored examples; again, this is probabilistic-scale stuff. This can also be called RLHF (Reinforcement Learning from Human Feedback), though that sometimes also refers to few-shot prompting, which is Prompt-phase (“Learning” is a nonsense word in the AI domain; it has zero salience without clarifying which phase you’re talking about). A lot of the interesting fine-tuning, imo, comes from getting these text generators to:

  • Read like a human chatting with you, rather than textual diarrhea
  • Getting structured output, like valid JSON, rather than textual diarrhea

Note: You can mentally slot in words like “parameters”, “dimensions”, “weights” and “layers” into all this. Also whenever someone says “we don’t really know how they work” what they really mean is “there’s a lot of variables and I didn’t specifically look at them all”. But that’s no different than being given an Excel spreadsheet with several VLOOKUPS and functions and saying “sure, that looks ok” and copy-pasting the report on to your boss; I mean, you could figure it all out, but it seems to work and you’re a busy person.

Ok, now we’re done with training. The model at this point is baked and no further modification takes place: no memory, no storage, no “learning” in the sense of a biological process. From this point further they operate as a function: input in, output out, no side effects.

Here’s how AWS Bedrock, which is how I imagine lots of companies are using AI in their product, describes all this:

After delivery of a model from a model provider [Anthropic, OpenAI, Meta] to AWS, Amazon Bedrock will perform a deep copy of a model provider’s inference and training software into those accounts for deployment. Because the model providers don’t have access to those accounts, they don’t have access to Amazon Bedrock logs or to customer prompts and completions.

See! It’s all just dead artifacts uploaded into S3, that are then loaded onto EC2 on-demand. A fancy lambda! Nothing more.

2. Prompting

Prompting is when we give the model input, and then it gives back some output. That’s it. Unless we are specifically collecting trillions of documents, or doing fine-tuning against thousands of examples (which we are NOT!), we are simply writing a prompt, and having the model generate some text based on it. It riffs. The output can be called “completions” because they’re just that: More words.

(Fun fact: how to get the LLM to stop writing words is a hard problem to solve)

Note: You might sometimes see prompting called model “testing” (as opposed to model building or training). That’s because you’re powering up the artifact to put some words through it. Testing testing is called “Evaluations” (“Evals” for short) and like all test test regimes the lukewarm debate I hear from everybody is “we aren’t but should we?”

Writing prompts

This is the work! Unfortunately the language used to describe all of this is truly and totally fucked. By which I mean that words like “learning” and “thought” and “memory” and even “training” is reused again.

It’s all about simply writing a prompt that boops the resulting text generator output into the shape you want. It’s all snoot-booping, all the time.

Going back to the Training data, let’s make some conceptual distinctions:

  • Content: specific facts, statements and assertions that were (possibly) encoded into those billions of probabilities from the training data
  • Structure: the overall probability that a string of words (really fragments of words) comes out again looking like something we expect, which has been adjusted via Fine Tuning

Remember, this is just a probabilistic text generator. So there is probabilistic facts, and probabilistic structure. And that probabilistic part is why we have words like “hallucination” and “slop” and “safety”. There’s no there there. It’s just probabilities. There’s no guarantee that a particular fact has been captured in those billions of variables. It’s just a text generator. And it’s been trained on a lot of dumb shit people write. It’s just a text generator. Don’t trust it.

So on to some prompting strategies:

  • Zero-Shot Prompting: This just means to ask something open-ended and the AI returns something that probabilistical follows:

    Classify the sentiment of the following review as positive, neutral, or negative: “The quality is amazing, and it exceeded my expectations”

  • Few-Shot (or one-shot/multi-shot) Prompting: This just means to provide one or more examples of “expected” completions in the prompt (remember, this is all prompt, not fine-tuning) to try to narrow down what could probabilistically follow:

    Task: Classify the sentiment of the following reviews as positive, neutral, or negative.
    Examples:

    1. “I absolutely adore this product. It’s fantastic!” - positive
    2. “It’s okay, not the best I’ve used.” - neutral
    3. “This is terrible. I regret buying it.” - negative
      Now classify this review:
    4. “The quality is amazing, and it exceed my expectations” - [it’s blank, for the model to finish]

Note: Zero/One/Few/Multi-Shot is sometimes called “Learning” instead of “Prompting”. This is a terrible name, because there is no learning (the models are dead!) but is one of those things where the most assume-good-intent explanation is that over the course of the prompt and its incrementally generated completion that the output assumes the desired shape.

  • Chain of Thought Prompting: The idea here is that the prompt includes a description of how a human might explain what they were doing to complete the prompt. And that boops the completion into filling out those steps, and arriving at a more expected answer:

    Classify the sentiment of the following review as positive, neutral, or negative.
    “I absolutely adore this product. It’s fantastic!”
    Analysis 1: How does it describe the product?
    Analysis 2: How does it describe the functionality of the product?
    Analysis 3: How does it describe their relationship with the product?
    Analysis 4: How does it describe how friends, family, or others relate to the product?
    Overall: Is it positive, neutral, or negative?

Note: again, there is no “thought” happening. The point of the prompt is to boop the text completion into giving a more expected answer. There are some new models (as of late 2024) that are supposed to do Chain-of-Thought implicitly; afaik there is just a hidden/unshown prompt that says “break this down into steps” and an intermediate output that takes the output of that and feeds it into another hidden/unshown prompt and then the output of that is shown to you. That’s why they costs more, cause it’s invoking the the LLM twice on your behalf.

  • Chain Prompting: This simply means that you take the output of one prompt, and then feed that into a new prompt. This can be useful to isolate a specific prompt and output. It might also be necessary because of the length: LLMs can only operate on so many words, so if you need to summarize a long document in a prompt, you’d need to first break it down into smaller chunks, use the LLM to summarize each chunk, and then combine the summaries into a new prompt for the LLM summarize that.
  • RAG (Retrieval Augmented Generation) Prompting: This means that you look up some info in a database, and then insert that into the prompt before handing it to the LLM. Everything is prompt, there is only prompt.

Note: “Embeddings” are a way of search indexing your text. LLMs take all those trillions of documents and probabilistically boils them down to several billion variables. Embeddings boil down further to a couple thousand variables (floating point numbers). Creating an embedding means providing a piece of text, and you get back the values of those thousand floating point numbers that probabilistically describe that text (big brain idea: it is the document’s location in a thousand-dimensional space). That lets you compute across multiple documents “given this document within the n-dimensional space, what are its closest neighboring documents semantically/probabilistically?” Embeddings are useful when you want to do RAG Prompting to pull out relevant documents and insert their text into your prompt before it’s fed to the LLM to generate output.

  • Cues and Nudges There are certain phrases, like “no yapping” or “take a deep breath” that change the output. I don’t think there is anything delightful about this; it’s simply trying to boop up the variables you want in your output and words are the only inputs you have. I’m sure there will someday be better ways to do it, but whatever works.

A strong opinion about zero-shot prompting

Don’t do it! I think it’s totally fine if you just want to ask a question and try to intuit the extent that the model has been trained or tuned on the particular domain you’re curious about. But you should put ZERO stock in the answer as something factual.

If you need facts, you must provide the facts as part of your prompt. That means:

  • Providing a giant pile of text as content, or breaking it down (like via embeddings) and injecting smaller chunks via RAG
  • Providing any and all input you ever expect to get out of the output

It’s ok to summarize, extract, translate or sentiment. The only reason it’s ok to zero-shot code is because it’s machine verifiable (you run it). Otherwise, you must verify! Or don’t do it at all.

Including Rails View Helpers is a concern

If you’re currently maintaining a Ruby on Rails codebase, I want you to do a quick regex code search in your Editor:

include .*Helper

Did you get any hits? Do any of those constants point back to your app/helpers directory? That could be a problem.

Never include a module from app/helpers into anything in your application. Don’t do it.

  • Modules defined in app/helpers should exclusively be View Helpers. Every module in the app/helpers directory is automatically included into Views/Partials, and available within Controllers via the helpers proxy e.g. helpers.the_method in Controllers or ApplicationController.helpers.the_method anywhere else.
  • Including View Helpers into other files (Controllers, Models, etc.) creates a risk that some methods may not be safely callable because they depend on View Context that isn’t present. (They’re also hell to type with Sorbet.)
  • If you do have includable mixins (“bucket of methods”) that do make sense to be included into lots of different classes (Controllers, Models, Views, etc.), make them a concern and don’t put them in app/helpers.

Some general background

Rails has always had View Helpers. Prior to Rails 2 (~2009), only the ApplicationHelper was included into all controller views and other helpers would have to be added manually. Rails 2 changed the defaults via helpers :all and config.action_controller.include_all_helpers to always include all Helpers in all Views.

Rails 4.0 (2012) introduced Concerns, which formalized conventions around extracting shared behaviors into module mix-ins.

Rails 5.0 (2016) introduced the Action Controller helpers proxy, and clearly summarizes the problem that I’ve observed too:

It is a common pattern in the Rails community that when people want to use any kind of helper that is defined inside app/helpers they includes the helper module inside the controller like:

module UserHelper
  def my_user_helper
    # ...
  end
end

class UsersController < ApplicationController
  include UserHelper

  def index
    render inline: my_user_helper
  end
end

This has problem because the helper can’t access anything that is
defined in the view level context class.

Also all public methods of the helper become available in the controller
what can lead to undesirable methods being routed and behaving as
actions.

Also if you helper depends on other helpers or even Action View helpers
you need to include each one of these dependencies in your controller
otherwise your helper is not going to work.

Some specific background

This has come up as a problem at my day job, GitHub. GitHub has the unique experience of being one of the oldest and largest Ruby on Rails monoliths and it’s full of opportunities to identify friction, waste, and toil.

Disordered usage of View Helpers and the app/views directory became very visible as we’ve been typing our monolith with Sorbet. Typing module mixins in Sorbet is itself inherently difficult, but View Helpers had accumlated a significant amount of T.unsafe escape-hatches and in understanding why… we discovered that explicitly including View Helpers in lots of different types of classes was a cause.

What’s the alternative?

I analyzed the different types of modules that were being created, and came up with this list:

  • Concerns are shared behaviors that may be optionally included into multiple other classes/objects when that behavior is desired. We can further break down:
    • Application-level Concerns are agnostic about the kind of object they are included into (could be a controller, or model, or a job, or a PORO)
    • Component-level Concerns are intended to only be mixed into a specific kind of object, like a controller with expectations that controller-methods are available to be used in that concern (like an http request object, or other view helpers like path helpers)
  • Dependencies are non-shared behaviors that have been extracted into a module from a specific, singular controller to improve behavioral cohesion, and is then included back into that one, specific class or object.
  • View Helpers are intended to be across Views (or Controllers via the helpers view-proxy method in Controllers or ApplicationController.helpers anywhere else) for formatting and presentation purposes and have access to other view helpers or http request objects. These are the only objects that modules that should go in app/helpers.

And this is what you might do about them:

  • Stop and remove include MyHelper from Controllers. Instead, you can access any View Helper method in a controller via helpers.the_name_of_the_method
  • Move Concerns and Dependencies out of app/helpers. If what is currently in app/helpers is not a View Helper, move it:
    • Application-level Concerns should be moved into app/concerns
    • Component-level Concerns should be moved into their appropriate app/controllers/concerns or your_package/models/concerns, etc.
    • Dependencies should be moved to an appropriate place in their namespaces hierarchy. e.g. if the module is only included into ApplicationController, it should be named ApplicationController::TheBehavior and live in app/controllers/application_controller/the_behavior.rb
  • Never include a module from app/helpers anywhere. Don’t do it.
  • Use the Controller helpers proxy or ApplicationController.helpers.the_helper_method to access helpers (like ActionView::Helpers::DateHelper) in Controller or other Object contexts.
  • Invert the relationship between Helpers and Concerns. If you have behavior that you want available to lots of different kinds of components and views, start by creating a Concern, and then include that Concern into a View Helper or ApplicationHelper. Don’t go the other direction.
  • Invert the relationship between Views and Controllers. If you have a private method that is specific to a single controller, and you want to expose that method to the controller’s views, you can expose that method to the views directly using helper_method :the_method_name . Use this sparingly, because it extends singleton View objects which deoptimizes the Ruby VM; but really, don’t twist yourself into knots to avoid it either, that’s what it’s there for.
  • (optional but recommended) Rename the constant too, not just move it, when it’s not a View Helper. Naming things is hard, but *Helper is… not very descriptive. While it’s the placement in app/helpers that brings the automatic behavior… so it’s not technically a problem to have a SomethingHelper that isn’t a View Helper living in app/controllers/concerns … it is confusing to have non-helpers named SomethingHelper. Some suggestions for renaming Concerns and Dependencies:
    • Use the “-able” suffix to turn the behavior or capability into an adjective. e.g. SoftDeletable
    • Append Dependancy to the end, like AbilityDependency
    • If you’re out of ideas, use Methods or Mixin, like UserMethods or UserMixin.

Keep your secrets.yml in Rails 7.2+

Ruby on Rails v7.1 deprecated and v7.2 removed support for Rails.application.secrets and config/secrets.yml in favor of Encrypted Credentials. You don’t have to go along with that! I like Secrets functionality because it allows for consolidating and normalizing ENV values in a single configuration file with ERB (Encrypted Credentials doesn’t).

It’s extremely simple to reimplement the same behavior using config_for and the knowledge that methods defined in application.rb show as methods on Rails.application:

# config/application.rb

module ExampleApp
  class Application < Rails::Application
    # ....
    config.secrets = config_for(:secrets) # loads from config/secrets.yml
    config.secret_key_base = config.secrets[:secret_key_base]

    def secrets
      config.secrets
    end
  end
end

That is all you need to continue using a secrets.yml file that looks like this:

# config/secrets.yml

defaults: &defaults
  default_host: <%= ENV.fetch('DEFAULT_HOST', 'localhost:3000') %>
  twilio_api_key: <%= ENV.fetch('TWILIO_API_KEY', 'fake') %>
  mailgun_secret: <%= ENV.fetch('MAILGUN_SECRET', 'fake') %>

development:
  <<: *defaults
  secret_key_base: 79c6d24d26e856bc2549766552ff7b542f54897b932717391bf705e35cf028c851d5bdf96f381dc41472839fcdc8a1221ff04eb4c8c5fbef62a6d22747f079d7

test:
  <<: *defaults
  secret_key_base: 0b3abfc0c362bab4dd6d0a28fcfea3f52f076f8d421106ec6a7ebe831ab9e4dc010a61d49e41a45f8f49e9fc85dd8e5bf3a53ce7a3925afa78e05b078b31c2a5

# Do not keep production secrets in the repository,
# instead read values from the environment.
production:
  <<: *defaults
  secret_key_base: <%= ENV["SECRET_KEY_BASE"] %>
  default_host: <%= ENV['DEFAULT_HOST'] || (ENV['HEROKU_APP_NAME'] ? "#{ENV['HEROKU_APP_NAME']}.herokuapp.com": nil) %>

Note: This only works for secrets.yml not secrets.enc.yml which was called “Encrypted Secrets.” If you’re using “Encrypted Secrets” then you should definitely move over to the Encrypted Credentials feature.


Older posts