Improving jj-gcp with JSON Schema and schemars

I don’t use this tool that often, but it’s good for it to work correctly when I do!

Assumed audience: People interested in how building tools around generative AI systems actually works. Also assumes and does not explain a baseline of basic Rust knowledge, but you can probably understand the gist without that.

I just published my most recent updates to jj-gpc, a little tool I built back in November to generate more useful branch names than the form generated by jj itself.

Between when I shipped the tool and now, Ollama added support for using JSON Schema to control the output of the models you invoke, and ollama-rs, the Rust library I use that wraps Ollama, likewise added support for it by way of the lovely schemars crate.

The basic idea is: if you provide a grammar or set of constraints for the output of a model, you can get much better results when trying to get out structured data, because it’s possible to validate at generation time whether the next token is valid by checking it against the specified grammar. There are two basic approaches to this out there: using actual grammars, in roughly Backus-Naur form, and using JSON Schema with constraints on the data structures — but the latter can include regexes, which makes JSON Schema roughly comparable in power to a more formal grammar.

Now, which one is easier to write by hand is no debate: BNF wins hands down. Which one is easier to programmatically generate is something else entirely, though, because the schemars crate makes it trivial to generate JSON Schema from Rust data structures, even including fancy bits like regular expression patterns. When I say trivial, I mean it. The whole data structure I defined, after a bit of mucking around, was this:

#[derive(JsonSchema, Deserialize, Debug)]
struct Branch(
    #[schemars(regex(pattern = "^[a-z]{1,10}+(-[a-z]{1,10}){2,4}$"))]
    String
);

It is a tuple struct” which enforces a regex pattern on the String it wraps. That’s the whole thing. The regex requires that it be lowercase characters, separated by hyphens, with at least 3 and not more than 5 such words.

With that in place, I finally stopped getting occasional output like ```something-like-this, and yes, the ``` was being included in that. This constraint also meant I could get rid of the bit of inline regex-wrangling I was doing to normalize the output from the LLM.

The big takeaways for me from this exercise:

  1. Being able to use a JSON Schema like this is super handy for constraining model output. This is old news to folks who have been hacking on LLM-based tools for a while, and I have known about this for quite some time, but it’s always something else to put it into practice, and I thought it would probably be new to at least some of you reading, as well!

  2. The schemars library is fantastic. I used the inline version of regexes, because that was most convenient, but there are a bunch of other ways to do it, including referring to regexes in scope to refer to by name. Generating a JSON Schema with it was basically trivial.

  3. A bonus: Avoiding conversational” prompts can be handy. It is easy to get in the habit of treating all LLM interactions as a variant of the chat-based model most of the existing tools use, but in practice what I actually wanted here was a prompt that was completely non-conversational.