Tom Goldstein
Tom Goldstein
@tomgoldsteincs
Apr 22 28 days ago 9 tweets Read on X

LLMs have low randomness: if you ask the same thing twice you get similar responses. Generator prompts are a way to boost the randomness of LLMs.

Using a few generator prompts, I had Gemini write an entire instruction tuning dataset from scratch. It outperform popular datasets.

Tweet image 1

Let’s start with a toy example of why we need generator prompts. Suppose I want a list of different colors. So I feed this prompt to Gemini 1000 times. This does poorly - I only get 33 unique outputs from 1000 runs. I need more randomness.

Tweet image 1

A generator prompt asks the model to enumerate a long list of execution paths, and then randomizes which paths get chosen.

Here's an example. The numbers 23 and 76 are randomized each time the prompt is called.

This prompt gives me 782 unique outputs from 1000 runs.

Tweet image 1

Now let’s do something useful. I want to extract math questions/answers from Gemini. I’ll ask the model to enumerate a long list of different topics, then subtopics, then write me a question.

Tweet image 1

Run this prompt through Gemini Pro 500K times, and presto! The resulting math instructions out-perform the human-curated MathInstruct dataset.

Tweet image 1

The GenQA dataset contains 11M questions in 9 splits. You can find it on HuggingFace.

Tweet image 1

...or you can read our tech report on how to quickly distill your own bespoke instruction datasets using powerful models.

Finally, shout out to the recent Magpie dataset. It creates instructions by giving an empty prompt to Llama. Our “general” split was created the same way, but using GPT-3.5 instead of Llama. We focused on generator prompts because they are controllable, but empty prompts produce great "general" instruction sets like Magpie.

There's also a related paper from Yiming Zhang, , Nicolas Carlini, , and on prompting strategies for helping LLMs make random choices.

Missing some Tweet in this thread? You can try to Update

More Threads by @tomgoldsteincs

Diffusion models are powerful tools for image creation, working by adding noise to images and then removing it to genera...
23 tweets • 9 months ago
Read Thread
OpenAI is working on watermarking AI-generated text to stop bots and cheating. They embed tiny marks in the output that ...
12 tweets • 10 months ago
Read Thread

Unroll Another Thread

Convert any Twitter threads to an easy-to-read article instantly

Have you tried our Twitter bot?

You can now unroll any thread without leaving Twitter/X. Here's how to use our Twitter bot to do it.

  • Give us a follow on Twitter. follow us
  • Drop a comment, mentioning us @unrollnow on the thread you want to Unroll.
  • Wait For Some Time, We will reply to your comment with Unroll Link.
UnrollNow Twitter Bot
Modal Image
0:00 / 0:00