Scenarios

What is a scenario?

In Okareo, a scenario is a collection of data points, each of which is defined by an input and a corresponding expected result. A single data point of a scenario can be represented as json or dict object.

Scenarios describe the expected inputs and results of models, and they allow you to:

Evaluate classification, retrieval, or generation models via Okareo's evaluations.
Create synthetic data in Okareo via scenario generators.

Cookbook examples that showcase Okareo scenarios are available here:

Colab Notebook
Typescript Cookbook - Coming Soon

note

Try creating and generating scenarios for yourself with the companion Jupyter notebook - scenarios.ipynb

Formatting scenario data

The format of your inputs should match the format expected by your model. The format of your result is dependent on the type of evaluations you want to run on the scenario.

Classification

In classification scenarios, the results correspond to the expected category or label that the model should assign to the input. For example, a point in a classification scenario could look like the following:

{
    "input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
    "result": "rewards"
}

Here rewards indicates that the input should be classified into the rewards category. See the Get started with Classification page for more on classification evaluations.

Retrieval

For a retrieval evaluation, each result is a list of one or more viable document IDs that should be returned for the associated input, like the following:

{
    "input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
    "result": ["35a4fd5b-453e-4ca6-9536-f20db7303344"]
}

See our Retrieval Testing guide for more details on setting up scenarios retrieval evaluations!

Generation

Evaluation of generative models can either be referenced or reference-free. Referenced evaluations involve comparing the generative model's output to one or more references, and in such cases, the result field should contain the reference(s). For example,

{
    "input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
    "result": "With WebBizz Rewards, customers can earn points with each purchase and avail exclusive discounts."
}

When performing referenced evaluations, the reference in the result field will be compared against the model's outputs. The content of the reference depends on your use case and can vary from written responses to edited versions of the model's outputs.

For reference-free evaluations, the result field is not strictly necessary, meaning any placeholder value can be provided, e.g.

{
    "input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
    "result": "<YOUR_PLACEHOLDER_STRING_HERE>"
}

Get started on setting up such scenarios with our Generation evaluation guide.

Seed scenarios

To get started in Okareo, you will need to begin with a Seed scenario, so-called since it can serve as the "seed" for Generated scenarios. Any scenario that has been uploaded to or created in Okareo can serve as the Seed for a Generated scenario.

As of now, there are three paths to creating/designating a Seed scenario:

An uploaded file (.jsonl)
A static definition
An existing scenario (Seed or Generated)

Creating seed scenarios

To create a seed scenario with a .jsonl file, you can use the following:

Python
Typescript

seed_scenario = okareo.upload_scenario_set(
    file_path='./path/to/your/file.jsonl', 
    scenario_name="your_scenario_name"
)

const data: any = await okareo.upload_scenario_set({
    file_path: "./path/to/your/file.jsonl",
    scenario_name: "your_scenario_name",
    project_id: project_id
});

To create a seed scenario via a static definition, you can use the following:

Python
Typescript

from okareo_api_client.models import ScenarioSetCreate, SeedData

# list of statically defined seed data
seed_data=[
    SeedData(input_="input1", result="result1"),
    SeedData(input_="input2", result="result2"),
    SeedData(input_="input3", result="result3")
]

# request for scenario set creation 
scenario_set_create = ScenarioSetCreate(
    name="your_static_scenario_name",
    generation_type=ScenarioType.SEED,
    seed_data=seed_data
)

static_scenario = okareo.create_scenario_set(scenario_set_create)

import { Okareo, ScenarioType, SeedData } from 'okareo-ts-sdk';

// request for scenario set creation 
const static_scenario: any = await okareo.create_scenario_set({
    name: "your_static_scenario_name",
    project_id: project_id,
    generation_type: ScenarioType.SEED,
    seed_data: [
        SeedData(input:"input1", result:"result1"),
        SeedData(input:"input2", result:"result2"),
        SeedData(input:"input3", result:"result3")
    ]
});

Finally, to use a previously created scenario as a seed, you can call okareo.generate_scenarios with the proper scenario_id

Python
Typescript

# use the previously generated `static_scenario` to seed another generated scenario

new_generated_scenario = okareo.generate_scenarios(
    source_scenario=static_scenario.scenario_id,
    name="generated_seed_scenario"
)

// use the previously generated `static_scenario` to seed another generated scenario

const new_generated_scenario: any = await okareo.generate_scenario_set(
      {
        project_id: project_id,
        name: "generated_seed_scenario",
        source_scenario_id: static_scenario.scenario_id,
        number_examples: 5,
        generation_type: ScenarioType.NEGATION
      }
    )

Generating synthetic scenarios

Assuming you have an existing scenario to use as a Seed, Okareo lets you automatically generate synthetic test cases based on a suite of scenario generators.

Generated scenarios can be a powerful tool to improve your model evaluation pipeline by allowing you to:

Create new test cases automatically
Ensure robustness to input perturbations/human error

Here, we describe our available scenario generators in more detail and offer a few examples of potential use cases. You can try these generators for yourself by checking out scenarios.ipynb.

Rephrasing

The Rephrasing generator rewords each sentence of the input while keeping the same content. This can be useful when you want to ensure that your model returns the same results under semantically identical inputs.

Example

--------Seed #0--------
WebBizz is dedicated to providing our customers with a seamless online shopping experience. Our platform is designed with user-friendly interfaces to help you browse and select the best products suitable for your needs...
-----Generated #0------
WebBizz prioritizes a smooth digital shopping journey for our customers. Our platform is tailored with straightforward interfaces for easier product browsing and selection...

Relevant Terms

The Relevant Terms generator returns three terms based on tf-idf, meaning the terms are frequent in the the document and relatively less frequent in the larger corpus of the scenario's inputs. This can be useful when you'd like to produce queries based on keywords, a typical pattern that search engine users might use.

Example

--------Seed #2--------
WebBizz places immense value on its dedicated clientele, recognizing their loyalty through the exclusive 'Premium Club' membership. This special program is designed to enrich the shopping experience, providing a suite of benefits tailored to our valued members. Among the advantages, members enjoy complimentary shipping, granting them a seamless and cost-effective way to receive their purchases. Additionally, the 'Premium Club' offers early access to sales, allowing members to avail themselves of promotional offers before they are opened to the general public.
-----Generated #0------
offers members club

Misspellings

The Misspellings generator lets you create scenarios with human-like errors. This can be useful if your model will be used in a context where inputs are likely to be error-prone. For example, you may be evaluating a model used in a conversational context (e.g., as a customer service chatbot).

Example

--------Seed #0--------
The quick brown fox jumps over the lazy dog
-----Generated #0------
The quick brown fox jumps over the lazt dog
-----Generated #1------
The quick brown fox humps over the lazy dog

Contractions

The Contractions generator attempts to shorten words in a human-like way. Similar to Misspellings, this generator can be beneficial if your model will be seeing conversational inputs.

Example

--------Seed #0--------
The quick brown fox jumps over the lazy dog
-----Generated #0------
The quick brwn fox jumps over the lazy dog

Reverse Questions

The Reverse Question generator poses questions based on the contents of inputs in the seed scenario. This generator is particularly useful when assessing the robustness of a retrieval model.

Suppose you have a database of articles and you would like to generate questions that a user might pose to a chatbot. The Reverse Question generator can help you get coverage on a wide range of questions that potential customers might pose, allowing you to evaluate the chatbot's robustness on corner cases.

Example

--------Seed #0--------
WebBizz is dedicated to providing our customers with a seamless online shopping experience. Our platform is designed with user-friendly interfaces to help you browse and select the best products suitable for your needs. We offer a wide range of products from top brands and new entrants, ensuring diversity and quality in our offerings. Our 24/7 customer support is ready to assist you with any queries, from product details, shipping timelines, to payment methods. We also have a dedicated FAQ section addressing common concerns. Always ensure you are logged in to enjoy personalized product recommendations and faster checkout processes.
-----Generated #0------
What features does WebBizz offer to enhance the customer's online shopping experience?

Conditionals

The Conditional generator assumes that the input values are questions and rewords each question to emphasize a particular clause. This can be used in conjunction with the Reverse Question generator to further expand your test coverage in a retrieval scenario.

Example

--------Seed #4--------
What is the primary benefit of joining the WebBizz Rewards program?
-----Generated #0------
Should you decide to join the WebBizz Rewards program, what would be the primary benefit?

Synonyms

The Synonyms generator takes two scenarios as its input:

The seed scenario to modify
The synonym set scenario that defines the groups of synonyms to replace with one another

seed_scenario = okareo_client.create_scenario_set(
    ScenarioSetCreate( 
        name="seed_scenario",
        seed_data=[
            SeedData(input_="the quick brown fox jumps over the lazy dog", result="N/A"),
            SeedData(input_="the rain in spain falls mainly on the plain", result="N/A"),
        ]
    )
)

synonym_scenario = okareo_client.create_scenario_set(
    ScenarioSetCreate( 
        name="synonyms_scenario",
        seed_data=[
            SeedData(input_=["brown", "hazel"], result="N/A"),
            SeedData(input_=["lazy", "lethargic"], result="N/A"),
            SeedData(input_=["plain", "field"], result="N/A"),
        ]
    )
)

scenario_set_generate = ScenarioSetGenerate(
    source_scenario_id=seed_scenario.scenario_id,
    name="my_synonym_scenario",
    scenario_set_id=synonym_scenario.scenario_id,
)

Custom Generator

The Custom generator allows you to write your own prompts to generate data based on your seed scenario.

seed_scenario = okareo_client.create_scenario_set(
    ScenarioSetCreate( 
        scenario_set_create
        name="seed_scenario",
        seed_data=[
            SeedData(input_="Lorem ipsum dolor sit amet", result="N/A"),
            SeedData(input_="consectetur adipiscing elit, sed do", result="N/A"),
            SeedData(input_="eiusmod tempor incididunt ut labore", result="N/A"),
        ]
    )
)

scenario_set_generate = ScenarioSetGenerate(
    source_scenario_id=seed_scenario.scenario_id,
    name="my_custom_scenario,
    generation_type=ScenarioType.CUSTOM_GENERATOR,
    generation_prompt="generate the next 5 words of 'lorem ipsum' based on the following text: {input}",
)

Custom Multi-chunk Generator

The Custom Multi-chunk generator is an extension of the Custom Generator. The generator tries to group consecutive rows of your scenario together, then uses your prompt to generate new rows based on the grouped rows.

Note: The result field of your scenario should must be a rank-ordered index for the Multi-chunk generator to function properly.

seed_scenario = okareo_client.create_scenario_set(
    ScenarioSetCreate( 
        scenario_set_create
        name="seed_scenario_with_index",
        seed_data=[
            SeedData(input_="Lorem ipsum dolor sit amet", result="1"),
            SeedData(input_="consectetur adipiscing elit, sed do", result="2"),
            SeedData(input_="eiusmod tempor incididunt ut labore", result="3"),
        ]
    )
)

scenario_set_generate = ScenarioSetGenerate(
    source_scenario_id=seed_scenario.scenario_id,
    name="my_custom_multi_chunk_scenario,
    generation_type=ScenarioType.CUSTOM_MULTI_CHUNK_GENERATOR,
    generation_prompt="generate the next 5 words of 'lorem ipsum' based on the following chunks of text: {input}", # 'input' here corresponds to the grouped set of scenario inputs
)

Generator usage

To use a scenario generator, you can use the following template:

Python
Typescript

from okareo_api_client.models import ScenarioType
# assuming you have an available seed scenario `source_scenario`

okareo.generate_scenarios(
    source_scenario=source_scenario.scenario_id,
    name="generated_scenario",
    num_examples=1,
    generation_type=ScenarioType.REPHRASE_INVARIANT
)

import { Okareo, ScenarioType } from 'okareo-ts-sdk';
// assuming you have an available seed scenario `source_scenario`

const scenario: any = await okareo.generate_scenario_set({
    project_id: project_id,
    name: "generated_scenario",
    source_scenario_id: source_scenario.scenario_id,
    number_examples: 5,
    generation_type: ScenarioType.REPHRASE_INVARIANT
});

For each input in the seed scenario, the generator will attempt to generate num_examples variations of that input.

The generator type is denoted by the ScenarioType enum, and the above example uses the Rephrasing generator. To use a different generator, simply change the enum to a valid ScenarioType in the table below.

Generator	`ScenarioType`	Brief Description
Rephrasing	REPHRASE_INVARIANT	Changes the wording of each sentence per `input`.
Relevant Terms	TERM_RELEVANCE_INVARIANT	Returns relevant/uniquely identifying words from `input`s.
Misspellings	COMMON_MISSPELLINGS	Adds human-like typing errors to `input`s.
Contractions	COMMON_CONTRACTIONS	Removes characters from `input`.
Reverse Questions	TEXT_REVERSE_QUESTION	Creates questions where an `input` contains the relevant answer.
Conditionals	CONDITIONAL	Changes questions in `input`s to emphasize a specific condition.
Synonyms	SYNONYMS	Replace substrings in source scenario with user-specified synonyms.
Custom	CUSTOM_GENERATOR	Generate data based on the user's `generation_prompt`.
Custom Multi-Chunk	CUSTOM_MULTI_CHUNK_GENERATOR	Generate data based on the user's `generation_prompt` and groups of `input`s.

Chaining generators

Composing multiple generators into a chain can help you test different model behaviors. For example, suppose you have trained a retrieval model on user questions. You might want to see if the model performs well based on keyword queries with and without errors. You might set up a chain of generators as follows:

Python
Typescript

# static definition for retrieval questions as seed data
seed_data=[
    SeedData(input_="What type of products does WebBizz offer?", "result"= ["75eaa363-dfcc-499f-b2af-1407b43cb133"])
    ...
]

# upload the seed data
scenario_set_create = ScenarioSetCreate(
    seed_data=seed_data,
    name="Chain Step #1: Seed Questions",
    generation_type=ScenarioType.SEED
)

questions_scenario = okareo.create_scenario_set(scenario_set_create)

# first generator uses uploaded scenario as seed
term_relev_scenario = okareo.generate_scenarios(
    source_scenario=questions_scenario.scenario_id,
    name="Chain Step #2: Term Relevance",
    generation_type=ScenarioType.TERM_RELEVANCE_INVARIANT
)

# second generator uses the first generator's output as a seed
misspellings_scenario = okareo.generate_scenarios(
    source_scenario=term_relev_scenario.scenario_id,
    name="Chain Step #3: Misspellings",
    generation_type=ScenarioType.COMMON_MISSPELLINGS
)

# third generator uses the second generator's output as a seed
contractions_scenario = okareo.generate_scenarios(
    source_scenario=misspellings_scenario.scenario_id,
    name="Chain Step #4: Contractions",
    generation_type=ScenarioType.COMMON_CONTRACTIONS
)

import { Okareo, ScenarioType } from 'okareo-ts-sdk';
// static definition for retrieval questions as seed data

/* Pass data directly or upload a jsonl file in the following format:
* //filename: rag_intent_prompts.jsonl
*  {input:"What type of products does WebBizz offer?", "result": ["75eaa363-dfcc-499f-b2af-1407b43cb133"]}
*  {input:"input_2", "result": ["UUID_2"]}
*  {input:"input_3", "result": ["UUID_3"]}
*  ...
*/

// upload the jsonl seed data
const upload_scenario: any = await okareo.upload_scenario_set({
    file_path: "./path/to/your/file.jsonl",
    scenario_name: "Chain Step #1: Seed Questions",
    project_id: project_id
});

// first generator uses uploaded scenario as seed
const term_relev_scenario: any = await okareo.generate_scenario_set({
    source_scenario_id:upload_scenario.scenario_id,
    name="Chain Step #2: Term Relevance",
    generation_type:ScenarioType.TERM_RELEVANCE_INVARIANT
});

// second generator uses the first generator's output as a seed
misspellings_scenario = okareo.generate_scenarios({
    source_scenario_id:term_relev_scenario.scenario_id,
    name:"Chain Step #3: Misspellings",
    generation_type:ScenarioType.COMMON_MISSPELLINGS
});

# third generator uses the second generator's output as a seed
const contractions_scenario: any = await okareo.generate_scenarios({
    source_scenario:misspellings_scenario.scenario_id,
    name:"Chain Step #4: Contractions",
    generation_type:ScenarioType.COMMON_CONTRACTIONS
});

Now all the steps of the chain are available to use in evaluating your retrieval model.

Scenarios

What is a scenario?​

Formatting scenario data​

Classification​

Retrieval​

Generation​

Seed scenarios​

Creating seed scenarios​

Generating synthetic scenarios​

Rephrasing​

Relevant Terms​

Misspellings​

Contractions​

Reverse Questions​

Conditionals​

Synonyms​

Custom Generator​

Custom Multi-chunk Generator​

Generator usage​

Chaining generators​

What is a scenario?

Formatting scenario data

Classification

Retrieval

Generation

Seed scenarios

Creating seed scenarios

Generating synthetic scenarios

Rephrasing

Relevant Terms

Misspellings

Contractions

Reverse Questions

Conditionals

Synonyms

Custom Generator

Custom Multi-chunk Generator

Generator usage

Chaining generators