Scenarios
What is a scenario?
In Okareo, a scenario is a collection of data points, each of which is defined by an input
and a corresponding expected result
. A single data point of a scenario can be represented as json or dict object.
Scenarios describe the expected inputs and results of models, and they allow you to:
- Evaluate classification, retrieval, or generation models via Okareo's evaluations.
- Create synthetic data in Okareo via scenario generators.
Cookbook examples that showcase Okareo scenarios are available here:
- Colab Notebook
- Typescript Cookbook - Coming Soon
Try creating and generating scenarios for yourself with the companion Jupyter notebook - scenarios.ipynb
Formatting scenario data
The format of your input
s should match the format expected by your model. The format of your result
is dependent on the type of evaluations you want to run on the scenario.
Classification
In classification scenarios, the result
s correspond to the expected category or label that the model should assign to the input
. For example, a point in a classification scenario could look like the following:
{
"input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
"result": "rewards"
}
Here rewards
indicates that the input
should be classified into the rewards
category. See the Get started with Classification page for more on classification evaluations.
Retrieval
For a retrieval evaluation, each result
is a list of one or more viable document IDs that should be returned for the associated input
, like the following:
{
"input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
"result": ["35a4fd5b-453e-4ca6-9536-f20db7303344"]
}
See our Retrieval Testing guide for more details on setting up scenarios retrieval evaluations!
Generation
Evaluation of generative models can either be referenced or reference-free. Referenced evaluations involve comparing the generative model's output to one or more references, and in such cases, the result
field should contain the reference(s). For example,
{
"input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
"result": "With WebBizz Rewards, customers can earn points with each purchase and avail exclusive discounts."
}
When performing referenced evaluations, the reference in the result
field will be compared against the model's outputs. The content of the reference depends on your use case and can vary from written responses to edited versions of the model's outputs.
For reference-free evaluations, the result
field is not strictly necessary, meaning any placeholder value can be provided, e.g.
{
"input": "Can you explain how the WebBizz Rewards loyalty program works and its benefits?",
"result": "<YOUR_PLACEHOLDER_STRING_HERE>"
}
Get started on setting up such scenarios with our Generation evaluation guide.
Seed scenarios
To get started in Okareo, you will need to begin with a Seed scenario, so-called since it can serve as the "seed" for Generated scenarios. Any scenario that has been uploaded to or created in Okareo can serve as the Seed for a Generated scenario.
As of now, there are three paths to creating/designating a Seed scenario:
- An uploaded file (
.jsonl
) - A static definition
- An existing scenario (Seed or Generated)
Creating seed scenarios
To create a seed scenario with a .jsonl
file, you can use the following:
- Python
- Typescript
seed_scenario = okareo.upload_scenario_set(
file_path='./path/to/your/file.jsonl',
scenario_name="your_scenario_name"
)
const data: any = await okareo.upload_scenario_set({
file_path: "./path/to/your/file.jsonl",
scenario_name: "your_scenario_name",
project_id: project_id
});
To create a seed scenario via a static definition, you can use the following:
- Python
- Typescript
from okareo_api_client.models import ScenarioSetCreate, SeedData
# list of statically defined seed data
seed_data=[
SeedData(input_="input1", result="result1"),
SeedData(input_="input2", result="result2"),
SeedData(input_="input3", result="result3")
]
# request for scenario set creation
scenario_set_create = ScenarioSetCreate(
name="your_static_scenario_name",
generation_type=ScenarioType.SEED,
seed_data=seed_data
)
static_scenario = okareo.create_scenario_set(scenario_set_create)
import { Okareo, ScenarioType, SeedData } from 'okareo-ts-sdk';
// request for scenario set creation
const static_scenario: any = await okareo.create_scenario_set({
name: "your_static_scenario_name",
project_id: project_id,
generation_type: ScenarioType.SEED,
seed_data: [
SeedData(input:"input1", result:"result1"),
SeedData(input:"input2", result:"result2"),
SeedData(input:"input3", result:"result3")
]
});
Finally, to use a previously created scenario as a seed, you can call okareo.generate_scenarios
with the proper scenario_id
- Python
- Typescript
# use the previously generated `static_scenario` to seed another generated scenario
new_generated_scenario = okareo.generate_scenarios(
source_scenario=static_scenario.scenario_id,
name="generated_seed_scenario"
)
// use the previously generated `static_scenario` to seed another generated scenario
const new_generated_scenario: any = await okareo.generate_scenario_set(
{
project_id: project_id,
name: "generated_seed_scenario",
source_scenario_id: static_scenario.scenario_id,
number_examples: 5,
generation_type: ScenarioType.NEGATION
}
)
Generating synthetic scenarios
Assuming you have an existing scenario to use as a Seed, Okareo lets you automatically generate synthetic test cases based on a suite of scenario generators.
Generated scenarios can be a powerful tool to improve your model evaluation pipeline by allowing you to:
- Create new test cases automatically
- Ensure robustness to input perturbations/human error
Here, we describe our available scenario generators in more detail and offer a few examples of potential use cases. You can try these generators for yourself by checking out scenarios.ipynb.
Rephrasing
The Rephrasing generator rewords each sentence of the input
while keeping the same content. This can be useful when you want to ensure that your model returns the same results under semantically identical inputs.
Example
--------Seed #0--------
WebBizz is dedicated to providing our customers with a seamless online shopping experience. Our platform is designed with user-friendly interfaces to help you browse and select the best products suitable for your needs...
-----Generated #0------
WebBizz prioritizes a smooth digital shopping journey for our customers. Our platform is tailored with straightforward interfaces for easier product browsing and selection...
Relevant Terms
The Relevant Terms generator returns three terms based on tf-idf, meaning the terms are frequent in the the document and relatively less frequent in the larger corpus of the scenario's input
s. This can be useful when you'd like to produce queries based on keywords, a typical pattern that search engine users might use.
Example
--------Seed #2--------
WebBizz places immense value on its dedicated clientele, recognizing their loyalty through the exclusive 'Premium Club' membership. This special program is designed to enrich the shopping experience, providing a suite of benefits tailored to our valued members. Among the advantages, members enjoy complimentary shipping, granting them a seamless and cost-effective way to receive their purchases. Additionally, the 'Premium Club' offers early access to sales, allowing members to avail themselves of promotional offers before they are opened to the general public.
-----Generated #0------
offers members club
Misspellings
The Misspellings generator lets you create scenarios with human-like errors. This can be useful if your model will be used in a context where inputs are likely to be error-prone. For example, you may be evaluating a model used in a conversational context (e.g., as a customer service chatbot).
Example
--------Seed #0--------
The quick brown fox jumps over the lazy dog
-----Generated #0------
The quick brown fox jumps over the lazt dog
-----Generated #1------
The quick brown fox humps over the lazy dog
Contractions
The Contractions generator attempts to shorten words in a human-like way. Similar to Misspellings, this generator can be beneficial if your model will be seeing conversational inputs.
Example
--------Seed #0--------
The quick brown fox jumps over the lazy dog
-----Generated #0------
The quick brwn fox jumps over the lazy dog
Reverse Questions
The Reverse Question generator poses questions based on the contents of input
s in the seed scenario. This generator is particularly useful when assessing the robustness of a retrieval model.
Suppose you have a database of articles and you would like to generate questions that a user might pose to a chatbot. The Reverse Question generator can help you get coverage on a wide range of questions that potential customers might pose, allowing you to evaluate the chatbot's robustness on corner cases.
Example
--------Seed #0--------
WebBizz is dedicated to providing our customers with a seamless online shopping experience. Our platform is designed with user-friendly interfaces to help you browse and select the best products suitable for your needs. We offer a wide range of products from top brands and new entrants, ensuring diversity and quality in our offerings. Our 24/7 customer support is ready to assist you with any queries, from product details, shipping timelines, to payment methods. We also have a dedicated FAQ section addressing common concerns. Always ensure you are logged in to enjoy personalized product recommendations and faster checkout processes.
-----Generated #0------
What features does WebBizz offer to enhance the customer's online shopping experience?
Conditionals
The Conditional generator assumes that the input
values are questions and rewords each question to emphasize a particular clause. This can be used in conjunction with the Reverse Question generator to further expand your test coverage in a retrieval scenario.
Example
--------Seed #4--------
What is the primary benefit of joining the WebBizz Rewards program?
-----Generated #0------
Should you decide to join the WebBizz Rewards program, what would be the primary benefit?
Generator usage
To use a scenario generator, you can use the following template:
- Python
- Typescript
from okareo_api_client.models import ScenarioType
# assuming you have an available seed scenario `source_scenario`
okareo.generate_scenarios(
source_scenario=source_scenario.scenario_id,
name="generated_scenario",
num_examples=1,
generation_type=ScenarioType.REPHRASE_INVARIANT
)
import { Okareo, ScenarioType } from 'okareo-ts-sdk';
// assuming you have an available seed scenario `source_scenario`
const scenario: any = await okareo.generate_scenario_set({
project_id: project_id,
name: "generated_scenario",
source_scenario_id: source_scenario.scenario_id,
number_examples: 5,
generation_type: ScenarioType.REPHRASE_INVARIANT
});
For each input
in the seed scenario, the generator will attempt to generate num_examples
variations of that input
.
The generator type is denoted by the ScenarioType
enum, and the above example uses the Rephrasing generator. To use a different generator, simply change the enum to a valid ScenarioType
in the table below.
Generator | ScenarioType | Brief Description |
---|---|---|
Rephrasing | REPHRASE_INVARIANT | Changes the wording of each sentence per input . |
Relevant Terms | TERM_RELEVANCE_INVARIANT | Returns relevant/uniquely identifying words from input s. |
Misspellings | COMMON_MISSPELLINGS | Adds human-like typing errors to input s. |
Contractions | COMMON_CONTRACTIONS | Removes characters from input . |
Reverse Questions | TEXT_REVERSE_QUESTION | Creates questions where an input contains the relevant answer. |
Conditionals | CONDITIONAL | Changes questions in input s to emphasize a specific condition. |
Chaining generators
Composing multiple generators into a chain can help you test different model behaviors. For example, suppose you have trained a retrieval model on user questions. You might want to see if the model performs well based on keyword queries with and without errors. You might set up a chain of generators as follows:
- Python
- Typescript
# static definition for retrieval questions as seed data
seed_data=[
SeedData(input_="What type of products does WebBizz offer?", "result"= ["75eaa363-dfcc-499f-b2af-1407b43cb133"])
...
]
# upload the seed data
scenario_set_create = ScenarioSetCreate(
seed_data=seed_data,
name="Chain Step #1: Seed Questions",
generation_type=ScenarioType.SEED
)
questions_scenario = okareo.create_scenario_set(scenario_set_create)
# first generator uses uploaded scenario as seed
term_relev_scenario = okareo.generate_scenarios(
source_scenario=questions_scenario.scenario_id,
name="Chain Step #2: Term Relevance",
generation_type=ScenarioType.TERM_RELEVANCE_INVARIANT
)
# second generator uses the first generator's output as a seed
misspellings_scenario = okareo.generate_scenarios(
source_scenario=term_relev_scenario.scenario_id,
name="Chain Step #3: Misspellings",
generation_type=ScenarioType.COMMON_MISSPELLINGS
)
# third generator uses the second generator's output as a seed
contractions_scenario = okareo.generate_scenarios(
source_scenario=misspellings_scenario.scenario_id,
name="Chain Step #4: Contractions",
generation_type=ScenarioType.COMMON_CONTRACTIONS
)
import { Okareo, ScenarioType } from 'okareo-ts-sdk';
// static definition for retrieval questions as seed data
/* Pass data directly or upload a jsonl file in the following format:
* //filename: rag_intent_prompts.jsonl
* {input:"What type of products does WebBizz offer?", "result": ["75eaa363-dfcc-499f-b2af-1407b43cb133"]}
* {input:"input_2", "result": ["UUID_2"]}
* {input:"input_3", "result": ["UUID_3"]}
* ...
*/
// upload the jsonl seed data
const upload_scenario: any = await okareo.upload_scenario_set({
file_path: "./path/to/your/file.jsonl",
scenario_name: "Chain Step #1: Seed Questions",
project_id: project_id
});
// first generator uses uploaded scenario as seed
const term_relev_scenario: any = await okareo.generate_scenario_set({
source_scenario_id:upload_scenario.scenario_id,
name="Chain Step #2: Term Relevance",
generation_type:ScenarioType.TERM_RELEVANCE_INVARIANT
});
// second generator uses the first generator's output as a seed
misspellings_scenario = okareo.generate_scenarios({
source_scenario_id:term_relev_scenario.scenario_id,
name:"Chain Step #3: Misspellings",
generation_type:ScenarioType.COMMON_MISSPELLINGS
});
# third generator uses the second generator's output as a seed
const contractions_scenario: any = await okareo.generate_scenarios({
source_scenario:misspellings_scenario.scenario_id,
name:"Chain Step #4: Contractions",
generation_type:ScenarioType.COMMON_CONTRACTIONS
});
Now all the steps of the chain are available to use in evaluating your retrieval model.