forked from YiVal/YiVal
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
42 changed files
with
696 additions
and
831 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,176 @@ | ||
# This is a generated template. Modify the values as needed. | ||
|
||
custom_function: /Users/taofeng/YiVal/demo.headline_generation.headline_generation | ||
dataset: | ||
data_generators: | ||
openai_prompt_data_generator: | ||
chunk_size: 1000 | ||
input_function: | ||
description: | ||
Given an tech startup business, generate one corresponding landing | ||
page headline | ||
name: headline_generation_for_business | ||
parameters: | ||
tech_startup_business: str | ||
number_of_examples: 10 | ||
openai_model_name: gpt-4 | ||
output_path: generated_examples.pkl | ||
source_type: machine_generated | ||
description: Generated experiment config | ||
|
||
human_rating_configs: | ||
- name: clarity | ||
instructions: Rate whether the headline clearly state what the start is doing | ||
scale: [1, 5] | ||
|
||
- name: relevance | ||
instructions: Rate whether the headline is relevant | ||
scale: [1, 5] | ||
|
||
evaluators: | ||
# - evaluator_type: all | ||
# input_description: | ||
# Given an tech startup business, generate one corresponding landing | ||
# page headline | ||
# metric_calculators: [] | ||
# name: openai_elo_evaluator | ||
# openai_model_name: gpt-4 | ||
- evaluator_type: individual | ||
metric_calculators: | ||
- method: AVERAGE | ||
name: openai_prompt_based_evaluator | ||
prompt: |- | ||
You are assessing a submitted answer on a given task based on a criterion. Here is the data: | ||
- Task: Given an tech startup business, generate one corresponding landing page headline | ||
- Does the headline clearly communicate what the startup does or what problem it solves? | ||
It should be immediately clear to anyone who reads the headline what the startup's purpose is. | ||
A lack of clarity can lead to confusion and may discourage potential users or investors. | ||
[Input]: {tech_startup_business} | ||
[Result]: {raw_output} | ||
Answer the question by selecting one of the following options: | ||
A It fails to meet the criterion at all. | ||
B It somewhat meets the criterion, but there is significant room for improvement. | ||
C It meets the criterion to a satisfactory degree. | ||
D It meets the criterion very well. | ||
E It meets the criterion exceptionally well, with little to no room for improvement. | ||
display_name: clarity | ||
choices: ["A", "B", "C", "D", "E"] | ||
description: Does the headline clearly communicate what the startup does or what problem it solves? | ||
scale_description: "0-4" | ||
choice_scores: | ||
A: 0 | ||
B: 1 | ||
C: 2 | ||
D: 3 | ||
E: 4 | ||
- evaluator_type: individual | ||
metric_calculators: | ||
- method: AVERAGE | ||
name: openai_prompt_based_evaluator | ||
prompt: |- | ||
You are assessing a submitted answer on a given task based on a criterion. Here is the data: | ||
- Task: Given an tech startup business, generate one corresponding landing page headline | ||
- Is the headline relevant to the target audience? The headline should speak directly to the | ||
startup's intended users or customers, highlighting the benefits or value proposition that | ||
the startup offers. | ||
[Input]: {tech_startup_business} | ||
[Result]: {raw_output} | ||
Answer the question by selecting one of the following options: | ||
A It fails to meet the criterion at all. | ||
B It somewhat meets the criterion, but there is significant room for improvement. | ||
C It meets the criterion to a satisfactory degree. | ||
D It meets the criterion very well. | ||
E It meets the criterion exceptionally well, with little to no room for improvement. | ||
display_name: relevance | ||
description: Is the headline relevant to the target audience? | ||
scale_description: "0-4" | ||
choices: ["A", "B", "C", "D", "E"] | ||
choice_scores: | ||
A: 0 | ||
B: 1 | ||
C: 2 | ||
D: 3 | ||
E: 4 | ||
|
||
- evaluator_type: individual | ||
metric_calculators: | ||
- method: AVERAGE | ||
name: openai_prompt_based_evaluator | ||
prompt: |- | ||
You are assessing a submitted answer on a given task based on a criterion. Here is the data: | ||
- Task: Given an tech startup business, generate one corresponding landing page headline | ||
- Is the headline catchy or memorable? While it's important to be clear and relevant, | ||
a good headline should also be engaging and memorable. | ||
This can help the startup stand out in a crowded market. | ||
[Input]: {tech_startup_business} | ||
[Result]: {raw_output} | ||
Answer the question by selecting one of the following options: | ||
A It fails to meet the criterion at all. | ||
B It somewhat meets the criterion, but there is significant room for improvement. | ||
C It meets the criterion to a satisfactory degree. | ||
D It meets the criterion very well. | ||
E It meets the criterion exceptionally well, with little to no room for improvement. | ||
display_name: catchiness | ||
description: Is the headline catchy or memorable? | ||
scale_description: "0-4" | ||
choices: ["A", "B", "C", "D", "E"] | ||
choice_scores: | ||
A: 0 | ||
B: 1 | ||
C: 2 | ||
D: 3 | ||
E: 4 | ||
|
||
selection_strategy: | ||
ahp_selection: | ||
criteria: | ||
- openai_elo_evaluator | ||
- average_token_usage | ||
- average_latency | ||
- "openai_prompt_based_evaluator: clarity" | ||
- "openai_prompt_based_evaluator: relevance" | ||
- "openai_prompt_based_evaluator: catchiness" | ||
criteria_maximization: | ||
openai_elo_evaluator: true | ||
average_latency: false | ||
average_token_usage: false | ||
criteria_weights: | ||
openai_elo_evaluator: 0.3 | ||
average_latency: 0.2 | ||
average_token_usage: 0.2 | ||
"openai_prompt_based_evaluator: clarity": 0.1 | ||
"openai_prompt_based_evaluator: relevance": 0.1 | ||
"openai_prompt_based_evaluator: catchiness": 0.1 | ||
|
||
# wrapper_configs: {} | ||
|
||
# Variations allow for dynamic content during experiments. | ||
# They are identified by a globally unique name. For example, in your code, | ||
# you might reference a variation by its name, like: | ||
# variation = StringWrapper("hello", 'test_experiment') | ||
# In this config, you would define the variations associated with that name, e.g. | ||
|
||
variations: | ||
- generator_config: | ||
input_description: | ||
Given an tech startup business, generate one corresponding landing | ||
page headline | ||
input_test_cases: ["AI law firm", "AI sales agent"] | ||
number_of_variations: 5 | ||
openai_model_name: gpt-4 | ||
output_path: generated_prompt.pkl | ||
generator_name: openai_prompt_based_variation_generator | ||
name: task | ||
variations: | ||
- instantiated_value: Generate landing page headline for | ||
value: Generate landing page headline for | ||
value_type: str | ||
variation_id: null | ||
|
||
variations: | ||
- name: task | ||
variations: | ||
- instantiated_value: Generate landing page headline for | ||
value: Generate landing page headline for | ||
value_type: str | ||
variation_id: null |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
<!-- markdownlint-disable MD041 --> | ||
::: yival.combination_improvers.base_combination_improver |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
<!-- markdownlint-disable MD041 --> | ||
::: yival.data.base_reader |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
<!-- markdownlint-disable MD041 --> | ||
::: yival.data_generators.base_data_generator |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
<!-- markdownlint-disable MD041 --> | ||
::: yival.evaluators.base_evaluator |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
<!-- markdownlint-disable MD041 --> | ||
::: yival.result_selectors.selection_strategy |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
<!-- markdownlint-disable MD041 --> | ||
::: yival.states.experiment_state |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
<!-- markdownlint-disable MD041 --> | ||
::: yival.variation_generators.base_variation_generator |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
<!-- markdownlint-disable MD041 --> | ||
::: yival.wrappers.base_wrapper |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.