diff --git a/docs/docs/deep-dive/optimizers/miprov2.md b/docs/docs/deep-dive/optimizers/miprov2.md index df6a03e267..9bf6481960 100644 --- a/docs/docs/deep-dive/optimizers/miprov2.md +++ b/docs/docs/deep-dive/optimizers/miprov2.md @@ -192,6 +192,7 @@ evaluate(optimized_program, devset=devset[:]) | `student` | `dspy.Module` | **Required** | The base program to optimize. | | `trainset` | `List[dspy.Example]` | **Required** | Training dataset which is used to bootstrap few-shot examples and instructions. If a separate `valset` is not specified, 80% of this training set will also be used as a validation set for evaluating new candidate prompts. | | `valset` | `List[dspy.Example]` | Defaults to 80% of trainset | Dataset which is used to evaluate candidate prompts. We recommend using somewhere between 50-500 examples for optimization. | +| `teacher` | `dspy.Module` | Defaults to student | The program to run in order to bootstrap the few-shot examples. | | `num_trials` | `int` | `30` | Number of optimization trials to run. When `minibatch` is set to `True`, this represents the number of minibatch trials that will be run on batches of size `minibatch_size`. When minibatch is set to `False`, each trial uses a full evaluation on the training set. In both cases, we recommend setting `num_trials` to a *minimum* of .75 x # modules in program x # variables per module (2 if few-shot examples & instructions will both be optimized, 1 in the 0-shot case). | | `minibatch` | `bool` | `True` | Flag to enable evaluating over minibatches of data (instead of the full validation set) for evaluation each trial. | | `minibatch_size` | `int` | `25.0` | Size of minibatches for evaluations. | @@ -216,4 +217,4 @@ These steps are broken down in more detail below: 3. **Find an Optimized Combination of Few-Shot Examples & Instructions**. Finally, now that we've created these few-shot examples and instructions, we use Bayesian Optimization to choose which set of these would work best for each predictor in our program. This works by running a series of `num_trials` trials, where a new set of prompts are evaluated over our validation set at each trial. This helps the Bayesian Optimizer learn which combination of prompts work best over time. If `minibatch` is set to `True` (which it is by default), then the new set of prompts are only evaluated on a minibatch of size `minibatch_size` at each trial which generally allows for more efficient exploration / exploitation. The best averaging set of prompts is then evaluated on the full validation set every `minibatch_full_eval_steps` get a less noisey performance benchmark. At the end of the optimization process, the LM program with the set of prompts that performed best on the full validation set is returned. -For those interested in more details, more information on `MIPROv2` along with a study on `MIPROv2` compared with other DSPy optimizers can be found in [this paper](https://arxiv.org/abs/2406.11695). \ No newline at end of file +For those interested in more details, more information on `MIPROv2` along with a study on `MIPROv2` compared with other DSPy optimizers can be found in [this paper](https://arxiv.org/abs/2406.11695). diff --git a/dspy/teleprompt/mipro_optimizer_v2.py b/dspy/teleprompt/mipro_optimizer_v2.py index 08ed6ea815..babb03cb75 100644 --- a/dspy/teleprompt/mipro_optimizer_v2.py +++ b/dspy/teleprompt/mipro_optimizer_v2.py @@ -95,6 +95,7 @@ def compile( student: Any, *, trainset: List, + teacher: Any = None, valset: Optional[List] = None, num_trials: int = 30, max_bootstrapped_demos: Optional[int] = None, @@ -165,7 +166,7 @@ def compile( ) # Step 1: Bootstrap few-shot examples - demo_candidates = self._bootstrap_fewshot_examples(program, trainset, seed) + demo_candidates = self._bootstrap_fewshot_examples(program, trainset, seed, teacher) # Step 2: Propose instruction candidates instruction_candidates = self._propose_instructions( @@ -368,7 +369,7 @@ def _get_user_confirmation( return user_input == "y" def _bootstrap_fewshot_examples( - self, program: Any, trainset: List, seed: int + self, program: Any, trainset: List, seed: int, teacher: Any ) -> Optional[List]: logger.info("\n==> STEP 1: BOOTSTRAP FEWSHOT EXAMPLES <==") if self.max_bootstrapped_demos > 0: @@ -399,6 +400,7 @@ def _bootstrap_fewshot_examples( ), metric=self.metric, max_errors=self.max_errors, + teacher=teacher, teacher_settings=self.teacher_settings, seed=seed, metric_threshold=self.metric_threshold,