Package sambo
SAMBO - Sequential and Model-Based Optimization [in Python]
Sambo is a global optimization framework for finding approximate global optima† of arbitrary high-dimensional objective functions in the least number of function evaluations. Function evaluations are considered the "expensive" resource (it can sometimes take weeks to obtain results!), so it's important to find good-enough solutions in as few steps as possible (whence sequential).
The main tools in this Python optimization toolbox are:
- function
minimize()
, a near drop-in replacement forscipy.optimize.minimize()
, - class
Optimizer
with an ask-and-tell user interface, supporting arbitrary scikit-learn-like surrogate models, with Bayesian optimization estimators like gaussian process and extra trees, built in, SamboSearchCV
, a much faster drop-in replacement for scikit-learn'sGridSearchCV
and similar exhaustive machine-learning hyper-parameter tuning methods, but compared to unpredictable stochastic methods, informed.
The algorithms and methods implemented by or used in this package are:
- simplical homology global optimization (SHGO), customizing the implementation from SciPy,
- surrogate machine learning model-based optimization,
- shuffled complex evolution (SCE-UA with improvements).
This open-source project was heavily inspired by scikit-optimize project, which now seems helplessly defunct.
The project is one of the better optimizers around according to [benchmark].
† The contained algorithms seek to minimize your objective f(x)
.
If you instead need the maximum, simply minimize -f(x)
. 💡
Sub-modules
sambo.plot
-
The module contains functions for plotting convergence, regret, partial dependence, sequence of evaluations …
Functions
def minimize(fun: Callable[[numpy.ndarray], float],
x0: tuple[float] | list[tuple[float]] | None = None,
*,
args: tuple = (),
bounds: list[tuple] | None = None,
constraints: Callable[[numpy.ndarray], bool] | scipy.optimize._constraints.NonlinearConstraint | None = None,
max_iter: int = 2147483647,
method: Literal['shgo', 'sceua', 'smbo'] = 'shgo',
tol: float = 1e-06,
n_iter_no_change: int | None = None,
y0: float | list[float] | None = None,
callback: Callable[[sambo._util.OptimizeResult], bool] | None = None,
n_jobs: int = 1,
disp: bool = False,
rng: int | numpy.random.mtrand.RandomState | numpy.random._generator.Generator | None = None,
**kwargs)-
Expand source code Browse git
def minimize( fun: Callable[[np.ndarray], float], x0: Optional[tuple[float] | list[tuple[float]]] = None, *, args: tuple = (), bounds: Optional[list[tuple]] = None, constraints: Optional[Callable[[np.ndarray], bool] | NonlinearConstraint] = None, max_iter: int = INT32_MAX, method: Literal['shgo', 'sceua', 'smbo'] = 'shgo', tol: float = FLOAT32_PRECISION, # x_tol: float = FLOAT32_PRECISION, n_iter_no_change: Optional[int] = None, y0: Optional[float | list[float]] = None, callback: Optional[Callable[[OptimizeResult], bool]] = None, n_jobs: int = 1, disp: bool = False, rng: Optional[int | np.random.RandomState | np.random.Generator] = None, **kwargs, ): """ Find approximate optimum of an objective function in the least number of evaluations. Parameters ---------- fun : Callable[[np.ndarray], float], optional Objective function to minimize. Must take a single array-like argument x (parameter combination) and return a scalar y (cost value). x0 : tuple or list[tuple], optional Initial guess(es) or starting point(s) for the optimization. args : tuple, optional Additional arguments to pass to the objective function and constraints. bounds : list[tuple], optional Bounds for parameter variables. Should be a sequence of (min, max) pairs for each dimension, or an enumeration of nominal values. For any dimension, if `min` and `max` are integers, the dimension is assumed to be _integral_. If `min` or `max` are floats, the dimension is assumed to be _real_. In all other cases including if more than two values are provided, the dimension is assumed to be an _enumeration_ of values. See _Examples_ below. .. note:: Nominals are represented as ordinals Categorical (nominal) enumerations, although often not inherently ordered, are internally represented as integral dimensions. If this appears to significantly affect your results (e.g. if your nominals span many cases), you may need to [one-hot encode] your nominal variables manually. [one-hot encode]: https://en.wikipedia.org/wiki/One-hot .. warning:: Mind the dot If optimizing your problem fails to produce expected results, make sure you're not specifying integer dimensions where real floating values would make more sense. constraints : Callable[[np.ndarray], bool], optional Function representing constraints. Must return True iff the parameter combination x satisfies the constraints. >>> minimize(..., constraints=lambda x: (lb < x <= ub)) max_iter : int, optional Maximum number of iterations allowed. method : {'shgo', 'sceua', 'smbo'}, default='shgo' Global optimization algorithm to use. Options are: * `"shgo"` – [simplical homology global optimization] (SHGO; from SciPy), * `"smbo"` – surrogate model-based optimization, for which you can pass your own `estimator=` (see `**kwargs`). * `"sceua"` – [shuffled complex evolution (SCE-UA)] (with a few tweaks, marked in the source). [simplical homology global optimization]: http://doi.org/10.1007/s10898-018-0645-y [shuffled complex evolution (SCE-UA)]: https://doi.org/10.1007/BF00939380 .. caution:: Default method SHGO is only appropriate for Lipschitz-smooth functions Smooth functions have gradients that vary gradually, while non-smooth functions exhibit abrupt changes (e.g. with nominal variables), sharp corners (e.g. function `abs()`), discontinuities (e.g. function `tan()`), or unbounded growth (e.g. function `exp()`). If your objective function is more of the latter kind, you might need to use one of the other methods. n_iter_no_change : int, default 10 Number of iterations with no improvement before stopping. tol : float, default FLOAT32_PRECISION Tolerance for convergence. Optimization stops when found optimum improvements are below this threshold. y0 : float or tuple[float], optional Initial value(s) of the objective function corresponding to `x0`. callback : Callable[[OptimizeResult], bool], optional A callback function that is called after each iteration. The optimization stops If the callback returns True or raises `StopIteration`. n_jobs : int, default 1 Number of objective function evaluations to run in parallel. Most applicate when n_candidates > 1. disp : bool, default False Display progress and intermediate results. rng : int or np.random.RandomState or np.random.Generator, optional Random number generator or seed for reproducibility. **kwargs : dict, optional Additional parameters to pass to optimization function. Popular options are: * for `method="shgo"`: `n_init` (number of initial points), * for `method="smbo"`: `n_init`, `n_candidates`, `n_models`, `estimator` (for explanation, see class `sambo.Optimizer`), * for `method="sceua"`: `n_complexes`, `complex_size` (as in [SCE-UA] algorithm), [SCE-UA]: https://doi.org/10.1007/BF00939380 Examples -------- Basic constrained 10-dimensional example: >>> from scipy.optimize import rosen >>> from sambo import minimize >>> result = minimize(rosen, bounds=[(-2, 2)] * 10, ... constraints=lambda x: sum(x) <= len(x)) >>> result message: Optimization terminated successfully. success: True fun: 0.0 x: [1 1 1 1 1 1 1 1 1 1] nfev: 1036 xv: [[-2 -2 ... -2 1] [-2 -2 ... -2 1] ... [1 1 ... 1 1] [1 1 ... 1 1]] funv: [ 1.174e+04 1.535e+04 ... 0.000e+00 0.000e+00] A more elaborate example, minimizing an objective function of three variables: one integral, one real, and one nominal variable (see `bounds=`). >>> def demand(x): ... n_roses, price, advertising_costs = x ... # Ground truth model: Demand falls with price, but grows if you advertise ... demand = 20 - 2*price + .1*advertising_costs ... return n_roses < demand >>> def objective(x): ... n_roses, price, advertising_costs = x ... production_costs = 1.5 * n_roses ... profits = n_roses * price - production_costs - advertising_costs ... return -profits >>> bounds = [ ... (0, 100), # From zero to at most roses per day ... (.5, 9.), # Price per rose sold ... (10, 20, 100), # Advertising budget ... ] >>> from sambo import minimize >>> result = minimize(fun=objective, bounds=bounds, constraints=demand) References ---------- * Endres, S.C., Sandrock, C. & Focke, W.W. A simplicial homology algorithm for Lipschitz optimisation. J Glob Optim 72, 181–217 (2018). https://doi.org/10.1007/s10898-018-0645-y * Duan, Q.Y., Gupta, V.K. & Sorooshian, S. Shuffled complex evolution approach for effective and efficient global minimization. J Optim Theory Appl 76, 501–521 (1993). https://doi.org/10.1007/BF00939380 * Koziel, Slawomir, and Leifur Leifsson. Surrogate-based modeling and optimization. New York: Springer, 2013. https://doi.org/10.1007/978-1-4614-7551-4 * Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., & Shcherbatyi, I. (2021). scikit-optimize/scikit-optimize (v0.9.0). Zenodo. https://doi.org/10.5281/zenodo.5565057 """ # noqa: E501 from sambo._space import Space constraints = _sanitize_constraints(constraints) rng = _check_random_state(rng) bounds, x0, y0 = _check_bounds(bounds, x0, y0, assert_numeric=False) space = Space(bounds, constraints, rng=rng) bounds = tuple(space) fun = _Args0TransformingFunc(fun, space.inverse_transform) if constraints is not None: constraints = _Args0TransformingFunc(constraints, space.inverse_transform) if callback is not None: callback = _Args0TransformingFunc(callback, space.inverse_transform_result) if method == 'shgo': from sambo._shgo import shgo as minimize_func elif method == 'sceua': from sambo._sceua import sceua as minimize_func elif method == 'smbo': from sambo._smbo import smbo as minimize_func else: assert False, f'Invalid method= parameter: {method!r}. Pls RTFM' if n_iter_no_change is not None: # Pass this iff specified b/c algos have different default values kwargs['n_iter_no_change'] = n_iter_no_change res = minimize_func( fun, x0, args=args, bounds=bounds, constraints=constraints, max_iter=max_iter, tol=tol, callback=callback, y0=y0, n_jobs=n_jobs, disp=disp, rng=rng, **kwargs ) res = space.inverse_transform_result(res) res.space = space return res
Find approximate optimum of an objective function in the least number of evaluations.
Parameters
fun
:Callable[[np.ndarray], float]
, optional- Objective function to minimize. Must take a single array-like argument x (parameter combination) and return a scalar y (cost value).
x0
:tuple
orlist[tuple]
, optional- Initial guess(es) or starting point(s) for the optimization.
args
:tuple
, optional- Additional arguments to pass to the objective function and constraints.
bounds
:list[tuple]
, optional-
Bounds for parameter variables. Should be a sequence of (min, max) pairs for each dimension, or an enumeration of nominal values. For any dimension, if
min
andmax
are integers, the dimension is assumed to be integral. Ifmin
ormax
are floats, the dimension is assumed to be real. In all other cases including if more than two values are provided, the dimension is assumed to be an enumeration of values. See Examples below.Note: Nominals are represented as ordinals
Categorical (nominal) enumerations, although often not inherently ordered, are internally represented as integral dimensions. If this appears to significantly affect your results (e.g. if your nominals span many cases), you may need to one-hot encode your nominal variables manually.
Warning: Mind the dot
If optimizing your problem fails to produce expected results, make sure you're not specifying integer dimensions where real floating values would make more sense.
constraints
:Callable[[np.ndarray], bool]
, optional- Function representing constraints.
Must return True iff the parameter combination x satisfies the constraints.
>>> minimize(..., constraints=lambda x: (lb < x <= ub))
max_iter
:int
, optional- Maximum number of iterations allowed.
method
:{'shgo', 'sceua', 'smbo'}
, default='shgo'
-
Global optimization algorithm to use. Options are:
"shgo"
– simplical homology global optimization (SHGO; from SciPy),"smbo"
– surrogate model-based optimization, for which you can pass your ownestimator=
(see**kwargs
)."sceua"
– shuffled complex evolution (SCE-UA) (with a few tweaks, marked in the source).
Caution: Default method SHGO is only appropriate for Lipschitz-smooth functions
Smooth functions have gradients that vary gradually, while non-smooth functions exhibit abrupt changes (e.g. with nominal variables), sharp corners (e.g. function
abs()
), discontinuities (e.g. functiontan()
), or unbounded growth (e.g. functionexp()
).If your objective function is more of the latter kind, you might need to use one of the other methods.
n_iter_no_change
:int
, default10
- Number of iterations with no improvement before stopping.
tol
:float
, defaultFLOAT32_PRECISION
- Tolerance for convergence. Optimization stops when found optimum improvements are below this threshold.
y0
:float
ortuple[float]
, optional- Initial value(s) of the objective function corresponding to
x0
. callback
:Callable[[OptimizeResult], bool]
, optional- A callback function that is called after each iteration.
The optimization stops If the callback returns True or
raises
StopIteration
. n_jobs
:int
, default1
- Number of objective function evaluations to run in parallel. Most applicate when n_candidates > 1.
disp
:bool
, defaultFalse
- Display progress and intermediate results.
rng
:int
ornp.random.RandomState
ornp.random.Generator
, optional- Random number generator or seed for reproducibility.
**kwargs
:dict
, optional-
Additional parameters to pass to optimization function. Popular options are:
Examples
Basic constrained 10-dimensional example:
>>> from scipy.optimize import rosen >>> from sambo import minimize >>> result = minimize(rosen, bounds=[(-2, 2)] * 10, ... constraints=lambda x: sum(x) <= len(x)) >>> result message: Optimization terminated successfully. success: True fun: 0.0 x: [1 1 1 1 1 1 1 1 1 1] nfev: 1036 xv: [[-2 -2 ... -2 1] [-2 -2 ... -2 1] ... [1 1 ... 1 1] [1 1 ... 1 1]] funv: [ 1.174e+04 1.535e+04 ... 0.000e+00 0.000e+00]
A more elaborate example, minimizing an objective function of three variables: one integral, one real, and one nominal variable (see
bounds=
).>>> def demand(x): ... n_roses, price, advertising_costs = x ... # Ground truth model: Demand falls with price, but grows if you advertise ... demand = 20 - 2*price + .1*advertising_costs ... return n_roses < demand >>> def objective(x): ... n_roses, price, advertising_costs = x ... production_costs = 1.5 * n_roses ... profits = n_roses * price - production_costs - advertising_costs ... return -profits >>> bounds = [ ... (0, 100), # From zero to at most roses per day ... (.5, 9.), # Price per rose sold ... (10, 20, 100), # Advertising budget ... ] >>> from sambo import minimize >>> result = minimize(fun=objective, bounds=bounds, constraints=demand)
References
- Endres, S.C., Sandrock, C. & Focke, W.W. A simplicial homology algorithm for Lipschitz optimisation. J Glob Optim 72, 181–217 (2018). https://doi.org/10.1007/s10898-018-0645-y
- Duan, Q.Y., Gupta, V.K. & Sorooshian, S. Shuffled complex evolution approach for effective and efficient global minimization. J Optim Theory Appl 76, 501–521 (1993). https://doi.org/10.1007/BF00939380
- Koziel, Slawomir, and Leifur Leifsson. Surrogate-based modeling and optimization. New York: Springer, 2013. https://doi.org/10.1007/978-1-4614-7551-4
- Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., & Shcherbatyi, I. (2021). scikit-optimize/scikit-optimize (v0.9.0). Zenodo. https://doi.org/10.5281/zenodo.5565057
Classes
class OptimizeResult (*args, **kwargs)
-
Expand source code Browse git
class OptimizeResult(_OptimizeResult): """ Optimization result. Most fields are inherited from `scipy.optimize.OptimizeResult`, with additional attributes: `xv`, `funv`, `model`. """ success: bool #: Whether or not the optimizer exited successfully. message: str #: More detailed cause of optimization termination. x: np.ndarray #: The solution of the optimization, `shape=(n_features,)`. fun: np.ndarray #: Value of objective function at `x`, aka the observed minimum. nfev: int #: Number of objective function evaluations. nit: int #: Number of iterations performed by the optimization algorithm. xv: np.ndarray #: All the parameter sets that have been tried, in sequence, `shape=(nfev, n_features)`. funv: np.ndarray #: Objective function values at points `xv`. model: Optional[list[_SklearnLikeRegressor]] #: The optimization model(s) used, if any.
Optimization result. Most fields are inherited from
scipy.optimize.OptimizeResult
, with additional attributes:xv
,funv
,model
.Ancestors
- scipy.optimize._optimize.OptimizeResult
- scipy._lib._util._RichResult
- builtins.dict
Class variables
var fun : numpy.ndarray
-
Value of objective function at
x
, aka the observed minimum. var funv : numpy.ndarray
-
Objective function values at points
xv
. var message : str
-
More detailed cause of optimization termination.
var model : list[sambo._util._SklearnLikeRegressor] | None
-
The optimization model(s) used, if any.
var nfev : int
-
Number of objective function evaluations.
var nit : int
-
Number of iterations performed by the optimization algorithm.
var success : bool
-
Whether or not the optimizer exited successfully.
var x : numpy.ndarray
-
The solution of the optimization,
shape=(n_features,)
. var xv : numpy.ndarray
-
All the parameter sets that have been tried, in sequence,
shape=(nfev, n_features)
.
class Optimizer (fun: Callable[[numpy.ndarray], float] | None,
x0: tuple[float] | list[tuple[float]] | None = None,
*,
args: tuple = (),
bounds: list[tuple] | None = None,
constraints: Callable[[numpy.ndarray], bool] | scipy.optimize._constraints.NonlinearConstraint | None = None,
max_iter: int = 2147483647,
n_init: int | None = None,
n_candidates: int | None = None,
n_iter_no_change: int = 10,
n_models: int = 1,
tol: float = 1e-06,
estimator: Literal['gp', 'et', 'gb'] | sambo._util._SklearnLikeRegressor = None,
y0: float | list[float] | None = None,
callback: Callable[[sambo._util.OptimizeResult], bool] | None = None,
n_jobs: int = 1,
disp: bool = False,
rng: int | numpy.random.mtrand.RandomState | numpy.random._generator.Generator | None = None)-
Expand source code Browse git
class Optimizer: """ A sequential optimizer that optimizes an objective function using a surrogate model. Parameters ---------- fun : Callable[[np.ndarray], float], optional Objective function to minimize. Must take a single array-like argument x (parameter combination) and return a scalar y (cost value). When unspecified, the Optimizer can be used iteratively in an ask-tell fashion using the methods named respectively. x0 : tuple | list[tuple], optional Initial guess(es) or starting point(s) for the optimization. args : tuple, optional Additional arguments to pass to the objective function and constraints. bounds : list[tuple], optional Bounds for the decision variables. A sequence of (min, max) pairs for each dimension. constraints : Callable[[np.ndarray], bool], optional Function representing constraints. Must return True iff the parameter combination x satisfies the constraints. max_iter : int, optional Maximum number of iterations allowed. n_init : int, optional Number of initial evaluations of the objective function before first fitting the surrogate model. n_candidates : int, optional Number of candidate solutions generated per iteration. n_iter_no_change : int, default 10 Number of iterations with no improvement before stopping. n_models : int, default 1 Number of most-recently-generated surrogate models to use for next best-point prediction. Useful for small and randomized estimators such as `"et"` with no fixed `rng=`. tol : float, default FLOAT32_PRECISION Tolerance for convergence. Optimization stops when found optimum improvements are below this threshold. estimator : {'gp', 'et', 'gb'} or scikit-learn-like regressor, default='gp' Surrogate model for the optimizer. Popular options include "gp" (Gaussian process), "et" (extra trees), or "gb" (gradient boosting). You can also provide your own regressor with a scikit-learn API, namely `fit()` and `predict()` methods. y0 : float or tuple[float], optional Initial value(s) of the objective function corresponding to `x0`. callback : Callable[[OptimizeResult], bool], optional A callback function that is called after each iteration. The optimization stops If the callback returns True or raises `StopIteration`. n_jobs : int, default 1 Number of objective function evaluations to run in parallel. Most applicate when n_candidates > 1. disp : bool, default False Display progress and intermediate results. rng : int or np.random.RandomState or np.random.Generator, optional Random number generator or seed for reproducibility. Examples -------- >>> from sambo import Optimizer >>> def objective_func(x): ... return sum(x**2) >>> optimizer = Optimizer(fun=objective_func, bounds=[(-5, 5), (-5, 5)]) >>> result = optimizer.run() Using the ask-tell interface: >>> optimizer = Optimizer(fun=None, bounds=[(-5, 5), (-5, 5)]) >>> suggested_x = optimizer.ask() >>> y = [objective_func(x) for x in suggested_x] >>> optimizer.tell(y, suggested_x) """ def __init__( self, fun: Optional[Callable[[np.ndarray], float]], x0: Optional[tuple[float] | list[tuple[float]]] = None, *, args: tuple = (), bounds: Optional[list[tuple]] = None, constraints: Optional[Callable[[np.ndarray], bool] | NonlinearConstraint] = None, max_iter: int = INT32_MAX, n_init: Optional[int] = None, n_candidates: Optional[int] = None, n_iter_no_change: int = 10, n_models: int = 1, tol: float = FLOAT32_PRECISION, estimator: Literal['gp', 'et', 'gb'] | _SklearnLikeRegressor = None, y0: Optional[float | list[float]] = None, callback: Optional[Callable[[OptimizeResult], bool]] = None, n_jobs: int = 1, disp: bool = False, rng: Optional[int | np.random.RandomState | np.random.Generator] = None, ): assert fun is None or callable(fun), fun assert x0 is not None or bounds is not None, "Either x0= or bounds= must be provided" constraints = _sanitize_constraints(constraints) assert constraints is None or callable(constraints), constraints assert isinstance(max_iter, Integral) and max_iter > 0, max_iter assert isinstance(tol, Real) and 0 <= tol, tol assert isinstance(n_iter_no_change, int) and n_iter_no_change > 0, n_iter_no_change assert callback is None or callable(callback), callback assert isinstance(n_jobs, Integral) and n_jobs != 0, n_jobs assert isinstance(rng, (Integral, np.random.RandomState, np.random.Generator, type(None))), rng assert n_init is None or isinstance(n_init, Integral) and n_init >= 0, n_init assert n_candidates is None or isinstance(n_candidates, Integral) and n_candidates > 0, n_candidates assert estimator is None or isinstance(estimator, (str, _SklearnLikeRegressor)), estimator assert isinstance(n_models, Integral) and n_models > 0, n_models bounds, x0, y0 = _check_bounds(bounds, x0, y0) rng = _check_random_state(rng) if n_init is None: n_init = 0 if not callable(fun) else min(max(1, max_iter - 20), 150 * len(bounds)) assert max_iter >= n_init, (max_iter, n_init) if n_candidates is None: n_candidates = max(1, int(np.log2(len(bounds)))) if estimator is None or isinstance(estimator, str): from sambo._estimators import _estimator_factory estimator = _estimator_factory(estimator, bounds, rng) assert isinstance(estimator, _SklearnLikeRegressor), estimator # Objective function can be None for the real-life function trials using ask-tell API fun = None if fun is None else _ParallelFuncWrapper( _ObjectiveFunctionWrapper( func=fun, max_nfev=max_iter, callback=callback, args=()), n_jobs, args, ) self.fun = fun self.x0 = x0 self.y0 = y0 self.bounds = bounds self.constraints = constraints self.max_iter = max_iter self.n_init = n_init self.n_candidates = n_candidates self.n_iter_no_change = n_iter_no_change self.tol = tol self.estimator = estimator self.estimators = [] self.n_models = n_models self.callback = callback self.n_jobs = n_jobs self.disp = disp self.rng = rng X, y = [], [] if y0 is not None: y0 = np.atleast_1d(y0) assert x0 is not None and len(x0) == len(y0), (x0, y0) x0 = np.atleast_2d(x0) assert len(x0) == len(y0), (x0, y0) X, y = list(x0), list(y0) self._X_ask = [] # Known points self._X = X self._y = y assert len(X) == len(y), (X, y) # Cache methods on the _instance_ self._init_once = lru_cache(1)(self._init_once) self.top_k = lru_cache(1)(self.top_k) def _init_once(self): assert not self.n_init or callable(self.fun), (self.n_init, self.fun) if not self.n_init: return x0, n_init = self.x0, self.n_init if self.y0 is not None: # x0, y0 already added to _X, _Y in __init__ x0, n_init = None, max(0, self.n_init - len(self.x0)) if n_init: X = _initialize_population(self.bounds, n_init, self.constraints, x0, self.rng) y = self.fun(X) self._X.extend(X) self._y.extend(y) self._fit() def _fit(self): from sklearn import clone estimator = self.estimator if self.n_models > 1 and hasattr(estimator, 'random_state'): estimator = clone(self.estimator) estimator.random_state = np.random.randint(10000000) estimator.fit(self._X, self._y) self.estimators.append(estimator) if len(self.estimators) > self.n_models: self.estimators.pop(0) self.top_k.cache_clear() def _predict(self, X): means, stds, masks = [], [], [] for estimator in self.estimators: try: mean, std = estimator.predict(X, return_std=True) except TypeError as exc: if 'return_std' not in exc.args[0]: raise mean, std = estimator.predict(X), 0 mask = np.ones_like(mean, dtype=bool) else: # Only suggest new/unknown points mask = std != 0 means.append(mean) stds.append(std) masks.append(mask) mask = np.any(masks, axis=0) mean = np.mean(means, axis=0) std = np.mean(stds, axis=0) if mask.any() and not mask.all(): X, mean, std = X[mask], mean[mask], std[mask] return X, mean, std #: Acquisition functions for selecting the best candidates from the sample. #: Currently defined keys: #: "UCB" for upper confidence bound (`mean - kappa * std`). #: [//]: # (No blank line here! bug in pdoc) #: .. note:: #: To make any use of the `kappa` parameter, it is important for the #: estimator's `predict()` method to implement `return_std=` behavior. #: All built-in estimators (`"gp"`, `"et"`, `"gb"`) do so. ACQ_FUNCS: dict = { 'UCB': _UCB, } def ask( self, n_candidates: Optional[int] = None, *, acq_func: Optional[Callable] = ACQ_FUNCS['UCB'], kappa: float | list[float] = 0, ) -> np.ndarray: """ Propose candidate solutions for the next objective evaluation based on the current surrogate model(s) and acquisition function. Parameters ---------- n_candidates : int, optional Number of candidate solutions to propose. If not specified, the default value set during initialization is used. acq_func : Callable, default ACQ_FUNCS['UCB'] Acquisition function used to guide the selection of candidate solutions. By default, upper confidence bound (i.e. `mean + kappa * std` where `mean` and `std` are surrogate models' predicted results). .. tip:: [See the source][_ghs] for how `ACQ_FUNCS['UCB']` is implemeted. The passed parameters are open to extension to accommodate alternative acquisition functions. [_ghs]: https://github.com/search?q=repo%3Asambo-optimization%2Fsambo%20ACQ_FUNCS&type=code kappa : float or list[float], default 0 The upper/lower-confidence-bound parameter, used by `acq_func`, that balances exploration vs exploitation. Can also be an array of values to use sequentially for `n_cadidates`. Returns ------- np.ndarray An array of shape `(n_candidates, n_bounds)` containing the proposed candidate solutions. Notes ----- Candidates are proposed in parallel according to `n_jobs` when `n_candidates > 1`. Examples -------- >>> candidates = optimizer.ask(n_candidates=2, kappa=2) >>> candidates array([[ 1.1, -0.2], [ 0.8, 0.1]]) """ if n_candidates is None: n_candidates = self.n_candidates assert isinstance(n_candidates, Integral) and n_candidates > 0, n_candidates assert isinstance(kappa, (Real, Iterable)), kappa self._init_once() n_points = max(10_000, 1000 * int(len(self.bounds)**1.2)) X = _sample_population(self.bounds, n_points, self.constraints, self.rng) X, mean, std = self._predict(X) criterion = acq_func(mean=mean, std=std, kappa=kappa) best_indices = np.argsort(criterion)[:, :n_candidates].flatten('F') X = X[best_indices] X = X[:n_candidates] self._X_ask.extend(map(tuple, X)) return X def tell(self, y: float | list[float], x: Optional[float | tuple[float] | list[tuple[float]]] = None): """ Provide incremental feedback to the optimizer by reporting back the objective function values (`y`) at suggested or new candidate points (`x`). This allows the optimizer to refine its underlying model(s) and better guide subsequent proposals. Parameters ---------- y : float or list[float] The observed value(s) of the objective function. x : float or list[float], optional The input point(s) corresponding to the observed objective function values `y`. If omitted, the optimizer assumes that the `y` values correspond to the most recent candidates proposed by the `ask` method (FIFO). .. warning:: The function first takes `y`, then `x`, not the other way around! Examples -------- >>> candidates = optimizer.ask(n_candidates=3) >>> ... # Evaluate candidate solutions IRL and tell it to the optimizer >>> objective_values = [1.7, 3, .8] >>> optimizer.tell(y=objective_values, x=candidates) """ y = np.atleast_1d(y) assert y.ndim == 1, 'y= should be at most 1-dimensional' if x is None: if not self._X_ask: raise RuntimeError( f'`{self.tell.__qualname__}(y, x=None)` only allowed as many ' f'times as `{self.ask.__qualname__}()` was called beforehand') for x, yval in zip(tuple(self._X_ask), y): self._X_ask.pop(0) self._X.append(x) self._y.append(yval) else: x = np.atleast_2d(x) assert len(x) == len(y), 'y= and x= (if provided) must contain the same number of items' for xi, yi in zip(x, y): try: self._X_ask.pop(self._X_ask.index(tuple(xi))) except (ValueError, IndexError): pass self._X.append(xi) self._y.append(yi) self._fit() def run(self, *, max_iter: Optional[int] = None, n_candidates: Optional[int] = None) -> OptimizeResult: """ Execute the optimization process for (at most) a specified number of iterations (function evaluations) and return the optimization result. This method performs sequential optimization by iteratively proposing candidates using method `ask()`, evaluating the objective function, and updating the optimizer state with method `tell()`. This continues until the maximum number of iterations (`max_iter`) is reached or other stopping criteria are met. This method encapsulates the entire optimization workflow, making it convenient to use when you don't need fine-grained control over individual steps (`ask` and `tell`). It cycles between exploration and exploitation by random sampling `kappa` appropriately. Parameters ---------- max_iter : int, optional The maximum number of iterations to perform. If not specified, the default value provided during initialization is used. n_candidates : int, optional Number of candidates to propose and evaluate in each iteration. If not specified, the default value provided during initialization is used. Returns ------- OptimizeResult: OptimizeResult Results of the optimization process. Examples -------- Run an optimization with a specified number of iterations: >>> result = optimizer.run(max_iter=30) >>> print(result.x, result.fun) # Best x, y """ max_iter = max_iter if max_iter is not None else 0 if self.fun is None else self.max_iter assert callable(self.fun) or max_iter == 0, "Can't run optimizer when fun==None. Can only use ask-tell API." assert n_candidates is None or isinstance(n_candidates, Integral) and n_candidates > 0, n_candidates assert max_iter is None or isinstance(max_iter, Integral) and max_iter >= 0, max_iter n_candidates = n_candidates or self.n_candidates success = True message = "Optimization hadn't been started" iteration = 0 prev_best_value = np.inf no_change = 0 try: for iteration in range(1, max_iter + 1): coefs = [self.rng.uniform(-2, 2) for i in range(n_candidates)] X = self.ask(n_candidates, kappa=coefs) y = self.fun(X) self.tell(y) best_value = min(self._y) if self.tol and prev_best_value - best_value < self.tol or prev_best_value == best_value: no_change += 1 if no_change == self.n_iter_no_change: message = 'Optimization converged (y_prev[n_iter_no_change] - y_best < tol)' break else: assert best_value < prev_best_value no_change = 0 prev_best_value = best_value if self.disp: print(f"{__package__}: {self.estimator.__class__.__name__} " f"nit:{iteration}, nfev:{self.fun.func.nfev}, " f"fun:{np.min(self._y):.5g}") except _ObjectiveFunctionWrapper.CallbackStopIteration: message = 'Optimization callback returned True' except _ObjectiveFunctionWrapper.MaximumFunctionEvaluationsReached: message = f'Maximum function evaluations reached (max_iter = {max_iter})' success = False except KeyboardInterrupt: message = 'KeyboardInterrupt' success = False if len(self._X) == 0 and self.fun is not None: # We were interrupted before ._init_once() could finish self._X = self.fun.func.xv self._y = self.fun.func.funv x, y = self.top_k(1) result = OptimizeResult( success=success, message=message, x=x, fun=y, nit=iteration, nfev=len(self._y) - (len(self.y0) if self.y0 is not None else 0), xv=np.array(self._X), funv=np.array(self._y), model=list(self.estimators), ) return result def top_k(self, k: int = 1): """ Based on their objective function values, retrieve the top-k best solutions found by the optimization process so far. Parameters ---------- k : int, default 1 The number of top solutions to retrieve. If `k` exceeds the number of evaluated solutions, all available solutions are returned. Returns ------- X : np.ndarray A list of best points with shape `(k, n_bounds)`. y : np.ndarray Objective values at points of `X`. Examples -------- Retrieve the best solution: >>> optimizer.run() >>> best_x, best_y = optimizer.top_k(1) """ assert isinstance(k, Integral) and k > 0, k best_index = np.argsort(self._y) index = slice(0, k) if k > 1 else (k - 1) return self._X[best_index[index]], self._y[best_index[index]]
A sequential optimizer that optimizes an objective function using a surrogate model.
Parameters
fun
:Callable[[np.ndarray], float]
, optional-
Objective function to minimize. Must take a single array-like argument x (parameter combination) and return a scalar y (cost value).
When unspecified, the Optimizer can be used iteratively in an ask-tell fashion using the methods named respectively.
x0
:tuple | list[tuple]
, optional- Initial guess(es) or starting point(s) for the optimization.
args
:tuple
, optional- Additional arguments to pass to the objective function and constraints.
bounds
:list[tuple]
, optional- Bounds for the decision variables. A sequence of (min, max) pairs for each dimension.
constraints
:Callable[[np.ndarray], bool]
, optional- Function representing constraints. Must return True iff the parameter combination x satisfies the constraints.
max_iter
:int
, optional- Maximum number of iterations allowed.
n_init
:int
, optional- Number of initial evaluations of the objective function before first fitting the surrogate model.
n_candidates
:int
, optional- Number of candidate solutions generated per iteration.
n_iter_no_change
:int
, default10
- Number of iterations with no improvement before stopping.
n_models
:int
, default1
- Number of most-recently-generated surrogate models to use for
next best-point prediction. Useful for small and
randomized estimators such as
"et"
with no fixedrng=
. tol
:float
, defaultFLOAT32_PRECISION
- Tolerance for convergence. Optimization stops when found optimum improvements are below this threshold.
estimator
:{'gp', 'et', 'gb'}
orscikit-learn-like regressor
, default='gp'
-
Surrogate model for the optimizer. Popular options include "gp" (Gaussian process), "et" (extra trees), or "gb" (gradient boosting).
You can also provide your own regressor with a scikit-learn API, namely
fit()
andpredict()
methods. y0
:float
ortuple[float]
, optional- Initial value(s) of the objective function corresponding to
x0
. callback
:Callable[[OptimizeResult], bool]
, optional- A callback function that is called after each iteration.
The optimization stops If the callback returns True or
raises
StopIteration
. n_jobs
:int
, default1
- Number of objective function evaluations to run in parallel. Most applicate when n_candidates > 1.
disp
:bool
, defaultFalse
- Display progress and intermediate results.
rng
:int
ornp.random.RandomState
ornp.random.Generator
, optional- Random number generator or seed for reproducibility.
Examples
>>> from sambo import Optimizer >>> def objective_func(x): ... return sum(x**2) >>> optimizer = Optimizer(fun=objective_func, bounds=[(-5, 5), (-5, 5)]) >>> result = optimizer.run()
Using the ask-tell interface:
>>> optimizer = Optimizer(fun=None, bounds=[(-5, 5), (-5, 5)]) >>> suggested_x = optimizer.ask() >>> y = [objective_func(x) for x in suggested_x] >>> optimizer.tell(y, suggested_x)
Class variables
var ACQ_FUNCS : dict
-
Acquisition functions for selecting the best candidates from the sample. Currently defined keys: "UCB" for upper confidence bound (
mean - kappa * std
).Note
To make any use of the
kappa
parameter, it is important for the estimator'spredict()
method to implementreturn_std=
behavior. All built-in estimators ("gp"
,"et"
,"gb"
) do so.
Methods
def ask(self,
n_candidates: int | None = None,
*,
acq_func: Callable | None = <function _UCB>,
kappa: float | list[float] = 0) ‑> numpy.ndarray-
Expand source code Browse git
def ask( self, n_candidates: Optional[int] = None, *, acq_func: Optional[Callable] = ACQ_FUNCS['UCB'], kappa: float | list[float] = 0, ) -> np.ndarray: """ Propose candidate solutions for the next objective evaluation based on the current surrogate model(s) and acquisition function. Parameters ---------- n_candidates : int, optional Number of candidate solutions to propose. If not specified, the default value set during initialization is used. acq_func : Callable, default ACQ_FUNCS['UCB'] Acquisition function used to guide the selection of candidate solutions. By default, upper confidence bound (i.e. `mean + kappa * std` where `mean` and `std` are surrogate models' predicted results). .. tip:: [See the source][_ghs] for how `ACQ_FUNCS['UCB']` is implemeted. The passed parameters are open to extension to accommodate alternative acquisition functions. [_ghs]: https://github.com/search?q=repo%3Asambo-optimization%2Fsambo%20ACQ_FUNCS&type=code kappa : float or list[float], default 0 The upper/lower-confidence-bound parameter, used by `acq_func`, that balances exploration vs exploitation. Can also be an array of values to use sequentially for `n_cadidates`. Returns ------- np.ndarray An array of shape `(n_candidates, n_bounds)` containing the proposed candidate solutions. Notes ----- Candidates are proposed in parallel according to `n_jobs` when `n_candidates > 1`. Examples -------- >>> candidates = optimizer.ask(n_candidates=2, kappa=2) >>> candidates array([[ 1.1, -0.2], [ 0.8, 0.1]]) """ if n_candidates is None: n_candidates = self.n_candidates assert isinstance(n_candidates, Integral) and n_candidates > 0, n_candidates assert isinstance(kappa, (Real, Iterable)), kappa self._init_once() n_points = max(10_000, 1000 * int(len(self.bounds)**1.2)) X = _sample_population(self.bounds, n_points, self.constraints, self.rng) X, mean, std = self._predict(X) criterion = acq_func(mean=mean, std=std, kappa=kappa) best_indices = np.argsort(criterion)[:, :n_candidates].flatten('F') X = X[best_indices] X = X[:n_candidates] self._X_ask.extend(map(tuple, X)) return X
Propose candidate solutions for the next objective evaluation based on the current surrogate model(s) and acquisition function.
Parameters
n_candidates
:int
, optional- Number of candidate solutions to propose. If not specified, the default value set during initialization is used.
acq_func
:Callable
, defaultACQ_FUNCS['UCB']
-
Acquisition function used to guide the selection of candidate solutions. By default, upper confidence bound (i.e.
mean + kappa * std
wheremean
andstd
are surrogate models' predicted results).Tip
See the source for how
ACQ_FUNCS['UCB']
is implemeted. The passed parameters are open to extension to accommodate alternative acquisition functions. kappa
:float
orlist[float]
, default0
-
The upper/lower-confidence-bound parameter, used by
acq_func
, that balances exploration vs exploitation.Can also be an array of values to use sequentially for
n_cadidates
.
Returns
np.ndarray
- An array of shape
(n_candidates, n_bounds)
containing the proposed candidate solutions.
Notes
Candidates are proposed in parallel according to
n_jobs
whenn_candidates > 1
.Examples
>>> candidates = optimizer.ask(n_candidates=2, kappa=2) >>> candidates array([[ 1.1, -0.2], [ 0.8, 0.1]])
def run(self, *, max_iter: int | None = None, n_candidates: int | None = None) ‑> sambo._util.OptimizeResult
-
Expand source code Browse git
def run(self, *, max_iter: Optional[int] = None, n_candidates: Optional[int] = None) -> OptimizeResult: """ Execute the optimization process for (at most) a specified number of iterations (function evaluations) and return the optimization result. This method performs sequential optimization by iteratively proposing candidates using method `ask()`, evaluating the objective function, and updating the optimizer state with method `tell()`. This continues until the maximum number of iterations (`max_iter`) is reached or other stopping criteria are met. This method encapsulates the entire optimization workflow, making it convenient to use when you don't need fine-grained control over individual steps (`ask` and `tell`). It cycles between exploration and exploitation by random sampling `kappa` appropriately. Parameters ---------- max_iter : int, optional The maximum number of iterations to perform. If not specified, the default value provided during initialization is used. n_candidates : int, optional Number of candidates to propose and evaluate in each iteration. If not specified, the default value provided during initialization is used. Returns ------- OptimizeResult: OptimizeResult Results of the optimization process. Examples -------- Run an optimization with a specified number of iterations: >>> result = optimizer.run(max_iter=30) >>> print(result.x, result.fun) # Best x, y """ max_iter = max_iter if max_iter is not None else 0 if self.fun is None else self.max_iter assert callable(self.fun) or max_iter == 0, "Can't run optimizer when fun==None. Can only use ask-tell API." assert n_candidates is None or isinstance(n_candidates, Integral) and n_candidates > 0, n_candidates assert max_iter is None or isinstance(max_iter, Integral) and max_iter >= 0, max_iter n_candidates = n_candidates or self.n_candidates success = True message = "Optimization hadn't been started" iteration = 0 prev_best_value = np.inf no_change = 0 try: for iteration in range(1, max_iter + 1): coefs = [self.rng.uniform(-2, 2) for i in range(n_candidates)] X = self.ask(n_candidates, kappa=coefs) y = self.fun(X) self.tell(y) best_value = min(self._y) if self.tol and prev_best_value - best_value < self.tol or prev_best_value == best_value: no_change += 1 if no_change == self.n_iter_no_change: message = 'Optimization converged (y_prev[n_iter_no_change] - y_best < tol)' break else: assert best_value < prev_best_value no_change = 0 prev_best_value = best_value if self.disp: print(f"{__package__}: {self.estimator.__class__.__name__} " f"nit:{iteration}, nfev:{self.fun.func.nfev}, " f"fun:{np.min(self._y):.5g}") except _ObjectiveFunctionWrapper.CallbackStopIteration: message = 'Optimization callback returned True' except _ObjectiveFunctionWrapper.MaximumFunctionEvaluationsReached: message = f'Maximum function evaluations reached (max_iter = {max_iter})' success = False except KeyboardInterrupt: message = 'KeyboardInterrupt' success = False if len(self._X) == 0 and self.fun is not None: # We were interrupted before ._init_once() could finish self._X = self.fun.func.xv self._y = self.fun.func.funv x, y = self.top_k(1) result = OptimizeResult( success=success, message=message, x=x, fun=y, nit=iteration, nfev=len(self._y) - (len(self.y0) if self.y0 is not None else 0), xv=np.array(self._X), funv=np.array(self._y), model=list(self.estimators), ) return result
Execute the optimization process for (at most) a specified number of iterations (function evaluations) and return the optimization result.
This method performs sequential optimization by iteratively proposing candidates using method
ask()
, evaluating the objective function, and updating the optimizer state with methodtell()
. This continues until the maximum number of iterations (max_iter
) is reached or other stopping criteria are met.This method encapsulates the entire optimization workflow, making it convenient to use when you don't need fine-grained control over individual steps (
ask
andtell
). It cycles between exploration and exploitation by random samplingkappa
appropriately.Parameters
max_iter
:int
, optional- The maximum number of iterations to perform. If not specified, the default value provided during initialization is used.
n_candidates
:int
, optional- Number of candidates to propose and evaluate in each iteration. If not specified, the default value provided during initialization is used.
Returns
OptimizeResult
:OptimizeResult
- Results of the optimization process.
Examples
Run an optimization with a specified number of iterations:
>>> result = optimizer.run(max_iter=30) >>> print(result.x, result.fun) # Best x, y
def tell(self,
y: float | list[float],
x: float | tuple[float] | list[tuple[float]] | None = None)-
Expand source code Browse git
def tell(self, y: float | list[float], x: Optional[float | tuple[float] | list[tuple[float]]] = None): """ Provide incremental feedback to the optimizer by reporting back the objective function values (`y`) at suggested or new candidate points (`x`). This allows the optimizer to refine its underlying model(s) and better guide subsequent proposals. Parameters ---------- y : float or list[float] The observed value(s) of the objective function. x : float or list[float], optional The input point(s) corresponding to the observed objective function values `y`. If omitted, the optimizer assumes that the `y` values correspond to the most recent candidates proposed by the `ask` method (FIFO). .. warning:: The function first takes `y`, then `x`, not the other way around! Examples -------- >>> candidates = optimizer.ask(n_candidates=3) >>> ... # Evaluate candidate solutions IRL and tell it to the optimizer >>> objective_values = [1.7, 3, .8] >>> optimizer.tell(y=objective_values, x=candidates) """ y = np.atleast_1d(y) assert y.ndim == 1, 'y= should be at most 1-dimensional' if x is None: if not self._X_ask: raise RuntimeError( f'`{self.tell.__qualname__}(y, x=None)` only allowed as many ' f'times as `{self.ask.__qualname__}()` was called beforehand') for x, yval in zip(tuple(self._X_ask), y): self._X_ask.pop(0) self._X.append(x) self._y.append(yval) else: x = np.atleast_2d(x) assert len(x) == len(y), 'y= and x= (if provided) must contain the same number of items' for xi, yi in zip(x, y): try: self._X_ask.pop(self._X_ask.index(tuple(xi))) except (ValueError, IndexError): pass self._X.append(xi) self._y.append(yi) self._fit()
Provide incremental feedback to the optimizer by reporting back the objective function values (
y
) at suggested or new candidate points (x
).This allows the optimizer to refine its underlying model(s) and better guide subsequent proposals.
Parameters
y
:float
orlist[float]
- The observed value(s) of the objective function.
x
:float
orlist[float]
, optional-
The input point(s) corresponding to the observed objective function values
y
. If omitted, the optimizer assumes that they
values correspond to the most recent candidates proposed by theask
method (FIFO).Warning
The function first takes
y
, thenx
, not the other way around!
Examples
>>> candidates = optimizer.ask(n_candidates=3) >>> ... # Evaluate candidate solutions IRL and tell it to the optimizer >>> objective_values = [1.7, 3, .8] >>> optimizer.tell(y=objective_values, x=candidates)
def top_k(self, k: int = 1)
-
Expand source code Browse git
def top_k(self, k: int = 1): """ Based on their objective function values, retrieve the top-k best solutions found by the optimization process so far. Parameters ---------- k : int, default 1 The number of top solutions to retrieve. If `k` exceeds the number of evaluated solutions, all available solutions are returned. Returns ------- X : np.ndarray A list of best points with shape `(k, n_bounds)`. y : np.ndarray Objective values at points of `X`. Examples -------- Retrieve the best solution: >>> optimizer.run() >>> best_x, best_y = optimizer.top_k(1) """ assert isinstance(k, Integral) and k > 0, k best_index = np.argsort(self._y) index = slice(0, k) if k > 1 else (k - 1) return self._X[best_index[index]], self._y[best_index[index]]
Based on their objective function values, retrieve the top-k best solutions found by the optimization process so far.
Parameters
k
:int
, default1
- The number of top solutions to retrieve.
If
k
exceeds the number of evaluated solutions, all available solutions are returned.
Returns
X
:np.ndarray
- A list of best points with shape
(k, n_bounds)
. y
:np.ndarray
- Objective values at points of
X
.
Examples
Retrieve the best solution:
>>> optimizer.run() >>> best_x, best_y = optimizer.top_k(1)
class SamboSearchCV (estimator,
param_grid: dict,
*,
max_iter: int = 100,
method: Literal['shgo', 'sceua', 'smbo'] = 'smbo',
rng: int | numpy.random.mtrand.RandomState | numpy.random._generator.Generator | None = None,
**kwargs)-
Expand source code Browse git
class SamboSearchCV(BaseSearchCV): """ SAMBO hyper-parameter search with cross-validation that can be used to **optimize hyperparameters of machine learning estimator pipelines** like those of scikit-learn. Similar to `GridSearchCV` from scikit-learn, but hopefully **much faster for large parameter spaces**. Parameters ---------- estimator : BaseEstimator The base model or pipeline to optimize parameters for. It needs to implement `fit()` and `predict()` methods. param_grid : dict Dictionary with parameters names (str) as keys and lists of parameter choices to try as values. Supports both continuous parameter ranges and discrete/string parameter enumerations. max_iter : int, optional, default=100 The maximum number of iterations for the optimization. method : {'shgo', 'sceua', 'smbo'}, optional, default='smbo' The optimization algorithm to use. See method `sambo.minimize()` for comparison. rng : int or np.random.RandomState or np.random.RandomGenerator or None, optional Random seed for reproducibility. **kwargs : dict, optional Additional parameters to pass to `BaseSearchCV` (`scoring=`, `n_jobs=`, `refit=` `cv=`, `verbose=`, `pre_dispatch=`, `error_score=`, `return_train_score=`). For explanation, see documentation on [`GridSearchCV`][skl_gridsearchcv]. [skl_gridsearchcv]: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html Attributes ---------- opt_result_ : OptimizeResult The result of the optimization process. See Also -------- 1: https://scikit-learn.org/stable/modules/grid_search.html """ def __init__( self, estimator, param_grid: dict, *, max_iter: int = 100, method: Literal['shgo', 'sceua', 'smbo'] = 'smbo', rng: Optional[int | np.random.RandomState | np.random.Generator] = None, **kwargs ): super().__init__(estimator=estimator, **kwargs) self.param_grid = param_grid self.max_iter = max_iter self.method = method self.rng = rng def _run_search(self, evaluate_candidates): import joblib @lru_cache(key=joblib.hash) # TODO: lru_cache(max_iter) objective function calls always?? def _objective(x): res = evaluate_candidates([dict(zip(self.param_grid.keys(), x))]) y = -res['mean_test_score'][-1] nonlocal it it += 1 if self.verbose: print(f'{self.__class__.__name__}: it={it}; y={y}; x={x}') return y bounds = [((sv := sorted(v))[0], sv[-1] + 1) if all(isinstance(i, Integral) for i in v) else ((sv := sorted(v))[0], sv[-1]) if all(isinstance(i, Real) for i in v) else list({i: 1 for i in v}) for v in self.param_grid.values()] kwargs = {} if self.max_iter is not None: kwargs = {'max_iter': self.max_iter} from ._minimize import minimize it = 0 self.opt_result_ = minimize( _objective, bounds=bounds, method=self.method, disp=self.verbose, rng=0, **kwargs)
SAMBO hyper-parameter search with cross-validation that can be used to optimize hyperparameters of machine learning estimator pipelines like those of scikit-learn. Similar to
GridSearchCV
from scikit-learn, but hopefully much faster for large parameter spaces.Parameters
estimator
:BaseEstimator
- The base model or pipeline to optimize parameters for.
It needs to implement
fit()
andpredict()
methods. param_grid
:dict
- Dictionary with parameters names (str) as keys and lists of parameter choices to try as values. Supports both continuous parameter ranges and discrete/string parameter enumerations.
max_iter
:int
, optional, default=100
- The maximum number of iterations for the optimization.
method
:{'shgo', 'sceua', 'smbo'}
, optional, default='smbo'
- The optimization algorithm to use. See method
minimize()
for comparison. rng
:int
ornp.random.RandomState
ornp.random.RandomGenerator
orNone
, optional- Random seed for reproducibility.
**kwargs
:dict
, optional-
Additional parameters to pass to
BaseSearchCV
(scoring=
,n_jobs=
,refit=
cv=
,verbose=
,pre_dispatch=
,error_score=
,return_train_score=
). For explanation, see documentation onGridSearchCV
.
Attributes
opt_result_
:OptimizeResult
- The result of the optimization process.
See Also
Ancestors
- sklearn.model_selection._search.BaseSearchCV
- sklearn.base.MetaEstimatorMixin
- sklearn.base.BaseEstimator
- sklearn.utils._estimator_html_repr._HTMLDocumentationLinkMixin
- sklearn.utils._metadata_requests._MetadataRequester