About params_func

params_func is used to set hyperparameters for training and to set a set of hyperparameters for optimization.

params_func have to accept only 1 argument and return dictionary of hyperparameters, dictionary content depends on the type of the model you are training.

params_func input

params_func should accept only 1 argument, during training optuna.trial.Trial object will be passed to this argument. You can use this object for hyperparameters optimization.

For instance, optuna.trial.Trial has method .suggest_float which is used for finding optimal value of hyperparameter of float type. All methods of optuna.trial.Trial you can find in optuna documentation.

Example:

def params_func(trial):
    return(
    {
        'model_params': {'boosting': trial.suggest_categorical('boosting', ['gbdt', 'dart', 'goss']),
                 'feature_fraction': trial.suggest_float('feature_fraction', 0.01, 1),
                 'min_child_samples': trial.suggest_int('min_child_samples', 2, 256)}

    }
    )

During hyperparameters optimization:

boosting will be chosen from [‘gbdt’, ‘dart’, ‘goss’] during hyperparameter optimization
feature_fraction will be chosen from uniform distribution from 0.01 to 1 during hyperparameter optimization
min_child_samples will be chosen itegers from 2 to 256 during hyperparameter optimization

params_func for LGB model

In case you use mlexp.trainer.lgb_trainer, params_func have to return a dictionary with the following keys:

model_params should contain a dictionary with hyperparameters for lightgbm model, except for data set parameters
lgb_data_set_params should contain a dictionary with parameters for lightgbm.Dataset, except for data and label, which are passed from training data automatically

Example:

def params_func(trial):
    return(
    {
        'model_params': {'objective': trial.suggest_categorical('objective', ['huber', 'fair', 'l2', 'l1', 'mape']),
                 'boosting': trial.suggest_categorical('boosting', ['gbdt', 'dart', 'goss']),
                 'n_jobs': -1,
                 'n_estimators': 500,
                 'random_state': random_state,
                 'bagging_fraction': trial.suggest_float('bagging_fraction', 0.01, 1),
                 'feature_fraction': trial.suggest_float('feature_fraction', 0.01, 1),
                 'min_child_samples': trial.suggest_int('min_child_samples', 2, 256),
                 'num_leaves': trial.suggest_int('num_leaves', 2, 256),
                 'learning_rate': trial.suggest_float('learning_rate', 0.01, 1.5)},
        'lgb_data_set_params': {'feature_name': ['first_feature', 'second_feature'],
                    'categorical_feature': ['first_feature']},

    }
    )

params_func for scikit-learn model

In case you use mlexp.trainer.sklearn_trainer, params_func have to return a dictionary with the following keys:

model_params should contain a dictionary with hyperparameters for scikit-learn model

Example for sklearn.linear_model.Ridge:

def params_func(trial):
    return(
    {
        'model_params': {'random_state': random_state,
                 'alpha': trial.suggest_float('alpha', 0.01, 1)},
    }
    )

params_func for pytorch-lightning neural network

In case you use mlexp.trainer.torch_trainer, params_func have to return a dictionary with the following keys:

model_params should contain a dictionary with hyperparameters for your nn_model class from module passed to nn_model_module parameter
EarlyStopping_params should contain a dictionary with parameters for pytorch_lightning.callbacks.EarlyStopping
trainer_params should contain a dictionary with parameters for pytorch_lightning.Trainer
data_loaders_params should contain a dictionary with hyperparameters for your train_val_data_loaders function from module passed to data_loaders_module parameter

Example:

def params_func(trial):
    return(
    {
        'model_params': {'optimizer': 'SGD',
                 'objective': 'cross_entropy',
                 'lr': trial.suggest_float('lr', 0.001, 1),
                 'vocab_size': 500,
                 'embedding_size': 150,
                 'weight_decay': 0.05},
        'trainer_params': {'min_epochs': 2,
                   'max_epochs': 10,
                   'num_sanity_val_steps': 0,
                   'progress_bar_refresh_rate': 0,
                   'gpus': 1},
        'data_loaders_params': {'return_wide_format': True},
        'EarlyStopping_params': {'min_delta': 0.001,
                     'patience': 3,
                     'monitor': 'validation'}
    }
    )