I have a dataset consisting of both numeric and categorical data and I want to predict adverse outcomes for patients based on their medical characteristics. I defined a prediction pipeline for my dataset like so:

X = dataset.drop(columns=['target'])
y = dataset['target']

# define categorical and numeric transformers
numeric_transformer = Pipeline(steps=[
    ('knnImputer', KNNImputer(n_neighbors=2, weights="uniform")),
    ('scaler', StandardScaler())])

categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))])

#  dispatch object columns to the categorical_transformer and remaining columns to numerical_transformer
preprocessor = ColumnTransformer(transformers=[
    ('num', numeric_transformer, selector(dtype_exclude="object")),
    ('cat', categorical_transformer, selector(dtype_include="object"))

# Append classifier to preprocessing pipeline.
# Now we have a full prediction pipeline.
clf = Pipeline(steps=[('preprocessor', preprocessor),
                      ('classifier', LogisticRegression())])

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

clf.fit(X_train, y_train)
print("model score: %.3f" % clf.score(X_test, y_test))


However, when running this code, I get the following warning message:

ConvergenceWarning: lbfgs failed to converge (status=1):
Increase the number of iterations (max_iter) or scale the data as shown in:
Please also refer to the documentation for alternative solver options:

    model score: 0.988


Can someone explain to me what this warning means? I am new to machine learning so am a little lost as to what I can do to improve the prediction model. As you can see from the numeric_transformer, I scaled the data through standardisation. I am also confused as to how the model score is quite high and whether this is a good or bad thing.


警告的含义主要是:尝试制作 solver 的建议(算法)收敛.

The warning means what it mainly says: Suggestions to try to make the solver (the algorithm) converges.

lbfgs 代表:有限存储器Broyden-Fletcher-Goldfarb-Shanno算法".它是Scikit-Learn库提供的求解器算法之一.

lbfgs stand for: "Limited-memory Broyden–Fletcher–Goldfarb–Shanno Algorithm". It is one of the solvers' algorithms provided by Scikit-Learn Library.


The term limited-memory simply means it stores only a few vectors that represent the gradients approximation implicitly.

在相对较小的 数据集上,它具有更好的收敛性.

It has better convergence on relatively small datasets.


简单来说.如果求解误差在很小的范围内(即几乎没有变化),则意味着算法已达到求解(不一定是最佳解决方案,因为它可能会停留在所谓的本地最优" ).

In simple words. If the error of solving is ranging within very small range (i.e., it is almost not changing), then that means the algorithm reached the solution (not necessary to be the best solution as it might be stuck at what so-called "local Optima").

另一方面,如果错误是 明显变化 (,即使错误相对较小[如您的情况得分一样],而是每次迭代错误之间的差异大于某个容差),那么我们说该算法没有收敛.

On the other hand, if the error is varying noticeably (even if the error is relatively small [like in your case the score was good], but rather the differences between the errors per iteration is greater than some tolerance) then we say the algorithm did not converge.

现在,您需要知道Scikit-Learn API有时会为用户提供选项,以指定算法以迭代方式搜索解决方案时应执行的最大迭代次数:

Now, you need to know that Scikit-Learn API sometimes provides the user the option to specify the maximum number of iterations the algorithm should take while it's searching for the solution in an iterative manner:

LogisticRegression(... solver='lbfgs', max_iter=100 ...)


As you can see, the default solver in LogisticRegression is 'lbfgs' and the maximum number of iterations is 100 by default.


Final words, please, however, note that increasing the maximum number of iterations does not necessarily guarantee convergence, but it certainly helps!


Based on your comment below, some tips to try (out of many) that might help the algorithm to converge are:

  • 增加迭代次数:如本回答所示;
  • 尝试使用不同的优化程序:在此处;
  • 扩展数据:在此处;
  • 添加工程功能:外观此处;
  • 数据预处理:查看和这里;
  • 添加更多数据:查看此处.
  • Increase the number of iterations: As in this answer;
  • Try a different optimizer: Look here;
  • Scale your data: Look here;
  • Add engineered features: Look here;
  • Data pre-processing: Look here - use case and here;
  • Add more data: Look here.

