本文介绍了使用sagemaker.estimator.Estimator部署最佳估计器时出现问题(带有sklearn自定义图像)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

创建SKLearn()实例并将HyperparamaterTuner与一些超参数范围一起使用后,我得到了最佳估计器.当我尝试对估计器进行deploy()时,它在日志中给出了错误.当我创建转换器并对其调用transform()时,会发生完全相同的错误.不部署,也不转换.可能是什么问题,至少我如何才能缩小问题的范围?

After creating SKLearn() instance and using HyperparamaterTuner with a few hyperparameter ranges, I get the best estimator. When I try to deploy() the estimator, it gives an error in the log. Exactly same error happens when I create transformer and call transform on it(). Doesn't deploy and doesn't transform. What could be the problem and at least how could I possibly narrow down the problem?

我什至不知道该如何解决这个问题.谷歌搜索没有帮助.什么都没发生.

I have no idea how to even begin to figure this out. Googling didn't help. Nothing comes up.

创建SKLearn实例:

Creating SKLearn instance:

sklearn = SKLearn(
    entry_point=script_path,
    train_instance_type="ml.c4.xlarge",
    role=role,
    sagemaker_session=session,
    hyperparameters={'model': 'rfc'})

将调谐器投入使用:

tuner = HyperparameterTuner(estimator = sklearn,
                            objective_metric_name = objective_metric_name,
                            objective_type = 'Minimize',
                            metric_definitions = metric_definitions,
                            hyperparameter_ranges = hyperparameters,
                            max_jobs = 3, # 9,
                            max_parallel_jobs = 4)

tuner.fit({'train': s3_input_train})
tuner.wait()
best_training_job = tuner.best_training_job()
the_best_estimator = sagemaker.estimator.Estimator.attach(best_training_job)

这提供了有效的最佳培训工作.一切似乎都很棒.

This gives a valid best training job. Everything seems great.

问题出在这里:

predictor = the_best_estimator.deploy(initial_instance_count=1, instance_type="ml.m4.xlarge")

或以下(触发完全相同的问题):

or the following (triggers exactly same problem):

rfc_transformer = the_best_estimator.transformer(1, instance_type="ml.m4.xlarge")
rfc_transformer.transform(test_location)
rfc_transformer.wait()

这是带有错误消息的日志(它在尝试部署或转换时多次重复同一错误;这是日志的开头):

Here is the log with the error message (it reiterates the same error many times while trying to deploy or transform; here is the beginning of the log):

........ [2019-09-22 09:17:48 +0000] [17] [INFO]开始使用枪杀19.9.0

................[2019-09-22 09:17:48 +0000] [17] [INFO] Starting gunicorn 19.9.0

[2019-09-22 09:17:48 +0000] [17] [INFO]收听:unix:/tmp/gunicorn.sock(17)

[2019-09-22 09:17:48 +0000] [17] [INFO] Listening at: unix:/tmp/gunicorn.sock (17)

[2019-09-22 09:17:48 +0000] [17] [INFO]使用worker:gevent

[2019-09-22 09:17:48 +0000] [17] [INFO] Using worker: gevent

[2019-09-22 09:17:48 +0000] [24] [INFO]引导进程的pid为24

[2019-09-22 09:17:48 +0000] [24] [INFO] Booting worker with pid: 24

[2019-09-22 09:17:48 +0000] [25] [INFO]使用pid:25启动工人

[2019-09-22 09:17:48 +0000] [25] [INFO] Booting worker with pid: 25

[2019-09-22 09:17:48 +0000] [26] [INFO]使用pid:26启动工人

[2019-09-22 09:17:48 +0000] [26] [INFO] Booting worker with pid: 26

[2019-09-22 09:17:48 +0000] [30] [INFO]使用pid:30启动工人

[2019-09-22 09:17:48 +0000] [30] [INFO] Booting worker with pid: 30

2019-09-22 09:18:15,061信息-sagemaker容器-未检测到GPU(如果未安装GPU则正常)

2019-09-22 09:18:15,061 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)

2019-09-22 09:18:15,062信息-sagemaker_sklearn_container.serving-遇到意外错误.

2019-09-22 09:18:15,062 INFO - sagemaker_sklearn_container.serving - Encountered an unexpected error.

[2019-09-22 09:18:15 +0000] [24] [错误]错误处理请求/ping

[2019-09-22 09:18:15 +0000] [24] [ERROR] Error handling request /ping

回溯(最近通话最近一次):

Traceback (most recent call last):

文件"/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base_async.py",行self.handle_request(listener_name,req,client,addr)中的第56行

File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base_async.py", line 56, in handle self.handle_request(listener_name, req, client, addr)

handle_request地址中的文件"/usr/local/lib/python3.5/dist-packages/gunicorn/workers/ggevent.py",第160行)

File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/ggevent.py", line 160, in handle_request addr)

文件"/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base_async.py",在handle_request respiter = self.wsgi(environ,resp.start_response)中的第107行

File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base_async.py", line 107, in handle_request respiter = self.wsgi(environ, resp.start_response)

文件"/usr/local/lib/python3.5/dist-packages/sagemaker_sklearn_container/serving.py",第119行,位于主user_module_transformer = import_module(serving_env.module_name,serving_env.module_dir)

File "/usr/local/lib/python3.5/dist-packages/sagemaker_sklearn_container/serving.py", line 119, in main user_module_transformer = import_module(serving_env.module_name, serving_env.module_dir)

import_module user_module = importlib.import_module(module_name)中的文件"/usr/local/lib/python3.5/dist-packages/sagemaker_sklearn_container/serving.py",第97行

File "/usr/local/lib/python3.5/dist-packages/sagemaker_sklearn_container/serving.py", line 97, in import_module user_module = importlib.import_module(module_name)

import_module中的文件"/usr/lib/python3.5/importlib/init.py",第117行:

File "/usr/lib/python3.5/importlib/init.py", line 117, in import_module if name.startswith('.'):

AttributeError:'NoneType'对象没有属性'startswith'

AttributeError: 'NoneType' object has no attribute 'startswith'

169.254.255.130--[22/Sep/2019:09:18:15 +0000]"GET/ping HTTP/1.1" 500141-""Go-http-client/1.1"

169.254.255.130 - - [22/Sep/2019:09:18:15 +0000] "GET /ping HTTP/1.1" 500 141 "-" "Go-http-client/1.1"

2019-09-22 09:18:15,178信息-sagemaker容器-未检测到GPU(如果未安装GPU则正常)

2019-09-22 09:18:15,178 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)

2019-09-22 09:18:15,179信息-sagemaker_sklearn_container.serving-遇到意外错误.

2019-09-22 09:18:15,179 INFO - sagemaker_sklearn_container.serving - Encountered an unexpected error.

[2019-09-22 09:18:15 +0000] [30] [错误]错误处理请求/ping

[2019-09-22 09:18:15 +0000] [30] [ERROR] Error handling request /ping

回溯(最近通话最近一次):

Traceback (most recent call last):

文件"/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base_async.py",行self.handle_request(listener_name,req,client,addr)中的第56行

File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base_async.py", line 56, in handle self.handle_request(listener_name, req, client, addr)

handle_request地址中的文件"/usr/local/lib/python3.5/dist-packages/gunicorn/workers/ggevent.py",第160行)

File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/ggevent.py", line 160, in handle_request addr)

文件"/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base_async.py",在handle_request respiter = self.wsgi(environ,resp.start_response)中的第107行

File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base_async.py", line 107, in handle_request respiter = self.wsgi(environ, resp.start_response)

文件"/usr/local/lib/python3.5/dist-packages/sagemaker_sklearn_container/serving.py",第119行,位于主user_module_transformer = import_module(serving_env.module_name,serving_env.module_dir)

File "/usr/local/lib/python3.5/dist-packages/sagemaker_sklearn_container/serving.py", line 119, in main user_module_transformer = import_module(serving_env.module_name, serving_env.module_dir)

import_module user_module = importlib.import_module(module_name)中的文件"/usr/local/lib/python3.5/dist-packages/sagemaker_sklearn_container/serving.py",第97行

File "/usr/local/lib/python3.5/dist-packages/sagemaker_sklearn_container/serving.py", line 97, in import_module user_module = importlib.import_module(module_name)

import_module中的文件"/usr/lib/python3.5/importlib/init.py",第117行:

File "/usr/lib/python3.5/importlib/init.py", line 117, in import_module if name.startswith('.'):

推荐答案

仔细检查您是否已设置了必要的环境变量.当我没有设置环境变量 SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT SAGEMAKER_PROGRAM SAGEMAKER_SUBMIT_DIRECTORY 时,就遇到了这个问题.检查工作基础模型,以了解需要设置哪些环境变量.

Double check you have the necessary environment variables set. I ran into this issue when I didn't set the environment variables SAGEMAKER_DEFAULT_INVOCATIONS_ACCEPT, SAGEMAKER_PROGRAM, and SAGEMAKER_SUBMIT_DIRECTORY. Check a working base model to see what environment variables need to be set.

这篇关于使用sagemaker.estimator.Estimator部署最佳估计器时出现问题(带有sklearn自定义图像)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-24 13:33