gcloud作业使用--data-format = TF_RECORD提交预测'无法解码json'

本文介绍了gcloud作业使用--data-format = TF_RECORD提交预测'无法解码json'的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我将一些测试数据作为二进制tfrecord文件推送到gcloud进行预测.运行脚本时出现错误('No JSON object could be decoded', 162).您认为我做错了什么?

I pushed up some test data to gcloud for prediction as a binary tfrecord-file. Running my script I got the error ('No JSON object could be decoded', 162). What do you think I am doing wrong?

要将预测任务推送到gcloud，我使用以下脚本:

To push a prediction job to gcloud, i use this script:

REGION=us-east1
MODEL_NAME=mymodel
VERSION=v_hopt_22
INPUT_PATH=gs://mydb/test-data.tfr
OUTPUT_PATH=gs://mydb/prediction.tfr
JOB_NAME=pred_${MODEL_NAME}_${VERSION}_b

args=" --model "$MODEL_NAME
args+=" --version "$VERSION

args+=" --data-format=TF_RECORD"
args+=" --input-paths "$INPUT_PATH
args+=" --output-path "$OUTPUT_PATH

args+=" --region "$REGION

gcloud ml-engine jobs submit prediction $JOB_NAME $args

test-data.tfr是从numpy数组生成的，如下所示:

test-data.tfr has been generated from a numpy array, as so:

import numpy as np

filename = './Datasets/test-data.npz'
data = np.load(filename)
features = data['X'] # features[channel, example, feature]
np_features = np.swapaxes(features, 0, 1) # features[example, channel, feature]

import tensorflow as tf
import nnscoring.data as D

def floats_feature(arr):
    return tf.train.Feature(float_list=tf.train.FloatList(value=arr.flatten().tolist()))

writer = tf.python_io.TFRecordWriter("./Datasets/test-data.tfr")

for i, np_example in enumerate(np_features):
    if i%1000==0: print(i)
    tf_feature = {  
        ch: floats_feature(x)
        for ch, x in zip(D.channels, np_example)
    }
    tf_features = tf.train.Features(feature=tf_feature)
    tf_example = tf.train.Example(features=tf_features)
    writer.write(tf_example.SerializeToString())

writer.close()

更新(在yxshi之后):

Update (following yxshi):

我定义了以下服务功能

def tfrecord_serving_input_fn():
    import tensorflow as tf
    seq_length = int(dt*sr) 
    examples = tf.placeholder(tf.string, shape=())
    feat_map = {
        channel: tf.FixedLenSequenceFeature(shape=(seq_length,),
            dtype=tf.float32, allow_missing=True)
        for channel in channels
    }
    parsed = tf.parse_single_example(examples, features=feat_map)
    features = {
        channel: tf.expand_dims(tensor, -1)
        for channel, tensor in parsed.iteritems()
    }
    from collections import namedtuple
    InputFnOps = namedtuple("InputFnOps", "features labels receiver_tensors")
    tf.contrib.learn.utils.input_fn_utils.InputFnOps = InputFnOps
    return InputFnOps(features=features, labels=None, receiver_tensors=examples)
    # InputFnOps = tf.contrib.learn.utils.input_fn_utils.InputFnOps
    # return InputFnOps(features, None, parsed)
    # Error: InputFnOps has no attribute receiver_tensors

..，我将其传递给generate_experiment_fn:

.., which I pass to generate_experiment_fn as so:

export_strategies = [
      saved_model_export_utils.make_export_strategy(
          tfrecord_serving_input_fn,
          exports_to_keep = 1,
          default_output_alternative_key = None,
  )]

  gen_exp_fn = generate_experiment_fn(
      train_steps_per_iteration = args.train_steps_per_iteration,
      train_steps        = args.train_steps,
      export_strategies  = export_strategies
  )

(旁边:注意InputFnOps的脏补丁)

(aside: note the dirty patch of InputFnOps)

推荐答案

似乎在推理图中未正确指定输入.要将tf_record用作输入数据格式，推理图必须接受字符串作为输入占位符.对于您的情况，您的推理代码中应包含以下内容:

It looks like the input is not correctly specified in the inference graph. To use tf_record as input data format, your inference graph must accept strings as the input placeholder. In your case, you should have something like below in your inference code:

 examples = tf.placeholder(tf.string, name='input', shape=(None,))
 with tf.name_scope('inputs'):
   feature_map = {
     ch: floats_feature(x)
     for ch, x in zip(D.channels, np_example)
   }
   parsed = tf.parse_example(examples, features=feature_map)
   f1 = parsed['feature_name_1']
   f2 = parsed['feature_name_2']

 ...

一个封闭的例子在这里: https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/flowers/trainer/model.py#L253

A close example is here:https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/flowers/trainer/model.py#L253

希望有帮助.

这篇关于gcloud作业使用--data-format = TF_RECORD提交预测'无法解码json'的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！