python - 凯拉斯的二阶导数

对于NN的自定义损失，我使用函数。给定(t，x)对，两个点都在一个间隔中，则是NN的输出。问题是我陷入了如何使用K.gradient(K是TensorFlow后端)计算二阶导数的问题:

def custom_loss(input_tensor, output_tensor):
    def loss(y_true, y_pred):

        # so far, I can only get this right, naturally:
        gradient = K.gradients(output_tensor, input_tensor)

        # here I'm falling badly:

        # d_t = K.gradients(output_tensor, input_tensor)[0]
        # dd_x = K.gradient(K.gradients(output_tensor, input_tensor),
        #                   input_tensor[1])

        return gradient # obviously not useful, just for it to work
    return loss

我基于Input(shape=(2,))的所有尝试都是上述代码段中注释行的变体，主要是试图找到所得张量的正确索引。

当然，我缺乏关于张量如何工作的知识。顺便说一下，我知道在TensorFlow本身中我可以简单地使用tf.hessian，但是我注意到当使用TF作为后端时，它只是不存在。

最佳答案

为了使 K.gradients() 层像这样工作，您必须将其封闭在 Lambda() 层中，因为否则将不会创建完整的Keras层，并且您无法对其进行链接或训练。因此，此代码可以正常工作(经过测试):

import keras
from keras.models import *
from keras.layers import *
from keras import backend as K
import tensorflow as tf

def grad( y, x ):
    return Lambda( lambda z: K.gradients( z[ 0 ], z[ 1 ] ), output_shape = [1] )( [ y, x ] )

def network( i, d ):
    m = Add()( [ i, d ] )
    a = Lambda(lambda x: K.log( x ) )( m )
    return a

fixed_input = Input(tensor=tf.constant( [ 1.0 ] ) )
double = Input(tensor=tf.constant( [ 2.0 ] ) )

a = network( fixed_input, double )

b = grad( a, fixed_input )
c = grad( b, fixed_input )
d = grad( c, fixed_input )
e = grad( d, fixed_input )

model = Model( inputs = [ fixed_input, double ], outputs = [ a, b, c, d, e ] )

print( model.predict( x=None, steps = 1 ) )

def network模型 f(x)= log(x + 2)在 x = 1 处。 def grad是完成梯度计算的位置。此代码输出:

这是 log(3)，⅓， -1/32 ， 2/33 ， -6/34 的正确值。

引用TensorFlow代码

作为引用，普通TensorFlow中的相同代码(用于测试):

import tensorflow as tf

a = tf.constant( 1.0 )
a2 = tf.constant( 2.0 )

b = tf.log( a + a2 )
c = tf.gradients( b, a )
d = tf.gradients( c, a )
e = tf.gradients( d, a )
f = tf.gradients( e, a )

with tf.Session() as sess:
    print( sess.run( [ b, c, d, e, f ] ) )

输出相同的值:

黑森人

tf.hessians() 确实返回第二个导数，这是链接两个 tf.gradients() 的简写。 Keras后端虽然没有hessians，所以您必须将两个 K.gradients() 链接起来。

数值近似

如果由于某种原因上述方法都不起作用，那么您可能需要考虑在较小的ε距离上采用差值，以数值近似二阶导数。对于每个输入来说，这基本上使网络增长了三倍，因此，该解决方案除了缺乏准确性外，还引入了严重的效率考虑。无论如何，代码(经过测试):
import keras from keras.models import * from keras.layers import * from keras import backend as K import tensorflow as tf def network( i, d ): m = Add()( [ i, d ] ) a = Lambda(lambda x: K.log( x ) )( m ) return a fixed_input = Input(tensor=tf.constant( [ 1.0 ], dtype = tf.float64 ) ) double = Input(tensor=tf.constant( [ 2.0 ], dtype = tf.float64 ) ) epsilon = Input( tensor = tf.constant( [ 1e-7 ], dtype = tf.float64 ) ) eps_reciproc = Input( tensor = tf.constant( [ 1e+7 ], dtype = tf.float64 ) ) a0 = network( Subtract()( [ fixed_input, epsilon ] ), double ) a1 = network( fixed_input, double ) a2 = network( Add()( [ fixed_input, epsilon ] ), double ) d0 = Subtract()( [ a1, a0 ] ) d1 = Subtract()( [ a2, a1 ] ) dv0 = Multiply()( [ d0, eps_reciproc ] ) dv1 = Multiply()( [ d1, eps_reciproc ] ) dd0 = Multiply()( [ Subtract()( [ dv1, dv0 ] ), eps_reciproc ] ) model = Model( inputs = [ fixed_input, double, epsilon, eps_reciproc ], outputs = [ a0, dv0, dd0 ] ) print( model.predict( x=None, steps = 1 ) )

输出:

(这仅涉及二阶导数。)
关于python - 凯拉斯的二阶导数，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/49935778/