问题描述
我正在向 TensorFlow 添加新硬件,但在处理设备上下文时遇到问题.
I am adding new hardware to TensorFlow and I am having trouble with the Device Contexts.
为了确保我理解它们的目的,从代码中,它们似乎管理计算图中给定节点的设备和数据.也就是说,用于输入数据和输出数据的设备在设备上下文中指定,执行器处理在这些设备之间来回传递张量.每个 OpKernelContext 似乎都包含一个设备上下文,用于管理给定内核的特定执行.
Just to make sure I understand their purpose, from the code, they appear to manage the devices and data for a given node in the compute graph. Namely, the Devices for the input data and output of data are specified in the Device Context and the Executor handles passing Tensors back and forth between these devices. Every OpKernelContext appears to contain a Device Context that governs the particular execution of that given kernel.
目前对于我的新硬件,设备上下文都是空的,这最终会导致段错误.根据现有 C++ 的风格,我希望有一些宏允许我为特定设备类型(即 GPU、CPU)注册"设备上下文,但我找不到这些.那么我的问题是,如何在为我的设备创建时将适当的设备上下文添加到 OpKernelContext 中.
Currently for my new hardware, the Device Contexts are all null which is ultimately causing a segfault. Per the style of the existing C++, I would expect that there are macros that allow me to "register" the Device Contexts for a particular Device Type (ie. GPU, CPU), but I can't find these. My question is then, how do I get proper Device Contexts to be added to the OpKernelContext's as they are created for my device.
请注意,我没有编写特定于我的硬件的设备上下文类.我注意到 ThreadPoolDevice 似乎没有特定于它的 DeviceContext 实现.我认为这是因为基类 DeviceContext 是为 ThreadPools 实现的.
Note that I have not written a Device Context class specific to my hardware. I noticed that ThreadPoolDevice does not appear to have a DeviceContext implementation specific to it. I assume this is because the base class DeviceContext is implemented for ThreadPools.
如果您对 DeviceContexts 有任何澄清,我将不胜感激.
I would appreciate any clarification on DeviceContexts.
推荐答案
设备上下文对象有两个目的:
目前,大多数调用者可能会忽略一些 StreamExecutor 特定的位 (gpu::Stream/MaintainLifetime),因为它们特定于 GPU.
At the moment, there's some StreamExecutor-specific bits that most callers probably can ignore (gpu::Stream / MaintainLifetime) because they are specific to GPU.
您需要一个 OpKernels 中底层设备资源的句柄,并且 DeviceContext 对象保存用于计算的流"对象.
You need a handle to the underlying device resources in OpKernels, and the DeviceContext object holds the "stream" object that is used to compute on.
我们还没有在 DeviceContext 中实现一个其他设备应该作为其资源实现的不透明句柄,但这正是我们所需要的.不幸的是,在我们让非基于 StreamExecutor 的设备工作之前,这是一个待办事项.
We have yet to implement an opaque handle in DeviceContext that other devices should implement as their resource, but that's what would be needed. So that's a TODO before we could get non StreamExecutor-based devices to work, unfortunately.
另一个组件是处理从设备到 CPU 的复制的代码.您是对的,CPU 不存在 DeviceContext,因为 CPU 是主机设备,从这个角度来看不需要特别对待.
The other component is code to handle copying to and from the device to CPU. You are right that the DeviceContexts don't exist for CPU, because CPU is the host device and doesn't need to be treated specially from this point of view.
但作为一个例子,我们可以看一下 GPU 的代码.montensorflow_run_device/compuh/code>
是GPU设备上下文的一个例子,它实现了DeviceContext接口.接口的实现是here,它委托给 GPUUtil 类中的代码来实际执行内存拷贝.它碰巧使用 StreamExecutor 框架来处理底层副本,但您自己的设备将使用您拥有的任何 API 来在主机与设备之间进行复制.
But as an example, we can take a look at the code for GPU.tensorflow/core/common_runtime/gpu_device_context.h
is an example of the GPU device context, which implements the DeviceContext interface. The implementation of the interface is here, which delegates to the code in the GPUUtil class for actually performing the memcopies. It happens to use the StreamExecutor framework to handle the underlying copies, but your own device would use whatever APIs you have for copying to and from host to device.
如果您想从一个设备复制到另一个设备,有一个特殊的注册.GPU 到 GPU 的示例是:此处对于实现,和 > 用于注册该函数.
If you want to copy from device to device, there's a special registration for this. An example for GPU to GPU is:herefor the implementation, and here for the registration of that function.
在某些时候,我们可能会将其重构为更清晰一些,因此注册都是统一的(CPU 到设备、设备到 CPU、设备到设备).目前它有点临时.
At some point we might refactor this to be a bit cleaner, so the registrations are all uniform (CPU to device, device to CPU, device to device). At the moment it's a bit ad hoc.
因此,目前支持其他设备的工作正在进行中,但我们很高兴与您和其他人合作以充实这种支持.
So it's going to be a work in progress at the moment to support other devices, but we are happy to work with you and others to flesh out this support.
这篇关于TensorFlow 设备上下文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!