问题描述
我有这个配置:
network = {"source_embed_raw": {"class": "linear", ...}}
我想从某个现有检查点加载 source_embed_raw
层的参数.在该检查点中,param 的调用方式不同 (output/rec/target_embed_raw/W
).
I want to load the params for layer source_embed_raw
from some existing checkpoint.In that checkpoint, param is called differently (output/rec/target_embed_raw/W
).
我明白,我可以使用 preload_from_files
加载参数,但我不确定在我的情况下这样做的确切方法,因为层的名称不同,因此只需添加一个前缀不做这项工作.
I understand, that I can load parameters with preload_from_files
, but I am not sure about the exact way to do that in my case, because the names of the layers differ, thus simply adding a prefix does not do the job.
推荐答案
preload_from_files
目前无法以这种方式实现.所以我目前看到这些可能的选项:
This is currently not possible with preload_from_files
in this way.So I currently see these possible options:
我们可以扩展
preload_from_files
(和CustomCheckpointLoader
)的逻辑以允许诸如此类(一些通用变量/层名称映射).
We could extend the logic of
preload_from_files
(andCustomCheckpointLoader
) to allow for sth like that (some generic variable/layer name mapping).
或者您可以将图层从 source_embed_raw
重命名为例如old_model__target_embed_raw
然后使用 preload_from_files
和 prefix
选项.如果你不想重命名,你仍然可以添加一个像old_model__target_embed_raw
这样的层,然后在source_embed_raw
中使用参数共享.
Or you could rename your layer from source_embed_raw
to e.g. old_model__target_embed_raw
and then use preload_from_files
with the prefix
option. If you do not want to rename it, you could still add a layer like old_model__target_embed_raw
and then use parameter sharing in source_embed_raw
.
如果检查点中的参数实际上被称为output/rec/target_embed_raw/...
,您可以创建一个名为old_model__output
的SubnetworkLayer
code>,在另一个名为 rec
的 SubnetworkLayer
,以及名为 target_embed_raw
的层.
If the parameter in the checkpoint is actually called sth like output/rec/target_embed_raw/...
, you could create a SubnetworkLayer
named old_model__output
, in that another SubnetworkLayer
with name rec
, and in that a layer named target_embed_raw
.
您可以编写一个脚本来简单地加载现有的检查点,并将存储作为一个新的检查点,但具有重命名的变量名称(这也完全独立于 RETURNN).
You could write a script to simply load the existing checkpoint, and store is as a new checkpoint but with renamed variable names (this is also totally independent from RETURNN).
LinearLayer
(和大多数其他层)允许准确指定参数的初始化方式(forward_weights_init
和 bias_init
).参数初始化相当灵活.例如.可以使用诸如 load_txt_file_initializer
之类的东西.目前没有这样的函数可以直接从现有的检查点加载它,但我们可以添加它.或者你可以简单地在你的配置中实现逻辑(它只会像 5 行左右的代码).
LinearLayer
(and most other layers) allows to specify exactly how the parameters are initialized (forward_weights_init
and bias_init
). The parameter initialization is quite flexible. E.g. there is sth like load_txt_file_initializer
which can be used. Currently there is no such function to directly load it from an existing checkpoint but we could add that. Or you could simply implement the logic inside your config (it will only be sth like 5 lines of code or so).
除了使用 preload_from_files
,您还可以使用 SubnetworkLayer
和 load_on_init
选项.然后是与选项 2 中类似的逻辑.
Instead of using preload_from_files
, you could also use SubnetworkLayer
and the load_on_init
option. And then a similar logic as in option 2.
这篇关于如何从检查点加载图层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!