问题描述
Q:当git推送没有共同历史的引用而不是,在构建瘦客户机时,它可以考虑本地和 origin
之间已经存在的根或子树发送?
tl; dr 考虑这种(不常见) -with并推送到远程Git仓库。 通过大型资源库的性能观察,和发送的对象数量,我怀疑 在这种情况下,树是相同的。如果在 当然,如果在推送无共同历史记录中有多个提交,则需要为每个提交重复此协商。 Smart API可以考虑已经拥有的公共子树,或者至少考虑每个提交的根树,这听起来合理吗?或者应该Git已经这样做了,并且我的客户端或服务器出了问题? 检查git的源代码并使用git守护进程和GIT_TRACE_PACKET尝试它,表示你对它做了什么是正确的:git在仅提交级别。如果历史不共享,git将不会检测到共享内容。 如果已经存在的公共子树或者至少是根树,持有的通用子树不能通过已经持有的通用提交来识别,然后找出它们必须发送其ID的子树。 事情是,对于任何缺少完整读数的东西,我都可以构建一个似是而非的角落案例,发送任意大量的冗余数据 - 但每次发送每个现有的子树ID以避免这种可能性显然是巨大的损失。不要忘记,往返延迟是非常昂贵的。那么,在考虑在整个所有提取中增加的开销时,在什么时候你可能会花更多时间进行谈判呢?如果您要争辩说某种特定的替代方法会节省整体时间,您将不得不显示实际生产流量的硬数据。 另外记住你可以自己构建包装。这并不难,你将对象id提供给 Q: When git pushes refs that have no common history over the Smart Protocol, can it consider root or sub-trees already in-common between local and tl;dr Consider this (uncommon) situation when working-with and pushing to a remote Git repository. From observations of the performance of this with a large repository, and the number of objects sent, I suspect that In this case the trees are identical. If a subsequent change is made in Of course if there are multiple commits in the nothing-in-common history being pushed, this negotiation would need to be repeated for each commit. Does it sound reasonable that the Smart API could consider already-held common sub-trees, or at the very least, the root-tree, as it considers each commit? Or should Git already be doing this and there is something wrong with my client or server? Checking git's source and trying it with git daemon and GIT_TRACE_PACKET says you're correct about what it's doing: git negotiates at the commit level only. If the history isn't shared, git won't detect the shared content. If the already-held common subtrees can't be identified by already-held common commits, then to identify those subtrees it'd have to send their ids. The thing is, for anything short of a complete readout, I can construct a plausible-sounding corner case that sends an arbitrarily-large amount of redundant data -- but sending every existing subtree id every time to avoid that possibility is clearly a huge loss. Don't forget that round-trip latency is horrendously expensive. So, at what point do you become likely to be spending more time negotiating when considering added overhead across all fetches, in the aggregate? If you're going to argue that some particular alternate method would save time overall, you're going to have to show up with hard data on actual production traffic. Also remember that you can construct packs yourself. It's not hard, you feed object id's to 这篇关于Git Smart API精简包装计算能否考虑重用常用的子树?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
master
指向具有1110个后代子树 a [0-9] / b [0-9] / c [0-9]
的树。 li>
origin / master
目前是本地 master
commit,即相同的历史记录。它使用 ssh
协议。
压扁
。我将该分支设置为新的单一根提交,但具有与 master
相同的内容/树。这可以通过 git commit-tree
完成。所以这个分支有一个单独的提交,没有与 master
共同的提交,但是根树 - 哈希是相同的,它指向<$ c $中的同一个树对象c> master 和 origin / master
。为了讨论这一点,这是一个单一的/压扁的提交并不重要 - 任何历史都会被重写回根提交,并且不会有通用的历史记录。
git push origin HEAD#push squashed
push
, send-pack
和 receive-通过和相关的精简包协商
压扁
与任何提交 origin
目前没有共同的历史记录。
压扁
指向不仅在原点
中的树,而且是指当前 HEAD
ref。
压扁
中进行后续更改...或者是额外的提交,或者是更改 a0 $ c $中的文件的新压缩c>,2棵树(
/
和 a0
)会发生变化,其他1109将保持不变。根树已经发生了变化,这意味着需要进行下一级搜索以查看是否值得搜索更多常见的子树。这可能需要一个启发式的方法,因为没有比较树叶下的所有子树,所以不可能推断树中任何特定深度的共同后代树的数目。
git 2.8.2
git pack-objects pack
并将输出放入 .git / objects / pack
,恭喜,您刚刚将这些对象提取到该回购中。origin
when building the thin-pack to send?master
points to a tree with 1110 descendant sub-trees a[0-9]/b[0-9]/c[0-9]
.origin/master
is current with the local master
commit i.e. identical histories. It uses ssh
protocol.squashed
. I set that branch to a new, single root-commit, but with the same content/tree as master
. This can be done with git commit-tree
. So this branch has a single commit with no commits in-common with master
, but the root tree-hash is identical, it points to the same tree object in master
and origin/master
. It is not important that this is a single/squashed commit in order to discuss this - any history rewritten back to the root commit, with no common history will do.git push origin HEAD # push squashed
push
, send-pack
and receive-pack
and associated thin-pack negotiation over the Smart Protocol does something like:squashed
has no common-history with any commit origin
currently has.squashed
points to a tree that is not only in origin
, but is the tree for a current HEAD
ref.squashed
... either an additional commit, or a new squash that changes a file in a0
, 2 trees (/
and a0
) would have changed, and the other 1109 would be unchanged. The root tree has changed, which means a next-level search would be required to see whether it is worth searching for further common sub-trees. This might require a heuristic, as without comparing all sub-trees down-to the leaves, it is not possible to infer the number of descendant trees in-common from the trees at any particular depth.git version 2.8.2
git pack-objects pack
and drop the output into .git/objects/pack
, congratulations, you've just fetched exactly those objects into that repo.