问题描述
我们有一个相当大的git repo(iOS应用程序资源)。我很欣赏git在使用它时会变得很慢,但是如果我创建一个新的分支并编辑一些文件(而不是二进制文件)并推送它,它就会一直持续下去。
感觉整个回购正在推进。我的印象是git只会发送差异,这是错的吗? (我知道git存储整个文件的压缩版本,我的意思是我的分支和我分支的地方之间的差异)。
如果我运行 git diff --stat --cached origin / foo
然后我看到一个看起来像我期望的文件的短列表,例如 34个文件已更改,1117个插入(+),72个删除( - )
。但是当我把它推到编写对象:21%(2317/10804)
并且停下来时,就好像它推送了所有2.4GB的二进制数据一样。
我是否错过了一些东西(我已经使用它很难)?这是预期的行为?我在OS X(Mavericks)和ssh([email protected])上使用git 2.2.2。
我在这里发现了一个类似的问题:,但没有真正的答案。
您使用的是智能事情),所以你确实得到增量,或者更具体地说,增量压缩。但这并不是说git推差异。
推送和读取的工作方式与此相同:在智能交通工具上,您的git调用远程,两端都有一个迷你会话来确定谁拥有哪些存储库对象,由SHA-1标识并附加到特定标签(通常是分支和标签名称,但也允许使用其他标签)。
master 为SHA-1 1234567 ...
。我发现你的 master
目前是 333333 ...
,这里是我认为你需要从那里到 7777777 ...
。他们应该回答好吧,我需要一些,但我已经有...。一旦你的git发现了需要发送的内容以及已经存在的内容,你的git将构建一个包含所有待发送对象的精简包。 (这是使用多达%d个线程的三角洲压缩阶段。)
然后通过智能传输发送最终的瘦身包;这是您看到写入对象消息的地方。 (必须使用 git index-pack --fix-thin
成功发送完整的瘦包,然后再将接收者强化,然后将其放入存储库。 )
确切地说,发送的数据取决于精简包中的对象。该应该只是他们拥有什么和你发送的内容之间的提交集合,以及这些提交所需的任何对象(树和斑点),以及任何带注释的标记发送和任何他们还没有需要的对象。
你可以通过使用 git fetch
来获取他们的最新信息,然后使用
git rev-list
来查看您要发送的内容。例如,如果你只是推动 master
:
$ git fetch origin#假设远程名称是origin
[等待它完成]
$ git rev-list origin / master..master
检查这些提交可能会显示一个非常大的二进制文件,它包含在其中一个中间文件中,然后在以后的提交中再次删除:
$ git log --name-status origin / master..master
如果一个提交包含
A giantfile.bin
,然后后续(可能列在git中的第一个) log
output)commit有D giantfile.bin
,你可能会挂断发送blob给giantfile.bin
。
如果是这种情况,您可以使用
git rebase -i
来消除该提交添加巨大的二进制文件,以便git push
不必发送提交。
(如果你的历史是线性 - 没有合并推送 - 那么你也可以使用
git format-patch
创建一系列包含补丁的电子邮件消息。这些适用于通过电子邮件发送给其他站点上的某个人 - 不是说github上有人在等待接收它们,但是您可以轻松检查修补程序文件以查看其中是否有任何修补程序文件很大。)
包是薄的,因为它违反了正常的包文件规则,需要任何增量压缩下游反对包装本身。相反,下游对象可以(实际上必须)在存储库中接收瘦身包。
We have a git repo that is quite large (ios app resources). I appreciate that git is going to be slow when working with it, but if I create a new branch and edit a couple of files (not binary ones) and push, it takes forever.
It feels like the entire repo is being pushed. I was under the impression that git would only send the diff, is that wrong? (I know git stores compressed versions of the whole file, I mean the diff between my branch and where I branched from).
If I run
git diff --stat --cached origin/foo
then I see a short list of files that looks like what I would expect, e.g.34 files changed, 1117 insertions(+), 72 deletions(-)
. But when I push it gets toWriting objects: 21% (2317/10804)
and grinds to a halt, as if it's pushing all 2.4GB of binary data.Am I missing something (I've googled it pretty hard)? Is this the expected behaviour? I'm using git 2.2.2 on OS X (Mavericks), and ssh ([email protected]).
I found a similar question here: Git - pushing a remote branch for a large project is really slow but no real answers.
解决方案You're using a "smart" transport (this is a good thing), so you do get deltas, or more specifically, "delta compression". But that's not to say that git pushes diffs.
Both push and fetch work the same way here: on a smart transport, your git calls up the remote and both ends have a mini conversation to figure out who has which repository objects, identified by SHA-1 and attached to specific labels (typically branch and tag names although other labels are allowed as well).
For instance, in this case, your git calls up theirs and says: "I propose to have you set your branch
master
to SHA-11234567...
. I see that yourmaster
is currently333333...
, here's what I think you need to get from there to7777777...
." Theirs should reply with "ok, I need some of those but I already have ...". Once your git has figured out what needs to be sent, and what is already present, your git builds a "thin pack" containing all the to-be-sent objects. (This is the "delta compressing using up to %d threads" phase.)The resulting thin pack is then sent over the smart transport; this is where you see the "writing objects" messages. (The entire thin pack must be sent successfully, after which the receiver "fattens it up" again using
git index-pack --fix-thin
and drops it into the repository.)Exactly what data is sent, depends on the objects in the thin pack. That should be just the set of commits between "what they have" and "what you're sending", plus any objects (trees and blobs) needed for those commits, plus any annotated tags you're sending and any objects needed for those, that they don't already have.
You can find the commits in question by using
git fetch
to pick up their latest information, then usinggit rev-list
to see what commits you'd send them. For instance, if you're just going to push things onmaster
:$ git fetch origin # assuming the remote name is origin [wait for it to finish] $ git rev-list origin/master..master
Examining these commits may show a very large binary file that is contained in one of the middle ones, then removed again in a later commit:
$ git log --name-status origin/master..master
If one commit has
A giantfile.bin
and then a subsequent (probably listed first ingit log
output) commit hasD giantfile.bin
, you're probably getting hung up sending the blob forgiantfile.bin
.If that's the case, you can use
git rebase -i
to eliminate the commit that adds the giant binary file, so thatgit push
won't have to send that commit.(If your history is linear—has no merges to push—then you can also, or instead, use
git format-patch
to create a series of email messages that contain patches. These are suitable for emailing to someone at the other site—not that there's someone at github waiting to receive them, but you can easily examine the patch files to see if any of them are enormous.)The pack is "thin" in that it violates a normal pack-file rule that requires any delta-compression "downstream" object to be in the pack itself. Instead, the "downstream" objects can (in fact, must) be in the repository receiving the thin pack.
这篇关于分支的git push非常慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!