问题描述
我无论如何都深深地被整个知识库所困扰(只能由我使用),并且可以利用一些帮助来整理它。这是我所做的。我意识到,在我的提交历史记录中,有一些文件包含我不想放置的凭据。所以,我决定合法,并尝试使用BFG Repo-Cleaner来解决这些问题。我把所有的凭据都扔进了.gitignores,然后继续努力将它们从历史中清理出去。按照文档说明,我执行了这些命令:
git clone --mirror myrepo.git
java -jar bfg.jar --delete-files stuffthatshouldbedeleted.txt myrepo.git
此时,BFG告诉我已找到并删除了多少个文件。 Sweet。
cd myrepo.git
git reflog expire --expire = now --all
git gc --prune = now --aggressive
git push
根据终端日志,它更新了回购。到目前为止很好,对吧?我弹出到我的github帐户中,点击几下后,在我的历史记录中找到仍然存在的凭据,文件和全部。我回去尝试一组相同的命令,但使用这一行代替文件移除器:
java -jar bfg。 jar --replace-text passwords.txt myrepo.git
其中passwords.txt是一个包含字符串的文件我想要去的所有凭证的实例。同样,BFG日志表明它有几个已修复的实例。我抬起头来,检查一下,证书还在那里,坐在Github。我注意到我的所有提交的SHA-1密钥都被修改了,所以大概BFG做了一些事情,而不是我想要的。
在这个点,我放弃并尝试恢复工作,我想稍后再解决。我做了一些工作,尝试推高,得到一个奇怪的合并冲突(你提前50次,落后50次)。什么?我尝试拉和合并,突然间,我的git历史记录中的每一个提交都是重复名称,其中一些只是空白。我检查了我的Github网络图,看起来有一个从我的初始提交开始的第二个分支,完全反映了我的最后一次提交(我从来没有分支,只是线性徘徊)已被拉链的所有提交。
我无法恢复到以前的提交,因为它们都按时间顺序重复。我的凭证还在,现在有两倍的例子,我的历史翻了一番,很难理解。当我尝试从头开始运行BFG,重新克隆和镜像repo时,它告诉我没有证书,尽管我可以在Github中看到它们。我真的可以用一些帮助来理解发生的事情,以及如何能够再次回到事情的状态。
我正在考虑只是删除整个回购并重新开始。我真的不想这样做。
tldr;试过使用BFG,不知何故在我的回购中复制了所有提交的一半版本,无法解开,并增加对伤害的侮辱,BFG什么都没做,并声称它完成了它的工作。
我是BFG的作者,我会尝试描述我认为按步骤进行的步骤在你的账户上: 前BFG手工清洗...
,假设您在运行BFG之前正确地手动删除了您最近提交的错误内容,那么您看到的内容相当奇怪。一些可能的原因:
a)存储库未使用 - mirror
标志进行克隆,因此并非全部GitHub上的分支被覆盖,在非主分支中留下肮脏的历史。但是,您已明确声明您使用了 - 镜像
标志。
即使使用镜像推到GitHub,旧的提交仍然可用时,通过显式的提交id(即一个GitHub网址,其中提交ID),直到。拉取请求和分叉也可以保留旧的历史提交。这可能是你看到的肮脏提交的另一种解释。在任何情况下,在这一点上,你都担心,并且:
- 再次运行BFG,这次是
- 替换文本passwords.txt
,它更新文件内容而不是删除整个文件。
有点好奇,BFG说那里是更多的内容来清除 - 可能你的凭据在更多的地方,你认为 - 但无论如何,无论原因是你看到他们仍然在第一次运行后,是同样的原因,你看到他们在第二次运行后。
回去工作
所以,在这一点上,您已经重写了您的Git资料库历史记录(两次!)并将其推送到GitHub。但是,您的帐户并未提及您删除所有本地旧版的回购副本,如BFG说明中所述:
在这一点上,你已经准备好让每个人都放弃他们旧版的回购协议,并且做好新的原始数据克隆。
那么,您是否在工作机器上删除了Git仓库的旧工作副本,并使用新的Git仓库历史记录重新克隆?旧回购中的历史记录与此时已存在于GitHub中的已清理历史记录不同(即使已清理的历史记录不如您清理已经喜欢它!)。
如果您在Git仓库的旧本地副本中完成工作(而不是从GitHub重新创建新仓库),那么这是你会看到的。你实际上已经向GitHub推送了50次旧的,肮脏的历史提交,并且Git你似乎完全不知道有50个完全不同的(对于Git,它只关心commit-id)在那个分支上已经提交。 Git认为你所做的有点奇怪('50前进50后),并试图告诉你。
让事情变得更糟......
因此,通过拉和合并,您将已清除的历史记录和肮脏的历史记录结合在一起,并将它们与合并提交统一起来。从历史排序的角度来看,这是一个糟糕的主意。一个更好的主意应该是在清理历史的基础上重新分配你的新工作,推动它,删除你的旧工作回购,并做一个新的克隆。
后果
这很奇怪,但我没有任何解释除了上面已经给出的'GitHub gc'解释之外,它不是操作员错误。您可以与我共享存储库(如果您喜欢),以便我可以执行更详细的检查,或者只发送一份.bfg-report目录的压缩副本,以便我可以看到BFG在执行时捕获了哪些诊断信息。
恢复
我希望我能够解释一些发生了什么事情。
在整理历史记录(即摆脱这两个重复链)方面,您需要将Git历史记录重置为(清除的)点在添加合并提交之前。查看合并提交,并确定您喜欢哪个父级历史记录。在你进行合并之前,历史上最后一次提交( xxxx
)是什么?
这可能会失去您所做的最后一项工作在你的旧的,肮脏的历史上。确定该提交( yyyy
),并将其重新绑定到历史记录顶部,或者只是选择它:
git cherry-pick yyyy
最后,使用'force'标志来使用GitHub:
git push origin master -f
...压缩旧回购的存档,然后删除回购的所有旧本地副本,以防止自己进一步混淆。做一个新的克隆。
I have somehow deeply borked by entire repository (used only by me) and could use some assistance in sorting it out.
Here is what I did. I realized that in my commit history, there were some files containing credentials that I did not want just laying around. So, I decided to be legit and try to use the BFG Repo-Cleaner to fix these issues. I threw all the credentials in .gitignores, and moved on to trying to scrub them out of the history. As per the documentation instructions, I executed these commands:
git clone --mirror myrepo.git
java -jar bfg.jar --delete-files stuffthatshouldbedeleted.txt myrepo.git
At this point, BFG told me that x number of files had been found and removed. Sweet.
cd myrepo.git
git reflog expire --expire=now --all
git gc --prune=now --aggressive
git push
According to the terminal logs, it updated the repo. So far so good, right? I pop into my github account, and after a few clicks, find the credentials still there, file and all, in my history. I go back and try the same set of commands, but using this line instead of the file remover:
java -jar bfg.jar --replace-text passwords.txt myrepo.git
where passwords.txt is a file containing string instances of all the credentials I would like gone. Again, BFG logs indicate that there are several instances that it has fixed. I push up, check, and the credentials are still there, sitting in Github. I notice that the SHA-1 keys for all of my commits have been altered, so presumably BFG did something, just not the thing I want it to do.
At this point, I give up and try to get back to work, figure I'll sort it out later. I do some work, try to push up, get a weird merge conflict (you are 50 ahead and 50 behind on commits). What? I try to pull and merge, and suddenly, every single commit in my git history is duplicated in name, and some of them are just blank. I check my Github network graph, and it looks like there is a second branch starting from my initial commit that exactly mirrors all of my commits that has been zippered in with my last commit (I have never branched, just been linearly chugging along).
I can't revert to a previous commit, because they are all chronologically duplicated. My credentials are still in there, with twice as many instances now, and my history is doubled and very confusing to try to understand. When I try to run BFG from the beginning now, cloning and mirroring the repo anew, it tells me that there are no credentials in it, despite the fact that I can see them in Github. I could really use some help in understanding what happened, and how, if at all, I can get back to a state of things again.
I am considering just deleting the entire repo and starting anew. I really don't want to do that.
tldr; Tried using BFG, somehow duplicated half-baked versions of all commits in my repo, can't untangle, and to add insult to injury, BFG did nothing and claims it's done its job.
I'm the author of the BFG, I'll try to describe what I think happened step-by-step based on your account:
The pre-BFG manual cleaning...
First you:
This description of your actions omits two essential steps:
Manually deleting the credentials from your current file-tree, and committing that change to your repo. If you didn't do this, The BFG would have eradicated the content from your old commits, but protected the dirt in your current commits. This behaviour is covered in the BFG documentation under the section titled 'Your current files are sacred...', and if you forget to do it, the BFG prints a warning message when you run it ("WARNING: The dirty content above may be removed from other commits, but as the protected commits still use it, it will STILL exist in your repository..." etc, etc). Did you see that message when you ran the BFG?
That commit needs to be pushed up to your GitHub repository before you clone the full mirror of your repository. Did you forget that step?
If you didn't do those things, that would account for your credentials not being fully scrubbed from your repository.
Running BFG for the first time...
Moving on, then you:
- made a fresh mirror clone of your repo from GitHub
- ran the BFG, filtering using the
--delete-files
option (did you see a protected-content warning?) - pushed the updated repository to GitHub
...at which point :
So, assuming you did correctly manually remove your bad content from your latest commits before running the BFG, what you saw is fairly weird. Some possible causes:
a) The repository wasn't cloned with the --mirror
flag, so not all branches on GitHub were overwritten, leaving dirty history around in non-master branches. However, you've explicitly stated that you used the --mirror
flag.
b) Even with a mirror push to GitHub, old commits are still available there when referenced by explicit commit-id (ie a GitHub url that has the commit-id in it), up until the point GitHub runs it's automatic garbage-collection on your repository. Pull-requests and forks can also preserve commits from the old history. That would be another possible explanation for the dirty commits you saw.
Running BFG for the second time...
In any case, at that point you were concerned, and:
- ran the BFG again, this time with
--replace-text passwords.txt
, which updates file contents rather than deleting the entire file.
It's a little curious that the BFG said that there was more content to clean away- possibly your credentials were in more places that you thought - but in any case, whatever the cause was for your seeing them still around after the first run, is the same reason you saw them around after the second run.
Going back to work
So, at this point you've rewritten your Git repository history (twice!) and pushed it up to GitHub. But your account does not mention you deleting all your local old copies of the repo, as specified in the BFG instructions:
"At this point, you're ready for everyone to ditch their old copies of the repo and do fresh clones of the nice, new pristine data."
So, did you delete your old working copy of the Git repo on your work machine, and re-clone with the new Git repository history? The history in your old repo would have been different to the 'cleaned' history which would have been present in GitHub at that point (even if the 'cleaned' history was not as 'cleaned' as you would have liked it!).
If you were doing the work in an old local copy of your Git repo (rather than a fresh re-clone from GitHub), then this is what you would see. You are essentially pushing up 50 commits of old, dirty history to GitHub, and to Git you seem blissfully unaware that there are 50 completely-different (to Git, which cares only about commit-ids here) commits on that branch already. Git thinks what you're doing is a bit weird ('50 ahead and 50 behind') and is trying to tell you that.
Making things worse...
So, by doing the pull and merge, you've joined together the cleaned history and the dirty history, unifying them with a merge commit. In terms of sorting your history out, this is a bad idea. A better idea would have been to rebase your new work on top of the cleaned history, push it, delete your old working repo, and do a fresh clone.
The aftermath
This is pretty weird, but I don't really have any explanation for it other than operator error, beyond the 'GitHub gc' explanation already given above. You can share the repository with me (if you like) so I can perform a more detailed inspection, or just send me a zipped copy of the '.bfg-report' directory so I can see what diagnostics the BFG captured on it's execution.
Recovery
I hope I've managed to explain some of what's happened.
In terms of sorting out your history (ie getting rid of these two duplicate strands), you need to reset your Git history back to the (cleaned) point before you added in that merge commit. Look at the merge commit, and identify which parent history you prefer. What's the last commit (xxxx
) in that history before you did the merge?
git reset --hard master xxxx
This may well lose the last bit of work you did on your old, dirty, history. Identify that commit (yyyy
), and rebase it on top of your history, or just cherry-pick it:
git cherry-pick yyyy
Finally, push your recovered history up to GitHub with the 'force' flag:
git push origin master -f
...zip an archive of your old repo, and then delete all old local copies of your repo to prevent yourself further confusion. Do a fresh clone.
这篇关于无效BFG使用后Git合并重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!