本文介绍了创建仅包含本地存储库历史记录子集的 GitHub 存储库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景:我正在接近 开源个人研究代码 我已经工作了两年多.它最初是一个 SVN 存储库,但大约一年前我搬到了 Git,我想在 GitHub 上共享代码.然而,这些年来它积累了很多东西,我更喜欢公共版本以其当前状态开始它的生命.但是,我仍然愿意为它做出贡献并吸收其他人的潜在贡献.

问题: 有没有办法分叉"一个 Git 存储库,这样分叉(它位于 GitHub 上)上就不会保留历史记录,但我的本地存储库仍然有完整的历史记录,我可以拉/推到 GitHub?

我在管理大型存储库方面没有任何经验,因此非常感谢您提供详细信息.

解决方案

您可以在 Git 中轻松创建全新的历史记录.假设您希望 master 分支成为您将推送到 GitHub 的分支,并且您的完整历史记录存储在 old-master 中.你可以移动你的master 分支到 old-master,然后使用 git checkout --orphan:

git branch -m master old-mastergit checkout --orphan mastergit commit -m "导入我的代码的干净版本"

现在您有一个没有历史记录的新 master 分支,您可以将其推送到 GitHub.但是,正如您所说,您希望能够查看本地存储库中的所有旧历史记录;并且可能希望它不被断开连接.

您可以使用 git replace 执行此操作.替换 ref 是一种在 Git 查看给定提交时指定备用提交的方法.因此,在查看历史记录时,您可以告诉 Git 查看旧分支的最后一次提交,而不是新分支的第一次提交.为此,您需要从旧存储库中引入断开连接的历史记录.

git 替换 master old-master

现在您有了新分支,您可以在其中查看所有历史记录,但实际提交对象与旧历史记录断开连接,因此您可以将新提交推送到 GitHub,而不会出现旧提交.将您的 master 分支推送到 GitHub,只有新的提交才会转到 GitHub.但是查看 gitkgit log 中的历史记录,您将看到完整的历史记录.

git push github master:mastergitk --all

问题

如果你曾经在旧提交的基础上建立任何新分支,你必须小心地将历史分开;否则,这些分支上的新提交将在其历史记录中真正包含旧提交,因此如果您将其推送到 GitHub,您将把整个历史记录一起拉.不过,只要您根据新的 master 保留所有新提交,就可以了.

如果您曾经运行过 git push --tags github,这将推送您的所有标签,包括旧标签,这将导致您的所有旧历史记录与它一起被拉取.您可以通过删除所有旧标签(git tag -d $(git tag -l))或从不使用 git push --tags 来解决这个问题,但是仅手动推送标签,或使用如下所述的两个存储库.

这两个陷阱背后的基本问题是,如果您推送任何连接到任何旧历史记录的引用(除了通过替换的提交),您将推送所有旧历史记录.避免这种情况的最好方法可能是使用两个存储库,一个只包含新提交,一个包含旧历史和新历史,目的是检查完整历史.您可以在存储库中完成所有工作,包括提交、从 GitHub 推送和拉取,以及新提交;这样,您就不可能不小心将旧提交推高.

然后,无论何时需要查看整个内容,您都可以将所有新提交拉入具有完整历史记录的存储库中.您可以从 GitHub 或其他本地存储库中提取,以更方便的为准.这将是您的存档,但为了避免意外发布您的旧历史,您永远不要从它推送到 GitHub.设置方法如下:

~$ mkdir newrepo~$ cd newrepo新仓库$ git initnewrepo$ git pull ~/oldrepo master# 现在 newrepo 刚刚有了新的历史;我们可以设置 oldrepo 来从中提取newrepo$ cd ~/oldrepooldrepo$ git remote add newrepo ~/newrepooldrepo$ git 远程更新oldrepo$ git branch --set-upstream master newrepo/master# ... 在 newrepo 中工作,提交,推送到 GitHub 等.# 现在,如果我们想查看 oldrepo 中的完整历史记录:oldrepo$ git pull

如果您使用的 Git 版本早于 1.7.2

您没有 git checkout --orphan,因此您必须通过从现有存储库的当前修订版创建新存储库,然后拉入旧存储库来手动执行此操作断开连接的历史.你可以这样做,例如:

oldrepo$ mkdir ~/newrepooldrepo$ cp $(git ls-files) ~/newrepooldrepo$ cd ~/newrepo新仓库$ git initnewrepo$ git add .newrepo$ git commit -m "导入我的代码的干净版本"newrepo$ git fetch ~/oldrepo master:old-master

如果您使用的 Git 版本早于 1.6.5

git replace 和 replace refs 是在 1.6.5 中添加的,因此您必须使用一种旧的、不太灵活的机制,称为 grafts,它允许您为给定的提交指定替代父项.代替 git replace 命令,运行:

echo $(git rev-parse master) $(git rev-parse old-master) >>.git/信息/嫁接

这将使它在本地看起来好像 master 提交将 old-master 提交作为其父提交,因此您将看到比您看到的多一个提交用 git 替换.

The background: I'm moving closer to open sourcing a personal research code I've been working on for more than two years. It started life as an SVN repository, but I moved to Git about a year ago, and I'd like to share the code on GitHub. However, it accumulated a lot of cruft over the years, and I'd prefer that the public version begin its life at its current status. However, I'd still like to contribute to it and incorporate other people's potential contributions.

The question: is there a way to "fork" a Git repository such that no history is retained on the fork (which lives on GitHub), but that my local repository still has a complete history, and I can pull/push to GitHub?

I don't have any experience in the administrating end of large repositories, so detail is very much appreciated.

解决方案

You can create a new, fresh history quite easily in Git. Let’s say you want your master branch to be the one that you will push to GitHub, and your full history to be stored in old-master. You can just move your master branch to old-master, and then start a fresh new branch with no history using git checkout --orphan:

git branch -m master old-master
git checkout --orphan master
git commit -m "Import clean version of my code"

Now you have a new master branch with no history, which you can push to GitHub. But, as you say, you would like to be able to see all of the old history in your local repository; and would probably like for it to not be disconnected.

You can do this using git replace. A replacement ref is a way of specifying an alternate commit any time Git looks at a given commit. So you can tell Git to look at the last commit of your old branch, instead of the first commit of your new branch, when looking at history. In order to do this, you need to bring in the disconnected history from the old repository.

git replace master old-master

Now you have your new branch, in which you can see all of your history, but the actual commit objects are disconnected from the old history, and so you can push the new commits to GitHub without the old commits coming along. Push your master branch to GitHub, and only the new commits will go to GitHub. But take a look at the history in gitk or git log, and you'll see the full history.

git push github master:master
gitk --all

Gotchas

If you ever base any new branches on the old commits, you will have to be careful to keep the history separate; otherwise, new commits on those branches will really have the old commits in their history, and so you'll pull the whole history along if you push it up to GitHub. As long as you keep all of your new commits based on your new master, though, you'll be fine.

If you ever run git push --tags github, that will push all of your tags, including old ones, which will cause all of your old history to be pulled along with it. You could deal with this by deleting all of your old tags (git tag -d $(git tag -l)), or by never using git push --tags but only ever pushing tags manually, or by using two repositories as described below.

The basic problem underlying both of these gotchas is that if you ever push any ref which connects to any of the old history (other than via the replaced commits), you will push up all of the old history. Probably the best way of avoiding this is by using two repositories, one which contains only the new commits, and one which contains both the old and new history, for the purpose of inspecting the full history. You do all of your work, your committing, your pushing and pulling from GitHub, in the repository with just the new commits; that way, you can't possibly accidentally push your old commits up.

You then pull all of your new commits into your repository that has the full history, whenever you need to look at the entire thing. You can either pull from GitHub or your other local repository, whichever is more convenient. It will be your archive, but to avoid accidentally publishing your old history, you don't ever push to GitHub from it. Here's how you can set it up:

~$ mkdir newrepo
~$ cd newrepo
newrepo$ git init
newrepo$ git pull ~/oldrepo master
# Now newrepo has just the new history; we can set up oldrepo to pull from it
newrepo$ cd ~/oldrepo
oldrepo$ git remote add newrepo ~/newrepo
oldrepo$ git remote update
oldrepo$ git branch --set-upstream master newrepo/master
# ... do work in newrepo, commit, push to GitHub, etc.
# Now if we want to look at the full history in oldrepo:
oldrepo$ git pull

If you're on Git older than 1.7.2

You don't have git checkout --orphan, so you'll have to do it manually by creating a fresh repository from the current revision of your existing repository, and then pulling in your old disconnected history. You can do this with, for example:

oldrepo$ mkdir ~/newrepo
oldrepo$ cp $(git ls-files) ~/newrepo
oldrepo$ cd ~/newrepo
newrepo$ git init
newrepo$ git add .
newrepo$ git commit -m "Import clean version of my code"
newrepo$ git fetch ~/oldrepo master:old-master

If you're on Git older than 1.6.5

git replace and replace refs were added in 1.6.5, so you'll have to use an older, somewhat less flexible mechanism known as grafts, which allow you to specify alternate parents for a given commit. Instead of the git replace command, run:

echo $(git rev-parse master) $(git rev-parse old-master) >> .git/info/grafts

This will make it look, locally, as if the master commit has the old-master commit as its parent, so you will see one more commit than you would with git replace.

这篇关于创建仅包含本地存储库历史记录子集的 GitHub 存储库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-16 12:21
查看更多