我来自集中式存储库,例如 SVN,在那里您通常只执行签出、更新、提交、还原、合并等操作.
Git 快把我逼疯了.命令数不胜数,但最难理解的是为什么很多事情都是这样运作的.
使用 git init --bare
使用 git init --bare
– Jon Saints,http://www.saintsjd.com/2011/01/what-is-a-bare-git-repository/
GitHub 上的 Git 存储库是空的,就像您要推送到的任何远程存储库一样[原文如此].
– VonC,https://stackoverflow.com/a/20855207
然而,在 GitHub 中有源文件.我能看到他们.如果我创建一个裸仓库,则没有源文件,只有工作仓库的 .git
Edward Thomson 的回答部分是我想知道的.不过,我会改写我的问题:
我发布的第一个链接状态("What是一个裸 git 存储库吗?):
它们 [裸存储库] 不包含您的源文件的工作或检出副本.
VonC 的回答:
GitHub 上的 Git 存储库是空的
Github 没有工作副本.
当您浏览网页时,它会根据数据呈现网页 - 将数据直接从存储库中提取出来并输出到您的网络浏览器,而不是先将其写入文件服务器上的磁盘
不知何故,一个裸存储库必须包含所有数据和源代码.如果没有,那么渲染任何东西都不是不可能,因为我可以看到所有更新(提交)的源代码、所有分支(带有各自的源)、一个 repo 的整个日志等.
存储库的全部数据是否总是在 .git 目录(或裸存储库)中,以某种能够随时呈现所有文件的格式?这是裸仓库的原因,而工作副本只有给定时间的文件吗?
是的,这些文件及其完整历史记录存储在 .git/packed-refs
和 .git/refs
和 .git/objects.
当你克隆一个 repo(裸或非裸)时,你总是拥有 .git
文件夹(或一个带有 .git
的文件夹)裸仓库的扩展,按照命名约定)及其 Git 管理和控制文件.(见词汇表)
Git 可以使用 git unpack-objects 随时解压它需要的东西.
从裸仓库中,您可以查询日志(git bare 仓库中的 git log
工作正常:不需要工作树),或者 列出裸存储库中的文件.
这就是 GitHub 可以呈现包含文件的页面而无需查看完整存储库的方式.
我不知道 GitHub 是否完全做到了这一点,因为存储库的绝对数量迫使 GitHub 工程团队做各种优化.
使用 DGit,这些裸存储库实际上是跨多个服务器复制的.
对于 GitHub 而言,维护工作树会在磁盘空间和更新(当每个用户请求不同的分支时)方面花费太多.最好从独特的裸仓库中提取呈现页面所需的内容.
通常(在 GitHub 约束之外),使用裸仓库进行推送,以避免出现 工作树与刚刚推送的内容同步.参见 "但是为什么我需要一个裸仓库?"举个具体的例子.
- 由于 git 2.3,您可以推送到非裸仓库(这将相应地更新工作树)
- 由于 git 2.4,您可以推送到部署"(即 它也适用于未出生的分支)
但这对于 GitHub 来说是不可能的,它无法为它必须存储的每个存储库维护一个(或服务器)工作树.
文章使用裸 Git 存储库为我的 dotfile 获取版本控制 "来自 Greg Owen,最初是 报告由aifusenno1补充:
裸仓库是指没有快照的 Git 仓库.
你甚至可以从一个裸仓库创建一个非裸仓库:如果你 git clone
一个裸仓库,Git 会自动为你在新仓库中创建一个快照(如果你想要一个裸仓库,使用git clone --bare
那么为什么我们要使用一个裸 Git 存储库?永久链接
参见 Git 存储库布局:
一个 .git
基本上,如果你想编写自己的 GitHub/GitLab/BitBucket,你的中心化服务会将每个 repo 存储为一个裸仓库.
答案是,如果与您的存储库交互的唯一服务是 Git,则不需要快照.
基本上,快照对人类和非 Git 工具都是一种方便,但 Git 只与历史交互.您的集中式 Git 托管服务只会通过 Git 命令与 repos 交互,那么为什么要一直物化快照呢?快照只会占用额外的空间,没有任何好处.
GitHub 会在您访问该页面时即时生成该快照,而不是将其永久存储在存储库中(这意味着 GitHub 只需要在您请求时生成快照,而不是每次有人推送时都保持更新)任何变化).
This maybe has been answered, but I didn't find a good answer.
I come from centralized repositories, such as SVN, where usually you only perform checkouts, updates, commits, reverts, merges and not much more.
Git is driving me crazy. There are tons of commands, but the most difficult to understand is why many things work as they do.
According to "What is a bare git repository?":
However, from the accepted answer to "what's the difference between github repository and git bare repository?":
However, in GitHub there are source files. I can see them. If I create a bare repository, there are no source files, only the contents for .git
directory of a working repository.
How is this possible? What don't I understand?
Can you give an example about why I would need a bare repository and its motivation to work that way?
Edward Thomson's answer is, in part, what I wanted to know. Nevertheless, I will rephrase my question:
First link I posted states("What is a bare git repository?"):
VonC's answer:
Both statements implies
Edward Thomson says:
Somehow, a bare repository has to contain all data and source code. If not, it wouldn't be impossible to render anything, because I can see all source code updated (commited), all branches (with their respective source), the whole log of a repo, etc.
Is there the whole data of a repository always within .git directory (or in a bare repo), in some kind of format which is able to render all files at any time? Is this the reason of bare repository, while working copy only has the files at a given time?
Yes, those files and their complete history are stored in .git/packed-refs
and .git/refs
, and .git/objects.
When you clone a repo (bare or not), you always have the .git
folder (or a folder with a .git
extension for bare repo, by naming convention) with its Git administrative and control files. (see glossary)
Git can unpack at any time what it needs with git unpack-objects.
The trick is:
From a bare repo, you can query the logs (git log
in a git bare repo works just fine: no need for a working tree), or list files in a bare repo.
Or show the content of a file from a bare repo.
That is how GitHub can render a page with files without having to checkout the full repo.
I don't know that GitHub does exactly that though, as the sheer number of repos forces GitHub engineering team to do all kind of optimization.
See for instance how they optimized cloning/fetching a repo.
With DGit, those bare repos are actually replicated across multiple servers.
For GitHub, maintaining a working tree would cost too much in disk space, and in update (when each user request a different branch). It is best to extract from the unique bare repo what you need to render a page.
In general (outside of GitHub constraint), a bare repo is used for pushing, in order to avoid having a working tree out of sync with what has just been pushed. See "but why do I need a bare repo?" for a concrete example.
That being said:
- since git 2.3 you could push to a non-bare repo (that would update the working tree accordingly)
- since git 2.4, you can "push-to-deploy" (ie, it works for unborn branch as well)
But that would not be possible for GitHub, which cannot maintain one (or server) working tree(s) for each repo it has to store.
The article "Using a bare Git repo to get version control for my dotfiles " from Greg Owen, originally reported by aifusenno1 adds:
And Greg adds:
a <project>.git
directory that is a bare repository (i.e. without its own working tree), that is typically used for exchanging histories with others by pushing into it and fetching from it.