问题描述
一两个星期前,我用简单的find |sed|tar|xz|gpg
bash脚本提取了一些我正在归档的文件,将它们全部解压缩,然后将它们的内容放入git仓库中,提交,将下一个档案内容放入该仓库中,进行(冲洗和重复)以便获得更好的系统.
A week or two ago I took some files that I had been archiving with a simple find |sed|tar|xz|gpg
bash script, unpacked them all, and put their contents in a git repo, commited, put the next archives content in the repo, committed (rinse and repeat) in order to have a nicer system.
所有文件都是在我的两台计算机之一上使用TeXstudio或Vim都使用Arch Linux进行编辑的.
All files were edited with on one of my two computers, both using Arch Linux, in either TeXstudio or Vim.
我试图签出一个旧版本,但是它被淘汰了--由于出色的更改,它不会让我离开.我尝试了我所知道的一切,然后在Google上找到了我不知道的东西.
I tried to checkout an old version, but its flipping out---it won't let me due to changed that are outstanding. I tried everything I knew how, and then went on Google to find out things I didn't know.
关于此主题还有许多其他问题.不幸的是,他们的回答并没有帮助我.为了完整起见,我将列出问题.
There are a number of other questions on this subject. Unfortunately their answers have not helped me. For sake of completion I'll list the questions.
已修改:Arcs/arc1.tex
修改:Arcs/arc2.tex
修改:Arcs/frontmatter.tex
modified: Arcs/arc1.tex
modified: Arcs/arc2.tex
modified: Arcs/frontmatter.tex
未添加任何更改来提交(使用"git add"和/或"git commit -a")
no changes added to commit (use "git add" and/or "git commit -a")
而且,人们不需要看,下面,我已经做了显而易见的事情.
Also, so people don't need to look, below, I already did the obvious ones.
git reset --hard
git -a commit
git stash
git pull
以及从索引中删除所有内容并将其重新添加.
as well as remove everything from the index and add it back.
我不在Windows上.另外,因为我是唯一的用户,所以这应该与行尾有关.没有理由出现奇怪的行尾.
I'm not on Windows. Also, this should have anything to do with line endings since I'm the only user. There is no reason for there to be weird line endings.
git reset --hard HEAD (among other possibilities)
git stash
git stash drop
git config core.autocrlf input
git rm --cached -r .
git reset --hard
git add .
git commit -m "Normalize line endings"
这不仅不起作用,而且还增加了行为异常的文件数量,并且还为该文件写入了700多行. .原因.甚至不是文件操作异常.
Not only did this not work but it increased the number of files that are misbehaving and also wrote 700+ lines to a file for. . .reasons. It wasn't even the file that was misbehaving.
更多结束行内容.
git clean -df
git checkout -- .
git checkout -- ./.
git checkout-index -a -f
git checkout --force master
我没有看到的东西,但是还是尝试了
我尝试提交更改后的git commit -am "WORK DAMN YOU!"
然后是git revert --hard HEAD^
我也尝试过从我的私人遥控器中拉出,但是被告知本地存储库已经是最新的.
I also tried pulling from my private remote, but was just told that the local repo was already up to date.
这真令人沮丧.
推荐答案
根据:
您可以使用它们.您只需要知道Git不能了解即可,您可能想以某种方式对其进行调整.不幸的是,进行上述调整的方法很少.
You can use them. You just have to be aware that Git doesn't understand them, and you may want to tweak them in some fashion. Unfortunately there are few good alternatives for doing said tweaking.
过滤器(称为 clean 和 smudge 过滤器)的工作方式与core.autocrlf
和行尾处理的工作方式直接相关.要自己理解它们,请从一些简单的事实开始:
The way that filters—called clean and smudge filters—work is directly related to the way that core.autocrlf
and end-of-line mangling works. To understand them yourself, start with a few simple facts:
-
任何Git对象(提交,树,blob或带注释的标签)的内容实际上都不能更改.这是因为内容是通过其数据库密钥检索的,该数据库密钥是必须匹配计算出的哈希值的哈希ID(当前为SHA-1,将来可能为SHA-3或某些其他非常好的哈希).内容.
The content of any Git object—commit, tree, blob, or annotated tag—literally cannot be changed. This is because the content is retrieved by its database key, which is a hash ID (currently SHA-1, in the future perhaps SHA-3 or some other very good hash) that must match the computed hash of the content.
您可以通过其哈希ID检索提交.像master
或develop
这样的分支名称仅包含该分支上最新提交的实际哈希ID.
You retrieve a commit by its hash ID. A branch name like master
or develop
just contains the actual hash ID of the latest commit on that branch.
每个提交都存储其父提交的原始哈希ID作为其内容的一部分,并存储树对象的原始哈希ID,该树对象通向blob对象,从而为该提交生成快照. /p>
Each commit stores the raw hash ID of its parent commit, as part of its content, and stores the raw hash ID of the tree object that leads to the blob objects and thus produces the snapshot for that commit.
要将新对象存储到数据库中,请将对象输入git hash-object -w
(或Git在内部自行完成). Git现在会计算内容的哈希值,包括给出对象类型和大小的标头,并将值(内容)存储到数据库中并发出密钥.然后,您将来可以使用该密钥来检索内容.那时,Git 重新检查哈希值:它必须与密钥匹配.如果不匹配,则表明数据已损坏,Git停止.
To store a new object into the database, you feed the object into git hash-object -w
(or Git does this internally on its own). Git now computes the hash of the content, including the header that gives the object's type and size, and stores the value—the content—into the database and emits the key. You may then use the key in the future to retrieve the content. At that time, Git re-checks the hash: it must match the key. If it does not match, the data have been corrupted, and Git stops.
因此,提交哈希必须与提交内容匹配,这将为树内容提供树哈希,为树内容提供树状哈希.如果提交本身不是分支的尖端,则通过向尖端提交回溯到一定数量的先前提交(全部通过其哈希ID)来找到该提交.产生的数据结构是 Merkle树,它提供了Git的数据完整性保证.
Hence, the commit hash must match the commit contents, which give the tree hash for the tree contents, which give the blob hashes for the blob contents. If the commit is itself not the tip of a branch, the commit was found by walking back through a tip commit to some number of previous commits, all by their hash IDs. The resulting data structure is a Merkle Tree that provides Git's data-integrity guarantees.
这意味着不能对已经提交的内容进行任何过滤.但是,必须对已经提交的内容进行 操作,以便Windows用户可以使用CRLF行结尾. Git如何解决这个矛盾?
This means that any filtering cannot be done on already-committed content. And yet, it must be done on already-committed content, so that Windows users can have CRLF line endings, for instance. How is Git to resolve this paradox?
答案在于有关Git的另外几个事实:
The answer lies in another several facts about Git:
-
您不能直接使用提交内容.需要将它们提取到一个称为 work-tree 的工作区中.工作树(或工作树,或您更喜欢拼写的树)具有解压缩形式的提取文件,可以在其中读取和写入文件.
You cannot work directly with commit contents. They need to be extracted into a working area, called the work-tree. The work-tree (or working tree or however you prefer to spell it) has the extracted files in de-compressed form, where they can be read and written.
但是Git也添加了一个中间数据结构,Git最初只是将其称为 index .这不是一个好名字,因此该数据结构包含三个名称:索引,临时区域和缓存 .该索引将选项卡保留在工作树上,例如缓存(因此称为第三个名称)stat
系统调用数据.首先,将当前提交中的每个文件提取到索引中,并以其特殊的压缩形式(实际上,仅直接使用原始blob哈希ID)即可,以使索引具有或实际上具有对的引用,即提交中文件的副本.
But Git adds an intermediate data structure as well, which Git originally just called the index. This was not a very good name, so this data structure wound up with three names: it's the index, the staging area, and the cache. This index keeps tabs on the work-tree, caching (hence the third name) stat
system call data for instance. Each file from the current commit is first extracted into the index, keeping it in its special compressed form—actually, just using the raw blob hash ID directly—so that the index has, or really, has a reference to, the copy of the file in the commit.
git add
会将文件复制到索引中(实际上,将其作为blob对象添加到主数据库中并计算其哈希ID,然后更新哈希ID在索引中).这意味着索引始终是Git用于您可以进行的 next 提交的图像.这就是它的名称 staging area .因为您可以使用git add
覆盖索引文件,所以它们在此处可写,而在提交中则不可写.
Running git add
on a file copies the file into the index (really, adding it as a blob object into the main database and computing its hash ID, then updating the hash ID in the index). This means that the index is, at all times, the image that Git will use for the next commit you can make. This is where it gets the name staging area. Because you can overwrite index files with git add
, they are writable here, where they are not writable in commits.
运行git commit
将当前索引打包到一个树对象中,并始终冻结它-Blob哈希不再可更改-并使用树对象进行新提交.
Running git commit
packages the current index into a tree object, freezing it for all time—the blob hashes are no longer changeable—and uses the tree object to make the new commit.
与其他版本控制系统相比,该索引是Git如何提高其速度的方式.由于索引跟踪工作树,因此Git可以比平时更快地完成很多事情:例如,git status
可以在目录或文件上调用stat
并将结果与缓存的stat
数据进行比较,而无需读取文件本身.
This index is how Git gets a lot of its speed, as compared to other version control systems. Since the index keeps track of the work-tree, Git can do a lot of things much faster than usual: git status
, for instance, can call stat
on a directory or file and compare the result to cached stat
data, without having to read the file itself.
(索引在冲突的合并中也起着扩展作用.这与清除和涂抹过滤器以及LF/CRLF战争无关,但是在我们谈论索引时值得一提.不仅仅是一个条目对于每个要提交的文件,索引可以容纳三个不提交的条目:一个来自合并基础,一个来自要合并的两个分支提示中的每个.)
(The index also takes on an expanded role during conflicted merges. This isn't relevant to clean and smudge filters and LF/CRLF wars, but is worth mentioning while we're talking about the index. Instead of just one entry per file-to-be-committed, the index can hold three not-to-be-committed entries: one from the merge base, and one from each of the two branch tips being merged.)
我们现在准备看看过滤是如何工作的.让我们总结一下有关提交,索引和工作树的关键点:
We are now ready to see how filtering really works. Let's summarize the key points about commits, the index, and the work-tree:
-
git checkout
将提交的树复制到索引,此后它与提交的树完全匹配,但格式更适合于跟踪工作树. -
git checkout
还还将每个提交的文件复制到工作树,同时更新该文件的索引槽. -
git add
将文件从工作树复制回索引,以便将来的git commit
可以冻结索引.
git checkout
copies a commit's tree to the index, after which it exactly matches the commit's tree but in a form more suitable to keep track of the work-tree.git checkout
also copies each commit's file to the work-tree, while updating the index slot for that file.git add
copies a file from the work-tree back into the index, so that a futuregit commit
can just freeze the index.
现在,请记住,将污迹过滤器应用于已提交的内容,因为它已变成工作树文件. 干净过滤器应用于工作树内容,因为它变成了已提交(或至少要提交)的内容.污迹过滤器时间是指仅LF的行尾可以变为Windows用户的CRLF行尾,而纯净过滤器的时间是指CRLF的行尾可以变为仅LF的行尾.
Now, remember, a smudge filter is applied to committed content, as it's turned into a work-tree file. A clean filter is applied to work-tree content, as it's turned into committed—or at least, to-be-committed—content. The smudge filter time is when LF-only line endings can become CRLF line endings for Windows users, and the clean filter time is when CRLF line endings can turn back to LF-only line endings.
应用污迹过滤器的理想时间是在文件扩展时 ,即从索引复制到工作树. 应用干净过滤器的理想时间是在压缩文件时 ,即从工作树复制到索引.所以这是何时 Git做到的.
The ideal time to apply a smudge filter is while the file is being expanded, i.e., copied from the index to the work-tree. The ideal time to apply a clean filter is while the file is being compressed, i.e., copied from the work-tree to the index. So this is when Git does it.
同时,索引的主要功能之一是 speed .因此,Git 假定,从某种意义上说,应用污迹过滤器不会更改"文件.工作树文件中的内容可能不再与解压缩的blob匹配,但(至少出于意图和目的)它仍然与通过清理并重新压缩工作树而得到的 匹配文件.
At the same time, though, one of the key features of the index is speed. So Git assumes that applying the smudge filter doesn't "change" the file, in some sense. The content in the work-tree file may not match the decompressed blob any more, but—at least by intent and purpose—it still matches what you would get by cleaning and re-compressing the work-tree file.
当此不正确成立时,就会出现摩擦.如果清理并重新压缩文件会导致具有不同哈希ID的不同内容怎么办?答案是 Git可能会注意到,但Git可能不会注意到,这完全取决于索引即缓存的有效性和保存在索引中的stat
数据的可变性,而不是stat
稍后的系统调用传递的数据.
The rub comes in when this isn't true. What if cleaning and re-compressing the file results in different content, with a different hash ID? The answer is that Git may notice, and yet Git may not notice, all depending on the vagaries of the effectiveness of the index-as-cache and the stat
data saved in the index, vs the stat
data delivered by a later system call.
如果污迹和干净的过滤器是完美的镜像(因此污迹和重新清理的文件始终与原始文件匹配),则可以在提取后git add
文件,而Git将更新保存的stat
数据.只要不再更改,Git现在就会认为该文件是干净的.如果基础文件系统具有不可靠的stat
数据,则可以使用索引的假定不变位来 force Git认为文件还是干净的.这是很粗略的,不是令人满意的解决方案,但可以完成工作.
If the smudge and clean filters are perfect mirrors—so that a smudged and re-cleaned file always matches the original—you can git add
the file after extraction, and Git will update the saved stat
data. As long as that does not change again, Git will now believe that the file is clean. If the underlying file system has unreliable stat
data, you can use the index's assume unchanged bit to force Git to think that the file is clean anyway. This is pretty crude and not a pleasing solution, but it will do the job.
这篇关于无法放弃git中的更改的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!