问题描述
那么反编译器真的是给您提供经过编译/解释的一段代码的源代码的东西吗?因为对我来说这听起来不可能。如果将其编译,如何获得函数,变量,类等的名称。还是我误解了定义?它是如何工作的?造一个的背后的一般原理是什么?
So is a decompiler really a thing that gives gives the source of a compiled/interpreted piece of code? Because to me that sounds impossible. How would you get the names of the functions, variables, classes, etc if it is compiled. Or am I misinterpreting the definition? How does it work? And what is the general principal behind making one?
推荐答案
您对反编译器的定义是正确的:它需要一个已编译的应用程序并生成要匹配的源代码。但是,在大多数情况下,它不不知道变量/函数/类的名称和结构,只是猜测而已。它分析程序的流程,并尝试找到一种方法来表示通过某种编程语言(通常为C)表示该流程。但是,由于选择的编程语言(在此示例中为C)通常处于比状态更高的级别。对于基础程序(二进制可执行文件),程序的某些部分可能无法准确表示;在这种情况下,反编译器将失败,您将需要使用反汇编器。这就是为什么许多人喜欢混淆代码的原因:反编译器很难打开它。
You're right about your definition of a decompiler: it takes a compiled application and produces source code to match. However, it does not in most cases know the name and structure of variables/functions/classes--it just guesses. It analyzes the flow of the program and tries to find a way to represent that flow through a certain programming language, typically C. However, because the programming language of choice (C, in this example) is often at a higher level than the state of the underlying program (a binary executable), some parts of the program might be impossible to represent accurately; in this case, the decompiler would fail and you would need to use a disassembler. This is why many people like to obfuscate their code: it makes it much harder for decompilers to open it.
构建反编译器并非易事。基本上,您必须采用要反编译的应用程序(它是可执行文件或某种其他形式的已编译应用程序),然后将其解析为可以在内存中使用的某种树。然后,您将分析程序的流程并尝试查找可能暗示在代码中的特定位置使用了 if
语句/变量/函数/等的模式。这实际上只是一个猜谜游戏:您必须知道编译器在已编译代码中创建的模式,然后搜索这些模式并将其替换为等效的人类可读源代码。
Building a decompiler is not a simple task. Basically, you have to take the application that you are decompiling (be it an executable or some other form of compiled application) and parse it into some kind of tree you can work with in memory. You would then analyze the flow of the program and try to find patters that might suggest that an if
statement/variable/function/etc was used in a certain location in the code. It's all really just a guessing game: you'd have to know the patterns that the compiler makes in compiled code, then search for those patterns and replace them with equivalent human-readable source code.
对于像Java或.NET这样的高级程序而言,这一切都简单得多,在这些程序中您不必处理汇编指令,而变量等事情通常会由您来处理。在这里,您不必猜测直接翻译那么多。您可能没有确切的变量/方法名称,但是您至少可以相当容易地推断出程序结构。
This is all much simpler for higher-level programs like Java or .NET, where you don't have to deal with assembly instructions, and things like variables are mostly taken care of for you. There, you don't have to guess as much as just directly translate. You might not have exact variable/method names, but you can at least deduce the program structure fairly easily.
免责声明:我从未写过反编译器,因此不知道我在说什么的每个细节。如果您真的有兴趣编写反编译器,则应该获得有关该主题的书。
这篇关于什么是反编译器?它如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!