使用Java解析包含汇编语言的文件的最佳方法是哪种?

本文介绍了使用Java解析包含汇编语言的文件的最佳方法是哪种?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我从java.util.regex软件包中阅读了有关StringTokenizer，StreamTokenizer，Scanner，Pattern和Matcher的信息.我还阅读了有关它们的意见，但确实感到困惑:最好使用哪种意见?

I have read about StringTokenizer, StreamTokenizer, Scanner, Pattern and Matcher from the java.util.regex package. I have also read about opinions on them and I am realy confused: which one is the best to use?

我需要做的是制作一个Assambler，即从该文件解析包含汇编语言的文件，然后将其转换为机器代码.

What I need to do is to make an Assambler, that is, to parse a file containing Assembly language from that file and I need to transform it into Machine Code.

例如，如果我有汇编代码:

For example if I have the assembly code:

MOV R15,R12

这应该转换为对应于每条指令和寄存器的十六进制数字.

This should translate to hexa numbers coresponding to each instruction and register.

我们只说翻译如下:

MOV变为10 F3
R15变为11 F2
R12变为20 1E

MOV becomes 10 F3
R15 becomes 11 F2
R12 becomes 20 1E

因此，我的输出文件应该是:

Thus, my output file should be:

10 F3 11 F2 20 1E

现在，我需要解析Assembly文件以识别每条指令及其后的内容.

Now I need to parse the Assembly file to identify each instruction and what comes after it.

对于那些了解微控制器的人来说，有很多方法可以显示一条指令.我的问题是:

For those who know microcontroller there are many ways for an instruction to appear. My question is:

使用Java，这是将文件中每个单词转换成令牌的最佳方法(使用上述任何类)，以便我找到匹配的单词并将其写入文件.

Using Java, which is the best method to transform each word from my file into tokens (using any of the aforementioned classes), so that I can find the matching one and write it into a file.

ldi R13,0x31

我需要在一个令牌中包含ldi，在另一个令牌中包含r13，在另一个令牌中包含31

I need to have ldi in one token, r13 in another and 31 in another

推荐答案

好吧，您提到的所有内容都非常适合简单地标记字符串或文件.在最新的JDK中，不建议使用StringTokenizer，并且存在更高效的令牌生成器，例如Scanner甚至String.split().但是，我认为这不是您想要的.您似乎需要一个词法分析器，或者至少需要一个词法分析器.因为您要理解标记，所以不只是基于某些分隔符对其进行分割.因此，您可以自己改正(如果您正在使用毒品)，或者只使用一种非常好的现有工具.像ANTLR http://www.antlr.org/它也是免费的，但是可能有点难以使用.还有JavaCC.祝你好运！

Well, everything you mentioned is pretty good for simply tokenizing a string or file. In the latest JDK, StringTokenizer is deprecated and more efficient tokenizers like Scanner and even String.split() exist.However, I don't think this is what you want. You seem to be needing a lexer, or at least a lexer-parser. Because you want to make sense of the tokens, not just split them based on some separator. So either you right your own - if you're on drugs - or just use one of the very good and existing tools out there. Like ANTLR http://www.antlr.org/It's free too, but may be a little hard to use. Also there's JavaCC. Good luck!

这篇关于使用Java解析包含汇编语言的文件的最佳方法是哪种?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！