问题描述
我从java.util.regex
软件包中阅读了有关StringTokenizer
,StreamTokenizer
,Scanner
,Pattern
和Matcher
的信息.我还阅读了有关它们的意见,但确实感到困惑:最好使用哪种意见?
I have read about StringTokenizer
, StreamTokenizer
, Scanner
, Pattern
and Matcher
from the java.util.regex
package. I have also read about opinions on them and I am realy confused: which one is the best to use?
我需要做的是制作一个Assambler,即从该文件解析包含汇编语言的文件,然后将其转换为机器代码.
What I need to do is to make an Assambler, that is, to parse a file containing Assembly language from that file and I need to transform it into Machine Code.
例如,如果我有汇编代码:
For example if I have the assembly code:
MOV R15,R12
这应该转换为对应于每条指令和寄存器的十六进制数字.
This should translate to hexa numbers coresponding to each instruction and register.
我们只说翻译如下:
-
MOV
变为10 F3
-
R15
变为11 F2
-
R12
变为20 1E
MOV
becomes10 F3
R15
becomes11 F2
R12
becomes20 1E
因此,我的输出文件应该是:
Thus, my output file should be:
10 F3 11 F2 20 1E
现在,我需要解析Assembly文件以识别每条指令及其后的内容.
Now I need to parse the Assembly file to identify each instruction and what comes after it.
对于那些了解微控制器的人来说,有很多方法可以显示一条指令.我的问题是:
For those who know microcontroller there are many ways for an instruction to appear. My question is:
使用Java,这是将文件中每个单词转换成令牌的最佳方法(使用上述任何类),以便我找到匹配的单词并将其写入文件.
Using Java, which is the best method to transform each word from my file into tokens (using any of the aforementioned classes), so that I can find the matching one and write it into a file.
ldi R13,0x31
我需要在一个令牌中包含ldi
,在另一个令牌中包含r13
,在另一个令牌中包含31
I need to have ldi
in one token, r13
in another and 31
in another
推荐答案
好吧,您提到的所有内容都非常适合简单地标记字符串或文件.在最新的JDK中,不建议使用StringTokenizer,并且存在更高效的令牌生成器,例如Scanner甚至String.split().但是,我认为这不是您想要的.您似乎需要一个词法分析器,或者至少需要一个词法分析器.因为您要理解标记,所以不只是基于某些分隔符对其进行分割.因此,您可以自己改正(如果您正在使用毒品),或者只使用一种非常好的现有工具.像ANTLR http://www.antlr.org/它也是免费的,但是可能有点难以使用.还有JavaCC.祝你好运!
Well, everything you mentioned is pretty good for simply tokenizing a string or file. In the latest JDK, StringTokenizer is deprecated and more efficient tokenizers like Scanner and even String.split() exist.However, I don't think this is what you want. You seem to be needing a lexer, or at least a lexer-parser. Because you want to make sense of the tokens, not just split them based on some separator. So either you right your own - if you're on drugs - or just use one of the very good and existing tools out there. Like ANTLR http://www.antlr.org/It's free too, but may be a little hard to use. Also there's JavaCC. Good luck!
这篇关于使用Java解析包含汇编语言的文件的最佳方法是哪种?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!