问题描述
我有这种情况,我的功能连续接收各种长度的数据。数据可以是任何东西。我想找到最好的方法,我在这个数据中寻找特定的字符串。
I have this situation where my function continuously receive data of various length. The data can be anything. I want to find the best way I to hunt for particular string in this data. The solution will require somehow to buffer previous data but I cannot wrap my head around the problem.
这里是一个问题的例子:
Here is an example of the problem:
资料输入 - > [\x00\x00\x01\x23B] [] [LABLABLABLABLA\x01TO] [KEN] [BLA\x01] ...
DATA IN -> [\x00\x00\x01\x23B][][LABLABLABLABLA\x01TO][KEN][BLA\x01]...
如果每个[...]表示一个数据块,[]表示没有项的数据块,扫描字符串TOKEN的最好方法是什么?
if every [...] represents a data chunk and [] represents a data chunk with no items, what is the best way to scan for the string TOKEN?
UPDATE:
我意识到这个问题有点复杂。 []不是分隔符。我只是用它们来描述每个上面的例子的块的结构。此外,TOKEN不是静态字符串。它是可变长度。我认为逐行读取的最好方法是如何读取可变长度的流缓冲区成行。
UPDATE:I realised the question is a bit more complex. the [] are not separators. I just use them to describe the structure of the chunk per above example. Also TOKEN is not a static string per-se. It is variable length. I think the best way to read line by line but than the question is how to read a streaming buffer of variable length into lines.
推荐答案
p>对不起,我投票删除我以前的答案,因为我的理解的问题是不正确的。我没有仔细阅读enouogh,并认为[]是符号分隔符。
Sorry, I voted to delete my previous answer as my understanding of the question was not correct. I didn't read carefully enouogh and thought that the [] are token delimiters.
对于你的问题,我建议基于一个简单的计数器构建一个小状态机:
对于每个字符,你执行类似下面的伪代码:
For your problem I'd recommend building a small state machine based on a simple counter:For every character you do something like the following pseudo code:
if (received_character == token[pos]) {
++pos;
if (pos >= token_length) {
token_received = 1;
}
}
else {
pos = 0; // Startover
}
这需要最少的处理器周期和最小的内存aso你不需要缓冲任何东西,除了刚收到的块。
This takes a minimum of processor cycles and also a minimum of memory aso you don't need to buffer anything except the chunk just received.
这篇关于在流缓冲区中连续扫描字符串的最佳方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!