问题描述
我在互联网上寻找聊天机器人。真有趣。但是现在,我非常喜欢这个主题,以至于我想开发自己的聊天机器人。
但第一件事是寻找一种管理聊天机器人大脑的好方法。我认为这是将所有内容保存在XML文件中的最佳解决方案,不是吗?
因此文件类型很明确。涉及不同名词之间的关系。当我有一个名词时,例如一颗树。如何最好地保存一棵树上有叶子,树枝和根的东西。树木需要水和阳光才能生存吗?
是否应该像这样保存它?
I was looking on the internet for chatbots. It was only fun. But now, I love this subject so much that I want to develop my own chatbot.
But the first thing is to look for a good way to manage the "brain" of my chatbot. I think that it's the best solution to save everything in a XML file, isn't it?
So the file type is clear. Comes to the relationship between different nouns etc. When I have a noun, e.g. a tree. How do I save best that a tree has leaves, branches and roots. And that a tree needs water and sunlight to survive?
Should I save it like that or otherwise?
这将是我的XML tree -示例:
This would be my XML for this tree-example:
<nouns>
<noun id="noun_0">
<name>tree</name>
<relationship>
<has>noun_1</has>
<has>noun_2</has>
<has>noun_3</has>
<need>noun_4</need>
<need>noun_5</need>
</relationship>
</noun>
<noun id="noun_1">
<name>root</name>
</noun>
<noun id="noun_2">
<name>branch</name>
<relationship>
<has>noun_3</has>
</relationship>
</noun>
<noun id="noun_3">
<name>leaf</name>
</noun>
<noun id="noun_4">
<name>water</name>
</noun>
<noun id="noun_5">
<name>light</name>
</noun>
. . .
</nouns>
推荐答案
数据存储选择: It取决于
Data Storage Choices: It Depends
简单的非学习型机器人:XML很好
看起来您已经有了一个基本的XML结构。对于刚起步的人来说,这很好,特别是对于AI支持聊天机器人(如果userMsg.contains('lega')然后print('TOS& ......
)。
It looks like you already have a basic XML structure worked out. For just starting out, I'd say that's fine, especially for AI support-chat kind of bots (if userMsg.contains('lega') then print('TOS & Copyright...'
).
当然,切换到任何新格式都会花费时间和开销。
Of course, switching to any new format will take time and overhead.
学习中的复杂机器人:数据库!
如果您想做更大的事情,特别是如果您有,我认为您将需要一个数据库。这是因为当您的文件..是文件且巨大且
If you're looking to do something much larger, especially if you have CleverBot in mind, I think you're going to need a database. This is because when your file .. is a file and is gigantic and trying to keep it all available in memory is resource intensive. For this kind of project, I'd recommend a database.
为什么英语很复杂 strong>
前一段时间,我写了一个nieve贝叶斯垃圾邮件分类器,大约10,000条垃圾邮件以7%的准确率训练了该邮件, 6小时和1.5GB的RAM可以将数据保存在内存中。 。英语非常难,如果 pony然后 saddle 的话,它就不能真正地分为,因此对于一个要学习最佳响应的机器人来说,您的数据库将变得庞大而迅速。
A while back I wrote a nieve bayes spam sorter. It took about 10,000 pieces of spam to "train" it at a 7% accuracy rate, which took about 6 hours and 1.5GB of RAM to hold the data in memory. That's a lot of data. English is very hard and can't really be broken into
if 'pony' then 'saddle'
, so for a bot to "learn" the best responses, your database is going to become massive and very quickly.
这篇关于如何最好地存储聊天机器人的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!