

我正在开发一个Rails应用程序,它将访问大量的RSS Feed或抓取网站的数据(主要是新闻)。它会像Google新闻一样,但采用不同的方法,因此我会存储大量新闻(或新闻摘要),将它们分类到不同的类别,并使用排名和推荐技巧。

I am developing a Rails application that will access a lot of RSS feeds or crawl sites for data (mostly news). It will be something like Google News but with a different approach, so I'll store a lot of news (or news summaries), classify them in different categories and use ranking and recommendation techniques.

  • 我应该使用MySQL吗?

  • Should I go with MySQL?

是否值得使用IBM DB2

Is it worthwhile using IBM DB2purexml to store the doucuments?Also Ruby search implementations(Ferret, Ultrasphinx and others) arenot needed If I choose DB2. Is that correct?


What are the advantages ofPostreSQL in this?

这种情况​​下使用Couch DB是否有意义?

Does it makes sense to use Couch DB inthis scenario?

选择最佳选择,但没有使解决方案过于复杂。所以我放弃了使用两种不同的存储解决方案(一个用于新闻文档和其他用于其余的数据)的想法。我也只考虑免费选项,所以我没有看看Oracle或MS SQL Server。

I'd like to choose the best option but without over-complicating the solution. So I discarded the idea to use two different storage solutions (one for the news documents and other for the rest of the data). I'm also considering only "free" options, so I didn't look at Oracle or MS SQL Server.




purexml is heavier than SQL, so you pay more for your roundtrip between webserver and DB. If you plan to have lots of users, I'd avoid it, your better off letting your webserver cache the requests, thus avoiding creating xml(rss) everytime, if that is what you are thinking about.


I'd go with MySQL because its really good at serving and its totally free, well PostgreSQL is too, but haven't used it so I can't say.


CouchDB could make sense, but not if you plan on doing OLAP (Offline Analysis) of your data, a normal RDBMS will be better at it.


09-02 19:20