问题描述
我们有称为X跟踪一个记录系统。我们用这个系统来进行转储,异常痕迹等在SQL Azure数据库。运营团队再利用这些数据进行调试,SCOM的目的。考虑到SQL Azure中具有我们正在考虑使用HDInsight(大数据)服务的150 GB的限制。
-
如果我们转储的Azure表的存储数据,将打击苯丙胺类兴奋剂HDInsight服务工作的?
-
或者它只会工作,对Blob存储,这意味着日志记录需要为文件被创建的Blob存储?
- 最后一个问题。考虑到我上面所解释的情况下,这是一个很好的候选人使用HDInsight服务?
HDInsight是要由HDFS消费内容,或者通过Azure存储保管库(ASV)之上,有效地提供了一个HDFS层映射到HDFS Blob存储Blob存储。后者是推荐的方法,因为你可以有写入Blob存储内容的显著量,这很好地映射到可由您HDInsight工作以后被消耗的文件系统。这对于像日志/踪迹工作的伟大。想象一下,写日志,每小时一个特定的容器中分离出来的斑点。然后,你必须创建了HDInsight集群,连接到同一个存储帐户。然后,它变得非常简单指定输入目录,它映射到您指定的储存容器内的文件,和您去。
您还可以存储在Windows Azure中的SQL数据库(传统的命名:SQL Azure的)的数据,使用一个名为工具的 Sqoop 可直接从SQL数据库导入数据到HDFS进行处理。然而,你有你在你的问题中提到的150GB极限。
有没有内置的表存储到HDFS映射;你需要创建一些类型的转换器表存储读取和写入文本文件处理(但我想直接写入文本文件时会更有效率,跳绳需要做的读/写在$ P $散装pparation您HDInsight处理)。当然,如果你做非HDInsight查询你的记录数据,那么它可能的确是有益的最初存储表的存储,然后提取您需要的时候启动您HDInsight工作的具体数据。
有一些的各地HDFS + Azure存储库的详细信息。
We have a logging system called as Xtrace. We use this system to dump logs, exceptions, traces etc. in SQL Azure database. Ops team then uses this data for debugging, SCOM purpose. Considering the 150 GB limitation that SQL Azure has we are thinking of using HDInsight (Big Data) Service.
If we dump the data in Azure Table Storage, will HDInsight Service work against ATS?
Or it will work only against the blob storage, which means the log records need to be created as files on blob storage?
- Last question. Considering the scenario I explained above, is it a good candidate to use HDInsight Service?
HDInsight is going to consume content from HDFS, or from blob storage mapped to HDFS via Azure Storage Vault (ASV), which effectively provides an HDFS layer on top of blob storage. The latter is the recommended approach, since you can have a significant amount of content written to blob storage, and this maps nicely into a file system that can be consumed by your HDInsight job later. This would work great for things like logs/traces. Imagine writing hourly logs to separate blobs within a particular container. You'd then have your HDInsight cluster created, attached to the same storage account. It then becomes very straightforward to specify your input directory, which is mapped to files inside your designated storage container, and off you go.
You can also store data in Windows Azure SQL DB (legacy naming: "SQL Azure"), and use a tool called Sqoop to import data straight from SQL DB into HDFS for processing. However, you'll have the 150GB limit you mentioned in your question.
There's no built-in mapping from Table Storage to HDFS; you'd need to create some type of converter to read from Table Storage and write to text files for processing (but I think writing directly to text files will be more efficient, skipping the need for doing a bulk read/write in preparation for your HDInsight processing). Of course, if you're doing non-HDInsight queries on your logging data, then it may indeed be beneficial to store initially to Table Storage, then extracting the specific data you need whenever launching your HDInsight jobs.
There's some HDInsight documentation up on the Azure Portal that provides more detail around HDFS + Azure Storage Vault.
这篇关于我们可以使用HDInsight服务的ATS?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!