问题描述
我最近在java中编写了几个map / reduce程序。但我也知道像php这样的脚本语言也可以使用。但是,大多数人都推荐java或python。我目前在php工作。所以我想知道哪种语言更适合地图/减少程序开发?
map / reduce实现的一个主要缺点是,它不是多线程的。
此外,hadoop有专门用java编写的类,接口和方法的广泛框架,这些php程序无法使用。另外,php的设计并不是为了处理繁重的数据处理任务。
那么,谁能告诉我哪个选择地图作为武器的宽泛点/减少实施?
shanthanu,您的第一个问题是
语言对hadoop有好处?
A)大多数脚本语言,比如php,python,perl,ruby bash都不错。任何能够从标准输入读取,写入sdtout和解析选项卡以及换行字符的语言都可以工作:Hadoop Streaming只是将键值对的字符串表示与选项卡连接到一个任意程序,该程序必须在每个任务跟踪器节点上可执行。
在大多数用于设置hadoop集群的linux发行版中,python,bash,ruby,perl ...已经安装,但没有任何东西可以阻止自动执行环境对于你最喜欢的脚本或编译的编程语言。
)Q)PHP不是多线程的?
A)是,但是,我们可以通过多种方式让PHP成为多线程。例如,使用:pnctl_fork()(但是,这在窗口中不起作用)
在使用hadoop编写脚本语言之前,您应该始终牢记的问题是不是哪种脚本语言?因为任何事都可以。但是,java和脚本语言的区别在于,当我们使用脚本语言时,子节点的心跳不会被发送到父节点。
I recently wrote a couple of map/reduce programs in java. But I also know that scripting language like php can also be used. However, mostly everyone recommends java or python. I currently work in php. So I was wondering which language is better suited for map/reduce program development?
One major disadvantage of php for map/reduce implementation is that, it is not multi-threaded.Also that, hadoop has extensive framework of classes, interfaces and methods specially made in java, which php programs can't avail. And also that, php isn't designed to handle heavy data processing task.
So can anyone tell me in broad points which one to choose as a weapon of choice for map/reduce implementation?
shanthanu, your first question is
Q) which scripting language is good for hadoop?
A) Most of the scripting languages like php, python, perl, ruby bash is good. Any language able to read from stdin, write to sdtout and parse tab and new line characters will work: Hadoop Streaming just pipes the string representations of key value pairs as concatenated with a tab to an arbitrary program that must be executable on each task tracker node.
On most linux distros used to setup hadoop clusters, python, bash, ruby, perl... are already installed but nothing will prevent to roll up your own execution environment for your favorite scripting or compiled programming language.
Q) PHP is not multi threaded?
A) yes, but, there are ways through which we can make PHP multi-threaded. For example use: pnctl_fork() (but, this does not work in windows)
The question which you should keep always in mind before going for the scripting languages with hadoop is not "which scripting language?" because anything is ok.
But, the difference between java and scripting language, it is "Heart Beat of child nodes will not be sent to the parent nodes when we are using scripting languages".
这篇关于Hadoop Map / Reduce程序使用哪种语言? Java还是PHP?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!