Automatically set the maximum number of map tasks on a node to the number of cores?

I am working on creating a hadoop cluster where the nodes are all rather heterogeneous, i.e. each of them has a different number of cores. Currently, I need to manually edit mapred-site.xmlon each node to populate {cores}:

<property>
    <name>mapred.tasktracker.map.tasks.maximum</name>
    <value>{cores}</value>
</property>

Is there an easier way to do this when I add new nodes? Most of the other values ​​are by default, and maximum map tasks are the only thing that changes from node to node.

+5
source share
1 answer

, "" ( , , ):

cat /proc/cpuinfo | grep processor | wc -l

sed mapred-site.xml .

, :

CORES=`cat /proc/cpuinfo | grep processor | wc -l`
sed -i "s/{cores}/$CORES/g" mapred-site.xml

, , , , , , , node ..

+3

All Articles