Java.lang.OutOfMemoryError: unable to create new native thread for large dataset

Question

Java.lang.OutOfMemoryError: unable to create new native thread for large dataset

I have a hive request that works fine for a small dataset. but I run for 250 million entries, I get fewer errors in the logs

 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError:   unable to create new native thread
    at java.lang.Thread.start0(Native Method)
    at java.lang.Thread.start(Thread.java:640)
    at org.apache.hadoop.mapred.Task$TaskReporter.startCommunicationThread(Task.java:725)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)



 2013-03-18 14:12:58,907 WARN org.apache.hadoop.mapred.Child: Error running child
 java.io.IOException: Cannot run program "ln": java.io.IOException: error=11, Resource temporarily unavailable
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
    at java.lang.Runtime.exec(Runtime.java:593)
    at java.lang.Runtime.exec(Runtime.java:431)
    at java.lang.Runtime.exec(Runtime.java:369)
    at org.apache.hadoop.fs.FileUtil.symLink(FileUtil.java:567)
    at org.apache.hadoop.mapred.TaskRunner.symlink(TaskRunner.java:787)
    at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:752)
    at org.apache.hadoop.mapred.Child.main(Child.java:225)
 Caused by: java.io.IOException: java.io.IOException: error=11, Resource temporarily unavailable
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
    at java.lang.ProcessImpl.start(ProcessImpl.java:65)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
    ... 7 more
2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Child: Error cleaning up
  java.lang.NullPointerException
    at org.apache.hadoop.mapred.Task.taskCleanup(Task.java:1048)
    at org.apache.hadoop.mapred.Child.main(Child.java:281)

need help with this.

+5

hadoop hive

hjamali52 Mar 19 '13 at 8:37

source share

3 answers

MapReduce . , Out of Memory - , , : " ".

( Linux) , ulimit, 1024, 2048 : ulimit -n 2048. - sudo root, 2048 , . .profile .bashrc.

ulimit -a. . : fooobar.com/questions/40824/...

, /etc/security/limits.conf, . , : fooobar.com/questions/40824/...

+7

quux00 11 . '13 19:54

- OutOfMemmory , tweek , JVM . mapred.child.java.opts( - 200Xmx) .

+1

Gargi Mar 21 '13 at 8:20

source share

hjamali52 · Accepted Answer · 2013-10-12T04:00:02+0000

Thanks to everyone .. You're right. this is because of the file descriptor, since my program generated a lot of files in the target table. due to the layered structure of the partitions.

I have enhanced the ulimit property as well as the xceivers. That helped. but still in our situation, these limits also crossed

Then we decided to distribute the data in accordance with the sections, and then we get only one file per section.

. 50 + ,

Java.lang.OutOfMemoryError: unable to create new native thread for large dataset

More articles: