mettastar Posted February 13, 2018 Report Posted February 13, 2018 naku Tez configuration koncham suggest cheyandi .. I will list my cluster config and current config what I have here .. I feel I'm under utilizing my cluster. Hardware vachesi - 1Master node - 64 VCore, 244Gig Ram, 640 GB SSD and 4 data nodes with same config as master below is the hive/tez config hive-site tez.am.resource.memory.mb 59205 hive-site hive.tez.java.opts -Xmx47364m hive-site hive.execution.engine tez hive-site hive.vectorized.execution.enabled true hive-site tez.am.grouping.max-size 36700160000 hive-site hive.tez.container.size 59205 hive-site hive.vectorized.execution.reduce.enabled true hive-site tez.task.resource.memory.mb 10000 hive-site tez.am.launch.cmd-opts -Xmx47364m with current setting I can see only 15 mappers or reducers are running at any given point.. container size thagginchi ekuva containers ni parallel ga run chesela chesthe queries runs faster ani chadivanu and I want to try that .. https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html I have changed my Tez settings according to this doc but containers getting killed with some physical memory issue .. evaranna help cheyandi thanks Quote
mettastar Posted February 13, 2018 Author Report Posted February 13, 2018 ltt for hadoop devs @kasi vuncle any help ? Quote
kasi Posted February 13, 2018 Report Posted February 13, 2018 3 minutes ago, mettastar said: naku Tez configuration koncham suggest cheyandi .. I will list my cluster config and current config what I have here .. I feel I'm under utilizing my cluster. Hardware vachesi - 1Master node - 64 VCore, 244Gig Ram, 640 GB SSD and 4 data nodes with same config as master below is the hive/tez config hive-site tez.am.resource.memory.mb 59205 hive-site hive.tez.java.opts -Xmx47364m hive-site hive.execution.engine tez hive-site hive.vectorized.execution.enabled true hive-site tez.am.grouping.max-size 36700160000 hive-site hive.tez.container.size 59205 hive-site hive.vectorized.execution.reduce.enabled true hive-site tez.task.resource.memory.mb 10000 hive-site tez.am.launch.cmd-opts -Xmx47364m with current setting I can see only 15 mappers or reducers are running at any given point.. container size thagginchi ekuva containers ni parallel ga run chesela chesthe queries runs faster ani chadivanu and I want to try that .. https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html I have changed my Tez settings according to this doc but containers getting killed with some physical memory issue .. evaranna help cheyandi thanks i work on spark will see if i can help you tez.grouping.max-size(default 1073741824 which is 1GB) tez.grouping.min-size(default 52428800 which is 50MB) tez.grouping.split-count(not set by default) http://www.openkb.info/2017/05/hive-on-tez-how-to-control-number-of.html and No of Mappers n reducers depends on the container size..... and optimizing you job depends on size of the data you have, your data might me small, but if you are shuffling data too much, then there will be a lot of overhead like if you have an error msg like 9.2gb out of 9gb....make make sure you have enough overhead...not sure how it translates to Tez....try googling it Quote
mettastar Posted February 13, 2018 Author Report Posted February 13, 2018 3 minutes ago, kasi said: i work on spark will see if i can help you tez.grouping.max-size(default 1073741824 which is 1GB) tez.grouping.min-size(default 52428800 which is 50MB) tez.grouping.split-count(not set by default) http://www.openkb.info/2017/05/hive-on-tez-how-to-control-number-of.html and No of Mappers n reducers depends on the container size..... and optimizing you job depends on size of the data you have, your data might me small, but if you are shuffling data too much, then there will be a lot of overhead like if you have an error msg like 9.2gb out of 9gb....make make sure you have enough overhead...not sure how it translates to Tez....try googling it similar error vasthundi vuncle.. data volume maree ekkuva kaadhu like 10-15GB .. config ni chusthe adi easy ga process cheyali ani na feeling. enough overhead ante endhi vuncle ? heapsize ? Quote
mettastar Posted February 13, 2018 Author Report Posted February 13, 2018 and do you know of any active hive/hadoop forums where I can ask questions like this ? Quote
rapchik Posted February 13, 2018 Report Posted February 13, 2018 4 minutes ago, mettastar said: and do you know of any active hive/hadoop forums where I can ask questions like this ? shoot these guys questions... might get replies http://www.hadoopwizard.com/top-10-helpful-hadoop-experts-on-stack-overflow/ andhulo vala names click chesthe direct profile ki osthadhi StackOverflow site lo Quote
mettastar Posted February 13, 2018 Author Report Posted February 13, 2018 35 minutes ago, rapchik said: shoot these guys questions... might get replies http://www.hadoopwizard.com/top-10-helpful-hadoop-experts-on-stack-overflow/ andhulo vala names click chesthe direct profile ki osthadhi StackOverflow site lo thanks man will try that Quote
kasi Posted February 13, 2018 Report Posted February 13, 2018 1 hour ago, mettastar said: similar error vasthundi vuncle.. data volume maree ekkuva kaadhu like 10-15GB .. config ni chusthe adi easy ga process cheyali ani na feeling. enough overhead ante endhi vuncle ? heapsize ? yes heap....gc thats what i though......check for skewed data if you are shuffling your data, after the shuffle your data is getting skewed error lo main part ni google chey...... Quote
mettastar Posted February 14, 2018 Author Report Posted February 14, 2018 set hive.execution.engine=tez; set tez.am.resource.memory.mb=59205; set tez.am.launch.cmd-opts=-Xmx47364m; set hive.tez.container.size=5120; set hive.tez.java.opts=-Xmx4096m; set tez.task.resource.memory.mb=10000; set hive.auto.convert.join.noconditionaltask.size=1000000000; set tez.am.grouping.max-size=36700160000; --set tez.grouping.max-size=1073741824; --set tez.grouping.min-size=52428800; set tez.runtime.io.sort.mb=2048; set hive.vectorized.execution.enabled=true; set hive.vectorized.execution.reduce.enabled=true; used these settings and was able to reduce the run time by 1/3rd Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.