calling hadoop admins or developers

February 13, 2018

naku Tez configuration koncham suggest cheyandi .. I will list my cluster config and current config what I have here .. I feel I'm under utilizing my cluster.

Hardware vachesi - 1Master node - 64 VCore, 244Gig Ram, 640 GB SSD and 4 data nodes with same config as master

below is the hive/tez config

hive-site   tez.am.resource.memory.mb   59205
hive-site   hive.tez.java.opts   -Xmx47364m
hive-site   hive.execution.engine   tez
hive-site   hive.vectorized.execution.enabled   true
hive-site   tez.am.grouping.max-size   36700160000
hive-site   hive.tez.container.size   59205
hive-site   hive.vectorized.execution.reduce.enabled   true
hive-site   tez.task.resource.memory.mb   10000
hive-site   tez.am.launch.cmd-opts   -Xmx47364m

with current setting I can see only 15 mappers or reducers are running at any given point.. container size thagginchi ekuva containers ni parallel ga run chesela chesthe queries runs faster ani chadivanu and I want to try that ..

https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html

I have changed my Tez settings according to this doc but containers getting killed with some physical memory issue ..

evaranna help cheyandi thanks

February 13, 2018

ltt for hadoop devs

@kasi vuncle any help ?

February 13, 2018

Ltt

February 13, 2018

3 minutes ago, mettastar said:

naku Tez configuration koncham suggest cheyandi .. I will list my cluster config and current config what I have here .. I feel I'm under utilizing my cluster.

Hardware vachesi - 1Master node - 64 VCore, 244Gig Ram, 640 GB SSD and 4 data nodes with same config as master

below is the hive/tez config

hive-site   tez.am.resource.memory.mb   59205
hive-site   hive.tez.java.opts   -Xmx47364m
hive-site   hive.execution.engine   tez
hive-site   hive.vectorized.execution.enabled   true
hive-site   tez.am.grouping.max-size   36700160000
hive-site   hive.tez.container.size   59205
hive-site   hive.vectorized.execution.reduce.enabled   true
hive-site   tez.task.resource.memory.mb   10000
hive-site   tez.am.launch.cmd-opts   -Xmx47364m

with current setting I can see only 15 mappers or reducers are running at any given point.. container size thagginchi ekuva containers ni parallel ga run chesela chesthe queries runs faster ani chadivanu and I want to try that ..

https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html

I have changed my Tez settings according to this doc but containers getting killed with some physical memory issue ..

evaranna help cheyandi thanks

i work on spark

will see if i can help you

tez.grouping.max-size(default 1073741824 which is 1GB)
tez.grouping.min-size(default 52428800 which is 50MB)
tez.grouping.split-count(not set by default)

http://www.openkb.info/2017/05/hive-on-tez-how-to-control-number-of.html

and No of Mappers n reducers depends on the container size.....

and optimizing you job depends on size of the data you have,

your data might me small, but if you are shuffling data too much, then there will be a lot of overhead

like if you have an error msg like 9.2gb out of 9gb....make make sure you have enough overhead...not sure how it translates to Tez....try googling it

February 13, 2018

3 minutes ago, kasi said:

i work on spark

will see if i can help you

tez.grouping.max-size(default 1073741824 which is 1GB)

tez.grouping.min-size(default 52428800 which is 50MB)

tez.grouping.split-count(not set by default)

http://www.openkb.info/2017/05/hive-on-tez-how-to-control-number-of.html

and No of Mappers n reducers depends on the container size.....

and optimizing you job depends on size of the data you have,

your data might me small, but if you are shuffling data too much, then there will be a lot of overhead

like if you have an error msg like 9.2gb out of 9gb....make make sure you have enough overhead...not sure how it translates to Tez....try googling it

similar error vasthundi vuncle.. data volume maree ekkuva kaadhu like 10-15GB .. config ni chusthe adi easy ga process cheyali ani na feeling.

enough overhead ante endhi vuncle ? heapsize ?

February 13, 2018

and do you know of any active hive/hadoop forums where I can ask questions like this ?

February 13, 2018

4 minutes ago, mettastar said:

and do you know of any active hive/hadoop forums where I can ask questions like this ?

shoot these guys questions... might get replies

http://www.hadoopwizard.com/top-10-helpful-hadoop-experts-on-stack-overflow/

andhulo vala names click chesthe direct profile ki osthadhi StackOverflow site lo

February 13, 2018

35 minutes ago, rapchik said:

shoot these guys questions... might get replies

http://www.hadoopwizard.com/top-10-helpful-hadoop-experts-on-stack-overflow/

andhulo vala names click chesthe direct profile ki osthadhi StackOverflow site lo

thanks man will try that

February 13, 2018

1 hour ago, mettastar said:

similar error vasthundi vuncle.. data volume maree ekkuva kaadhu like 10-15GB .. config ni chusthe adi easy ga process cheyali ani na feeling.

enough overhead ante endhi vuncle ? heapsize ?

yes heap....gc

thats what i though......check for skewed data

if you are shuffling your data, after the shuffle your data is getting skewed

error lo main part ni google chey......

February 14, 2018

set hive.execution.engine=tez;
set tez.am.resource.memory.mb=59205;
set tez.am.launch.cmd-opts=-Xmx47364m;
set hive.tez.container.size=5120;
set hive.tez.java.opts=-Xmx4096m;
set tez.task.resource.memory.mb=10000;
set hive.auto.convert.join.noconditionaltask.size=1000000000;
set tez.am.grouping.max-size=36700160000;
--set tez.grouping.max-size=1073741824;
--set tez.grouping.min-size=52428800;
set tez.runtime.io.sort.mb=2048;
set hive.vectorized.execution.enabled=true;
set hive.vectorized.execution.reduce.enabled=true;

used these settings and was able to reduce the run time by 1/3rd

Sign In

calling hadoop admins or developers

Recommended Posts

mettastar

mettastar

pandemkodi

kasi

mettastar

mettastar

rapchik

mettastar

kasi

mettastar

Join the conversation

Tell a friend

Most viewed in last 30 days

Activity