Jump to content

Spark - Scala - help


Recommended Posts

Posted
7 minutes ago, siritptpras said:

Total diff question, how is spark and Scala job opportunities..op sorry for posting here as I am planning to learn..

no idea bro..

Posted
1 minute ago, mettastar said:

val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
val rowRDD = r.map(row => Row.fromSeq(row.split("\t", -1)))
val df = hiveContext.createDataFrame(rowRDD, schema)


df.write.mode(SaveMode.Overwrite).format("orc").partitionBy("processed_day").save("/user/hive/warehouse/df_d_distributor_return_items/")

 

tried this way . and was running forever .. and same thing hive dynamic partitioning try chesa adi kuda running forever.. 

hive dynamic partitioning one month data thoni chesthe 20mins lo atla finish aindhi..

now i want to get max and min of processed_day column and after that use for loop to loop through in one month increments .. so one month increments lo dynamic partition chestha..

so aa max and min kanukoni for loop lo one month increments lo etla cheyalno cheppava bro .. nenu kuda google chesthunna ..thx

ltt

Posted
4 hours ago, mettastar said:

Gurus,

naku okahelp kavali .. I have large dataset, daanni spark lo read chesi .. I want dynamically partition the data into multiple folders based on a date field

I was able to do that using hiveContext but it is taking a lot of time..

so deeni badhulu I want to read distinct dates from that date field and store them in one variable and for loop use chesi I want to manually create folders and load the data into them .. itla aithe I dont have to use hiveql

 

any other sugestions 

already solution you know no...why ask how to do it again ?

Posted
51 minutes ago, k2s said:

already solution you know no...why ask how to do it again ?

E programming adi naku vachi savadhu vuncle.. and e scala examples kuda ekuva doriki saavatle.. anduke asking 

Posted
10 hours ago, mettastar said:

Gurus,

naku okahelp kavali .. I have large dataset, daanni spark lo read chesi .. I want dynamically partition the data into multiple folders based on a date field

I was able to do that using hiveContext but it is taking a lot of time..

so deeni badhulu I want to read distinct dates from that date field and store them in one variable and for loop use chesi I want to manually create folders and load the data into them .. itla aithe I dont have to use hiveql

 

any other sugestions 

@kasi ni adugu 

Posted
13 hours ago, mettastar said:

Gurus,

naku okahelp kavali .. I have large dataset, daanni spark lo read chesi .. I want dynamically partition the data into multiple folders based on a date field

I was able to do that using hiveContext but it is taking a lot of time..

so deeni badhulu I want to read distinct dates from that date field and store them in one variable and for loop use chesi I want to manually create folders and load the data into them .. itla aithe I dont have to use hiveql

 

any other sugestions 

dora edo hadoop basha lo matladutunav emi ardam kavatla

Posted
16 hours ago, mettastar said:

E programming adi naku vachi savadhu vuncle.. and e scala examples kuda ekuva doriki saavatle.. anduke asking 

u beating a dead cow uncle 

Posted

Done .. 1yr partitions ki almost 1.5hrs paduthundi .. not bad .. 

Dataframe ni iterate cheyalemu .. we have to use RDD to iterate through...

So distinct dates ni Rdd loki store chesi then rdd paina For each loop use chesi one month of data at a time read chesi dynamic partition chesthunna using hivecontext. 

Evarikanna code kaavalante chepandi i can paste here .. inka optimization emanna cheyachemo i dont know

Posted
Just now, mettastar said:

Done .. 1yr partitions ki almost 1.5hrs paduthundi .. not bad .. 

Dataframe ni iterate cheyalemu .. we have to use RDD to iterate through...

So distinct dates ni Rdd loki store chesi then rdd paina For each loop use chesi one month of data at a time read chesi dynamic partition chesthunna using hivecontext. 

Evarikanna code kaavalante chepandi i can paste here .. inka optimization emanna cheyachemo i dont know

holiday roju kuda working ah :o 

Posted
51 minutes ago, perugu_vada said:

holiday roju kuda working ah :o 

em plans levu .. so working

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...