Jump to content

Spark - Scala - help


mettastar

Recommended Posts

1 minute ago, mettastar said:

val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
val rowRDD = r.map(row => Row.fromSeq(row.split("\t", -1)))
val df = hiveContext.createDataFrame(rowRDD, schema)


df.write.mode(SaveMode.Overwrite).format("orc").partitionBy("processed_day").save("/user/hive/warehouse/df_d_distributor_return_items/")

 

tried this way . and was running forever .. and same thing hive dynamic partitioning try chesa adi kuda running forever.. 

hive dynamic partitioning one month data thoni chesthe 20mins lo atla finish aindhi..

now i want to get max and min of processed_day column and after that use for loop to loop through in one month increments .. so one month increments lo dynamic partition chestha..

so aa max and min kanukoni for loop lo one month increments lo etla cheyalno cheppava bro .. nenu kuda google chesthunna ..thx

ltt

Link to comment
Share on other sites

4 hours ago, mettastar said:

Gurus,

naku okahelp kavali .. I have large dataset, daanni spark lo read chesi .. I want dynamically partition the data into multiple folders based on a date field

I was able to do that using hiveContext but it is taking a lot of time..

so deeni badhulu I want to read distinct dates from that date field and store them in one variable and for loop use chesi I want to manually create folders and load the data into them .. itla aithe I dont have to use hiveql

 

any other sugestions 

already solution you know no...why ask how to do it again ?

Link to comment
Share on other sites

51 minutes ago, k2s said:

already solution you know no...why ask how to do it again ?

E programming adi naku vachi savadhu vuncle.. and e scala examples kuda ekuva doriki saavatle.. anduke asking 

Link to comment
Share on other sites

10 hours ago, mettastar said:

Gurus,

naku okahelp kavali .. I have large dataset, daanni spark lo read chesi .. I want dynamically partition the data into multiple folders based on a date field

I was able to do that using hiveContext but it is taking a lot of time..

so deeni badhulu I want to read distinct dates from that date field and store them in one variable and for loop use chesi I want to manually create folders and load the data into them .. itla aithe I dont have to use hiveql

 

any other sugestions 

@kasi ni adugu 

Link to comment
Share on other sites

13 hours ago, mettastar said:

Gurus,

naku okahelp kavali .. I have large dataset, daanni spark lo read chesi .. I want dynamically partition the data into multiple folders based on a date field

I was able to do that using hiveContext but it is taking a lot of time..

so deeni badhulu I want to read distinct dates from that date field and store them in one variable and for loop use chesi I want to manually create folders and load the data into them .. itla aithe I dont have to use hiveql

 

any other sugestions 

dora edo hadoop basha lo matladutunav emi ardam kavatla

Link to comment
Share on other sites

Done .. 1yr partitions ki almost 1.5hrs paduthundi .. not bad .. 

Dataframe ni iterate cheyalemu .. we have to use RDD to iterate through...

So distinct dates ni Rdd loki store chesi then rdd paina For each loop use chesi one month of data at a time read chesi dynamic partition chesthunna using hivecontext. 

Evarikanna code kaavalante chepandi i can paste here .. inka optimization emanna cheyachemo i dont know

Link to comment
Share on other sites

Just now, mettastar said:

Done .. 1yr partitions ki almost 1.5hrs paduthundi .. not bad .. 

Dataframe ni iterate cheyalemu .. we have to use RDD to iterate through...

So distinct dates ni Rdd loki store chesi then rdd paina For each loop use chesi one month of data at a time read chesi dynamic partition chesthunna using hivecontext. 

Evarikanna code kaavalante chepandi i can paste here .. inka optimization emanna cheyachemo i dont know

holiday roju kuda working ah :o 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...