Jump to content

Help : hive query


Recommended Posts

Posted

I have two tables in hive each has billions of records and 200 columns

i want to compare each column against other table column based on primary key and Trigger email containing mismatched records 

 

spark code is also fine 

 

Posted
3 minutes ago, vendettaa said:

I have two tables in hive each has billions of records and 200 columns

i want to compare each column against other table column based on primary key and Trigger email containing mismatched records 

 

spark code is also fine 

 

Code not the fine ?

Posted
Just now, chary69 said:

Why to come to other thread post rica why?

because everything causes the cause inbuilt around the inner senses of the spoon

Posted
Just now, alooparata said:

because everything causes the cause inbuilt around the inner senses of the spoon

Why to get the spoon between us ?

Posted
Just now, chary69 said:

Why to get the spoon between us ?

its bhagamathis but bahubali stolen from lokayya

Posted
Just now, alooparata said:

its bhagamathis but bahubali stolen from lokayya

Whose to lokayya for Minky?

Posted
Just now, chary69 said:

Whose to lokayya for Minky?

minkys donkey was stolen by pinkys ponky

Posted
6 minutes ago, chary69 said:

Code not the fine ?

Edokati man 

e piece teliste yamls create chesi automate cheyali

spark or hive is fine 

Posted
1 minute ago, vendettaa said:

Edokati man 

e piece teliste yamls create chesi automate cheyali

spark or hive is fine 

Im not an expert but I think you can use except data frame api to perform this.

put table 1 data into data frame 1

table 2 data into another data frame 2.

dataframe1.select(keyColumn).except.dataframe2.select(keycolumn)

you will get data from dataframe 1 which is not present in df2. May not be a perfect answer but you can change it according to your use case.

Posted
6 minutes ago, NPReddy said:

Im not an expert but I think you can use except data frame api to perform this.

put table 1 data into data frame 1

table 2 data into another data frame 2.

dataframe1.select(keyColumn).except.dataframe2.select(keycolumn)

you will get data from dataframe 1 which is not present in df2. May not be a perfect answer but you can change it according to your use case.

Ok how to do this on hive 

am not sure whether we have commands to invoke spark yaml but thanks 

Posted
1 minute ago, vendettaa said:

Ok how to do this on hive 

am not sure whether we have commands to invoke spark yaml but thanks 

This is not in hive, read and process hive table data using spark, write a spark application which perform this or you can do it in spark shell directly. Im not sure about email. 

Posted
1 minute ago, NPReddy said:

This is not in hive, read and process hive table data using spark, write a spark application which perform this or you can do it in spark shell directly. Im not sure about email. 

Ok without spark ela ani asking 

Posted
Just now, vendettaa said:

Ok without spark ela ani asking 

Not sure buddy. Since you asked for spark, i just tried to help you. How do u write hive queries? In shell? I think, You can create some functions to performa this. Since you are dealing with billions of records, i do not think it is recommended. Wait for experts.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...