Big data guys randi vaaa....

Followers

July 18, 20205 yr

Author

14 minutes ago, SharkTank said:

Real time depends minimum 500 line of code . Simple practise wise ayte 30 lines code ni practise kosam. Datasets chala untay . Go to kaggle and download a dataset. Install scala spark and also intellij .Start practicing

500 aa...inka ainatte

Quote

Report

Replies 83
Views 10.4k
Created 5 yr5 yr
Last Reply 2 yr2 yr

Popular Days

Most Popular Posts

Sarvapindi

July 18, 20205 yr

Java vasthe e lolli antha enduku...
Sarvapindi

July 18, 20205 yr

Sql vasthe pyspark radhu...pyspark ante python in spark.....u can use sql in pyspark....ya python or scala rendilto edoti vasthe saal...
SharkTank

July 18, 20205 yr

Emadya devops infrastructure setup kuda expecting ..

July 18, 20205 yr

3 minutes ago, Sarvapindi said:

500 aa...inka ainatte

Easy Ee kani u have to understand

Quote

Report

July 18, 20205 yr

Bhaiya big daya lite tesuko vere vi vetuko, . Coding rakapothey big data lo vundadam kastam 😐

Quote

Report

July 18, 20205 yr

5 minutes ago, Killer66 said:

Bhaiya big daya lite tesuko vere vi vetuko, . Coding rakapothey big data lo vundadam kastam 😐

😂

Quote

Report

July 18, 20205 yr

1) used to create the folder on HDFS
hadoop fs -mkdir//////*****
2) starting Sparkshell
spark-shell --master yarn
3) importing Packages
import org.apache.spark.sql.functions.{expr, col, column}
import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)

4) removing the columns from the data set

val af_temp = af.where((col("maker") !== "") && (col("model") !== "") && (col("mileage") !== 0) && (col("manufacture_year")!== 0) && (col("engine_displacement") !== 0) && (col("engine_power
") !== 0) && (col("body_type") !=="") && (col("color_slug") !== "") && (col("stk_year") !== 0) && (col("transmission") !== "") && (col("door_count") !== "") && (col("seat_count") !== "") && (col("
fuel_type") !== "") && (col("date_created") !== "") && (col("date_last_seen") !== "") && (col("price_eur") !== 0))

val af_temp = af.where((col("maker") !== "") && (col("model") !== "") && (col("mileage") !== 0) && (col("manufacture_year")!== 0) && (col("engine_displacement") !== 0) && (col("engine_power
") !== 0) && (col("body_type") !=="") && (col("stk_year") !== 0) && (col("transmission") !== "") && (col("door_count") !== "") && (col("seat_count") !== "") && (col("fuel_type") !== "") && (col("
date_created") !== "") && (col("date_last_seen") !== "") && (col("price_eur") !== 0))
5) Removing null values from the data set

val cars_nullaf = af_temp.where((col("maker").isNotNull) && (col("model").isNotNull) && (col("mileage").isNotNull) && (col("manufacture_year").isNotNull) && (col("engine_displacement").isNo
tNull) && (col("engine_power").isNotNull) && (col("stk_year").isNotNull) && (col("transmission").isNotNull) && (col("door_count").isNotNull) && (col("seat_count").isNotNull) && (col("fuel_type").i
sNotNull) && (col("date_created").isNotNull) && (col("date_last_seen").isNotNull) && (col("price_eur").isNotNull))

cars_nullaf.select("maker","model","engine_power","transmission","fuel_type").orderBy(desc ("engine_power")).show

cars_nullaf.createOrReplaceTempView("cars")

val sqlDF = spark.sql("SELECT * FROM cars")

val Total_number_columns = spark.sql("SELECT COUNT(*) FROM cars")

Total_number_columns.show()

scala> val Number_of_models_by_manufactuer = spark.sql("SELECT model, maker, COUNT(model) FROM cars Group by maker
,model")

scala> Number_of_models_by_manufactuer.show()

al Number_of_models_by_manufactuer = spark.sql("SELECT maker,model, COUNT(model) as top_car_models_sold F
ROM cars Group by maker,model" )

Number_of_models_by_manufactuer.show()

val Type_of_Tranmissions_sold = spark.sql("SELECT transmission, COUNT( transmission) FROM cars Group by tr
ansmission")

Type_of_Tranmissions_sold.show()

val Type_of_Car_sold = spark.sql("SELECT maker,transmission, count(*) FROM cars Group by transmission, ma
ker")

Type_of_Car_sold.show()

val AVG_Price_car_by_model = spark.sql("SELECT maker, model, AVG(price_eur) FROM cars GROUP BY maker ,mode
l")
AVG_Price_car_by_model.show()

Quote

Report

July 18, 20205 yr

the above is the code I have used

Quote

Report

July 18, 20205 yr

22 minutes ago, Sarvapindi said:

500 aa...inka ainatte

intrest unte edaina easy avtadi bro .......starting epdu kastamgane anpistadi when you get into it and do some hardwork you will start enjoying it. Hard work must . Practise practise coding . First spark practise cheyi bro , adi simple untundi then slowly application building oops concepts must ...then ala chala chala nerchukuntu velpotaru

Edited July 18, 20205 yr by SharkTank

Quote

Report

July 18, 20205 yr

11 minutes ago, kevinUsa said:

1) used to create the folder on HDFS
hadoop fs -mkdir//////*****
2) starting Sparkshell
spark-shell --master yarn
3) importing Packages
import org.apache.spark.sql.functions.{expr, col, column}
import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)

4) removing the columns from the data set

val af_temp = af.where((col("maker") !== "") && (col("model") !== "") && (col("mileage") !== 0) && (col("manufacture_year")!== 0) && (col("engine_displacement") !== 0) && (col("engine_power
") !== 0) && (col("body_type") !=="") && (col("color_slug") !== "") && (col("stk_year") !== 0) && (col("transmission") !== "") && (col("door_count") !== "") && (col("seat_count") !== "") && (col("
fuel_type") !== "") && (col("date_created") !== "") && (col("date_last_seen") !== "") && (col("price_eur") !== 0))

val af_temp = af.where((col("maker") !== "") && (col("model") !== "") && (col("mileage") !== 0) && (col("manufacture_year")!== 0) && (col("engine_displacement") !== 0) && (col("engine_power
") !== 0) && (col("body_type") !=="") && (col("stk_year") !== 0) && (col("transmission") !== "") && (col("door_count") !== "") && (col("seat_count") !== "") && (col("fuel_type") !== "") && (col("
date_created") !== "") && (col("date_last_seen") !== "") && (col("price_eur") !== 0))
5) Removing null values from the data set

val cars_nullaf = af_temp.where((col("maker").isNotNull) && (col("model").isNotNull) && (col("mileage").isNotNull) && (col("manufacture_year").isNotNull) && (col("engine_displacement").isNo
tNull) && (col("engine_power").isNotNull) && (col("stk_year").isNotNull) && (col("transmission").isNotNull) && (col("door_count").isNotNull) && (col("seat_count").isNotNull) && (col("fuel_type").i
sNotNull) && (col("date_created").isNotNull) && (col("date_last_seen").isNotNull) && (col("price_eur").isNotNull))

cars_nullaf.select("maker","model","engine_power","transmission","fuel_type").orderBy(desc ("engine_power")).show

cars_nullaf.createOrReplaceTempView("cars")

val sqlDF = spark.sql("SELECT * FROM cars")

val Total_number_columns = spark.sql("SELECT COUNT(*) FROM cars")

Total_number_columns.show()

scala> val Number_of_models_by_manufactuer = spark.sql("SELECT model, maker, COUNT(model) FROM cars Group by maker
,model")

scala> Number_of_models_by_manufactuer.show()

al Number_of_models_by_manufactuer = spark.sql("SELECT maker,model, COUNT(model) as top_car_models_sold F
ROM cars Group by maker,model" )

Number_of_models_by_manufactuer.show()

val Type_of_Tranmissions_sold = spark.sql("SELECT transmission, COUNT( transmission) FROM cars Group by tr
ansmission")

Type_of_Tranmissions_sold.show()

val Type_of_Car_sold = spark.sql("SELECT maker,transmission, count(*) FROM cars Group by transmission, ma
ker")

Type_of_Car_sold.show()

val AVG_Price_car_by_model = spark.sql("SELECT maker, model, AVG(price_eur) FROM cars GROUP BY maker ,mode
l")
AVG_Price_car_by_model.show()

try to implement everything using dataframes single shot . Spark sql usage enta limit cheste good, coz view occupies space. Use spark 2 . Ipdu indulo emundi . Error post chey . As i said give

spark.conf.set("spark.sql.shuffle.partitions",2000) on your sparkscala console . This should fix.

spark.conf.set("spark.sql.shuffle.partitions",100)

Quote

Report

July 18, 20205 yr

19 minutes ago, Sarvapindi said:

500 aa...inka ainatte

Enduku vayya intha bayapadukuntu learn python basic try job on automation testing with python

Quote

Report

July 18, 20205 yr

@Sarvapindi bro

https://sparkbyexamples.com/category/spark/

This site would help you. spark examples good. Spark okate vaste sarpodu ....there is more konni projects lo spark asalu use cheyam , we use only hadoop stack and oozie workflows for scheduling jobs.

Good luck. Bye

Quote

Report

July 18, 20205 yr

2 minutes ago, SharkTank said:
try to implement everything using dataframes single shot . Spark sql usage enta limit cheste good, coz view occupies space. Use spark 2 . Ipdu indulo emundi . Error post chey . As i said give

spark.conf.set("spark.sql.shuffle.partitions"2000) on your sparkscala console . This should fix.
 
spark.conf.set("spark.sql.shuffle.partitions",100)

learning bro adi

em anna doubts unte pm chesta e m anna manchi maternial unte pm cheyi

or ikkada post cheyi

btw is my approach correct ?

i used GCP

for this

Quote

Report

July 18, 20205 yr

2 hours ago, Sarvapindi said:

Big data real time scenarios ela untay.. jara seppandi

real ga untay

Quote

Report

July 18, 20205 yr

3 minutes ago, soodhilodaaram said:

real ga untay

em chepparu!!!!

Quote

Report

July 18, 20205 yr

Author

27 minutes ago, Killer66 said:

Bhaiya big daya lite tesuko vere vi vetuko, . Coding rakapothey big data lo vundadam kastam 😐

last ki ade aithadi

Quote

Report

July 18, 20205 yr

Author

38 minutes ago, Killer66 said:

Bhaiya big daya lite tesuko vere vi vetuko, . Coding rakapothey big data lo vundadam kastam 😐

rate kuda super em lev..current situation la reporting tools ki enthundo antey istunnar nee yavva...ala aithe waste inka..if we r sufer thopu we can demand otherwise waste e

Quote

Report

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Followers

Go to topic listing

Love Andhrafriends.com? Tell a friend!

🏏 🏏 🏏 🏏 🏏 🏏 Gachibowli Diwakarams - Cricket Disco 🏏 🏏 🏏 🏏 🏏
Discussions

🏏 🏏 🏏 🏏 🏏 🏏 Gachibowli Diwakarams - Cricket Disco 🏏 🏏 🏏 🏏 🏏

Spartan · Monday at 05:54 PM5 days

@afdb_sai pin this thtead
- 532 replies
- 13,284 views
Spartan

Monday at 05:54 PM5 days

The_Mentalist

46 minutes ago46 min
Kaamandhulu
Discussions

Kaamandhulu

Sucker · February 3Feb 3

Place holder for all kaamandhulu. We can keep disco here about Kaamam Of the kaamam, By the Kaamam, For the Kaamam
- 79 replies
- 12,912 views
Sucker

February 3Feb 3

Maggavale

17 minutes ago17 min
Btc under 70k - updated
Discussions

Btc under 70k - updated

csrcsr · January 31Jan 31

Not sure low volume shorting just fake drop it might bounce or if it looses critical support Loooks hedge funds dragged retail to metals gold silver crypto if they dump big all retail investors mataaash Not sure pedda vala thinking enro Inka spy qqq meeda padithe cash unolu king With so many things ai imoact on jobs lay offs wars dumling real estate across dont know where we are going looks Bahar sherwani under pareshani type lo undo or just over thinking time w
- 113 replies
- 11,528 views
csrcsr

January 31Jan 31

Jatka Bandi

February 6Feb 6
If you thought my 1st video on H1 was insane, just wait till you see what i just uncovered- Sara
Discussions

If you thought my 1st video on H1 was insane, just wait till you see what i just uncovered- Sara

Sam480 · Monday at 07:58 PM5 days
- 102 replies
- 11,033 views
Sam480

Monday at 07:58 PM5 days

Sucker

9 hours ago9 hr
who is Shravan?
Discussions

who is Shravan?

Coconut · January 26Jan 26
- 23 replies
- 10,775 views
Coconut

January 26Jan 26

LachLan123

Thursday at 01:15 PM2 days
Gold 5k silver 105 per oz official
Discussions

Gold 5k silver 105 per oz official

csrcsr · January 25Jan 25

Congratulations to all gold silver buyers last year not me db pedda manishi @Android_Halwa anna was telling from last 1 year
- 121 replies
- 9,223 views
csrcsr

January 25Jan 25

Jatka Bandi

January 30Jan 30
Texas AG visits 3Bees Technologies, plans to destroy Body Shop
Discussions

Texas AG visits 3Bees Technologies, plans to destroy Body Shop

Coconut · January 29Jan 29
- 73 replies
- 8,781 views
Coconut

January 29Jan 29

CADNMALODU

January 30Jan 30
N400
Discussions

N400

Jalsa456 · January 21Jan 21

anyone applied recently? can we file individually or need to use attorney .. any good resources to prep for civic test ? what is average wait time in Dallas for interview and oath
- 98 replies
- 8,570 views
Jalsa456

January 21Jan 21

Sundarpichal

January 27Jan 27
SIT concluded no Animal fat found in Tirumala Laddu
Discussions

SIT concluded no Animal fat found in Tirumala Laddu

The_Mentalist · January 24Jan 24
- 177 replies
- 8,161 views
The_Mentalist

January 24Jan 24

psycopk

February 3Feb 3
inko H1B frustation story!
Discussions

inko H1B frustation story!

aa_cheptam_mari · January 26Jan 26

TLDR: lost job in December layoffs, H1 grace period ends on feb 10th. Got job offer last week, LCA filed today. Possible government shutdown this week, so assam! Background: manchiga chadivi, job techukoni, India lo job chesi (not WITCH, was working with FAANG for 5+ years). US lo manchi (top 5) college lo MS chesi manchi high paying job lo unna, december lay offs lo job thusss.... Current situation: sachi chedi pellam pillalni pakkaki petti, holidays bottles antha gattu
- 58 replies
- 7,967 views
aa_cheptam_mari

January 26Jan 26

allbakara

February 4Feb 4
Desi' started FLOODING Sara's inbox with tips
Discussions

Desi' started FLOODING Sara's inbox with tips

Sam480 · January 30Jan 30
- 63 replies
- 7,522 views
Sam480

January 30Jan 30

kevinUsa

January 30Jan 30
h1b landlords
Discussions

h1b landlords

ismartganesh · February 3Feb 3
- 64 replies
- 6,507 views
ismartganesh

February 3Feb 3

andhra_jp

14 hours ago14 hr
Students be careful
Discussions

Students be careful

Sucker · January 16Jan 16
- 66 replies
- 6,392 views
Sucker

January 16Jan 16

allbakara

January 19Jan 19
is it true that raids on H1 workers happened in dallas?
Discussions

is it true that raids on H1 workers happened in dallas?

paaparao · February 7Feb 7

kumar says so. not sure whether it is fake or true.
- 49 replies
- 6,201 views
paaparao

February 7Feb 7

Sundarpichal

Monday at 06:14 PM5 days
Stock discussions 2026
Discussions

Stock discussions 2026

nanibabu · Tuesday at 09:58 PM4 days

Perf issue new thread pinned
- 89 replies
- 6,077 views
nanibabu

Tuesday at 09:58 PM4 days

viky

6 hours ago6 hr
these jobs..comfort and luxury...can sustain max for 5 years...
Discussions

these jobs..comfort and luxury...can sustain max for 5 years...

dasaribro · Thursday at 04:11 AM3 days

tough times ahead for Gen Z...social media is their biggest enemy..and next AI and automation is second best enemy... capitalism నడిచేది.. demand and supply పైన అనుకుంటే...labour/workers/employees is on supply side...and consumers/customer/common people are on demand side.. టూకీగా చెప్పలంటే...ఒక బ్యాంక్ కి పని చేసే SW Engineer...ఇంకో బ్యాంకి కి consumer...thats the economic cycle... people work-> companies sell products --> people earn wages -> people spend on buying prod
- 100 replies
- 5,953 views
dasaribro

Thursday at 04:11 AM3 days

csrcsr

Friday at 04:55 PM1 day
aipooyindhi raa .... anthaa aipooyindhi...mottam mogga gu£i$i pooyindhi..
Discussions

aipooyindhi raa .... anthaa aipooyindhi...mottam mogga gu£i$i pooyindhi..

yslokesh · Thursday at 06:10 PM2 days

I was the one who kept on pushing that Coding cannot be replaced by AI... but recent happening in very project I am working: 1. Generate a Micro-service using Java21 with Akka Actor system - core requirement being SUPPORT up to 1000 Concurrent users, scalability taking high-priority. 2. Apply Round-robin scheduling of Actor threads with priority-queueing of certain tasks. -- APPLY BUSINESS LOGIC 3. Pull data from Amazon SQS, empty SQS queue for user, process data, create a fi
- 75 replies
- 5,937 views
yslokesh

Thursday at 06:10 PM2 days

yslokesh

Friday at 02:53 PM1 day
Almost time to go back to India!
Discussions

Almost time to go back to India!

krishnaaa · January 23Jan 23

Start making plans to go back to India from next year Jobs moving back and now AQI going to get better starting next year with cheaper EVs and longer range.
- 60 replies
- 5,929 views
krishnaaa

January 23Jan 23

krishnaaa

January 25Jan 25
#EH1B Expose# Checking 7 Shell IT Companies in 2 different buildings in Houston
Discussions

#EH1B Expose# Checking 7 Shell IT Companies in 2 different buildings in Houston

Sam480 · January 22Jan 22
- 45 replies
- 5,661 views
Sam480

January 22Jan 22

kevinUsa

January 23Jan 23
June 2016 - PD
Discussions

June 2016 - PD

kittu2u · January 29Jan 29

Hello DB members , my PD is june 2016 . Is it a good idea to convert to FTE ? Any chance priority date will be current in next 2 years ? emaina chance unda PD current avvadaniki ? Cc : @csrcsr@Spartan
- 53 replies
- 5,647 views
kittu2u

January 29Jan 29

bankris

January 31Jan 31

Sign In

Top Posters In This Topic

Popular Days

Most Popular Posts

Sarvapindi

Sarvapindi

SharkTank

Join the conversation

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)