Big data guys randi vaaa....

July 18, 2020

28 minutes ago, Sarvapindi said:

google cheste real time telvadhi ga man...real time lo full advanced untaya ..konchem basics tho manage seyocha ani doubt

real time scenarios ante informatica or datastage idea unda

July 18, 2020

Just now, trent said:

Gattiga planning uncle Nuvvu ithe

definitions nerchukunte job iyyar baa...ee louda gallu technology ni thega develope sestunnar oka 10 yrs no new tech ante bagundu..Lekapothe every year okati nee yavva

July 18, 2020

2 minutes ago, SharkTank said:

real time scenarios ante informatica or datastage idea unda

Ha undi

July 18, 2020

13 minutes ago, Sarvapindi said:

Ha undi

Informatica datastage lanti etl application develop chestav either in python or scala and deploy it on spark cluster for processing data .data live streaming,unstructured,semi structured untundi

u will build etl Data pipelines which process any type of data huge data from several sources. Usually source rdms or live stream or flat files untay processing will be on spark and target will be some nosql database.

programming (okadu scala antadu inkodu Python )+ Hadoop stack + spark + Reporting tool +shell scripting + automation test inka na bonda boshanam mannu mashanam ,not discouraging but fed up

July 18, 2020

Emadya devops infrastructure setup kuda expecting .. &*B@

July 18, 2020

30 minutes ago, Sarvapindi said:

Sql vasthe pyspark radhu...pyspark ante python in spark.....u can use sql in pyspark....ya python or scala rendilto edoti vasthe saal...

In Scala lo kuda u can use SQL

BTW u create a SQL temp

Run the query and print it

July 18, 2020

9 minutes ago, SharkTank said:

Informatica datastage lanti etl application develop chestav either in python or scala and deploy it on spark cluster for processing data .data live streaming,unstructured,semi structured untundi

u will build etl Data pipelines which process any type of data huge data from several sources. Usually source rdms or live stream or flat files untay processing will be on spark and target will be some nosql database.

programming (okadu scala antadu inkodu Python )+ Hadoop stack + spark + Reporting tool +shell scripting + automation test inka na bonda boshanam mannu mashanam ,not discouraging but fed up

Bro do u even do cleansing

In Scala

If so how ??

Can u let me know please last week oka data set meda try chesa

4mn rows unna data set. Came to 1650 rows

July 18, 2020

8 minutes ago, kevinUsa said:

Bro do u even do cleansing

In Scala

If so how ??

Can u let me know please last week oka data set meda try chesa

4mn rows unna data set. Came to 1650 rows

U use spark + scala . transforming cleansing analytics anni chestam.4 mn dataset 1650 rows ante u r records are dropped. It depends on Shuffle partitions you mention .usually we give 2000 . If that is not set try to give it and execute your query

July 18, 2020

20 minutes ago, kevinUsa said:

In Scala lo kuda u can use SQL

BTW u create a SQL temp

Run the query and print it

Spark sql is not recommendable as temp tables occupy space. sql 10% work. Sql vaste saripodu

July 18, 2020

39 minutes ago, Sarvapindi said:

Ha undi

Bro meeru ikda post cheste pedda use undadu.1/4th knowledge suggestions vastay . You need to build an application then meeku idea vastundi chepte vachevi kadu .

July 18, 2020

3 minutes ago, SharkTank said:

Bro meeru ikda post cheste pedda use undadu.1/4th knowledge suggestions vastay . You need to build an application then meeku idea vastundi chepte vachevi kadu .

ela practice cheyalo chepu jara..manki real time data dorkadu kada ..

July 18, 2020

34 minutes ago, SharkTank said:

Informatica datastage lanti etl application develop chestav either in python or scala and deploy it on spark cluster for processing data .data live streaming,unstructured,semi structured untundi

u will build etl Data pipelines which process any type of data huge data from several sources. Usually source rdms or live stream or flat files untay processing will be on spark and target will be some nosql database.

programming (okadu scala antadu inkodu Python )+ Hadoop stack + spark + Reporting tool +shell scripting + automation test inka na bonda boshanam mannu mashanam ,not discouraging but fed up

Ade ba ..chala pedda code rayala in scala or python or few lines enuf aa? Online documentation lo examples la ne untaya inka ekkuva na

July 18, 2020

10 minutes ago, Sarvapindi said:

Ade ba ..chala pedda code rayala in scala or python or few lines enuf aa? Online documentation lo examples la ne untaya inka ekkuva na

Real time depends minimum 500 line of code . Simple practise wise ayte 30 lines code ni practise kosam. Datasets chala untay . Go to kaggle and download a dataset. Install scala spark and also intellij .Start practicing

July 18, 2020

31 minutes ago, SharkTank said:

U use spark + scala . transforming cleansing analytics anni chestam.4 mn dataset 1650 rows ante u r records are dropped. It depends on Shuffle partitions you mention .usually we give 2000 . If that is not set try to give it and execute your query

I will post it code what I have done

July 18, 2020

@dasari4kntr garu Scala Ani Denni kuda konchem add cheyandi please

Sign In

Big data guys randi vaaa....

Recommended Posts

SharkTank

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

Sarvapindi

Sarvapindi

SharkTank

Sarvapindi

Sarvapindi

SharkTank

SharkTank

kevinUsa

kevinUsa

SharkTank

SharkTank

SharkTank

Sarvapindi

Sarvapindi

SharkTank

kevinUsa

kevinUsa

Join the conversation

Popular Now

Tell a friend

Most viewed in last 30 days

Activity