What Is Big Data And What Is Hadoop

March 9, 2015

It is a good question and lot of people will confuse about it..

Big data is a issue or problem where companies not able to handle the data coming from lot of sources(Like Web logs,Sensor dat etc..i mean even the small company nd big company)..Within the next coming 3 to 5 years the data increases like 50 to 70% more compared to previous Last 5 years.So companies will miss important info( if they are not able to analyze this data).So Hadoop is used to handle this type of data.

Hadoop is not a tool and it is an Open source Eco system which is used to handle this Big data and will process distributedly.

Please look into RDBMS V/S Hadoop in google..If you want to know why oracle,teradata etc.. not able to handle this type of data??

Batch Processing : Mapreduce,Pig,Apache Spark

Real time Processing : Storm or Spark Streaming

Sql interface : Hive,Impala,.SparkSql,Apache Drill

Sqoop : Used to move the data from RDBMS to HDFS or Vice-Versa

Oozie: Scheduling the jobs

No-sql : Hbase,MongoDB,Cassandra etc.

Graph database : Neo4j

Dashboards to visualize this data: Tableau

Technologies which is important to learn CoreJava,Python,Scala,R

Hadoop is completely different when you compare to Teradata

Hadoop = unstructured data and TD = structured data

There is no comparison between TD and Hadoop(as they are completely different), Teradata handles large amount of data for example ebay/paypal data in Teradata is around 1.2 Peta Bytes and I guess this is proof that TD can handle large amount of data.

March 9, 2015

Gp Colin

March 9, 2015

Hadoop is completely different when you compare to Teradata
Hadoop = unstructured data and TD = structured data
There is no comparison between TD and Hadoop(as they are completely different), Teradata handles large amount of data for example ebay/paypal data in Teradata is around 1.2 Peta Bytes and I guess this is proof that TD can handle large amount of data.

Then why organizations prefering hadoop than teradata man

March 9, 2015

Then why organizations prefering hadoop than teradata man

If you understand the difference between structured and unstructured data then you got the answer for your question.

March 9, 2015

It is a good question and lot of people will confuse about it..

Big data is a issue or problem where companies not able to handle the data coming from lot of sources(Like Web logs,Sensor dat etc..i mean even the small company nd big company)..Within the next coming 3 to 5 years the data increases like 50 to 70% more compared to previous Last 5 years.So companies will miss important info( if they are not able to analyze this data).So Hadoop is used to handle this type of data.

Hadoop is not a tool and it is an Open source Eco system which is used to handle this Big data and will process distributedly.

Please look into RDBMS V/S Hadoop in google..If you want to know why oracle,teradata etc.. not able to handle this type of data??

Batch Processing : Mapreduce,Pig,Apache Spark
Real time Processing : Storm or Spark Streaming
Sql interface : Hive,Impala,.SparkSql,Apache Drill
Sqoop : Used to move the data from RDBMS to HDFS or Vice-Versa
Oozie: Scheduling the jobs
No-sql : Hbase,MongoDB,Cassandra etc.
Graph database : Neo4j
Dashboards to visualize this data: Tableau

Technologies which is important to learn CoreJava,Python,Scala,R

Indulo ETL ki related edi.

March 9, 2015

Indulo ETL ki related edi.

Any databases will require to load the data bro, so ETL is completely different topic.

For Teradata, they have inbuild ETL tools like FLoad, MLoad, BTEQ, etc., or you can use regular etls tools like informatica also but mostly Teradata we use ELT than ETLs

so when it comes to ETL its completely different topic

March 9, 2015

:4_12_13:

March 9, 2015

If you understand the difference between structured and unstructured data then you got the answer for your question.

so Teradata and other big data bases ni Hadoop or Big data replace cheyadantava market lo after few years? Teradata ki inka enni yr untadanukuntunnav life?

March 9, 2015

Hadoop is completely different when you compare to Teradata

Hadoop = unstructured data and TD = structured data

There is no comparison between TD and Hadoop(as they are completely different), Teradata handles large amount of data for example ebay/paypal data in Teradata is around 1.2 Peta Bytes and I guess this is proof that TD can handle large amount of data.

Hadoop lo kuda you can import structured data from various RDBMS systems using Sqoop.. kani that data will be stored in semi-structured format (separated by delimeters) .. kani of course it is easier to handle structured data in RDBMS than Hadoop.. kani Hadoop tho advantage is:

Elastic: Eppudanna if batch processing has more data for a particular month, we can easily add a few nodes to speed up the process

Fault Tolerant: Even if a couple of nodes fail, the entire job wont fail.. ade RDBMS lo ee facility ekkuva undadu

Cheap: Teradata, Netezza, etc.. nodes ki expensive hardware avasaram kani Hadoop nodes ki commodity hardware is enough

TD and Netezza can handle large amounts of data, kani Hadoop can process it faster for certain use cases

March 9, 2015

so Teradata and other big data bases ni Hadoop or Big data replace cheyadantava market lo after few years? Teradata ki inka enni yr untadanukuntunnav life?

RDBMS ippatlo replace ayye chance ledu.. they are good at what they do..

RDBMS: Good for OLTP. Good for crunching latest and aggregated data

Hadoop: Good for Active Archive or basically old/all data at any granular level

March 9, 2015

Hadoop lo kuda you can import structured data from various RDBMS systems using Sqoop.. kani that data will be stored in semi-structured format (separated by delimeters) .. kani of course it is easier to handle structured data in RDBMS than Hadoop.. kani Hadoop tho advantage is:

Elastic: Eppudanna if batch processing has more data for a particular month, we can easily add a few nodes to speed up the process

Fault Tolerant: Even if a couple of nodes fail, the entire job wont fail.. ade RDBMS lo ee facility ekkuva undadu

Cheap: Teradata, Netezza, etc.. nodes ki expensive hardware avasaram kani Hadoop nodes ki commodity hardware is enough

TD and Netezza can handle large amounts of data, kani Hadoop can process it faster for certain use cases

Fault Tolerant: Even if a couple of nodes fail, the entire job wont fail.. ade RDBMS lo ee facility ekkuva undadu

Teradata can support False Tolerance, there is no way that the all the nodes fail at the same time and even if they do TD always has backup nodes which

will become active once the other nodes fail and I am not sure about other RDBMs

Cheap: Teradata, Netezza, etc.. - I agree on this as Hadoop is open source and very cheap when compared to Teradata ala ani clients wont go with Hadoop all the time and some clients dont care about money, they just need the performance.

March 9, 2015

gadhe vaste manchiga BE, BPM ki potunde kadha..

bpm ki java avsram ledh..avsram lene ledh.. gallery_8818_6_385253.gif?1367349476

March 9, 2015

so Teradata and other big data bases ni Hadoop or Big data replace cheyadantava market lo after few years? Teradata ki inka enni yr untadanukuntunnav life?

no your completely misunderstood about Hadoop

Hadoop is not a dataware house where as TD is DW

Please see this article, hope this helps

http://assets.teradata.com/resourceCenter/downloads/WhitePapers/EB-6448.pdf?processed=1

March 9, 2015

Fault Tolerant: Even if a couple of nodes fail, the entire job wont fail.. ade RDBMS lo ee facility ekkuva undadu

Teradata can support False Tolerance, there is no way that the all the nodes fail at the same time and even if they do TD always has backup nodes which

will become active once the other nodes fail and I am not sure about other RDBMs

Cheap: Teradata, Netezza, etc.. - I agree on this as Hadoop is open source and very cheap when compared to Teradata ala ani clients wont go with Hadoop all the time and some clients dont care about money, they just need the performance.

agreed on the replacement factor.. RDBMS ippatlo replace ayye chance ledu.. they are good at what they do.. kani for some unsolvable loads, Hadoop is a good answer

March 9, 2015

bpm ki java avsram ledh..avsram lene ledh..

gallery_8818_6_385253.gif?1367349476

Sign In

What Is Big Data And What Is Hadoop

Recommended Posts

namesake

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Darshwana

vasu123

namesake

kumar654

namesake

ATOM

vasu123

compose

compose

namesake

kedharinath

namesake

compose

namesake

Tell a friend

Most viewed in last 30 days

Activity