namesake Posted March 9, 2015 Report Posted March 9, 2015 It is a good question and lot of people will confuse about it.. Big data is a issue or problem where companies not able to handle the data coming from lot of sources(Like Web logs,Sensor dat etc..i mean even the small company nd big company)..Within the next coming 3 to 5 years the data increases like 50 to 70% more compared to previous Last 5 years.So companies will miss important info( if they are not able to analyze this data).So Hadoop is used to handle this type of data. Hadoop is not a tool and it is an Open source Eco system which is used to handle this Big data and will process distributedly. Please look into RDBMS V/S Hadoop in google..If you want to know why oracle,teradata etc.. not able to handle this type of data?? Batch Processing : Mapreduce,Pig,Apache Spark Real time Processing : Storm or Spark Streaming Sql interface : Hive,Impala,.SparkSql,Apache Drill Sqoop : Used to move the data from RDBMS to HDFS or Vice-Versa Oozie: Scheduling the jobs No-sql : Hbase,MongoDB,Cassandra etc. Graph database : Neo4j Dashboards to visualize this data: Tableau Technologies which is important to learn CoreJava,Python,Scala,R Hadoop is completely different when you compare to Teradata Hadoop = unstructured data and TD = structured data There is no comparison between TD and Hadoop(as they are completely different), Teradata handles large amount of data for example ebay/paypal data in Teradata is around 1.2 Peta Bytes and I guess this is proof that TD can handle large amount of data.
vasu123 Posted March 9, 2015 Author Report Posted March 9, 2015 Hadoop is completely different when you compare to Teradata Hadoop = unstructured data and TD = structured data There is no comparison between TD and Hadoop(as they are completely different), Teradata handles large amount of data for example ebay/paypal data in Teradata is around 1.2 Peta Bytes and I guess this is proof that TD can handle large amount of data. Then why organizations prefering hadoop than teradata man
namesake Posted March 9, 2015 Report Posted March 9, 2015 Then why organizations prefering hadoop than teradata man If you understand the difference between structured and unstructured data then you got the answer for your question.
kumar654 Posted March 9, 2015 Report Posted March 9, 2015 It is a good question and lot of people will confuse about it.. Big data is a issue or problem where companies not able to handle the data coming from lot of sources(Like Web logs,Sensor dat etc..i mean even the small company nd big company)..Within the next coming 3 to 5 years the data increases like 50 to 70% more compared to previous Last 5 years.So companies will miss important info( if they are not able to analyze this data).So Hadoop is used to handle this type of data. Hadoop is not a tool and it is an Open source Eco system which is used to handle this Big data and will process distributedly. Please look into RDBMS V/S Hadoop in google..If you want to know why oracle,teradata etc.. not able to handle this type of data?? Batch Processing : Mapreduce,Pig,Apache Spark Real time Processing : Storm or Spark Streaming Sql interface : Hive,Impala,.SparkSql,Apache Drill Sqoop : Used to move the data from RDBMS to HDFS or Vice-Versa Oozie: Scheduling the jobs No-sql : Hbase,MongoDB,Cassandra etc. Graph database : Neo4j Dashboards to visualize this data: Tableau Technologies which is important to learn CoreJava,Python,Scala,R Indulo ETL ki related edi.
namesake Posted March 9, 2015 Report Posted March 9, 2015 Indulo ETL ki related edi. Any databases will require to load the data bro, so ETL is completely different topic. For Teradata, they have inbuild ETL tools like FLoad, MLoad, BTEQ, etc., or you can use regular etls tools like informatica also but mostly Teradata we use ELT than ETLs so when it comes to ETL its completely different topic
vasu123 Posted March 9, 2015 Author Report Posted March 9, 2015 If you understand the difference between structured and unstructured data then you got the answer for your question. so Teradata and other big data bases ni Hadoop or Big data replace cheyadantava market lo after few years? Teradata ki inka enni yr untadanukuntunnav life?
compose Posted March 9, 2015 Report Posted March 9, 2015 Hadoop is completely different when you compare to Teradata Hadoop = unstructured data and TD = structured data There is no comparison between TD and Hadoop(as they are completely different), Teradata handles large amount of data for example ebay/paypal data in Teradata is around 1.2 Peta Bytes and I guess this is proof that TD can handle large amount of data. Hadoop lo kuda you can import structured data from various RDBMS systems using Sqoop.. kani that data will be stored in semi-structured format (separated by delimeters) .. kani of course it is easier to handle structured data in RDBMS than Hadoop.. kani Hadoop tho advantage is: Elastic: Eppudanna if batch processing has more data for a particular month, we can easily add a few nodes to speed up the process Fault Tolerant: Even if a couple of nodes fail, the entire job wont fail.. ade RDBMS lo ee facility ekkuva undadu Cheap: Teradata, Netezza, etc.. nodes ki expensive hardware avasaram kani Hadoop nodes ki commodity hardware is enough TD and Netezza can handle large amounts of data, kani Hadoop can process it faster for certain use cases
compose Posted March 9, 2015 Report Posted March 9, 2015 so Teradata and other big data bases ni Hadoop or Big data replace cheyadantava market lo after few years? Teradata ki inka enni yr untadanukuntunnav life? RDBMS ippatlo replace ayye chance ledu.. they are good at what they do.. RDBMS: Good for OLTP. Good for crunching latest and aggregated data Hadoop: Good for Active Archive or basically old/all data at any granular level
namesake Posted March 9, 2015 Report Posted March 9, 2015 Hadoop lo kuda you can import structured data from various RDBMS systems using Sqoop.. kani that data will be stored in semi-structured format (separated by delimeters) .. kani of course it is easier to handle structured data in RDBMS than Hadoop.. kani Hadoop tho advantage is: Elastic: Eppudanna if batch processing has more data for a particular month, we can easily add a few nodes to speed up the process Fault Tolerant: Even if a couple of nodes fail, the entire job wont fail.. ade RDBMS lo ee facility ekkuva undadu Cheap: Teradata, Netezza, etc.. nodes ki expensive hardware avasaram kani Hadoop nodes ki commodity hardware is enough TD and Netezza can handle large amounts of data, kani Hadoop can process it faster for certain use cases Fault Tolerant: Even if a couple of nodes fail, the entire job wont fail.. ade RDBMS lo ee facility ekkuva undadu Teradata can support False Tolerance, there is no way that the all the nodes fail at the same time and even if they do TD always has backup nodes which will become active once the other nodes fail and I am not sure about other RDBMs Cheap: Teradata, Netezza, etc.. - I agree on this as Hadoop is open source and very cheap when compared to Teradata ala ani clients wont go with Hadoop all the time and some clients dont care about money, they just need the performance.
kedharinath Posted March 9, 2015 Report Posted March 9, 2015 gadhe vaste manchiga BE, BPM ki potunde kadha.. bpm ki java avsram ledh..avsram lene ledh..
namesake Posted March 9, 2015 Report Posted March 9, 2015 so Teradata and other big data bases ni Hadoop or Big data replace cheyadantava market lo after few years? Teradata ki inka enni yr untadanukuntunnav life? no your completely misunderstood about Hadoop Hadoop is not a dataware house where as TD is DW Please see this article, hope this helps http://assets.teradata.com/resourceCenter/downloads/WhitePapers/EB-6448.pdf?processed=1
compose Posted March 9, 2015 Report Posted March 9, 2015 Fault Tolerant: Even if a couple of nodes fail, the entire job wont fail.. ade RDBMS lo ee facility ekkuva undadu Teradata can support False Tolerance, there is no way that the all the nodes fail at the same time and even if they do TD always has backup nodes which will become active once the other nodes fail and I am not sure about other RDBMs Cheap: Teradata, Netezza, etc.. - I agree on this as Hadoop is open source and very cheap when compared to Teradata ala ani clients wont go with Hadoop all the time and some clients dont care about money, they just need the performance. agreed on the replacement factor.. RDBMS ippatlo replace ayye chance ledu.. they are good at what they do.. kani for some unsolvable loads, Hadoop is a good answer
namesake Posted March 9, 2015 Report Posted March 9, 2015 bpm ki java avsram ledh..avsram lene ledh..
Recommended Posts