Hello,
Data means meaningful information. This data generated from various sources. Some data can be structured data and unstructured. Structured data like bank data with row and columns in defined format (Table format). Unstructured data is like http log , log4j log - which also called streaming data.
Structured data is mainly managed by traditional way of RDBMS e.g. Oracle, MsSQL, Sybase, DB2 etc. Even it can managed by NoSQL e.g. MongoDB, Cassandra and HBase (These are nosql database/next generation database).
Stream data or Unstructured data to manage we have Apache Spark, is very useful.
So data will accumulated every days, not just from banking, IT companies , social networking sites, it also generate from manufacturing companies also. After AI and IOT (Internet Of Things), it generates from tiny sensor as well. So Data is growing, expectation are by 2020 data would 44ZB (44 trillion gigabytes). Every 2 years data is doubling in size.
So we need lot of storage along with technologies to process all these mammoth of data. So we have Big Data technology open source like Apache Hadoop.
So Hadoop is next Boom in Market.. or Any other Big Data Technology.
Hadoop ecosystem has various solution for different requirements.
HBase - is the Hadoop database, a distributed, scalable, big data store. When you need random, real time read/write access.
Hive - Hadoop data-warehouse, processing large data sets. Developed by Facebook. Nosql, Columnar DB (column Store DB - Derby DB). Its like Cassandra db.
Spark - Processing Streaming data and real time analysis. ,With built-in modules for streaming, SQL, machine learning and graph processing.
Kafka - Messaging solution.
Pig - A procedural (data flow )language for processing semi-strutured data-sets using Hadoop Map reduce Developed by- Yahoo.
Sqoop - Data loading tool from RDBMS (SQL to Hadoop)
Flume - Streaming data -> e.g. https log, for click data, non structured one.
Oozie - Workflow , Job Scheduler
Zoo keeper - Coordinator.
There are many Hadoop flavor in the market you can explore more....
Data means meaningful information. This data generated from various sources. Some data can be structured data and unstructured. Structured data like bank data with row and columns in defined format (Table format). Unstructured data is like http log , log4j log - which also called streaming data.
Structured data is mainly managed by traditional way of RDBMS e.g. Oracle, MsSQL, Sybase, DB2 etc. Even it can managed by NoSQL e.g. MongoDB, Cassandra and HBase (These are nosql database/next generation database).
Stream data or Unstructured data to manage we have Apache Spark, is very useful.
So data will accumulated every days, not just from banking, IT companies , social networking sites, it also generate from manufacturing companies also. After AI and IOT (Internet Of Things), it generates from tiny sensor as well. So Data is growing, expectation are by 2020 data would 44ZB (44 trillion gigabytes). Every 2 years data is doubling in size.
So we need lot of storage along with technologies to process all these mammoth of data. So we have Big Data technology open source like Apache Hadoop.
So Hadoop is next Boom in Market.. or Any other Big Data Technology.
Hadoop ecosystem has various solution for different requirements.
HBase - is the Hadoop database, a distributed, scalable, big data store. When you need random, real time read/write access.
Hive - Hadoop data-warehouse, processing large data sets. Developed by Facebook. Nosql, Columnar DB (column Store DB - Derby DB). Its like Cassandra db.
Spark - Processing Streaming data and real time analysis. ,With built-in modules for streaming, SQL, machine learning and graph processing.
Kafka - Messaging solution.
Pig - A procedural (data flow )language for processing semi-strutured data-sets using Hadoop Map reduce Developed by- Yahoo.
Sqoop - Data loading tool from RDBMS (SQL to Hadoop)
Flume - Streaming data -> e.g. https log, for click data, non structured one.
Oozie - Workflow , Job Scheduler
Zoo keeper - Coordinator.
There are many Hadoop flavor in the market you can explore more....