Once our bigdata loaded into hadoop environment . we need to think how to filter or customize our data to process in mapreduce algorithm. Then biggest question will running in our mind , what does pig and hive ? writing MapReduce code with basic Java may require you to write many lines of code will cost the productive time.
Instead of writing plain Java code to use MapReduce, now we can use the options of using either the Pig Latin or Hive SQL languages to construct MapReduce programs which will ease our data processing.
Hive is commonly used in Facebook for processing their data’s and Yahoo used Pig latin scripting.
Hive is as SQL interface which allows sql savvy users or Other tools like Tableu/Microstrategy/any other tool or language that has sql interface..
PIG is more like a ETL pipeline..with step by step commands like declaring variables, looping, iterating , conditional statements etc.
Partitioning can be done using HIVE but not in PIG.
Hive defines tables before hand (schema) + stores schema information in database and PIG don’t have dedicated metadata of database