Definition – What does SQL on Hadoop mean?
SQL on Hadoop is a type of analytical application tool — the SQL implementation on the Hadoop platform, which combines standard SQL-style querying of structured data with the Hadoop data framework. Hadoop is a relatively new platform, as is big data itself, and not many professionals are experts in it, but SQL on Hadoop simplifies access to the Hadoop framework and makes it easier to implement on current enterprise systems.
SQL-on-Hadoop is a class of analytical application tools that combine established SQL-style querying with newer Hadoop data framework elements.
By supporting familiar SQL queries, SQL-on-Hadoop lets a wider group of enterprise developers and business analysts work with Hadoop on commodity computing clusters. Because SQL was originally developed for relational databases, it has to be modified for the Hadoop 1 model, which uses the Hadoop Distributed File System and Map-Reduce or the Hadoop 2 model, which can work without either HDFS or Map-Reduce.
The different means for executing SQL in Hadoop environments can be divided into (1) connectors that translate SQL into a MapReduce format; (2) “push down” systems that forgo batch-oriented MapReduce and execute SQL within Hadoop clusters; and (3) systems that apportion SQL work between MapReduce-HDFS clusters or raw HDFS clusters, depending on the workload.