%md #Introduction to Spark SQL * This notebook explains the motivation behind Spark SQL * It introduces interactive SparkSQL queries and visualizations * This notebook uses content from Databricks SparkSQL notebook and [SparkSQL programming guide](http://spark.apache.org/docs/latest/sql-programming-guide.html)
Introduction to Spark SQL
- This notebook explains the motivation behind Spark SQL
- It introduces interactive SparkSQL queries and visualizations
- This notebook uses content from Databricks SparkSQL notebook and SparkSQL programming guide
Last refresh: Never
%md ### Some resources on SQL * [https://en.wikipedia.org/wiki/SQL](https://en.wikipedia.org/wiki/SQL) * [https://en.wikipedia.org/wiki/Apache_Hive](https://en.wikipedia.org/wiki/Apache_Hive) * [http://www.infoq.com/articles/apache-spark-sql](http://www.infoq.com/articles/apache-spark-sql) * [https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html](https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html) * [https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html](https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html) * **READ**: [https://people.csail.mit.edu/matei/papers/2015/sigmod_spark_sql.pdf](https://people.csail.mit.edu/matei/papers/2015/sigmod_spark_sql.pdf) Some of them are embedded below in-place for your convenience.
Some resources on SQL
- https://en.wikipedia.org/wiki/SQL
- https://en.wikipedia.org/wiki/Apache_Hive
- http://www.infoq.com/articles/apache-spark-sql
- https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html
- https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html
- READ: https://people.csail.mit.edu/matei/papers/2015/sigmod_spark_sql.pdf
Some of them are embedded below in-place for your convenience.
Last refresh: Never
//This allows easy embedding of publicly available information into any other notebook //when viewing in git-book just ignore this block - you may have to manually chase the URL in frameIt("URL"). //Example usage: // displayHTML(frameIt("https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation#Topics_in_LDA",250)) def frameIt( u:String, h:Int ) : String = { """<iframe src=""""+ u+"""" width="95%" height="""" + h + """" sandbox> <p> <a href="http://spark.apache.org/docs/latest/index.html"> Fallback link for browsers that, unlikely, don't support frames </a> </p> </iframe>""" } displayHTML(frameIt("https://en.wikipedia.org/wiki/SQL",500))
Last refresh: Never
Command took 1.98 seconds
displayHTML(frameIt("https://en.wikipedia.org/wiki/Apache_Hive#HiveQL",175))
Last refresh: Never
Command took 0.12 seconds
displayHTML(frameIt("https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html",600))
Last refresh: Never
Command took 0.15 seconds
SDS-2.x, Scalable Data Engineering Science
Last refresh: Never