Posts

Showing posts with the label sql

Google Sheets with SQL

The Google App Suite ( or G-Suite) of products is a powerful and free alternative to the traditional Office Suite and includes among others Docs, Sheets and Slides. Google Sheets offers full spreadsheet functionality and in this post and sample, we show how data stored in Google Sheets spreadsheets can be accessed with SQL. This post assumes that the reader is familar with standard spreadsheet terms like sheet, range and functions. Google Script The primary programming language used here is Google App Script, a javascript look alike that can be used to enhance and extend G-Suite products. Google script can also be interleaved with traditional HTML / CSS / JavaScript to create an elegant user interface as is shown in this example. A Google Script project sits in the Google Drive just like any other Sheet, Doc or Slides and consists of code in a code.gs file along with html files for the interface. Google Script functionality can be built into other Google products but in this exam...

OLAP Data Cube with SQL

Image
As an erstwhile DBA, a long time user and a great admirer of the SQL language -- that has stood the test of time for the last 30 years -- I have always sought to use SQL in many useful ways. In an earlier post , I had shown how SQL can be used to solve a classic data science problem, namely Clustering, using the K-Means algorithm and today, I demonstrate how SQL can be used to process OLAP data cubes and generate the popular cross-tabs table. Data cubes, or OLAP cubes, are a way to store historic data using the dimensional model, as opposed to the relational or 3rd normal form model. These data cubes can be "sliced" and "diced" to reveal data relevant to particular dimensions. Because of the immense popularity and ubiquity of relational databases, like Oracle and MySQL, data in the dimensional model is routinely stored in relational tables and retrieved -- by slicing and dicing the cube -- using standard SQL constructs like the WHERE clause. This is called Relat...

K-means Clustering with SQL

Ever since I started my career in software and IT as a database administrator, I have had a great fascination, if not love, for the SQL language. I also know that there are many in the software fraternity (and sorority) who share my comfort with using SQL. Even after moving into the current enthusiasm, or should I call it a fad, with data analytics, I see that there exists many who prefer the ease and convenience of SQL even in big data. For example, Hadoop had to hide the complexity of its Map-Reduce magic behind HIVE. Spark offers SparkSQL. In an earlier post, I had shown how SQL can be used for scoring Data Mining models developed with R through the PMML route. In this post, we show how K-means clustering, a very common and widely used data mining activity can be done with SQL, using a technique adopted from Joni Salonen's blog post . The following codes were executed in MySQL. First we create the tables : drop table if exists km_data; -- contains initial data, on...