Magic of SQL in scoring Data Mining Models

As a former DBA I love the fact that SQL has a life of its own and the way it is still used after it was first conceived of in the 1980s as the natural way to query Relational Database Management Systems. A vast amount of data in the world is still available in SQL compliant RDBMS tables and today when business analytics and data science seems to be overshadowing SQL, I was delighted to find that SQL can still play a very important role in the implementation of complex data mining applications. This post explains how this evergreen tool is still very, very relevant in data mining. Many data mining tools like Rattle , RapidMiner are used to create "models" for Classification / Decision Trees and Heirarchical Clustering but then the models have to put into production by using them to score large datasets. This is where SQL can play a very powerful role. The models created in the data mining tools need to be exported as PMML documents and then converted to SQL using any of...