September 21, 2016

From Logic to Magic -- In Search of the Real

Better standards of living, that many parts of the world enjoy, are often traced back to the renaissance in Europe that freed man from blind bondage to belief and allowed him to fly free on the wings of rational inquiry. These standards of course are defined in terms of material comfort -- food, clothing, shelter, safety and finally the leisure to explore the arts and the sciences. This leads to technical and administrative competence and the emergence of good governance that in turn, loops back to create even higher standards of living. While every society desires this virtuous cycle, those that have aggressively adopted a scientific approach were the ones that have been successful in overcoming or converting others to their point of view. Spiralling out of Europe and reaching out into the depths of America, Africa, Asia and Australia, it has been the triumph of the rational way -- based on facts, axioms, logic and reason -- that delivers material comfort to the population.

borrowed from
But is there an alternative? A narrative that seeks to look past the last 500 years of rational science and instead, perceives the universe through intuition and imagination? The Sanskrit word for philosophy is darshan, the sight, the perception of the truth as seen through the mind’s eye of the Vedic seer! But any attempt to subscribe to such a Vedantic vision of the world is immediately criticised as being an anti-scientific, irrational regression into saffron stupidity. Where is the proof? So first, let us get that out of the way  …

Kurt Gödel was a mathematician, a colleague, confidante and contemporary of Albert Einstein at the Institute of Advanced Studies at Princeton. He is remembered for his famous Theory of Incompleteness that showed that any collection of consistent statements will have at least one statement that is true but not provable. Thus provability is a weaker notion than truth. Gödel’s Theorem knocks out the philosophical foundations of the edifice modern mathematics, carefully crafted by Euclid with his axioms and proofs. Gödel showed how his theorem was applicable to mathematics in general and arithmetic in particular. This implies that if there are statements that are true but not provable even in a science as well structured as arithmetic, then there should be no difficulty in accepting the same in more complex, subjective philosophical systems. Lack of proof is no more an excuse to deny the truth of what is otherwise intuitively obvious!

Now that the need for a proof is out of the way, let us explore some interesting ideas ..

Sankaracharya,  who churned out the philosophy of Advaita Vedanta from the ocean of the Vedas and the Upanishads and created the most popular philosophical basis of Hinduism, states that the physical world is an illusion -- Brahma satyam jagat mithya, jivo brahmaiva naparah. Prima facie, this sounds absurd. How can the world that I see -- and touch, feel, experience -- around me be not real? Even if we ignore our senses, we have equipment in our laboratories that can record enough evidence from deep inside atoms to the outer edges of the galaxy and the universe.

But consider virtual worlds. Of the kind described in the movie Matrix or experienced in online games like World of Warcraft  and now being rendered through virtual reality devices like Oculus Rift, Samsung Gear and Microsoft Hololens? Technology can blur the boundary between the real and the virtual but we may still claim the satisfaction of knowing that, in principle and in theory, it is possible to distinguish one from the other. But this satisfaction is short lived. Nick Bostrom, of Oxford University, in a paper published in 2003 puts forward the simulation hypothesis that argues quite convincingly that it is not impossible that the world that we inhabit is indeed a simulation (or “Sim”) that is being run on a digital computer in the multiverse. So the physical universe that we know it today becomes one of the many “Sims” in the multiverse operating at a higher dimension or plane of existence. Recently, Elon Musk has echoed a similar thought.

What exactly are these higher planes and dimensions? The normal reality that we are familiar with admits of three dimensions in space to which Einstein added a fourth dimension in time. String Theory, a descendent from the hoary lineage of Relativity, Quantum Mechanics and the Standard Model of particle physics has made it quite respectable to consider the universe to have 10 or even 26 dimensions that are curled up, or crushed into the four that we know. A 3D structure can be reduced to a 2D photograph (for the engineer, the  plan and elevation!) that in turn can be rolled up into a thin, 1D tube. Information is lost when dimensions are reduced and may be recovered when the original dimensions are unrolled. Edwin Abott’s 1884 novella, Flatland, a satire on 19th century social issues that spanned across 1D, 2D & 3D worlds, was one of the first modern texts that explored the mathematical novelty that crops up in multi-dimensional String Theory today.

Unlike Relativity and Quantum Mechanics, String Theory is not yet proven and it is unlikely to be, in the near future but that does not make it untouchable in academic circles. So is the case with the simulation hypothesis that opens the doors to the multiverse. The illusory world of Maya that Sankara posits is indeed no different either.

Sankara talks about the primacy of the Brahman, the primordial, conscious sentience that is the only reality of the universe. Sentience and its close cousin intelligence is a function of information exchange and information science can play an interesting role in exploring this area. For example, life as we know it is a manifestation of the information stored in the genetic code. The medium of storage, the DNA molecule, is physical and degradable but the personality, the spirit, the Atman, that is encoded in the gene is transcendental, immortal and transferrable. You can destroy a paper book but not forget the classic story that was written on it! The story is independent of the physical book. From this perspective, the idea of an immortal Atman that evolves across multiple physical incarnations until it achieves identity with the Brahman certainly sounds feasible, much to the chagrin of the dyed-in-the-wool rationalist.

That information is the key to a fundamental understanding of the real world is a hot new topic in current physics. According to the MIT Technology Review “Some physicists are convinced that the properties of information do not come from the behaviour of information carriers such as photons and electrons but the other way round. They think that information itself is the ghostly bedrock on which our universe is built”. Based on the work of Erik Verlinde, of the University of Amsterdam, who showed that the Laws of Gravity can be derived from the Laws of Thermodynamics, Jae-Weon Lee of Jungwon University, South Korea has shown how gravity can be related to quantum information. Of course, the information that they talk about is not the kind stored in books and  computer disks but are defined in terms of symbols, sequences, probabilities and eventually entropy.

Information plays a key role in the description of both the cognizant intelligence that lies at the heart of reality as well as the physical depiction of this reality. Information is both the spirit as well as the body that is temporarily attached to it. Access to this information is the key to experience sat-chit-anand, the real and conscious bliss, that pervades and defines existence. Ekam Satya, vipra bahuda vadanti - Truth is one but many are the paths to it. Traditional science with its emphasis experimental rigour and rationality is certainly a useful tool but the direct experience, born of meditation and leading to enlightenment is an equally viable way to reach the same goal.

The Nasadiya Sukta of the Rig Veda 10:129 asks
But, after all, who knows, and who can say
Whence it all came, and how creation happened?

To which the classic textbook on Physics by Resnick and Halliday answers with a quote from the English poet W.B.Yeats saying that “the world is full of magical things patiently waiting for our wits to grow sharper”. Implicit in this statement is the fundamental premise of science that the world is understandable. The alternate premise is that the world is experienceable. Before the Mahabharata war, Krishna is seen trying to convert Arjun to his point of view. But after the first ten chapters of the Bhagavad Gita, Krishna realises that his logic has failed. Then he invokes the magic of a direct experience of the Divine, the Vishwaroop Darshan, in chapter eleven to show Arjun the reality and convince him to pick up his weapons again.

Logic and magic, reason and intuition, are two sides of the same coin that buys a ticket for the train that runs from darkness to light, from the illusion to the real.
this post was originally published in Swarajya, the magazine that reads India Right

August 22, 2016

Two Cheers, Not Three for Economic Liberalisation

1989 was a watershed year for both the world in general and me in particular.

I had just finished my PhD from the University of Texas at Dallas and had decided to break the jinx of the X+1 syndrome and return to India. Those who have been a part of the desi community in the US in the last century would recollect this strange yearning of those who had finally arrived in the US, not just physically, but metaphorically as well, to give it all up and return to India. Nostalgia for home, sprinkled with a sense of guilt for having abandoned it, competed with la dolce vita, the good life, that America held out to the F-1 visa community of graduate students and it was always that the good life that won out. Most of F1 crowd would eventually get the Green Card, permanent immigrant status, and then become US citizens but they would always keep alive the delusion that next year, X+1, they will wind it all up and move back to India. It was a delusion because India was still stuck in socialist quicksand, where the cost of a new car was twenty five times the monthly salary of a fresh IIT B.Tech, while the corresponding factor in the US was three or four! Did I feel a turn in the wind? Did I suspect that things in India could change for the better? Perhaps I did or perhaps I was just foolish, but armed with a large hearted offer from Tata Steel I decided to pack up and return.

On the way back, my wife and I decided to use the $2200 windfall that I had just got by selling my Mazda 626 car to buy two tickets for a 15 day tour through Europe with the travel company Globus Gateway. Europe of course meant western Europe because the Iron Curtain of communism ensured that eastern borders could not be crossed very easily. Even within western Europe, we had to obtain seven separate visas for the countries that we would pass through. Nevertheless we eventually arrived at Paris on Bastille Day to realise that the world was celebrating 200 years of the the French Revolution. But little did we know that three months later, while we would safely be in Jamshedpur by then, the world would see the spectacular fall of something that is closer to us in history -- the Berlin Wall.

The aftershocks of the fall of the Berlin Wall reverberated throughout the world and in a way led to the fall of India’s Soviet-era socialist economic model in 1991. Indians finally had the chance to participate in the global economy and today, the 2015 IIT graduate with his Rs 15 lakh placement package can finally think of a new car with only four months of his salary -- just as it was in the US in 1989! Some may of course wonder whether a new car is all that important for a fresh graduate but that is another question that can be debated elsewhere.

This summer, my wife and I were back in Europe, with our son, and with no Iron Curtain in the way, we decided to go through the great cities of Eastern Europe. Did I see anything different? Not really. As a tourist you visit palaces and churches, ride trams, take cruises and eat, drink in pubs and bars that have not really changed over the years. But the real change  that I felt was in me -- and by extension, in other Indians. This was a direct outcome of the economic reforms that were kicked off in 1991 by the beleaguered Government of India in a desperate attempt to stave off the socialism inspired bankruptcy.

So what were these changes that I felt ?

First was economic freedom. I had grown up in a upper middle class family in Calcutta, studied in a renowned school and were financially well off but my father could never dream of a family vacation in Europe! That was for “big businessmen like Tata, Birla”. This has changed. The emergent middle class in India can now think big as well, not just in terms of vacations but in most of the good things in life. No longer do we wait for our relatives to come back from foreign lands and hand out shampoo and soap!

What is more important is that our currency is recognised internationally. Before 1991 the rupee was worthless outside India. Getting “foreign exchange” for even the most mundane and legitimate purposes, like paying for the application fees for a US university, was a titanic struggle with forms to be filled in triplicate. Any foreign exchange in cash or cash equivalent travellers cheques had to entered in the passport for subsequent scrutiny by vulturesque customs officers. Given the restrictions on getting foreign exchange and the meagre amounts that could be obtained -- unless of course you had the right connections -- travelling abroad was difficult. You had to think thrice before eating out at anything more expensive than a McDonald’s restaurant. But now our own Indian credit cards, issued by our own Indian banks are readily accepted anywhere around the world and this was a very pleasant surprise for me. Conditioned as I have been to moving around with limited amounts of dollars, and keeping track of every cent that I was spending, the fact that I could access an ATM and withdraw euros, zlotys, forints and karunas directly from my rupee savings bank account in India was something that took me quite some time to get accustomed to.

The next big change is in telecommunications. I had grown up in an India where a phone was a luxury and one had to wait for years to get a connection. STD was unknown and trunk calls -- with variations like lightning calls and person-to-person calls, were hideously expensive. Long distance calls within US were quite reasonable but calling India from the US was frightful and one had to pay nearly US$ 5/ min for calls and that too when it was night in India. The first time I saw a fax -- it used to be called ZapMail -- was when the Swiss embassy suddenly wanted a copy of our air ticket before issuing a visa.  Back in India, calling up my mother in Calcutta from Jamshedpur involved visiting a post office, standing in queue and paying Rs 100 in advance before the call could even be attempted. Mind you, it was attempted, not guaranteed to connect! Outside cities, telephones were impossible. I remember the wedding of a friend of mine where we accompanied the groom from Jamshedpur to Bokaro via Purulia and when faced with a sudden emergency, we could not make any kind of call until we had actually reached our destination.

This of course has changed beyond recognition. First, thanks to private players and wireless technology, getting a phone in India is just one KYC compliance away. Then you have VoIP technology like WhatsApp and Google Voice and when this is coupled to free WiFi services available at each and every hotel and restaurant in Europe, we were in constant touch with friends and family at virtually no cost.

Consumer goods, currency controls and communications -- ever since the heady days since 1991, all this changed for the better in India, but what has not? Many things, including our attitude towards corruption and criminals in public life but perhaps what is most obvious is India’s travel and transport infrastructure. While private airlines and app-based cabs cater to the requirements of the well heeled traveller, the common man is still at the mercy of inadequate and overcrowded public transport systems. In my current visit to Europe nothing showed this up more than the usage of trams in inner city transport.

Calcutta has a history of trams going back to 1902 and has the oldest running electric trams in Asia. But thanks to a combination of unfortunate incidents, including but not limited to the destruction of a large number of rolling stock by the communists in the 1960s, the tram system is gasping for breath. Unimaginative planning, incompetent operations, venal politics and inevitable corruption has come together to destroy an elegant, inexpensive and non-polluting form of transportation. As a big fan of trams in Calcutta, I have often been told that trams are obsolete and are an anachronism in a modern city. But this year in city after city, in Berlin, Warsaw, Krakow, Brno, Budapest, Vienna and Prague we saw how modern and sophisticated trams have been integrated with buses and even river boats to create an affordable and efficient public transport system. Why can we not build the roads and railways that this country so desperately needs?

What is lacking in India is neither technology nor capital but the ability, or perhaps the willingness, to put things together and craft an elegant solution that addresses basic infrastructural requirements. The economic reforms of 1991 may have uncorked the bottle of stifling socialism and released the genie but the genie is yet to master the magic that will create the right management structures not only for transportation but also for schools, hospitals, municipalities, courts of law, law enforcement, tax collection and in fact for the entire infrastructure of governance and public services.

The reforms of 1991 might have vindicated my 1989 decision to return to India because in purely economic terms, India today offers opportunities to achieve and maintain a standard of living that is comparable to what was possible in the United States. But what the reforms have left incomplete is the corresponding changes in governance procedures. With people and their mindset remaining the same, the only way to upgrade this infrastructure of governance is perhaps to reduce the discretionary role of humans and move over to a more systems driven approach to governance. As argued by this author in the May 2015 issue of this magazine, we need to leverage technology and modern management techniques to the hilt and use them to overcome deficiencies caused by people. Unless this happens and it happens very quickly, future generations of Indians will once again think of India as not a good place to go, or even return, to but perhaps just a great place to have come from.

And till then, it is only two cheers for economic liberalisation!
This article was originally published in Swarajyamag - the Magazine that reads India right

July 21, 2016

The Second Book on the Third Wave

Steve Case is such a big fan of Alvin Toffler’s 1980 classic, The Third Wave, that when he pens his own memoirs he gives it the same title. In his seminal work, Toffler had identified three distinct waves in the evolution of human society as the world moved from agriculture, through industry to become a post-industrial information driven society. Steve divides Toffler’s third wave -- the information phase -- into three sub-waves and then examines the third of this third in greater detail.

In addition to being his memoirs, that chronicle the rise and fall of America Online, the company that really got Americans hooked to the internet, there are two other distinct themes that Steve has woven into this easy to read book. First he wants to be mentor and cheerleader for the entrepreneur who has an idea to change the world and does not know how to go about it. The second, and this is pet theme, is the distinction between the first, second and third waves, or sub-waves, of the internet driven economy that dominates the world today.

For Steve, the first wave, in which he and his company AOL played a very significant role, was all about the setting up of the infrastructure of the internet and world wide web. This wave collapsed in the dot com bust and was followed by the second wave of applications -- Google for search, Facebook  for social media, WhatsApp for communications, Amazon for commerce. The key difference between the first and the second wave was that the second was driven by individuals, or small groups, using cutting edge technology while the first wave was not so much about innovative technology as about clever collaborations and partnerships. Steve admits it as such when he says that “AOL was not alone in believing in the idea of the Internet but we outhustled and outexecuted our competitors. The big companies like IBM and GE, should have prevailed, but they didn’t. Their lack of agility and entrepreneurial passion and culture hobbled them.”

The third wave will finally see the internet delivering on its promise of universal connectivity that it’s evangelists have been talking about since it’s early days. IoT -- the Internet of Things -- will connect every device from the car, to the toaster, the smartphone to the refrigerator, the powerplant to the electric switch through the internet and deliver innovative, useful services seamlessly. Steve believes that this connectivity will be so ubiquitous that the phrase internet enabled device will be as irrelevant as, say, an electricity enabled washing machine. Being connected to the internet will be the default and not a novelty or a USP. This will also ensure that third wave companies, and applications, will not create new or unusual business opportunities but will streamline and make more efficient, existing mainstream businesses like healthcare, education and agriculture that form the backbone of the global economy today.

Steve believes that the key and crucial differentiator for the third wave companies will be, like the first wave once again, partnerships. Unlike Elon Musk or Jeff Bezos, Steve is no champion for new or groundbreaking technology. Instead he believes that the success of the third wave entrepreneur will lie in stitching together a network of alliances and partnerships across three kinds of entities, namely, technology creators, mainstream businesses and government agencies. Knowwho will take precedence over knowhow. Unlike most technophiles, Steve believes that government can and must be trusted and, however difficult it may be, the entrepreneur must walk that extra mile to take government along if he wants to succeed. Entrepreneurs in India would, I am sure, wholeheartedly go along with this sentiment since they know very well that in India managing the government is more important than technology or management systems.

As an extension of his recurring belief in the value of partnerships --  “If you want to go quickly, go alone, if you want to go far, go together” -- Steve has a team of professionals from West Wing Writers to sieve through his speeches, distill out the wisdom and package it into this nice book. But as in anything created by a committee, the result, while being faultless to a point, lacks the brilliance of original ideas or the elegance of literary craftsmanship! Entrepreneurs however will see in Steve an image of their days of struggle, learn about the importance of networking and partnering with government and be motivated to jump into the third wave of the digital society that is already cresting around us today.

Originally published in the August 2016 issue of Swarajya, the magazine that reads India right!

June 30, 2016

Build your website at the lowest cost

This blog post will show you how to create a fairly decent website at a guaranteed lowest cost and that too without writing any code. Take a look. This post was originally written for iot-hub and the approach is currently being used at Yantrajaal as well. However, when Yantrajaal was created in 1999, none of these technologies existed and I had to take a more expensive route, that you do not need today.

The first step to creating your, or  your company's, digital identity is to build a website. Most people begin by purchasing web hosting services either from a web hosting company or from a value added reseller and have them build their own website. While this may be fine, a do-it-yourself approach will get you going at the minimum possible cost. This post will tell you how you can do this.

1. Purchase a domain name, from a domain registrar like TierraNet or any other similar company. This will cost you around US$ 14 / year. You can get an absolutely free domain name from Freenom but these domains will  end with .tk, .ml, .ga, .cf, .gq and not with the usual .com, .net, .org etc. Irrespective of where you purchase your domain from make sure that you have complete access to configure or modify the DNS records corresponding  to your domain, preferably through a GUI interface.

2. For hosting your website you have two options (a) Get a traditional web server from a hosting company like x10hosting, that could be free or have a monthly charge. Make sure that you have access to the CPanel application to manage your website. (b) The other option is to use Google's blogger platform. Unless  you want to build a transactional website with PhP-MySQL ( or equivalent ) support, the blogger platform is far easier to work with and is an excellent starting point. The blogger option is strongly recommended.

This post assumes that  you have chosen the blogger option.

3. Create a blog by following instructions given in this tutorial. For the name of the blog, choose the same character string as you have for the domain. If your domain is then your blog should be This is not essential but is a nice to have feature.

4. A blog looks different from a traditional website because unlike the latter it does not have a fixed home page nor does it have a set of navigational tabs across the top. To get over this problem, follow instructions given in this post.

5. Now  you need to link your domain to your blog For this you need to login into your domain registrar account (created in step 1) and then navigate to the screen that allows  you to manage the DNS. There will be DNS records that would need to be added, modified. To do so, you need to go to this Google Support Site and follow instructions there. Remember, you are modifying the top level domain, that is and not a subdomain like and choose the appropriate instructions. The DNS records that you add may have a conflict with existing DNS records that the registrar would have provided by default ( that points visitors to a default, under construction website). If in doubt, keep a screenshot of the earlier records and delete all of them. The new records should do the work. Anything to do with DNS servers takes some time to take effect. So after completing this step, go away to do something else for three or four hours ( though Google claims that it will take 24 hours) and then see if you can load  your website at If everything is OK, you should see your blog.

June 25, 2016

Spark, Python & Data Science -- Tutorial

Hadoop is history and Spark is the new kid on the block who is the darling of the Big Data community. Hadoop was unique. It was a pioneer that showed how "easy" it was to replace large, expensive server hardware with a collection, or cluster, of cheap, low end machines and crunch through gigabytes of data using a new programming style called Map-Reduce that I have explained elsewhere. But "easy" is a relative term. Installing Hadoop or writing the Java code for even simple Map-Reduce tasks was not for the faint hearted. So we had Hive and Pig to simplify matters. Then came tools like H20 and distributions like Hortonworks to make life even simpler for non-Geeks who wanted to focus purely on the data science piece without having to bother about technology. But as I said, with the arrival of Spark, all that is now history!

Spark was developed at the University of California at Berkeley and appeared on the horizon for data scientists in 2013 at an O'Reilly conference. You can see the presentations made there, but the following one will give you a quick overview of what this technology is all about.

But the three real reasons why Spark has become my current heart-throb is because
  1. It is ridiculously simple to install. Compared to the weeks that it took me to understand, figure out and install Hadoop, this was over in a few minutes. Download a zip, unzip, define some paths and you are up and running
  2. Spark is super smart with memory management and so unlike Hadoop, starting Spark on your humble laptop will not kill it. You can keep working with other applications even when Spark is running -- unless of course you are actually crunching through 5 million rows of data. Which is what I actually did, on my laptop.
  3. And this is the killer. Coding is some simple. 50 lines of code Java code -- all that public static void main()  crap -- needed in Hadoop, reduces to two or three lines of Scala or Python code. Serious, not joking.
And unlike the Mahout machine learning library of Hadoop that everyone talked about but no one could really use, the Spark machine learning library, though based on Mahout code, is something that  you can be running at the end of this tutorial itself. So enough of chit-chat, let us get down to action and see how easy it is to get going with Spark and Python.

In this post, we will show how to install and configure Spark, run the famous WordCount program so beloved of the Hadoop community, run a few machine learning programs and finally work our way through a complete data science exercise involving descriptive statistics, logistic regression, decision trees and even SQL -- the whole works.

Though, in principle, Spark should work on Windows, the reality it is not worth the trouble. Don't even try it. Spark is based on Hadoop and Hadoop is never very comfortable with Windows. If you have access to a Linux machine either as full machine to yourself or one that has a dual boot with Windows and Linux then you may skip section [A] on creating virtual machines and go directly to  section [B] on installing Spark.

Also please understand that you need a basic familiarity with the Linux platform. If you have no clues at all about what is "sudo apt-get ..."  or have never used the "vi" or equivalent text editor then it may be a good idea to have someone with you who knows these things during the install phase. Please do understand that this is not like downloading an .exe file in Windows and double-clicking on it to install a software. But even if you have a rudimentary understanding of Linux and can follow instructions, you should be up and running.

A] Creating a Virtual Machine running Ubuntu on Windows

If your machine has only Windows -- as is the case with most Windows 8 and even Windows 10 users -- then you will have to create an Linux Virtual Machine and carry out the rest of the exercise on the VM.   This exercise was comfortably carried out on 8GB RAM laptop but even 6GB should suffice.

  1. Download Oracle VirtualBox [ including Extension pack ] software for Windows and install it on your Windows machine.
  2. Download an Ubuntu image for the Virtual Box. Make sure that you get the image for the VirtualBox and not the VMware version! This is a big download, nearly 1GB and may take some time. What you get is a zip file that you can unzip to obtain a .vdi file, a virtual disk image. Note the userid, password of the admin user that will be present in the VM [ usually userid is osboxes and password is, but this may be different ]
  3. Start the VirtualBox software and create new virtual machine using the vdi file that  you have just downloaded and unzipped. You can give the machine any name but it must be defined as a Linux, Ubuntu. 
    1. If you are not sure how to create a virtual machine, follow these instructions. Remember to allocate at least 6GB RAM to the virtual machine
    2. If your machine is 64 bit but VirtualBox is only showing 32 bit options then it means that virtualization has been disabled on your machine. Do not panic, simply follow instructions given here. If you dont know how to boot your machine into bios then see
    3. Once your Ubuntu virtual machine starts, you will find that it runs in a small window and quite inconvenient to use. To make the VM occupy the full screen you would need to install Guest Additions to Virtual Box by following instructions given here  [ sudo apt install virtualbox-guest-additions-iso ] followed by loading the CD image as explained here
    4. In the setup options of the VM you can define shared folders between the Windows host OS and the Ubuntu guest OS. However the shared folder will be visible but not accessible to the Ubuntu userid until you do the this
    5. Steps 3 and 4 are not really necessary for Spark but if you skip them you may find it difficult or uncomfortable to work inside a very cramped window
  4. Strangely enough, the VM image does not come with Java, that is essential for Spark. So please install Java by following these instructions.
Ubuntu is so cool! Who wants Windows?

B] Install Spark

Once we have an Ubuntu machine, whether real or virtual, we can now focus on getting Python and Spark.
  1. Python - The Ubuntu 16.04 virtual machine comes with Python 2.7 already installed and is adequate if you want to use Spark at the command line. However if you want to use iPython notebooks [ and our subsequent tutorial needs notebooks ] it is better to install the same.
    1. There are many ways to install iPython notebooks but the easiest way would be to download and install Anaconda
      1. Note that this needs to be downloaded inside the Ubuntu guest OS and not the Windows host OS if  you are using a VM.
      2. When the install scripts asks if Anaconda should be placed in the system path, please say YES
    2. Start python and ipython from the Ubuntu prompt and you should see that Anaconda's version of python is being loaded.
  2. Spark - the instructions given here have been derived from this page but there are some significant deviations to accommodate the current version of the ipython notebook.
    1. Download the latest version of Spark from here.
      1. In the package type DO NOT CHOOSE source code as otherwise you will have to compile it. Choose instead the package with the latest pre-built Hadoop. 
      2. Choose direct download, not a mirror.
    2. Unzip the tgz file, move the resultant directory to a convenient location and give it a simple name. In our case it was /home/osboxes/spark16
    3. Add the following lines to the end of file .profile
      1. export SPARK_HOME=/home/osboxes/spark16
      2. export PATH=$SPARK_HOME/bin:$PATH
      3. export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
      4. export PYTHONPATH=$SPARK_HOME/python/lib/$PYTHONPATH
        1. to get the correct version of the file go to $SPARK_HOME/python/lib and see the actual value
        2. the last two paths are required because in many cases the py4j library is not found
    4. To start spark in the command line mode enter the "pyspark" command and you should see the familiar Spark screen. To quit enter exit()
    5. To start spark in the ipython notebook format enter the command $IPYTHON_OPTS="notebook" pyspark. Please note that the strategy of using profiles for starting ipython notebook may not work as the current version of jupyter does not support profiles anymore and hence this strategy was used. This will start the server and make it available at port 8888 on the localhost. To quit press ctrl-c  twice in quick succession.

    6. An alternative way of starting the notebook, not involving the IPYTHON_OPTS command is shown here. This is easier
      1. Start notebook with $ipython notebook ( or alternatively, $jupyter notebook)
      2. Execute these two lines from the first cell of the notebook
        1. from pyspark import  SparkContext
        2. sc = SparkContext( 'local', 'pyspark')
  3. Now we have Spark running on our Ubuntu machine, check out the status at http;//localhost:4040

C] Running Simple programs

If you have not familiar with Python do go through some of the first few exercises of Learning Python the Hard Way and if the concept of a notebook is alien to you then go through this tutorial.

Go to this page and scroll down to the section "Interacting with Spark" and follow the instructions there to run the WordCount application. This will need a txt file as input and any text file will do. If you cannot find a file, create one with vi or gedit and write a few sentences there and use it. Enter each of these lines as a command at the pyspark prompt

text = sc.textFile("datafile.txt")
print text
from operator import add
def tokenize(text):
    return text.split()
words = text.flatMap(tokenize)
print words
wc = x: (x,1))
print wc.toDebugString()
counts = wc.reduceByKey(add)

The final output in Hadoop style will be stored in a directory called "output-dir". Remember Hadoop, and hence Spark, does not allow the same output directory to be reused.

The same commands can also be entered one by one in the ipython notebook and you would get the same result

This establishes that you have Spark and Python working smoothly on your machine. Now for some real data science

D] Data Science with Spark

[New 24Jul16] Unlike Hadoop / Mahout, the machine learning library of Spark is quite easy to use. There are tons and tons of samples and even machine learning samples available. These samples along with the sample data are also available in the Spark Home directory that gets created during the installation of Spark as described above. You an run these programs using the spark-submit command as explained in this page after making small changes to bring them into the format described on that page. The basic template for converting these samples to run with spark-submit and two sample programs for clustering and logistic regression is available for download here.

To understand the nuances of the MLLIB library read the documentation, then, for example, follow the one on k-means. For more details of the API and the k-means models follow the links.

Jose A Dianes, a mentor at codementor, has created a very comprehensive tutorial on data science and his ipython notebooks are available for download at github. This uses actual data from a KDD cup competition and will lead the user through

  • Basics of RDD datasets
  • Exploratory Data Analysis with Descriptive Statistics
  • Logistic Regression
  • Classification with Decision Trees
  • Usage of SQL
After going through this tutorial, one will have a good idea of how Spark and Python can be used to address a full cycle data science problem right from data gathering to building models

Spark is a part of the curriculum in the Business Analytics program at Praxis Business School, Calcutta. At the request of our students I have created an Oracle Virtual Appliance that you can download [ 4GB though] import it into your Virtual Box and go directly to section [D]! No need for any installation and configuration of Ubuntu, Java, Anaconda, Spark or even creating the demo MLLIB code. This VM has been configured with 4GB RAM which just about suffices. Increase this to 6GB if feasible. -- Updates : [28Aug16] - New Virtual Box (password ="")

June 23, 2016

Tales from IIT Kharagpur

The IIT KGP Pre 1993 group in a facebook was created as a reaction and rebuff to the official IIT KGP Alumnus group that had become a hotbed of political and religious bigotry. The Pre 1993 crowd is believed to be more of the "kool kgp type" and even if some of them do have strong views on political and religious matters, such things are kept aside in this group. Posts in this group are more in the form of happy reminiscences that the old men like to tell, listen to and enjoy.

Since it is difficult to search through Facebook posts, I have created these pointers to the stories that I had contributed to this column. But to respect the privacy of the closed group where these stories were pointed, I have made sure that only members can read the whole story -- and the incredible comments that other members have made.

6th July 2015
Sometime in the recent past, Surath Chatterjee, like many of us, reached his 50th year and some of his friends had thrown a party for him at Bali. Unfortunately I could not join him at Bali but my thoughts went back to another birthday party that we in Azad C-Top-West had thrown for him. This might not have been at Bali, but I suppose it was no less spectacular! [ continued here! ]

21 June 2016
This was in our third or fourth year, 1982 or 1983 and even the names of the co-conspirators is fading out. It was the night of the Vishwakarma Puja, that heralds the start of the festive season in Bengal. Some of us from Azad, C-top West had visited Salua / Prembazar end of the campus for a shot of Mahua and on the way back we decided to stop by the various Vishwakarma puja pandels that had sprouted around the campus. The farthest one was the one at the Workshop and we decided to go there .. and on the way of course was the dark and ghostly structure of the Old Building. We had heard of the ghosts of the freedom fighters who were martyred there but that we did not really care. What we were really worried about was the security staff because we had decided -- on the spur of the moment and under the influence of Madam M -- to climb to the top of Old Tower! [ continued here!]

June 15, 2016

Society and the Second Law of Thermodynamics

When I think back about it, the New Delhi that I had visited as a child in the late 1960s was so much cleaner, nicer and better than the city I occasionally go to today. As a tourist in Europe in the late 1980s I did not have to think about the terror, and counter-terror, that is now being unleashed in Brussels and Paris. The Kolkata that I live in today is an urban disaster compared to the Calcutta that I went to school in. It is difficult to deny that, net-net, there has been a decay and degradation in the quality of urban life over the past 50 years.

Is it only in urban life? And is it only for the past 50 years? The dry statistics captured by economists in the Human Development Index -- a vector consisting of life expectancy, education and per-capita income -- and talked about by governments in power would claim that the world is becoming a better place. But the common man in the bazaars of India, and perhaps the world as well, would beg to differ. With his native intelligence, or rather his intuition, most people would tend that to believe that the past was better, cleaner and more peaceful than the murderous mayhem that he finds himself in at the moment. Is this intuition correct? Or should we believe the economic indicators that claim that we are better off?

Globally, and socially, are things changing for the better or for the worse?

To seek an apolitical answer to this question, let us travel back in time, to the early years of the nineteenth century when steam power was driving the industrial revolution and propelling Europe towards its pinnacle of economic, political and social glory. Everyone was trying to understand the mechanics of the steam engine and improve its efficiency and it was the Frenchman, Sadi Carnot (1796 - 1832) who came up with a body of knowledge that is known as thermodynamics today. His ideas were far from perfect but thanks to subsequent interpretations by Lord Kelvin, Rudolf Clausius and Ludwig Boltzmann, we have today the Second Law of Thermodynamics that states that in any physical process the entropy of the universe always increases. This technical statement is generally explained as that the physical world can only move from a state of order to disorder or from a state of lesser chaos to higher chaos! Though there is a counter perspective that refuses to view entropy as disorder but only as a measure of energy dispersal, the dominant, popular narrative continues with analogy of chaos and disorder.
image credit

The Second Law of Thermodynamics is one of the key pillars of modern science, on par with Newton’s Laws of Mechanics, Maxwell’s Laws of Electromagnetism, Einstein’s Theory of Relativity and Quantum Mechanics of Schrodinger and Heisenberg. It’s veracity is beyond challenge. However one must note that the Second Law refers to the entropy of the universe as a whole. The universe consists of a system and its environment. The system could be something as a simple as an empty box or as complex as a spaceship, the planet Earth or even a galaxy. The environment is everything outside the box, or the spaceship or Earth or the galaxy. Hence it is quite possible that the entropy, or disorder, in a system can decrease but the corresponding increase in the entropy of the environment would be such that the sum total of the decrease and the increase will result in an overall increase of the entropy of the universe. There is no known deviation from, or violation, of this Second Law in the domain of physical sciences. The arrow of time itself is defined by the change towards greater disorder.

Is it possible for a law that is valid in the physical sciences to be applicable in the social sciences?

Can we say that a society at rest continues to be at rest until acted upon by an external idea? Can we say that the rate of change in a society is proportional to the strength of an external idea and is inversely proportional to the size of the society? Can we say that most, if not all, social actions are accompanied by equal and opposite reactions? Even if we answer these questions with a qualified yes, then we are transferring Newton’s Laws from the physical world to the world of social sciences. Extending this logic to the Second Law means that society will inevitably go from order to disorder, from lesser chaos to greater chaos!

That the world is getting progressively “worse” was a concept that sent a shock through European society in the nineteenth and early twentieth century when the consequences of the Second Law seeped into public consciousness. People argued that the flowering of the European civilisation with its beautiful cities and stable social structures did not quite agree with notion of increasing chaos. But then people did not realise that this decrease in “chaos” in Europe was accompanied by a far greater increase in “chaos” that was being unleashed in colonies across Asia, Africa and South America! The net chaos of the universe was indeed increasing. Similarly in post independence India, we may claim that the order enforced by the Constitution has reduced the anarchy prevalent in the era of Mughals and the Marathas but this claim is undermined when we see the growth of maoist, islamist and separatist insurgency and the continuing politics of caste that has been unleashed.

The collective wisdom of the Hindu psyche represents this inevitable decay in the context of a Mahayuga that lasts for 4.32 million years and consists of four Yugas. Chaos, disorder or “evil” increases monotonically as society traverses through Satya, Treta, Dwapar and eventually reaches Kali Yuga where it ends in a blaze of violence. Though “modern” humans appeared 200,000 years ago and “civilisation” 6000 years ago, the figure of 4.32 million is not incompatible with the first evidence of humanoid behaviour that emerged 6 million years ago, when our ancestors started walking erect leaving their hands free to wield the tools and weapons that change the world.

A modern perspective would be to look at social entropy, defined as a measure of dissatisfaction within a social, political and economic system -- and we can see at once that there is no doubt that this increasing all around us at an alarming rate. Whether it is the killing fields of the Islamic State in Iraq and Syria, the Arab Spring in North Africa, the rising intolerance in North America, the siege mentality in Europe or closer home the “syndicate”-raj in Rajarhat or the anarchy in JNU, discontent is huge and growing rapidly. Neither does any amount of art, literature or other finer elements of culture, nor any spectacular technology, like SpaceX or solar cells, seem to have any calming effect. This is where we are tempted to take one of the grandest ideas of science and apply them to complex social phenomenon! Whether we should do so could be the subject of research, but if we could, what are the consequences?

Accepting the inevitability of increasing disorder would mean that the world would become an increasingly uncomfortable place. Elections may come and go but the average misery for the common man can never reduce. Occasionally we may encounter some superior technology that, for example, may wipe out a particular disease and reduce misery in a limited context, but in the long term it will lead to situations where things would be even worse. Occasionally, there can emerge a charismatic leader, a messiah, who will lead his followers towards something positive and triumphant but then again, with the passage of time, the world will regress  to what would be worse than what it was before. The end of colonial expansionism led into the rise of fascism and the World Wars. The end of fascism led to the rise of communism. The end of communism, and the purported End of History, led to the rise of Islamic terror. What is the next abomination?

The Second Law could be a powerful discouragement for trying to do anything constructive. Since you really cannot change the world for the better, why bother? But nevertheless,  while global entropy, or disorder, will always increase, it is possible to reduce it in a small, local environment -- Singapore is an example and Elon Musk’s proposed colony on Mars could be another.  Nearer home, acting locally to impose order and governance could be the only viable, but selfish, strategy. So while America, Europe, Iraq, Kashmir, GST and secularism are undoubtedly important, what really matters is whether the garbage is being cleared and the neighbourhood lumpens are behind bars. Panchayat is more important than Parliament!

For the rest of the world? And in the long term? We may have to live with the dismal consequences of the Second Law -- all the king’s horses and all the king’s men [cannot] put Humpty Dumpty together again.

This article originally appeared in Swarajya, the magazine that reads India right.

About This Blog

  © Blogger template 'External' by 2008

Back to TOP