June 30, 2017

Quantum Computers

Quantum mechanics is a subject that has the strange property of simultaneously being logically rigorous and yet completely counterintuitive. So much so, that even a towering intellect like Einstein could never bring himself to accept its principles even though products based on the same exist all around us. The earliest oddity, identified by Schrodinger, one of the founders of quantum mechanics is about a hypothetical cat that is neither dead nor alive until someone actually observes it. A similar oddity is that of quantum entanglement, where the behaviour of one particle is instantly affected by the behaviour of another particle, however distant it may be -- an example of “spooky” action-at-a-distance. Explaining these phenomena is beyond the scope and temerity of this article and so the reader would have to accept them here in good, almost religious, faith and carry on with the belief that such phenomenon has been observed and explained by scientists under the most rigorous experimental circumstances.

Image borrowed from Quanta Magazine
Any programmable digital computer that we use, the desktop,the smartphone or the ones at Google, is based on a finite state machine (FSM). It can, at any instant of time, be in one of a large, but finite, number of well defined states. The state of a FSM is defined by the value stored in each of its memory locations and we know that these can either be 0 or 1. So an FSM with, say, 16 bits of memory could in principle be in any one of 2^16 states. Any instruction to the FSM changes the value of one or more bits and and the FSM moves to a different state. An FSM along with the ability to read binary input, from an infinite tape, and write back on the same tape, is the Turing machine that is the theoretical basis of any modern computer.

The fundamental principle of computer science is that the world is computable, meaning that any logically decidable problem can be represented and solved on a Turing machine and hence by extension on some, possibly very powerful, digital computer. This is the basis of our immense belief in computer technology that powers everything from smartphones to artificial intelligence. But even as long back as 1982, Richard Feynman had questioned this principle because he realised that Turing / FSM based computers could not solve the problem of simulating the movement of multiple particles whereas nature was doing it all the time! Did the quantum mechanical behaviour of nature mean that nature had a computing device that was inherently superior to the Turing machines built by classical computer technology? This is where the concept of a quantum computer was born.

A computer, is a state-machine where it’s state is defined by the collective states of each of its memory locations. In a classical computer, each memory location, or bit, can either be 0 or 1 certainly not both, but in a quantum computer it can be both 0 and 1 simultaneously -- very much like Schrodinger’s cat that was dead and alive at the same time!  This is where the going gets really rough for anyone who has spent a lifetime in classical computer science because this is something that is completely counter-intuitive. A memory location, a bit, is a transistor, or switch, made of silicon that is either ON or OFF. How can it be both? Turns out, that if you keep aside computer science and open your books on quantum mechanics, it is indeed possible that a body can be in two states at the same time based on the well established principle of quantum superposition. Now if we go back to our 16 bit classical computer with its 2^16 states and replace it with a quantum computer with 16 quantum bits, or qubits, of memory we have a machine that can be in 2^16 states simultaneously. If that is not mind-bending enough, all these 2^16 states will collapse into any one of the states as soon as we try to observe it. It is almost as if nature is playing a game with us, pretending to be classical whereas it is actually quantum.

But why are we obsessed with this counter-intuitive phenomenon? Will it have a drastic improvement on existing digital computer technology? Not really. Your spreadsheet, email, YouTube, eCommerce, smartphone will hardly change but two things could. First, current cybersecurity systems, that are based on our inability to decompose integers into their prime factors in a reasonable amount of time, could be ripped apart by quantum computers, leaving all passwords vulnerable to hackers. Second artificial intelligence could be taken to altogether and unbelievable levels of sophistication. So quantum computers will soon have a very important role to play -- but how far away are we from real, practical systems?

The biggest challenge is the construction of the physical memory locations and the complexity of the engineering problem is evident from the following : A modern IBM classical computer chip has anything between 2 and 7 billion transistors each of which can be ON or OFF. The corresponding IBM quantum computer chip, that powers the IBM Quantum Experience machine, has only 5, yes just 5, qubits of memory that can be in quantum superposition of ON and OFF. Why so? First, the memory locations have to be cooled to near zero Kelvin to exhibit their quantum superposition behaviour and if the cryogenic challenge was not enough, the second challenge is even bigger. Unlike the memory locations of classical computers whose state can be determined by sensing the presence or absence of an electrical voltage, the multiple, superimposed quantum states collapse as soon as any effort is made to observe them. This is as if a room has a house of cards that collapse as soon as the door is opened by the observer and the observer has to figure out what the house looked like by observing the disposition of the cards on the floor! Since the qubits can never be accessed directly, as in a classical computer with read and write statements, they can only be “influenced” indirectly.

To put things in perspective, ENIAC, one of the world’s first, 1st generation, vacuum tube based classical computer had 20 memory units, or accumulators, in 1945, and a 2nd generation, transistor-based computer from the University of Manchester had only 200 transistors in 1955. Since then we have moved through 3rd generation integrated chips and the current 4th generation of microprocessors have scaled up to billions of transistors thanks to the inexorable pressure of Moore’s Law. If we remember that even with its 20 memory units, ENIAC was used to solve problems in weather forecasting, atomic energy calculations, wind tunnel design, the current 5 qubit IBM machine does not look as hopeless, or helpless, as it seems to be.

But actually things are a little better off. D-Wave a Canadian company that has been building quantum computers since 1999  have come out with a 128 qubit machine in 2010, a 512 qubit machine in 2012 and 1000 qubit machine in 2015. Initially there were some doubts about whether these were quantum machines at all but after these machines were actually installed and used first by Lockheed Martin at the University of Southern California and later at the Quantum AI Lab of NASA Ames Research Centre by a team from Google, these doubts have receded to a large extent. But even though some doubts persist, there is enough evidence of quantum behaviour or at least great promise that these doubts will be removed soon. In early 2017, D-wave announced the sale of their first, commercial available $15 million 2000-qubit machine to cyber-security firm, Temporal Defence Systems.

IBM’s 5-qubit Quantum Experience is positioned as general purpose computer. It could be used for any computational task but would be efficient only if the program was designed to use quantum properties -- a colour TV is useful only if the broadcast is in colour. Very few programs can do this today but Shor’s algorithm, used to crack passwords, is definitely one such. D-Wave systems on the other hand are designed to solve one class of problems that minimise the weighted sum of large number of interrelated, or entangled, variables. This may sound restrictive but the reason why everyone from Google to Temporal is interested is because this class of problems is similar to the ones that occur in artificial neural networks that lie at the heart of systems based on machine learning.

Spectacular progress in machine learning with artificial neural networks using classical computers itself, is rapidly closing the gap between biological and nonbiological intelligence or even between carbon and silicon “life-forms”. With the advent of quantum computers one more crucial barrier between the natural world and it’s man-made, artificial model could break down -- as could the increasingly thin line that delineates man from machine. Will this drag man down to the level of machines? Or will these machines push man up towards his eventual union, or Yoga, with the transcendent omniscience that some refer to as God or Brahman?

This article originally appeared in Swarajya -- The magazine that reads India right!

June 03, 2017

Order, Stability or Chaos?

Global, national and local societies face many threats. We are threatened by enemies -- internal and external -- who want to destroy our way of life. We are plagued with environmental degradation as we quickly try to ramp up the economy and improve our living standards. Finally our own social systems are in tatters because efforts to mitigate the effects of the first two reasons are stymied by venal corruption and a cynical disregard for the rule of law. In fact the last reason is perhaps the most over arching reason, because it leads to the other two.

image from 5rhythms
We have solutions to most of our problems. Technology solutions are available to grow more food, generate more energy, combat disease and check crime. There are public structures like hospitals, schools, municipal, state and central governments, the legislature, each having its own set of rules and procedures, to guide and govern matters. There are commercial structures, like corporates, cooperatives and professional networks that transform natural and human resources into disposable surplus that can be used for material pleasure. Then there are clubs, non-profits and political parties that lubricates the gears and facilitates the work of the public and private structures. Finally, we have a whole set of checks and balances, like police, the courts of law, and institutions that recursively keep checks on the checks and balances, like Vigilance Department, the CBI and the LokPal to ensure that everyone does what they should. So in principle, if everything were to work like clockwork, there should not be any unresolved problems on the planet.

But obviously this is absurd. Unlike the precise determinism of classical mechanics, the social mechanism that governs society is based on the non-deterministic behaviour of human beings. No two persons are alike and so no two will respond to a situation in an identical manner. One may be afraid to break the law even if there is a benefit but another may be willing to do so. So there is an element of randomness that permeates society and it is this randomness that is key determinant of social outcomes.

Randomness leads the environment from order to disorder. Physics equates disorder with entropy and the Second Law of Thermodynamics states that entropy of a closed system can only increase over time. In fact the direction of the “arrow of time” is often determined by the level of entropy between two states of the system. Information theory also associates entropy with randomness. Uncertain, random events are associated with high information content and hence high entropy. Certain events, like the daily sunrise, that have a probability of 1, are associated with zero entropy, as are impossible events like a horse giving birth to a dog, that have a probability of 0. But entropy is high when there is uncertainty and unpredictability as in the outcome of a toss of a fair coin, the results of an election or a war.

Increase in entropy, in randomness, in unpredictability, leads to chaos that can be analysed in terms of Chaos Theory. Chaos is the inevitable outcome of any adaptive, dynamic and complex system which is exactly what human society is. Chaos is unpredictability in the face of apparent determinism -- and as Edward Lorenz puts it so elegantly, Chaos is when the present determines the future, but the approximate present does not approximately determine the future. What this means is that a slight change in initial conditions -- a crow flapping its wings in Calcutta -- can cause a major upheaval far away -- a tornado in Texas. Mapped to human society, it means that social uncertainty caused by the erratic, unpredictable behaviour of a even a small group of people can cause ripples and upheavals across the world.

Chaos theory allows for strange attractors, or periodic repetitions of somewhat predictable outcomes, which is why human society settles into equilibria that gives us a sense of stability.  But given its colossal complexity even one incident, like 9/11, can tip it into a new, possibly more uncomfortable and anarchic equilibrium. Complexity is in fact impossible to manage in large organisations which is why we have the eventual collapse of centrally governed empires -- the Kaurava, the Pharaonic, the Roman, the Mauryan, the Holy Roman, the Ottoman, the Mughal, the British, the Soviet and finally the European Union. We can only hope that India will not join this list. Well governed human societies are based on the rule of law and order and it is this order that is under threat from the Second Law of thermodynamics and Chaos Theory. While we all crave for order, the reason why we rarely attain it is because the laws of the universe inexorably push us towards disorder and anarchy.

But will entropy always increase? Not really. In a small closed system -- as in a school, a company, a factory, a state like Singapore, or perhaps a human colony on Mars -- it is possible to reduce the local entropy within the system and impose perfect order, but this needs one of two prerequisites. Either we need an external agency imposing order from outside -- a non-popular dictatorship -- or there has to exist a mechanism of self-organisation, that resolves contradictions and guides the system towards greater order. A small school or factory is an example of the first while well governed US cities that are cleaner and more habitable than anarchic municipalities in India is an example of the second.

But even in a small society, that is somehow isolated from the random anarchy of the global environment, the ability to self regulate is not guaranteed. Self regulation is actually an outcome of enlightened self-interest that seeks to create the proverbial win-win situation that benefits all at the cost of none. But this is not easy. To understand why, consider the Prisoner’s Dilemma, a special case of a mathematical oddity called Nash Equilibrium that is a part of Game Theory.

Consider two persons who have been arrested for a murder but the police do not have any clinching evidence, with which they can ensure a conviction. So both prisoners are offered a plea-bargain offer. If any one turns approver and betrays the other, then the betrayer will be let off but the other will serve twenty years in jail. If both turn approver, then both serve ten years in jail. But if both cooperate and neither betrays the other, then the police will imprison them for a year on a lesser crime. Unfortunately, neither do the prisoners have any knowledge of what the other prisoner will do and nor do they trust each other. Ideally neither should betray the other, because this will ensure light punishment for both which is the best solution. But in reality, given the uncertainty, neither will trust the other, both will betray each other and so ensure ten year hardship for both. A classic lose-lose scenario.

This scenario is reflected in many real life situations like women wearing makeup to look more elegant, athletes using steroids to enhance performance, over-exploitation of resources like fishes or minerals, countries spending money on arms and ammunitions, countries refusing restrictions on environmental pollutants that hamper economic growth, advertisers spending money to push competing products or bidders at an auction being afflicted with the winner’s curse. In India, aggressive drivers break traffic rules to squeeze past others and in the process create  massive traffic jam whereas everyone could reach home earlier by waiting and obeying traffic rules.

If only people would cooperate with each other, the world will be a better place but the inexorable laws of Game Theory says that this will never happen. If all political parties were to cooperate on matters of national interest, like implementing labour reforms or fighting Islamic terror, many of the social and economic problems that bedevil India can be quickly eliminated but as in the case of the Prisoner’s Dilemma, each political party thinks that cooperating with the other means sealing one’s own electoral fate and facilitating a landslide victory for the other.

Human society is in a bind. The Second Law and Chaos Theory pushes us towards anarchy while Game Theory prevents us from self-organising. So we are forced to reconcile ourselves to a chaotic future. Given the inevitability of chaos in complex systems, our only hope for stability and order would be to have smaller, simpler systems that are easier to manage. Small states, municipalities, panchayats and even gated communities, where the number of players, or variables, is small and where complexity is manageable, have a far better chance of avoiding anarchy.  Going forward, as complex social and security challenges -- both international and now more often intra-national -- overwhelm the world, a loosely-coupled federation of small, self-sustainable, technology enabled, well-managed, elitist communities or “smart-cities”, spread across the Earth and nearby planets, may be the only way towards a reasonably stable future.

The Prisoner’s Dilemma and the inability of people to collaborate for the common good may be a persistent roadblock on the path to global peace with prosperity.

This article first appeared in Swarajya - the magazine that reads India right

May 05, 2017

Biohackers & Biohackspaces

The ubiquity of computers and smartphones and the pervasive presence of digital technology means that everybody who is reading this article is familiar with hackers. Hackers, as we all believe, are evil people, who either create viruses that ruin our machines or access our computers to steal confidential information with the intention to cause harm. We also have ethical, or white-hat, hackers, the guards and policemen, who with the same level of skill, try to beat the evil black-hat hackers at their game and keep digital assets secure. But the original meaning of hacker was someone who is so intensely immersed in computer technology that he knows much more than what a normal, non-hacker,  user would ever know about what can be done with computers. The hacker was the uber-geek, in whose hands a computer could be stretched to perform tasks that it was never meant for and deliver unexpected results. The hacker was a genius, not necessarily the evil genius that he -- and it is generally a he -- is portrayed to be. He was someone who could, in a sense, disassemble and reassemble the hardware and software in ways that no one else can even think about, to create new functionality. This same kind of behaviour when seen in the world of biosciences is called biohacking.
image borrowed from

Given the very wide range possibilities within biosciences, biohacking means different things to different people but there is one common thread. Just like his better known computer cousin, the biohacker generally works alone or in small groups and usually outside the regulated confines of a university or corporate laboratory. So his -- or her, biosciences is more gender diverse -- activities are usually unsupervised, unregulated and more often than not borders on the unsafe if not almost illegal. But if we leave aside the legal and ethical issues, then biohacking falls into two, broad and sometimes overlapping categories, namely grinding, body hacks or body modification on one hand and DIYbio or synthetic biology on the other.

Grinders are people who modify, or upgrade their biological bodies with non-biological components. A very simple body-hack is to have a bio-safe RFID or magnetic chip -- similar to what we have in our credit cards --  implanted under the skin of the wrist. This chip contains digital information that can be used to “magically” open doors secured with access control devices or unlock smartphone or computers without using passwords. Such implants are not much different from pacemakers but obviously serve a different purpose.

Human beings have five sensory organs but this can be increased or their capabilities enhanced. People have embedded rice-grain sized neodymium magnets, coated with bio-safe materials like titanium nitride or teflon, commonly used in orthopaedic equipments, inside their arms. In the presence of magnetic field, say near an electric motor, these vibrate and alert the user to the existence of the field. Bottlenose, an off-the-shelf customisable product from Grindhouse Wetware, extends this basic capability to ultraviolet, WiFi, sonar or thermal signals, so that, for example, people can estimate distances in the dark using sound waves -- like dolphins or bats.

Pushing into even more dangerous territory, grinders have laced their eyes with a chlorophyll derivative found in the eyes of the deep sea dragonfish that lives in the mile-deep darkness of the ocean. This causes a dramatic improvement in night vision allowing them to recognise people in near darkness. It is also possible to implant a magnet in the tragus -- the small protuberance in front of the year, used to carry ear-piercing jewellery -- that allows one to listen for vibrations generated by a phone and works like an inexpensive ear-piece. An even more amazing example is that of colour-blind musician Neil Harbisson, who persuaded an anonymous surgeon to perform an illegal operation to implant a camera on skull and connect it to a vibrating chip placed near his inner ear. Now the colour blind artist can distinguish colours -- say the red or green of traffic signal -- by noting the frequency or pitch of the sound that he hears when his face, and hence the camera, turns towards coloured objects.

The most shocking body-hack, pun intended, is to improve the performance of the brain with a 2.5 mA, 15 V electric shock -- transcranial direct current stimulation (tDCS). Available off the shelf as ThinkingCap, this device alters the electrical potential of stimulated neurons making them fire differently, leading to better or at least perceptibly different abilities of the brain. Anecdotal evidence suggests that the US Armed forces is experimenting with this technology to keep soldiers calm under stress and improve their marksmanship under simulated battlefield situations!

Obviously none of these body-hacks are approved by any medical regulator and experimenters do so at their peril, but who knowns, out of such intrepid experiments will one day emerge a new kind of human being who may be able to breathe underwater or live happily in the methane atmosphere of Titan.

Less dramatic but perhaps more profound is the kind of work that is done in synthetic biology or Do-It-Yourself biology. Most of the projects in this area are focussed on altering the genetic sequence on existing lifeforms to create modified organisms -- for example microbes that generate copious quantities of insulin needed by diabetics. “Editing” the genetic code is not easy -- it requires big laboratories, lots of equipment and highly trained staff. But thanks to an amazing new technology called CRISPR/Cas9 that was developed in 2012, gene editing has now become faster and inexpensive. Key CRISPR tools, the plasmids -- a genetic structure, typically a small circular DNA strand, that is widely used in the laboratory manipulation of genes -- can be ordered online from companies and non-profit repositories like AddGene, much like books from Flipkart, at prices as low as US$60.

While CRISPR promises garage level DIYbio, in reality, there is some basic level of equipment that is necessary to use these tools. This is where gene clubs and collectives have started to appear. From garages and kitchens, we now have biohacker spaces that offer shared services for fairly sophisticated equipments that members can use either against a monthly fee or on a pay-per-use basis, very similar to the way large computers are available on a shared basis from cloud hosting services like Amazon or Google.  BioCurious, located in Sunnyvale, in the heart of the silicon valley in California is one such biohackspace that offers much of the same equipment found in professional labs. Similarly, the London Biohackspace is located within the London Hackspace that is wildly popular with computer hackers working with the latest in digital technology.

Just as software programmers work collaboratively on open-source software like Linux, volunteers are collaborating at BioCurious to create vegetarian cheese without using any animals by modifying the DNA of baker’s yeast. Similarly, a community driven project at the London Hackspace is trying to create plants that glow in the dark when exposed to mechanical movement or in the presence of toxic chemicals. Community projects at both these labs are also directed towards building new kinds of equipment, like a bioprinter, a 3D printer that can be used to actually “print-out” body parts like skin or even kidneys! Not all projects are community driven. Some individual hackers experiment with their own DNA are looking for genes that may cause diseases, or even to find out what percentage of their own genes comes from Neanderthals!

These biohackspaces are driven by a chaotic combination of ideas and motivation -- that is very reminiscent of the original computer hackers who laid the foundations of the digital revolution that we see today. In fact Wired Magazine has quoted Bill Gates who says that if he were a teenager today, he would be hacking biology -- “Creating artificial life with DNA synthesis. That’s sort of the equivalent of machine-language programming .. If you want to change the world in some big way, that’s where you should start — biological molecules.”

But while biohacking might change the world, there are risks involved and this risk gets magnified when we have unsupervised people playing with dangerous tools. Hence most of these biohackspaces have basic bio-safety protocols in place to prevent any kinds of dangerous experiments and are under informal surveillance of many security agencies including the FBI’s Biological Countermeasures Units. Every technology has the potential to both help and as well hurt society and the biosciences is no different in this regard from either computers or atomic energy.

India never quite had the hacker culture that created the computer revolution. Perhaps that is why Indian IT  was born, not in the crucible of innovation but in the peat bogs of modifying COBOL programs to address Y2K bugs -- and continues with the unfortunate legacy of being a maintenance service industry. Given that a new and far more potent revolution in the biosciences is breaking out all around us, it is important that we quickly create an ecosystem, these biohackspaces, so that our biohackers can lead, not just follow, the herd into the future.

This article originally appeared in Swarajya, the magazine that reads India right!

April 23, 2017

Beautiful and unusual gift from PMI West Bengal

Yesterday, I had the good fortune to have been invited to speak at the PMI Regional conference where instead of the regular, and pointless, bouquet of flowers that is traditionally given to the keynote speaker, I was presented with the following certificate

what this means is that PMI has paid Sankalptaru.org some money to plant 10 trees on my behalf and "my tree" is visible in at the URL indicated by the QRcode.

Thank you PMI for this unusual gift

April 22, 2017

DB2 to Lotus : Accessing Mainframe Data from PC in the pre-Windows age

April 16, 2017

Spark with Python in Jupyter Notebook on Amazon EMR Cluster

In the previous post, we saw how to run a Spark - Python program in a Jupyter Notebook on a standalone EC2 instance on Amazon AWS, but the real interesting part would be to run the same program on genuine Spark Cluster consisting of one master and multiple slave machines.

The process is explained pretty well in Tom Zeng's blog post and we follow the same strategy here.

1. Install AWS Command Line services by following these instructions.
2. Configure the AWS CLI with your AWS credentials using these instructions.

in particular, the following is necessary
$ aws configure
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: us-east-1
Default output format [None]: ENTER

you will have to use your own AWS Access Key ID and AWS Secret Access Key of course!

3. Execute the following command :

aws emr create-cluster --release-label emr-5.2.0 \
  --name 'Praxis - emr-5.2.0 sparklyr + jupyter cli example' \
  --applications Name=Hadoop Name=Spark Name=Tez Name=Ganglia Name=Presto \
  --ec2-attributes KeyName=pmapril2017,InstanceProfile=EMR_EC2_DefaultRole \
  --service-role EMR_DefaultRole \
  --instance-groups \
    InstanceGroupType=MASTER,InstanceCount=1,InstanceType=c3.4xlarge \
    InstanceGroupType=CORE,InstanceCount=2,InstanceType=c3.4xlarge \
  --region us-east-1 \
  --log-uri s3://yj01/emr-logs/ \
  --bootstrap-actions \
    Name='Install Jupyter notebook',Path="s3://aws-bigdata-blog/artifacts/aws-blog-emr-jupyter/install-jupyter-emr5.sh",Args=[--r,--julia,--toree,--torch,--ruby,--ds-packages,--ml-packages,--python-packages,'ggplot nilearn',--port,8880,--password,praxis,--jupyterhub,--jupyterhub-port,8001,--cached-install,--copy-samples]

note the options have been modified a little
a) number of machines is 1+2
b) the S3 bucket used is yj01 in s3://yj01/emr-logs/
c) the password is set as "praxis"
d) the directive to store notebooks on S3 has been removed as this is causing problems. Now the notebooks will be stored in the home directory of the user=hadoop on the master node

this command returns ( or something similar)
    "ClusterId": "j-2LW0S8SAX5OC4"

4. Log in to the AWS console and go to the EMR section.

The cluster will show up as starting

and will then move into Bootstrapping mode

and after about 22 minutes will move into Waiting mode. If that happens earlier then there could have been an error in the bootstrap process. Otherwise you will see this

5. Login to Jupyter hub
Note the URL of the Master Public DNS : ec2-54-82-207-124.compute-1.amazonaws.com
and point your browser to : http://ec2-54-82-207-124.compute-1.amazonaws.com:8001

Login with user = hadoop and password = praxis  ( supplied in the command) and you will get the familiar Notebook interface

There will be samples directory containing sample programs covering a wide range of technologies and data science applications. Extremely useful to cut-and-paste from!

Create a work directory and upload the Wordcount and the Hobbit.txt file, used in the original Spark+Python blog post

Notice the changes necessary for cluster operations

Cells 1 -3 reflect the fact that we are now using a cluster, not a local machine
Cells 4, 12 show that the program is NOT accessing the local file storage on the Master Node but the HDFS file system on the cluster

To explore the HDFS file system, go back to this screen

and then press "View All" ... Click on the HDFS link and take your browser to
and see

and you can browse to the hadoop user home HDFS directory where the "hobbit.txt" file was stored and where the "hobbit-out" directory has been created by the Spark program. In fact, all HDFS operations can be carried out from the Notebook cells like this

!hdfs dfs -put hobbit.txt /user/hadoop/
!hdfs dfs -get /user/hadoop/hobbit-out/part* .
!hdfs dfs -ls hobbit-out/
!hdfs dfs -rm hobbit-out/*
!hdfs dfs -rmr hobbit-out
!hdfs dfs -rm hobbit.txt

You can also see the various Hadoop resources -- including the two active nodes through this interface
After Jupyterhub is started, the notebooks can be accessed by going directly to port 8880 and using the password=praxis

Finally it is time to
6. Terminate the cluster!

Go to the cluster console, choose the active cluster and press the terminate button. If termination protection is in place, you would need to turn it off.

Notes :
1. The same task can be done through the EMR console, without having to use the AWS CLI command line because most of the parameters used in this command can be passed through the console GUI. For example, look at this page.
2. Because of the error with the S3 we are storing our programs and data in the master node where it gets deleted when the cluster is terminated.  Ideally this should be placed in an s3 bucket using this option --s3fs
3. The default security group created by the create-cluster command does not allow SSH into port 22. However if this is added, then standard SSH commands can be used to access and transfer files into the master
4. Tom Zeng's post says that SSH tunnelling is required. However I did not need to use process nor follow any of the complex FoxyProxy business to access. Not sure why. Simple access to port 8001 and 8880 worked fine -- Mystery?

Spark with Python in Jupyter Notebook on a single Amazon EC2 instance

In an earlier post I have explained how to run Python+Spark program with Jupyter on local machine and in a subsequent post, I will explain how the same can be done an AWS EMR cluster of multiple machines.
In this post, I explain how this can be done on a single EC2 machine instance running Ubuntu on Amazon AWS.

The strategy described in this blog post is based on strategies described in posts written by Jose Marcial Portilla and Chris Albon. We assume that you have a basic familiarity with AWS services like EC2 machines, S3 data storage and concept of keypairs and an account with Amazon AWS. You may use your Amazon eCommerce account but you may also create one on the AWS login page. This tutorial is based on Ubuntu and assumes that  you have a basic familiarity with the SSH command and other general Linux file operation commands.

1. Login to AWS

Go to the AWS console ,login with userID and password, then go to the page with EC2 services. Unless you have used AWS before, you should have 0 Instances, 0 keypairs, 0 security groups.

2. Create (or Launch) an EC2 instance and use default options except for
a. Choose Ubuntu Server 16.04 LTS
b. Instance type t2.small
c. Configure a security group - unless you already have a security group, create a new one. Call it pyspju00. Make sure that it has at least these three rules.
d. Review and Launch the instance. At this point you will be asked to use and existing keypair or create a new one. If you create a new one, then  you can will have to download a .pem file into your local machine and use this for all subsequent operations.

Go back to the EC2 instance console and you should see your instance running :

Press the button marked Connect and you will get the instructions on how to connect to the instance using SSH.

3. Connect to your instance

Open a terminal on Ubuntu, move to the directory where the pem file is stored and connect with

ssh -i "xxxxxxx.pem" ubuntu@ec2-54-89-196-90.compute-1.amazonaws.com
you will have a different URL for your instance

From now on you will be issuing commands to the remote EC2 machine

4. Install Python / Anaconda software on remote machine

sudo apt-get update
sudo apt-get install default-jre

wget https://repo.continuum.io/archive/Anaconda3-4.3.1-Linux-x86_64.sh

get the exact URL of the Anaconda download by visiting the download site and copying the download URL

bash Anaconda3-4.3.1-Linux-x86_64.sh
Accept all the default options except on this one, say YES here
Do you wish the installer to prepend the Anaconda3 install location
to PATH in your /home/ubuntu/.bashrc ? [yes|no]
[no] >>> yes

logout of the remote machine and login back again with
ssh -i "xxxxxxx.pem" ubuntu@ec2-54-89-196-90.compute-1.amazonaws.com

5. Install Jupyter Notebook on remote machine

a. Create certificates in directory called certs

mkdir certs
cd certs
sudo openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout pmcert.pem -out pmcert.pem

this creates a certificates file pmcert.pem ( not to be confused with the .pem file downloaded on your machine) and stores it on the remote machine

b. Jupyter configuration file

go back to home directory and execute
jupyter notebook --generate-config

now move to the .jupyter directory and edit the config file

vi jupyter_notebook_config.py
if you are not familiar with the editor, either learn how to use it or use anything else that you may be familiar with

notice that everything is commented out and rather than un-commenting specific lines, just add the following lines at the top of the file
c = get_config()

# Notebook config this is where you saved your pem cert
c.NotebookApp.certfile = u'/home/ubuntu/certs/pmcert.pem' 
# Run on all IP addresses of your instance
c.NotebookApp.ip = '*'
# Don't open browser by default
c.NotebookApp.open_browser = False  
# Fix port to 8888
c.NotebookApp.port = 8892

c. Start Jupyter without browser and on port 8892

move to new working directory
mkdir myWork
cd myWork
jupyter notebook

you will get >
Copy/paste this URL into your browser when you connect for the first time,    to login with a token:

but instead of going to local host, we will go to the EC2 machine URL in a separate browser window
this will throw errors about security but ignore the same and keep going until you reach this screen

in the password area, enter the value of the token that you have got in the previous step and you will see your familiar notebook screen

6. Installation of Spark

Go back to the home directory and download URL of the latest version of spark from this page.

wget http://d3kbcqa49mib13.cloudfront.net/spark-2.1.0-bin-hadoop2.7.tgz
tar -xvf spark-2.1.0-bin-hadoop2.7.tgz 
mv spark-2.1.0-bin-hadoop2.7 spark210

edit the file .profile and add the following lines at the bottom
export SPARK_HOME=/home/ubuntu/spark210
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$PYTHONPATH
make sure that you have the correct version of py4j-n-nn-n-src by looking into the directory where it is stored

logout from the remote machine and then login back again

7. Running Spark 2.1 with Python

[The following step may not be necessary if your versions of Spark and Python are compatible. Please see the April 13 update on this blog for an explanation of this]

cd myWork
conda create -n py35 python=3.5 anaconda
logout / login (SSH) back
cd myWork
source activate py35

now run pyspark and note that pyspark is working with Python 3.5.2 so we are all set to start Jupyter again

jupyter notebook
note the new token=12e55cacf8cdcad2f8c77f7959047034b698f4b8f67b679a that you get

The Jupyter Notebook should now be clearly visible again at

Now we upload the notebook containing the WordCount program and the hobbit.txt input file, from the previous blog post.

That we can execute

This completes the exercise, but before  you go, do remember to shut down the notebook, logout of the remote machine and most important terminate the instance

8. Terminate the instance

Go to the EC2 Instance console and Terminate the instance. If  you do not do this, you will continue to be billed!

April 10, 2017

Raja Shashanka and the Calendar in Bengal

The origin of the Bengali calendar

On Saturday, 15th April 2017 Common Era, Bengalis in India, especially in West Bengal, Assam and Tripura, will celebrate the 1st Baisakh, or “Poila Boisakh”, 1424 Bengal Era (BE) -- the start of the Bengali New Year. Most of us are aware that the globally used Common Era, or Christian Era, starts with the birth of Jesus Christ in 1 CE, but what exactly is commemorated by the start of the Bengal Era? What happened in 1 BE?

There are two points of view.

Raja Shashanka is the first universally accepted ruler of a major part of the land mass that is associated with Bengal -- West Bengal & East Bengal / Bangladesh -- today. His capital was at Gaud (current Murshidabad) and he was a contemporary of Raja Harshavardhana of Kannauj (near Lucknow) in the West and of Raja Bhaskar Varman of Kamarupa (Assam) in the East. These three persons were the three principal rulers of North India. While exact dates are not available, it is strongly believed that Raja Shashanka ruled in Bengal between 590 CE and 625 CE. If we can assume that Raja Shashanka ascended the throne on 594 CE and Bengal celebrates the same as the start of the Bengali Era, then in 2017 CE, the Bengali Era year should be  1 + (2017 - 594) = 1424 BE which is exactly what it is on 2017 “Poila Baisakh”. Hence the Bengal Era begins with the ascendance of Raja Shashanka to the throne of Gauda-Bengal.

Long after Raja Shashanka and the Hindu rulers of Bengal were dead and gone, Bengal came under Islamic rule when Bakhtiar Khilji evicted Lakshman Sen in 1206 CE. Subsequently Bengal became a province under the Mughal (Mongol) empire that followed the Islamic Hijri calendar and because this was based on lunar months, it caused major administrative problems.

Agricultural revenue is tied to the harvest and it is most easily collected at the end of the harvest season when the farmer has money in his purse. Seasons in turn are tied to the position of the sun as defined by solar months that commence with the entry of the sun into the signs of the zodiac. So there is a one-to-one fixed connection between a solar month, the position of the sun and the seasons. For example, the spring or vernal equinox happens on 21 March of the Gregorian Calendar or on 1st Chaitra of the Saka Calendar, the official Government of India calendar, because both of these are solar calendars.

The Islamic Hijri calendar is based on lunar months where the start of each year varies widely across seasons -- in some years, the year starts in summer, in other years during the monsoon or in winter. So tax collection based on the Islamic year was a nightmare because the tax collector might arrive when the seeds had just been sown and the farmer would not have the money to pay his taxes. This would lead to endless arguments.

Akbar realised that the Hindu calendars, that were based on the solar months, were more useful for tax-collection purposes, because the year started on a fixed seasonal date. So he adopted the solar calendar, according to which the year of his coronation in 1556 CE was 1 + (1556 - 594) = 963 BE according to the Bengali Era. Coincidentally  -- and this was a huge coincidence -- 1556 CE was also 963 in the Islamic Calendar. So in order to not lose face by having to replace the unstable Islamic lunar calendar with the stable Hindu solar calendar, he adopted the Bengali solar calendar in 1556 CE, the year of his coronation but instead of defining it as Bengal Era year 1 BE, he declared it Bengal Year 963 BE, so as to maintain the illusion that he was continuing with the Islamic calendar. But going forward, the administrative year was aligned to the traditional Bengali solar year so that seasons will begin on fixed dates.

So, the first, simple, explanation for the Bengali Era is that it starts with the ascendancy of Raja Shashanka to the throne of Gaud in 594 CE with the first year being defined as 1 BE. The alternate explanation is that it starts with the coronation of Akbar in Delhi in 1556 CE but with the first year being numbered 963 BE to maintain an artificial equivalence with 963 Islamic Era year that was prevailing at that time.

Now it is up to the reader to decide whether he or she wants to start Bengali Era with the coronation of Raja Shashanka at Gaud, in 594 CE or Akbar at Delhi in 1556 CE.

Actually Akbar would have got away with this sleight of hand of passing off the Bengali Era as being an extension of the Islamic Era but for the start date. Akbar had his coronation on 14th Feb 1556 CE and if the Bengali Era was based on this event then the first day would have been 14th Feb. But all Bengalis celebrate the new year, 1st Baisakh, on 14th April, when the Sun enters the constellation of Aries or Mesha. This clearly shows that the Bengali Era is actually rooted in the Hindu tradition of Solar years dating back to Raja Shashanka and antiquity.

But how do we know which point in the sky is the start of Aries? Where does the zodiac start?

The two zodiacs : Tropical and Sidereal

In Bengal, we have 1st Baisakh usually coinciding with 15th April, when the Sun enters the constellation of Mesha (Aries). The Government of India approved Indian National Calendar that is based on the Saka Era defines the start of the year as 21st March which is 1st Chaitra, when the Sun enters the constellation of Meen (Pisces). Now this leads to a strange inconsistency. If the Sun enters Mesha (Aries) on 15 April as per the Bengali calendar, then it must enter Meena (Pisces) on 15/16th March, but as per the Saka calendar, it enters Meena (Pisces) on 21st March. Why this gap?

To explain  this anomaly, we need to know that there are TWO zodiacs, the tropical (ayana) zodiac and the sidereal (nirayana) zodiac and the implications of this is explored in the rest of this post. [ Warning : Rest of the post has a little mathematics, that you may like read only if you are not scared about the devil in the detail ]

Consider a spherical coordinate system, that is embedded on the Earth and rotates along with it every day. In this spherical coordinate system, every heavenly body, is defined by three numbers -- the azimuthal angle, that shows the position along the equatorial circle or on a longitude, the declination angle, that shows the position above or below the equatorial plane, and a distance from the centre of the Earth. In our assumption, all heavenly bodies are at the same uniform distance and fixed on “the sphere of heavens” and so the distance from the centre is immaterial. The only real variables are the azimuthal and the declination angles and they specify position of every heavenly body.

There are two classes of heavenly bodies -- the “fixed” stars and the “wanderers” or “planets”. The “fixed” stars do not change their position in our spherical coordinate system, but the “planets”, that also include the Sun, the Moon, move around among the “fixed” stars as their azimuth and declination angles change with the passage of time.

For the purpose of the solar calendar, we will only consider the movement of the Sun as it travels around the Earth. Do note that there is nothing mathematically wrong in considering the Sun to be travelling round the Earth, as frames of reference can be changed without affecting the description of the physical reality. As the Sun moves round the Earth, its azimuth angle, or longitude, changes from 0 through 359 then back to 0 in one year and in the same time its declination angle changes from -23 to +23 as seasons change from winter through spring, summer, autumn and back to winter. The declination being 0 at the two equinoxes, when day and night is of equal length. So the Sun moves in a band around the Earth and this band is divided into twelve sectors of 30 degrees each. Each of these twelve sectors are occupied by, or related to, one of the 12 constellations consisting of the “fixed” stars arranged in certain imaginary patterns -- Aries, Taurus, Gemini and so on.

A circle has neither a beginning nor an end and so while the Sun takes a year to complete this circle, there is no unambiguous way to define where exactly the circle -- and hence, by extension,  the year -- starts. However, this starting point can be defined in two ways leading to the existence of  two zodiacs - the tropical and the sidereal.

In the tropical zodiac, the point on the circle when the Sun is at the vernal equinox, and its declination is 0, is defined as the starting point of the year and the azimuth angle is defined as 0. This means that the tropical year starts at the vernal equinox and this is traditionally associated with the entry of the Sun into the sign of tropical Aries -- that is Aries as shown in the tropical zodiac.

In the sidereal zodiac, the point on the circle that is diametrically opposite to the “fixed” star Spica -- also known as Chitra in India -- is considered to be the starting point of the year, where the azimuthal angle is defined as 0. This means that the sidereal year starts when the Sun is opposite Spica and this is traditionally associated with the entry of the Sun into the sign of the sidereal Aries -- that is Aries as shown in the sidereal zodiac ( we will refer to this as sidereal Mesha, to avoid confusion with the tropical Aries, even though Aries and Mesha refer to the same physical constellation)

So now we have two circles, or two zodiacs, with two starting points and these two starting points are approximately 23 degrees separated from each other! This gap is known as ayanamsa and it keeps changing, increasing, with each passing year.
Open Link in New Tab to see the full diagram

The sidereal year, the time taken by the Sun to move to start from the “fixed” star Spica and return to it is 365.25636 days or rotations of the Earth. The tropical year, the time taken by the Sun to move from its position in one vernal equinox to its position in the subsequent vernal equinox is 365.242189 days or about 20 mins 24 sec less. The difference is because the axis of the Earth is not invariant and “wobbles” slowly. This means that the tropical year is shorter than the sidereal year by about 20 mins 24 secs

At some point in the past, in 285 AD, the position of the Sun at the vernal equinox was directly opposite the “fixed” star Spica. This means the the entry point of the tropical Aries was coincident on the entry point of the sidereal Mesha. In that year, the tropical and sidereal zodiacs were identical. But since the tropical year was shorter than the sidereal year, the next tropical year started 20 min 24 sec earlier than the next sidereal year. With each passing year, the tropical year commenced and additional 20 min 24 sec earlier until the cumulative gap between the respective starts of the tropical year and the sidereal year stands at almost 24 days in 2017 today.

But since all official solar calendars, including the Gregorian calendar used in the West and the Saka calendar officially used by the Government of India, are tied to the tropical calendar, the vernal equinox is fixed compulsorily on 21st March / 1st (tropical) Chaitra, when the Sun enter the tropical Aries. But the Bengali calendar, that starts on 1st (sidereal) Baisakh, when the Sun enters the sidereal Mesha begins on 15th April of the Gregorian calendar or 24 days later. The existence of two zodiacs, the tropical and the sidereal is the reason for the gap of 5 days that was the starting point for this discussion.

In 285 AD, when the tropical and sidereal zodiacs were coincident, the vernal equinox, the entry of Sun into tropical Aries and its entry into sidereal Mesha -- all three events -- would all have happened on 21st March  which would also have coincided with 1st (sidereal) Baisakh.

If we keep the date of the vernal equinox compulsorily fixed at 21st March, then with the passage of time, the start of the sidereal year will occur at a later date every year. Conversely, if the start of the sidereal year is considered to be fixed by the arrival of the Sun opposite Spica, and its entry into sidereal Mesha, then the vernal equinox will be 20 mins 24 secs “earlier” each year, when the Sun has not yet reached sidereal Mesha, but is still in sidereal Meena. From this sidereal perspective, the vernal equinox that signals the start of the tropical year with the entry of the Sun into tropical Aries, has now pushed “back” from sidereal Mesha and into sidereal Meena ( or Pisces). Hence, as per western astrological practices, this is the Age of Pisces and after some more time we will move even further backward into the Age of Aquarius.

In Hindu astrology, the analysis of the horoscope is based on the positions of the planets in the sidereal zodiac. However all astronomical calculations that are used to generate the ephemeris, the azimuthal or longitudinal positions of planets, are based on the tropical zodiac. Since the sidereal zodiac is about 23 degrees ahead of the tropical zodiac at the moment, all planetary longitudes need to be reduced by this amount -- known as the Ayanamsa amount -- before being shown on the horoscope. Western astrologers on the other hand work with the tropical and do not need this correction.

Finally, the identification of 285 AD as the year when the vernal equinox coincided with Spica and the tropical and sidereal zodiacs were identical has been challenged. While this date and the ayanamsa of 23 degrees has been defined by N C Lahiri, other astrologers claim that according to Surya Siddhanta,, the definitive classical text on astronomy, the year of coincidence should be 499 AD and the ayanamsa should be reduced accordingly. This is a big debate with no clear resolution in sight.

April 06, 2017

The Entangled Future of Man & Machine

First we had computers, then we had the world wide web and now we are talking about IOT, the Internet of Things. Computers are now pervasive and have occupied every nook and corner of life. Artificial intelligence seems to be growing bigger, faster and smarter everyday. Science fiction writers have often spoken of machines taking over the world but is that pure fantasy or is it a matter of time before this fiction becomes synonymous with fact? In this article, we will explore the genesis of artificial intelligence, examine how fast these are mutating and morphing into more advanced levels and finally speculate on what this means in terms of employment and other key indices of human society. But first, let us see ..

How machines learn

When I was a member of the faculty at IIT Kharagpur, one of my senior colleagues had told me that we should not to teach students but instead, help them learn. Can this idea to extended to machines?

Any spreadsheet user would know how to have a computer add two numbers. Why just add? You could do many other tasks like finding percentage or net present value as well. Obviously someone has written a computer program that does all this! Those who know computer programming would also know that writing the code to add two numbers is quite easy. Depending on the level at which you would like to describe the process, the program can be written in a high level language like Python or Java, with just one line of code. But if one uses a low level language like Assembler or binary code then he will need a large number of rather arcane instructions written as a series of 1s and 0s [http://bit.ly/add2numbers]. Irrespective of the programming language used, a program that performs any task on a computer, from adding numbers to showing a YouTube movie, is a series of explicit instructions given by a human programmer. Irrespective of the complexity of the task, it is always a human who teaches a computer how to perform it. This is one of the fundamental tenets of computer programming … or was, until the emergence of machine learning. Instead of teaching a computer, can we make it learn on its own?

There are many facts that a child is taught in school --  like how to add two numbers or when Raja Harshvardhan ruled in Kannauj. But there is much more that he learns on his own -- like recognising his mother or realising that toffees are good to eat but stones are not. How people learn, for example, to recognise a friend in crowd or choose the best move in a game of chess, is something that has baffled computer scientists for a long time. That is why even though we have had computer programs doing incredibly difficult things -- like landing  a spacecraft on Mars -- they have had immense difficulty in performing apparently simple tasks like crossing a busy city road. This has now triggered a new line of thought that, instead of teaching computers how to perform, we need to equip computers with the ability to learn how to perform!

While a lot of human learning involves memorising facts, the really complex or intelligent skills that people acquire are based on a series of trials and errors that they make as soon as they become aware of their environment. Whether babies or laboratory rats, intelligence is acquired by performing a task and determining whether the outcome was good or bad -- was there a reward or a punishment?. This conscious feedback loop spread over days or even years gives man the ability to do intelligent tasks like recognising a friend in a crowd even if he has grown a beard or lost his hair. After despairing for years about what kind of instructions that must be given to a computer to do similar tasks, scientists have looked into the human brain -- the biological marvel that sits inside the cranium -- for ideas and have finally hit upon a way to address these challenges.

The animal brain is an electrochemical device that consists of, literally, trillions of cells called neurons that can sense, generate and transmit electrical signals. Each neuron can be visualised as a little blob with a number of wires sticking out of it. Most of these wires, called dendrites, can sense an electrical symbol, while one, called the axon, can generate an electrical signal depending on the signals that it has received on its dendrites. The signal generated on the axon of one neuron feeds into the dendrites of other neighbouring or distant neurons, creating a complex electrical circuit that is constantly sending electrical charges rushing around the brain. Neurologists have a rough idea of what kind of electrical activity is associated with the corresponding human behaviour and it is believed that memory is a created or defined by the way these trillions of neurons are connected to and influence each other.

Computer scientists have merged these two mechanisms -- the reward-and-punishment mechanism from behavioural science and the input-output electrical mechanism from neurology -- to create a software program that mimics the behaviour of biological neurons. Such a software program is called an artificial neural network (ANN), and is the basis for almost all intelligent software including AlphaGo -- that can beat humans in board games, Watson -- that can diagnose clinical diseases, or the Google Self Driving car -- that has travelled thousands of miles with an accident rate that is far less than that of humans.

While we already know that a computer can be taught to add two numbers, let us now see how it can taught to learn to add on its own! To build a general purpose ANN one starts by simulating nodes, or artificial neurons, each with its own set of inputs and an output. Each input is associated with a parameter, or number, called weight and each output depends on the inputs and another parameter called bias. Output from each node is sent to other nodes to mimic electrical signals on the biological neural network. The number of nodes along with the numerical values of the respective parameters define the ANN. When the ANN receives an input it generates an output that depends on the specific values of weight and bias parameters. If the values are chosen at random, the output is wrong but with the right values, the network will generate the correct answer. Learning consists in determining the right values. The beauty of this approach is that almost all problems can be addressed with this structure -- only the number of nodes, weights and biases need to change.

To  teach a computer to learn how to add we provide it with training data, consisting of thousands of addition problems along with their correct answer! Initially, the ANN has random values of parameters and as it checks with the given correct answer and determines the error, it keeps changing the parameter values until, after thousands of attempts, the error becomes very small. Thus the machine discovers, or arrives at,  the correct parameter and can be said to have learnt addition. Now it can solve new addition problems correctly even though no human programmer gave it explicit instructions to add or even the right parameter values. The programmer’s skill lies in specifying how the parameters should be changed so that the error is minimised with the least number of trials. What is amazing is that a similar way of changing parameters can help the machine solve many other very different problems.

This is meta learning. The machine has learnt to learn and can now address many other hard problems like handwriting recognition, face recognition, detecting criminal behaviour like financial fraud and computer hacking, playing chess and other board games, medical diagnosis, weather forecasting, road navigation and even reading human thoughts! In all such cases, the program has to cycle through thousands of pre-solved problems until it discovers the values of the parameter that makes it generate correct answers. Sometimes this training process is evident -- as when you give it a large number of problems and their correct answers --  but sometimes the training is implicit.

When you “tag” a picture of your friend in Facebook, you are inadvertently giving Facebook a correct answer to a face recognition problem that helps train its software. Similarly, when you click on link in Google search, you identifying the most relevant answer and so Google is learning about your personal preferences so as to answer your next query better! In general, any action that you take on Facebook, a like, an emoticon, a phrase that you type as a post or a comment, is being used by the ANN to learn more about you so that it can predict your next behaviour -- which could be anything from accepting friend requests to clicking on ads!

Computers have been used for a long time to perform mathematical, process automation  and other well structured tasks. But despite great increases in the speed and memory, tasks that are ambiguous or unstructured were never addressed adequately. Many people claimed that intelligence -- something evident in children or even animals -- is something that mere machines can never acquire. ANNs, that can learn, can be trained,  has now helped us solve unstructured problems that were once considered intractable.

But unlike humans, machines can learn very fast -- working out 100,000 addition problems is almost impossible for a man but is trivial for a machine. So is the case with reading through all the articles in Wikipedia. If learning is the key to intelligence, as is commonly understood here, and if machines can learn much faster than humans, then does it mean that machines will become more intelligent than humans?

Tipping into the Singularity

In his 2000 bestseller, Malcolm Gladwell, defined the Tipping Point, as "the moment of critical mass, the threshold, the boiling point”, at which a small change, a sudden trigger, can suddenly usher in a huge change in the larger society. The phrase tipping point originates in the study of epidemics, and refers to the moment when a virus reaches a certain critical mass and then suddenly begins to spread at an accelerated rate. While Gladwell restricts himself to the analysis of social trends like crime and fashion, Ray Kurzweil, in a series of books on the technological singularity, predicts that human society as a whole is just about ready to transform itself in a way that is perhaps inconceivable today.

The trigger in this case is artificial intelligence or its alter ego, machine learning. The hypothesis is that society will pass through this technical singularity when non-biological intelligence transcends biological intelligence and changes the way not just how we live and locomote, but the way we think and what we believe. But is this hypothesis sustainable?  Or is this another big hype  -- that has been with us since the dawn of the science fiction era?

Ten years ago, Deep Blue, IBM’s chess playing computer had created history by beating the world champion, Gary Kasparov in 1996. Last year, AlphaGo, a computer based on an artificial neural network (ANN) , built by DeepMind, a Google company, that had “learnt” to play GO a game that is considered to be far more difficult for computers,  had beaten Lee Sedol, one of the highest ranked professional players in the world. In the first week of January 2017, AlphaGo, playing anonymously under the handle of “Master” ran through China’s online GO playing websites and beat almost every top ranked player with contemptuous ease. Playing with inhuman speed, it eventually ended its unbroken 60-0 winning streak by beating Ke Jei, the reigning world champion. More than the defeat itself, its style, the strategies that it evolved on its own, are so different from that used by humans that the world champions were awestruck. After losing to Master, Ke, the current world champion admitted on social media : “After humanity spent thousands of years improving our tactics, computers tell us that humans are completely wrong, (and) I would go as far as to say that not a single human has touched the edge of truth of Go.” Playing in a manner completely different from humans, it bewildered opponents with apparently foolish moves that placed pieces at outrageously unconventional positions so that one player, Gu Li noted that “I can’t help ask, one day many years later, when you find your previous awareness, cognition and choices were all wrong, will you keep going along the wrong path or reject yourself.”

The key elements of the computer program were not written by humans but were discovered by the program itself as it learnt to play Go by observing millions online games, as explained earlier.

Very similar is the case of Interlingua, a new language that has evolved, or emerged out of Google’s automatic language translation software. After being trained to translate say, French to English and then English to Hindi, the software has learned how to translate from French to Hindi even though it was never “trained” to do so. This new language, or new way to represent ideas, has emerged not from the mind of programmers but from the neural network architecture that drives the software. This is an emergent phenomenon, similar to the appearance of language in a primordial human society.

Since its uncertain beginning in the 1960’s artificial intelligence technology took nearly 40 years to reach the level of maturity to beat a human chess champion, but the next big leap, when it learnt to play Go took only 10 years. This is the law of accelerating returns -- where key jumps in human or biological ability happen at increasingly shorter intervals, or where the gap between between innovations shorten exponentially as shown by Ray Kurzweil :

[ Graph #1 source Wikimedia Commons Ray Kurzweil]

This graph shows the time gap between key events in the history of human social evolution plotted against the time frame of history and it shows a nice linear slope because this is log-log graph where we have taken the logarithm of the X and Y values in the plot. Similar graphs drawn with key events identified by others show a very similar downward trend. [ for example this ]
However, if we convert the historical time frame to a linear scale, then drop in the time between two successive key events is clearly precipitous and this is what leads us to the concept of technological singularity.

Those who are familiar with Moore’s Law would know that the density of transistors on computer chips has been doubling every two years -- with a corresponding fall in prices and increase in computation power. This exponential pace of technology has been the driver behind the inexorable and astonishing growth of computers. While Moore’s law operates on computer chips and has been valid for the past 50 years, these charts show that human innovation has been accelerating exponentially across an even bigger time scale. Each human invention, like the ability to walk upright, usage of tools, language, agriculture, writing, printing, industrial machinery, electricity, computers, internet -- have actually reduced the time necessary to reach the next key invention. For example, years of calculations done by Kepler to track the movements of planets can now be done by school student on his desktop and with Google search, it is far easier, for someone to tap into the knowledge of others and create a new piece of technology than what it would have taken 100 years ago.

In another view of this same chart, we should see an exponential -- almost magical --  growth in the human ability to address problems that were thought impossible. After millions of years of being bound to the ground, man learnt to fly and then in 60 years he was on the moon and is today planning to go to Mars. So is the case with industrial machinery, telecommunications, healthcare and a host of allied fields. We are at, or very close to, an inflexion point -- a tipping point -- that will put us on a trajectory that leads to what seem to be a world of magic that may include immortality, omniscience or anything that does not violate the fundamental laws of physics!

A key characteristic of the tipping point would be the blurring of the borders between the biological and the non-biological world. Just as the industrial revolution merged the ability of man and machinery to create the superhuman capability to manage enormous quantities of materials and energy with, for example, cranes, excavators, rocket engines -- so does the digital revolution merge the ability of humans and computers to create a similar superhuman capability that manages equally enormous quantities of information. Continuing with this analogy we see that just as industrial machines today are far more muscular than humans in their ability to lift loads or travel faster and farther, so would the next generation of intelligent machines, powered by artificial super intelligence, would far outstrip the mental ability of humans -- as demonstrated by AlphaGo.

This then is the singularity, that point in time in human history that human society will soon pass through, after which the intelligence of machines will surpass that of humans. This superior intelligence, coupled with the ability to handle gigantic quantities of information at superhuman speed will lead to a cascading impact on the emergence of new ideas that “will abruptly trigger runaway technological growth, resulting in unfathomable changes to human civilization.”

This explosion of intelligence, this chain reaction where each level of intelligence gives rise to an even higher level of intelligence, exhibited either by machines or in a hybrid cyborg of man and machine, at faster and faster speeds or at shorter and shorter intervals was first postulated by John Von Neumann, a pioneer of, among things, the digital computer. Von Neumann had coined the term singularity but never lived to see technology reach anywhere near the level that matched his postulates. Subsequently, IJ Good, Vernor Vinge and eventually Ray Kurzweil have elaborated the concept. However equally well known people like Paul Allen, who co-founded Microsoft and Gordon Moore, whose Moore’s Law is a classic example of this exponential progress, have raised doubts about the plausibility of this concept. But with the rapid growth of artificial intelligence as evident in the spectacular success of AlphaGo, it seems that the singularity is indeed near and despite a wide range of predicted dates, the median value of its estimated arrival is the year 2040.

Once we pass through the singularity, how will the world look after 2040? Would this non-biological intelligence help create new, biological, non-biological or hybrid “life-forms”? Would these life-forms spread out across the solar system and the galaxy? But before we look at such big question, let us explore something that is of immediate concern -- the job market.

The Future of Employment

When computers were first introduced in India, there was a lot of concern that this would lead to widespread unemployment. In West Bengal, and elsewhere, Communist labour “leaders” led huge agitations against this new technology with the slogan : automation must be stopped with rivers of blood, and employees of banks and commercial establishments went on strikes to prevent the installation of computers. But in reality the introduction of computers did not lead to any major disaster. The number of jobs lost to computers and automation was more than offset by the number of new jobs created not only in the IT industry, but also in many new age businesses that were based on computer technology. The number of people needed to  use computers or write programs for computers was more than the number of people who became redundant. So while certain individuals were retired or laid off, the overall number of people who were gainfully employed increased and the economy responded with buoyant optimism.

That was then, what will it be now? Will the new technology based on powerful instances of artificial intelligence create more jobs than it destroys? Can our earlier experience with an earlier generation of technology be extrapolated in time, through the technological singularity where machines become more intelligent than men? Unfortunately, the answer to both questions seems to be “No”.

While robots are not yet in widespread use in India, we can look at what is happening in other countries that are ahead of us in this path. Between 2000 and 2010, 5.6 million jobs were lost in the US and Canada. Of this only 15%  were lost to overseas competitors -- mainly China, while the other 85% were due to, what is euphemistically referred to as, “productivity growth”. This means that humans were replaced by machines or robots. However this decline in employment did not result in lower production. On the contrary, in the last 20 years, the value added by US factories has, after adjusting for inflation, grown by nearly 40% to reach a record US$ 2.4 trillion. On the street, McDonalds’ introduction of self-service kiosks in response to popular pressure for $15/hour wage is a vivid example.This is job-less growth, where the economy expands but without a corresponding increase in employment and is perhaps one reason why lots of middle-class Americans who have been rendered unemployed and unemployable in the rust belt states have aggressively turned against the establishment and voted for Donald Trump.

Many Trump supporters believe that foreign countries are stealing their jobs but the situation is not greatly different there either. The BBC has reported that Foxconn, the Chinese company that manufactures products on contract from Apple and Samsung, has recently replaced 60,000 workers with robots and many other companies in the Kunshan region, where Foxconn factories are located, are likely to follow suit. Since 2013, companies in China have purchased more industrial robots than in any other country, as thousands of companies are turning to automation in a robot-driven automation drive that has been backed by the Chinese government in a desperate bid to remain competitive in manufacturing. Can the “Make-in-India” movement avoid this? Unlikely, because no army can stop an idea whose time has come!

The services industry, that is of more immediate concern to India is no better off. Thanks to better voice and natural language recognition techniques, call center operators -- the backbone of India’s BPO / ITES success story -- can be replaced by artificial intelligence driven bots that can do a far better job in patiently listening to customer problems and offering solutions. In fact, a whole range of service jobs are now at risk and these include but are not limited to cooks and chefs, medical doctors, surgeons, pharmacists and pathology laboratory staff, security guards, retail salespersons, receptionists, bar-tenders, farmers, truck, bus and taxi drivers and even those who perform unstructured tasks like journalists, accountants and insurance claims adjusters. This may sound too futuristic and science-fictionesque but as we have seen earlier, the fierce of acceleration of change makes it impossible to wish away these dire, Cassandra-like predictions on the future of employment. Amazon Go, a new kind of store that has no human employees is an example of what a typical retail store could be like in the future. It’s Just-Walk-Out technology, allows a customer to walk in, pick up products from the shelves, look at it, return it if necessary, then simply walk out without any formal check out procedure and yet be billed for it automatically on his credit card. This is not magic, this is not the future. This is here and now. Such a store is actually operating in Seattle on a trial basis for Amazon employees and it is simply a matter of time before it becomes a mainstream technology, just like Google’s technology for driverless cars is proliferating across the US and Europe and is about to be tested even by Tata Elxsi in Bengaluru.

When bank workers were replaced by banking software there was a big need for computer programmers to build and maintain banking software. When retail stores were replaced by e-commerce sites they needed people to build and maintain e-commerce software and also an army of delivery boys to cater to an expanding clientele in the previously unreachable mofussil areas. But when jobs are lost to robots and artificial intelligence systems, the number of jobs created is only a fraction of those lost. Unlike computer software, industrial robots are not built by humans but by other robots that are designed and  programmed by a couple of very smart humans. So a large number of low end jobs are replaced by a few, high end, specialist jobs. This makes eminent economic sense for both the users and manufacturers of robots but leaves the newly unemployed worker distressed and angry.

While robots can always build robots, it was thought that at least the programming of such automated systems would be done by a few highly skilled humans -- but even such relief may be short lived. In January 2017, the MIT Technology Review has reported that Google is building systems that not only demonstrate AI in, say driving cars, but -- like a snake, recursively, swallowing its tail -- actually builds the the next-generation systems that demonstrate more AI. This is the eureka moment, comparable to the event of the DNA molecule being able to replicate itself and thus define the emergence of physical life. As AI systems build more advanced AI systems, they can take on a life, a non-biological life, of their own. So the need for human beings becomes even more insignificant and inconsequential in the economic systems of tomorrow’s world.

But then what  do we do with humans who are no more necessary for the economy to generate goods and services? for whom would the “economy” generate goods and services? and who will pay for these goods and services? These are difficult questions that will crop up as humanity moves into an uncharted territory of the post-singularity era. An obvious way to contain the rising tide of resentment against unemployment would be to have strong social security systems like Uniform Basic Income (UBI) or even NREGA that will push money into people’s pockets without them having to do any work -- because there really is no useful work left that they could do.

Does this mean that people will simply stay at home and play cards? Or will they get drunk and create mischief? An idle mind could be the devil’s workshop. What kind of sociological and psychological problems will this lead us towards? Once again we have questions but hardly have any definitive answers. In the utopian scenario, we envisage that man will  increasingly be involved in cerebral, cultural and entertainment activities -- art, music, literature, physical sports, cinema, virtual and augmented reality games, sex -- while machines, robots and AI systems will do the “dirty” job of keeping the economy running so that it generates a distributable surplus.  The other, dystopian scenario would be a descent into anarchy as the gap between the unemployable poor and the talented rich becomes bigger and more bitter. We are looking at gated communities, or huge walled towns, well endowed with sustainable sources of food and energy, managed efficiently with high technology and defended with highly effective robotic systems against an angry, anarchic and violent outer world ruled by rogues and brigands.

These two extremes that we can atavistically think of are reminiscent of the middle ages -- the first, benevolent scenario resembles a society of serfs and noblemen while the second, malevolent scenario reminds us of medieval cities being islands of civilisation and governance in an otherwise anarchic and lawless countryside. But perhaps with the emergence and eventual predominance of artificial intelligence and non-biological “life” forms, our existing models of social behaviour will cease to be reliable predictors of  hybrid, human-machine civilisation that we cannot envisage at the moment but are about to bequeath to ourselves anyway.

The accelerating pace of technology cries “Havok” and has let slip the dogs of rapid and irreversible change. The genie is out of the bottle and new “life” forms, with a different type of intelligence, is about to be released first into the economy, then into society and finally into the planet as a whole. Would this wipe out humans? Or would man evolve and adapt itself to co-exist with this new “species”? Would he retain his position as the master at the top of the global ecosystem? Or would this position be taken over by a machine or a hybrid cyborg that combines the best -- or worst -- of both man and machines? The answer lies in the womb of futurity.

This article originally appeared in Swarajya, the magazine that reads India right

About This Blog

  © Blogger template 'External' by Ourblogtemplates.com 2008

Back to TOP