November 02, 2017

Managing GST efficiently with Blockchain and GSTCoin (⇞)

Bitcoin is the rage! With an unbelievable appreciation in prices, many people want to invest in this red-hot cryptocurrency but whether the current valuation is a bubble that will burst is an open question. What is indisputable however is the immense versatility of the blockchain technology that is used in bitcoin -- and all cryptocurrencies. GST too is the rage, or rather the cause of rage because of the inefficiency of the software implementation. This article explains how a decentralised blockchain application could solve GST problems. But what is bitcoin?


image borrowed from financeminutes
A bitcoin is a unit of value, like an equity share of a company, that can be owned and transferred. It resides in an account in a ledger, like a dematerialised share in a demat account with NSDL. The account number, the public key of the account, is known to all and so anyone can send or deposit demat shares into this account. However to sell or transfer shares out of this account, the anonymous account holder must use a password, a private key that only he knows, to create and publish an outbound transaction pointing to another account identified by its public key. In cryptocurrency jargon, an account is called a wallet and the ledger is called the blockchain. A wallet is defined by a {public-key : private-key} pair consisting of two very large numbers that have special cryptographic properties. But what is really novel is that the blockchain ledger, the record of all transactions, is not maintained by or at any one institution, like the NSDL for equity shares, but jointly by all participants in the network. Everyone has a copy of the blockchain-ledger that has a record of all coin transfers and so everyone can both verify and confirm each transaction before they accept it in their own copy. An invalid transaction can pass into the blockchain-ledger if and only if, it is accepted by more than 50% of the network and this has never happened since 2009.


Verification means that a payment transaction is valid -- the total value of all inbound or credit transactions to a wallet less value of all outbound or debit transactions is more than or equal to the current outbound debit transaction. Confirmation means that there is no double spend and the same set of unspent, inbound, credit transactions (“UTXO” or unspent transactions outputs) are not being used to create more than one outbound payment transactions. Since everyone has access to all transactions, anyone can perform the verification and confirmation. But because this needs a lot of computational power, this task is voluntarily done by a group of miners. As a reward for doing this public service, miners who run a full node blockchain software on a powerful computer, are rewarded with newly created coins that are added to their wallet when they have verified and confirmed a new block of transactions -- that is then added to the blockchain. However the reward is not given to any miner who performs the verification and confirmation but to the one specific miner who, in addition to the verification and confirmation, also solves a difficult mathematical puzzle first.


The payment, or output  transaction that deposits a newly created coin into a successful miner’s wallet is called a coinbase transaction. It is different from all other transactions because it is not backed by any previous input transaction. Hence the analogy of mining, as if a coin was dug out of the ground and not received from anyone else, whereas all other coins would have to be received from someone before they can be sent to someone else. However a better analogy would be to view bitcoin, as sweat equity that is given, in lieu of salary, to the accountants in a bank for checking and approving all transactions. The brilliance of “Satoshi Nakamoto”, who designed bitcoin, was in equating the sweat equity of the bank to the assets that are managed by the bank and initiating a self-sustaining network that is working flawlessly since 2009. The magic mathematics of cryptography ensures that this decentralised autonomous organisation (DAO) runs without any formal management and yet has achieved a market capitalisation of over US$ 70 billion.


But why should these “sweat equity” shares of a non-existent bank, the actual bitcoins, be each worth thousands of dollars today? Many people, who are not miners, buy these coins from the miners for investment or payment purposes and this demand is pushing up the market price.  Bitcoin can be purchased at many cryptocurrency exchanges with fiat currency like US$ or INR ₹. After KYC compliance, these exchanges will convert fiat currency into cryptocurrency and vice versa at market driven prices. Bitcoin is both a currency that is extremely useful as a payment mechanism because transfers are simple, fast and anonymous and is also a commodity that has appreciated in value and hence worth investing in.


How can GST use this technology?


Bitcoin is the most popular and valuable coin but there are many others. The software to create similar coins is freely available and anyone can do so. But these alternate coins do not have any value in the bitcoin network because transactions with these coins lack the cryptographic signatures of past owners that are necessary for verification and confirmation. However these coins can have value in other networks where they are recognised and traded.


Let us consider a scenario where where we have 4 companies identified as A, B, C and D. C buys ₹1000 worth of goods from A and ₹ 2000 worth of goods from B. Assuming GST @ 18%, C pays ₹ 1180 to A and ₹ 2360 to B. D buys ₹ 5000 worth of goods from C and pays ₹  5900 to C. Under GST rules, A, B, C are required to deposit ₹ 180, ₹ 360 and ₹ 900 respectively to the GST Authority (GSTA). However C can claim ₹ 540 as GST Input Tax Credit and deposit only ₹ 360 provided GSTA has evidence, that A and B have made the corresponding deposits on behalf of C.


The idea is simple but the process is very complex and laborious. There are millions and millions of GST transactions and all this data accumulates in the central GSTA database. GSTA needs to run a  matching exercise within this huge and ever expanding pool of data and identify the two deposits from A and B that benefit C. This is a big bottleneck, a possible single point of failure and so cause of delay before C gets credit for ₹540.


Instead, let the GSTA define a new cryptocurrency called GSTCoin identified by the symbol ⇞ and through an Initial Coin Offering (ICO), release, say,  ⇞100 crore into its own GSTCoin wallet. To deposit GST, one has to purchase GSTCoins of equivalent value.  GSTCoins (⇞) can be purchased at one of many GST Exchanges (GSTX) operated by, say, banks. Every GST registered company will need a GSTCoin wallet whose public key is, for example, the GST registration number. This GSTCoin wallet will be hosted by the same GSTX where the GST registered company becomes a member and this will hold balances in two currencies namely, Rupee (₹ ) and GSTCoin (⇞). As in any currency exchange, the fiat currency Rupee (₹ ) may be deposited from and withdrawn into a traditional bank account using NEFT or credit card. GSTA too will have a wallet in each GSTX.


A GSTX would differ from a normal cryptocurrency exchange in two ways. First, members can only trade with GSTA, not with other members. When a member deposits GST, he actually buys GSTCoins(⇞) from GSTA, using either his Rupee (₹ ) balance or his GSTCoin (⇞) balance. The second key difference is that the purchased GSTCoins (⇞) are not delivered to the purchaser’s, wallet but to a public key associated with the wallet of the beneficiary that may be on a different GSTX. This is quite easily done in any cryptocurrency network. If the identity of the beneficiary for whom GST being paid is not known, or if the beneficiary does not have GST registration, then the GSTCoins are delivered to a special terminus wallet of the GSTA.


All transactions are recorded in the GSTCoin blockchain ledger and a full copy of which is maintained by all those who runs a full node on the GSTCoin network. Any organisation with an interest in the GST network, including large GST payers, may run these full nodes.


In our case, A and B will first deposit Rupees into their respective GSTX wallets and then buy GSTCoins worth ⇞180 and ⇞360, respectively from GSTA using their Rupee balances.  These will be delivered to the wallet of C which will now show a credit balance of ⇞540. Since GSTCoins could not have been transferred from the GSTA wallet to C’s wallet without a corresponding GST payment in either Rupee or GSTCoin, there is no further need for any matching. The GSTCoins in a beneficiary’s wallet must have been received against some GST deposit somewhere. Hence credit is automatic and instantaneous. For the GST deposit that C needs to make on his sales to D, he will purchase a total of ⇞ 900 GSTCoins through two transactions :  a ⇞540 transaction from his GSTCoin balance and a ₹360 transaction from his Rupee balance. This fulfills his obligation to deposit ₹ 900 as GST while actually spending only ₹360. In case D is registered for GST, the ⇞900 GSTCoins that have been purchased by C will go to D’s wallet. If D is not registered, ⇞900 GSTCoins will be delivered to GSTA’s own terminus wallet where it will be accounted for as nett GST received.


As in any cryptocurrency network, all transactions will be verified and confirmed by miners who operate full nodes and who, on finding a block of valid transactions, will be rewarded with a certain number of GSTCoins through a coinbase transaction. Like any other GSTCoin, these too can be used for GST deposits.


How is this blockchain based architecture any different from the current architecture of the GST software application that runs on a centralised database? Can we not modify this centralised software to make automatic NEFT or UPI payments to beneficiaries when GST is received? Functionally the two architectures are no different. But by physically distributing the processing requirements across multiple GSTXs we eliminate bottlenecks caused by the limitations in the computing power of a single central machine. Such a distributed system is inherently scalable and will not get choked by high transaction volumes. So delays associated with uploading data and transferring credit will be significantly reduced. Moreover, as a distributed system it will be almost immune to any form of hacking or disruption at any single point of failure. Finally, there is total transparency because the blockchain database is immutable and visible to the network of full node operators.


One challenge is privacy. Since all transactions are visible to everyone, all GST deposits and hence corresponding purchase prices are in the public domain. Fortunately all cryptocurrency wallets allow the creation and management of any number of public keys and it would not be difficult for the depositor’s wallet to pick up a new public key from the beneficiary’s wallet before initiating a GSTCoin purchase transaction, and use that to deliver the GSTCoins. This will make all blockchain transactions anonymous while leaving a clear audit trail within the wallet of the depositor that can be used in case of any subsequent disputes. In fact the wallet itself could automatically generate the GST return on a daily basis.


Another challenge is that GST consists of Central, State and Interstate GST and dues in one category cannot be offset with credits in the other. This problem can also be easily addressed by creating three different GSTCoins, say of three “colours”. The wallets will become a little more complex, but the same underlying principles can be used to meet the requirements of three different, but similar, taxes.


With the RBI apparently exploring the possibility of its own cryptocurrency, the GSTCoin could be a good way to test Blockchain technology as well as improving the efficiency of the current GST system.

This article has been written with help from Bhartendu Tilak, a Chartered Accountant and a colleague of the author. Any inaccuracy or inconsistency is of course the responsibility of the author.

This article originally appeared in Swarajya, the magazine that reads India right

October 28, 2017

Google Sheets with SQL

The Google App Suite ( or G-Suite) of products is a powerful and free alternative to the traditional Office Suite and includes among others Docs, Sheets and Slides. Google Sheets offers full spreadsheet functionality and in this post and sample, we show how data stored in Google Sheets spreadsheets can be accessed with SQL. This post assumes that the reader is familar with standard spreadsheet terms like sheet, range and functions.

Google Script

The primary programming language used here is Google App Script, a javascript look alike that can be used to enhance and extend G-Suite products. Google script can also be interleaved with traditional HTML / CSS / JavaScript to create an elegant user interface as is shown in this example.
A Google Script project sits in the Google Drive just like any other Sheet, Doc or Slides and consists of code in a code.gs file along with html files for the interface. Google Script functionality can be built into other Google products but in this example, we build a web-app that can be called independently with its own URL or can be embedded into an iframe :



So how does one start with this new technology?

The developers guide to web apps is a good place to start and build a traditional Hello World application that creates and serves an HTML file. Then things turned a little difficult because it is not directly possible to make pure SQL calls from Google Script to Google Sheets.

A quick search through and post on Stackoverflow threw up two possible ways of addressing this problem and both these two approaches have been demonstrated in this application. However there were more challenges to overcome! How to navigate from one HTML page to another, how to store global variables that will carry data across functions and pages and how to embed the app inside an iframe. After sorting out all this the web app was up and running but it was looking rather clumsy and so there was a need for some cute CSS for the input form and the output report. But all this CSS had made the code almost unreadable and so there was a need to remove the CSS and Javascript code into external files and have them included in the HTML.

Before the app can be used it is nice to keep the following points in mind:
  1. For the googleViz route, the underlying spreadsheet must have public access, though this does not seem to be necessary for the query formula route
  2. For gViz, the name of the worksheet is adequate but for the query route the worksheet name followed by the cell range is necessary
  3. The spreadsheet document needs to be identified by the document ID that is a part of the URL that leads to the sheet. However the URL to the sheet can also be used by calling a different function that was tried but then commented out.
The source code for this web app -- with all script code, html, javascript and CSS -- is available in G-drive. You would need to login with your Gmail id to inspect it and use it if you like this approach.

Some known Issues :
  1. The worksheet names should not have spaces
  2. Sometimes, the last displayed report shows up instead of the new one. Reloading the page, fixes the problem 
  3. Multi-table joins are of course not supported! That is where you need a real RDBMS

October 04, 2017

Quantum Computers

My presentation on Quantum Computers at the Cypher 2017 conference in Bangalore on 21 September 2017



more information about Quantum Computers is available in an earlier blog post.

October 02, 2017

A Central Cyber Defence Authority for Digital India

The weekend of 8th - 10th July 2017 was a little different from most other weekends. First, late on Friday night, the Airtel network in the NCR region went down because of a data corruption in one of their critical computers. Then, and the exact time is not known, someone hacked, or illegally accessed, the Jio customer database, retrieved confidential identity data about customers, including phone and Aadhaar numbers, and published the same on a public website. Finally, and again the exact time is not known, something unpleasant happened to the National Stock Exchange computers so that when the market opened on Monday morning, nobody could trade for almost the entire day.


It is possible that these three events were independent, random events but as Goldfinger says, in Ian Fleming’s eponymous novel, “Once is happenstance. Twice is coincidence. The third time it's enemy action." So let us not have any delusions about incompetence or equipment malfunction. This was a cyber attack.


Who could the enemy be?


Obviously we do not know, as yet, but consider three more facts. First, this was when India was in a tense stand-off in the Sikkim sector of the Indo-Tibetan border where the Indian army had a significant situational advantage. Second, China has publicly warned India that 2017 will not be the same as 1962. Finally, in 2014, five officers of the the Chinese People’s Liberation Army Unit 61398, operating out a 12 storey building in the outskirts of Shanghai, were indicted by a US Federal grand jury “on charges of theft of confidential business information and intellectual property from U.S. commercial firms and of planting malware on their computers” leading to a tense stand-off between the United States and China over state-backed cyber espionage.


While it is true that no bullets, shells or rockets have been fired across the Himalayan border, at least not till this article was written, could it be that something else is being fired?


Most of us civilians, living far away from the LoC or International borders have not seen guns being fired in anger but thanks to television and social media coverage we have a fair idea of what happens there. But what does a cyber attack look like?   

Consider these two screenshots from two well known cyber security companies that show live cyber attacks in real time :


Watch live cyber attacks at : http://map.norsecorp.com/#/


Watch live cyber attacks at : https://cybermap.kaspersky.com/

This is only a small subset of actual cyber criminal activities that security companies can track and have chosen to make public -- like an excerpt from the register of FIRs that is maintained in every police thana in India. As in real life, most crimes are neither recorded nor publicized unless they reach epidemic or pandemic proportions like the WannaCry virus that disabled thousands of computers by encrypting critical data.

Now that we know how pervasive and ubiquitous cyber attacks are, what should we be doing to counter them? Some of us use anti-virus and anti-malware software on our personal machines and many technology savvy companies use firewalls to protect their internal networks -- that connect both users’ personal machines and company servers containing operational databases -- from hostile external access. But is this adequate?

While technology exists to stop almost every kind of cyber attack, not all end users have the knowledge, the ability and most importantly the determination to use it effectively. Consider a small or medium size company that uses a billing or financial accounting software. In the past, these would be on stand alone machines and hence inherently safe because it was “air gapped” -- or physically disconnected -- from the big, bad external world. But with more and more bills, invoices and money receipts being exchanged over mail this is no longer possible. So is the case with electronic filing of various tax returns and GST in particular. It is now impossible for any useful computer to be isolated from the internet and hence  be safe from hostile attacks from anyone, anywhere in the world. Are the computers that form the backbone of our central and state governments safe? Unfortunately, the answer is NO. So what if “non-state” hackers shut down the computers that control Power Grid’s electricity distribution network in India as was the case of the National Stock Exchange? The damage would be worse than a bomb exploding  in Howrah Station!

The challenge is less about ability and more about the attitude towards security. We know that our homes, offices and factories face threats from thieves and robbers but do we all learn martial arts and purchase guns? No, we hire security guards or outsource the security to specialised security agencies who have the expertise to handle thugs and thieves. Can our software programmers and IT staff not protect our computer systems? In principle they can, and in many companies they do keep hackers at bay but most software programmers, have expertise in a completely different area -- meeting customer and business requirements in an efficient and economical manner. Security for them is more often than not an afterthought, not the core competence. On the other hand, the durwan at the gate does not care two hoots about how and what is being produced in the factory but only knows that neither should anything go out or nor should anyone enter the premises without an approval from an authorised person. That security mindset lacking in most of our IT installations.

Which is why we have the police in towns, the CISF in factories and airports, the RPF at railway stations, the BSF and the ITBP on the borders and of course the Army as specialist agencies of the state whose only job is to ensure the security of our citizens, our factories, our infrastructure and hence of the country itself. Where is equivalent agency that guards our cyber assets? Critical machines in the GST network, the bank ATM network, the telephone network, computers that control the generation and distribution of power, computers that store Aadhaar and voter information are at the moment being guarded, if at all, by people who know little about cyber security and certainly do not have the “police” mindset that anticipates crime and thwarts threats. CERT-IN, the Indian Computer Emergency Response Team, under the Ministry of Electronics and Information is merely a technical body, not a security agency, whose responsibility is limited to collecting and disseminating information on threats and offering advice to anyone who chooses to listen. They do have the mandate to intervene during or, as is usually the case, after an attack but do not have the executive or operational responsibility to actually to prevent attacks, as is the case of the CISF or the BSF. The so called “cyber cells” of the metropolitan police are hardly any better -- all that they can do is track down mischief makers who put up politically inconvenient Facebook posts.

Going forward what we need is to separate the operational roles from security roles. Just as the security of an industrial plant is not the responsibility of the production manager but instead, is handled by a separate security department, so should be the case of security for our government installations. Those who operate IT systems should not have the additional responsibility of ensuring their security. This is not because local IT staff may not be competent enough, but because we need a consistent and comprehensive security stance at all possible threat points. It is not enough for some installations to be secure. Since all systems are interconnected, a breach anywhere is a threat everywhere and that is why we need consistent security everywhere. Hence the cyber security team should not be a part of the  local IT management but should be a part of a central organisation, the Central Cyber Defence Authority, CCDA -- analogous to CISF or BSF --  reporting directly into the security establishment in the Home Ministry.

In fact, CCDA should be an organisation on par with any other central security agency like CISF, CRPF, BSF, ITBP and like them should be headed by a person from a police, or crime prevention, background with a rank equivalent to that of the head of existing central forces. While CCDA should be responsible for government and public assets, private companies, unless they create their own separate cyber-security organisations, could outsource their cyber-security requirements to professional security companies, for whom this will be an additional line of business above and beyond their their normal fire and crime prevention services.

But while our security establishments, the Army, police, CISF, CRPF etc, may have the psychological mindset, the security stance, to anticipate criminal behaviour and prevent crime,  they would not have the technical skills to do so. Cybersecurity is not a part of the curriculum either at the Indian Military Academy or the National Police Academy and it is unlikely that it will ever be so. Even if some basic training is imparted, it will never have the technical depth required to defeat the sophisticated hacker. However the Manhattan Project, to build the atom bomb, was run by the US Army Corp of Engineers under General Leslie Groves but he had the best nuclear scientists like Robert Oppenheimer and Nobel Laureates like Richard Feynman working for him. So should be the case of the CCDA -- led by people from a police background, with an aptitude in computers and an interest in cyber security,  but staffed with people who have the deep technical knowledge, recruited laterally, or on lien,  from the IT industry.

Just as the CISF reports to the Home Ministry but is deployed in airports that report to the Aviation Ministry, the CCDA should report to the Home Ministry but should be deployed across all computer installations in all government departments, power generation and distribution companies and other critical utilities like roads, railways, telecom, ATMs. In these deployments, CCDA should be THE executive body, not be an advisory one and should have both the responsibility and the authority to ensure security. For example, it should be CCDA technicians who should have passwords for the firewall servers -- that protect government computers on, for example, the GST network  or the power transmission network -- and should be responsible for  configuring the security settings on the same.  This will be analogous to the CISF -- not DGCA, AAI or airline staff -- being the custodian of the door keys, frisking passengers and operating the X-ray scanners at the airport.

In fact, CCDA, like the Army, should also acquire offensive, or “Strike”, capabilities in addition to its professed defensive, or “Holding”, capabilities. Building offensive capabilities is a good way to test its own defenses and sometimes, offense is often the best form of defence!  

But unlike other central forces, the CCDA need not physically relocate its expert staff to distant locations even when it is deployed to protect dispersed digital assets. Just as the attacker can attack from anywhere in the world, so too can the defender protect from one or two central locations because all activity -- both offensive and defensive -- can and will be carried out over the same networks.

The HBO network was recently hacked by people who demand a multi-million dollar ransom in untraceable bitcoins to refrain from leaking episodes of the billion dollar Game of Thrones serial. What would happen if someone were to hold the Government of India to ransom with a similar hack? Just as we need to have the BSF jawan with his INSAS rifle at the LoC or the CISF jawan with his X-ray scanner at airports, we also need the CCDA jawan -- or in this case, the CCDA technician -- with his “hardened” firewall to stand guard on the digital assets that are connected to the web. The arrival of nuclear technology in the battlefield, led India to set up the Nuclear Command Authority. With the emergence of Digital India, we need the CCDA to protect the core digital assets that are critical for safety and security of the country.

Originally published in Swarajya, the magazine that reads India right!

September 03, 2017

The IT Professional & the Zone of Comfort

Rapid advances in machine learning and the spectacular success of artificial intelligence software in, say, self-driving cars, voice recognition and chatbots for customer service, is sending shivers of anxiety through IT employees. The havoc that robots and automation technology has played with the jobs of blue collared workers on the shop floor, is now travelling upward into white collar offices and not a day passes without a new report about automation eliminating jobs. In India, the IT sector -- that includes actual software developers, application maintenance staff, tech-support personnel and BPO call centre operators -- seems to be particularly vulnerable and it is no secret that a sense of doom and gloom hangs over the cubicles and around coffee machines in large and small IT companies. To make matters worse, some companies have started to shed mid-level people managers, who have stopped writing code for years,  and even senior managers who give poor returns of billability on their bloated salaries. The last straw on the back of the vanishing optimism is the reduction in campus hiring of bus-loads of low quality engineers from the hundreds of engineering colleges that have mushroomed on the promise of the Y2K inspired IT revolution. How much of this gloomy scenario is true and what can be done to bring the sunshine back? Obviously there is no quick fix but let us explore the terrain to seek a way out of these difficult times.

image borrowed from Financial Express
TCS, the biggest IT company in India was founded in 1968 and its 3.75 lakh employees generate a revenue of $18 billion while Microsoft, founded in 1975 pulls in $86 billion with 1.2 lakh employees. These, and similar, statistics has been used, ad nauseum, to pontificate that India must move up the value chain from TCS style services to Microsoft style products. But why has this not happened despite being talked about for years? One reason of course is the distance from the customer. Before the advent of the world wide web, a software builder in India would be so far removed from both the technology and the customer for the technology that it would have been impossible for him to create anything relevant. Hence the great divide between the smaller, H1B fuelled, onsite gang and their poor cousins in the offshore team. But even this results in only more services but no products. However with the internet bridging the gap this should not have been an issue -- except that it still is!

It is said that the eclectic ecosystem of Silicon Valley, with it’s simmering cauldron of technology evangelists, dreamers, brilliant programmers, venture capitalists along with legal and infrastructural support, is a fertile bed where innovative products sprout like weeds. Then how come Skype, that defines web based video conferencing was developed in Estonia? a country of 1.3 million people that most of us may not be able to locate on a map. Similarly, AVG, one of the most popular anti-virus products,  was developed in Czechoslovakia, a country with only 16 million people. But India, with over 100 million people of whom 3 million are software professionals is yet to come out with any such software product that has global acceptance and recognition. Do Indians not know how to write programs? That is highly unlikely, given the size of the IT sector in India, but what is surely missing is the ability to complete the full cycle of identifying requirements, architecting the design, securing funding, coding, building the product and eventually managing and monetising intellectual property rights. Instead, what our professionals know and do best is to receive instructions from an overseas client and code to their specifications.

This inability to go beyond meticulously following instructions and that too at a price point so beloved of our overseas clients, is the root cause of the insecurity created by the arrival of AI.  This technology is best geared to target tasks that are reasonably well defined and needs to be done repetitively. This means that in the spectrum of IT services, call centre operations and tech-support jobs are the most vulnerable. Neither is application maintenance any safer because fault diagnostics and repair is something that AI can do pretty well. What is most safe is new application, or product, development -- though even here, there are rapid application development tools that reduce the effort, and people required -- and that is where Indian IT is on its weakest wicket.

One reason why we in India are unable to come up with new products is that as a people, we are perhaps very comfortable in our respective Zones of Comfort. Our reverence for what is old, established and running is phenomenal and we are very reluctant to try out anything new. Consider the transition from MS Office, with which all of us are comfortable, to cloud based free products like Google Docs, Sheets and Slides. Despite the fact that web connectivity is as ubiquitous as electricity for our IT folks, and that the Google products meets all the requirements for 95% of IT professionals, they will almost inevitably begin with MS Office whenever they want to create a new document. Why? Zone of Comfort! This inability to try out something new inhibits our mid-level managers from “dirtying their hands” with any new technology. In fact, for many of our managers, trying out technology is considered infra-dig! Most of them prefer “management” tasks like allocation of people, attending client conference calls, preparing schedules, recording and tracking issues in minutes of meeting and so on because all that this needs is comfort with email and MS Office. In fact many managers overtly claim that is beneath their dignity to touch code -- something left for the new hires -- when the covert reality is that it is beyond their ability to do so.

In fact, this reluctance to actually “do something new” is a part of a larger tendency of being involved with consumption and avoiding creation. We would rather read a webpage than actually write a blog. It is an even greater effort for us to write a book when print-on-demand services are available for anyone who wants to publish on his own. The genesis of this mindset of consumption can perhaps be traced back to the path that our kids take from class rooms in schools to the desk at the IT company. Given the historical scarcity of jobs and the lure of campus placements, there is a mad rush for engineering entrance examinations because only those who can crack exams get selected for engineering and then placed in IT and even non-IT companies. The creative types, who are misfits in the rigid constraints of coaching classes, are automatically excluded not just from our engineering colleges but subsequently from the corporate sector. But the exam-crackers, most of whom have been successfully hammered by coaching institutes to abandon their originality and conform to patterns required by entrance examinations, enter the sector, rise through the ranks and in a pernicious cycle, recruit more and more conformists like them thus perpetuating the scarcity of creativity and innovation in our IT companies.

But all that is history. It is easy to say that we must change the system but that is neither something that will happen very soon nor will it benefit anyone in the IT industry today. What should one do to stay employable and relevant?

First, stop blaming the system, the nation or your company and take charge of your life. Light a fire under your seat and move out of your Zone of Comfort. Install an RSS reader in your browser and instead of reading client mail and following company gossip, keep an eye on RSS feeds from Slashdot, TechCrunch and Wired for latest technology trends -- say, machine learning or cybersecurity. Create blogs, contribute on discussion forums. Google and locate technology tutorials. Invest time and money -- lots of time and a little money, because not everything is free -- to acquire new skills. Skip that latest smartphone and instead, buy a personal laptop to install new, experimental software that your employer’s security policy bars on company machines. Write code, build proofs-of-concept, purchase hosting services to make these applications public and highlight these in your Linkedin profile. Go beyond the laptop, get a Raspberry Pi or an Arduino, connect it to a smartphone or even a drone -- available online in India -- to create something that people can touch and play with. Obviously things will not work as easily as they do in office projects but Stackoverflow is always there to help one go past the bleeding edge. Finally, get your kids out of coaching classes and encourage them to join you in exploring new technology! Move from the cool comfort of consumption to the caustic crucible of creation.

AI  is certainly a threat to all those who stay within their Zone of Comfort.  But technology offers infinite possibilities for those who choose to stay relevant through this fourth revolution -- agricultural, industrial, digital and now cognitive -- in human society.

This article originally appeared in Swarajya, the magazine that reads India right

August 10, 2017

Facebook : How it meddles with your mind

Facebook is the mythical 800-lb gorilla in the media world that, as the original joke goes, “sits down wherever it wants to”. With 1.2 billion pairs of eyeballs eyeing it every day, it has an audience greater than any American, European or Asian TV news network, newspaper or online news portal. This immense reach also makes it the most effective medium of entertainment. In societies where it has crossed a critical threshold of penetration, it has become the most potent mobilising force in politics and all this eventually translates into Facebook being one of the  most valuable companies in the world.
image borrowed from https://mymuddledmind.blog/

We know that information is power. We also know that power corrupts and absolute power corrupts absolutely. Should we be wary of Facebook? Consider the following ...

In the Foundation series of iconic science fiction novels by Isaac Asimov, we have  the villain, a mutant psychopath called the Mule, using popular musical concerts as a mechanism, a medium, to transmit subliminal messages to an unsuspecting audience, that demoralizes the population and breaks its resistance to the Mule’s political hegemony.  On December 17, 1997, in a chilling realisation of this fictional scenario, many news channels, including the New York Times and CNN, reported from Tokyo, that “The bright flashing lights of a popular TV cartoon became a serious matter Tuesday evening, when they triggered seizures in hundreds of Japanese children. In a national survey, the Tokyo fire department found that at least 618 children had suffered convulsions, vomiting, irritated eyes, and other symptoms after watching "Pokemon."”

Can a mass media platform be used to meddle with or influence, human minds, en masse?

As an early adopter and ardent evangelist of social media, I had always thought that platforms like Facebook and Twitter were an excellent replacement for television and newspapers as channels for current news and diverse views. But after getting drawn into a series of unintentional and inconclusive spats and flame wars with strangers with whom I have little in common and which left both sides as unconvinced about the other’s point of view as ever, I am sceptical. Was the price I was paying for using these “free” channels far too high in terms of the collaterals of irritation and anger generated in an otherwise placid and cheerful person like me? Was this my fault? Was I not savvy enough to handle this new media just as an earlier generation is psychologically uncomfortable with shopping at Flipkart or using an Android smartphone. How did the evangelist in me morph into a social media luddite, ranting against a technology? Was it just me? Or is this feeling universal?

In a peer reviewed paper published in the Harvard Business Review in April 2017, Holly Shakya and Nicholas Christakis has established what I had recently come to believe, namely, that “The More You Use Facebook, the Worse You Feel”! This is paradoxical because social interaction is a necessary and healthy part of human existence and many studies have shown that people thrive when they have strong, positive relationships with others. But when real world, physical relationships are replaced by digital and virtual relationships, the situation changes. The authors measured well-being -- through self reported life satisfaction, mental and physical health and body-mass index -- and Facebook usage -- through the number of likes, posts and clicks on links -- from three waves of data of 5208 users over two years, and came to the conclusion that overall well-being was negatively associated with Facebook usage, with the results being particularly strong for mental health. Moreover, the study also showed that the decline in well-being is strongly tied to the quantity of Facebook usage and not just the quality of interactions as it was believed to be in the past.

While the authors offer no explanation for this negative association of well-being with Facebook usage, it is not difficult to see why this is so if we consider what shows up on your newsfeed. Depending on the number of posts that your friends, and pages that you have liked, have shared there would be approximately 2000+ items that Facebook could show you but since this  leads to an uncomfortable information overload, the actual number shown is possibly as low as 200. This selection or curation is not performed by any human editor but by an artificial intelligence (AI) program that is designed to maximise benefits for Facebook. Since it is in Facebook’s interest to stimulate conversations, it’s AI will obviously select items that would provoke a user to react -- just as in a zoo, visitors throw stones at the animals instead of allowing them to rest in peace. Hence, while placid and informative items will not be totally ignored, there will always be a slight bias towards items that will provoke a reaction. For example, a Hindutva follower -- and Facebook knows our preferences to the last detail -- will be shown more items on minority appeasement, knowing fully well that is more likely to trigger a torrid response, and a subsequent equally torrid counter response,  than pictures of flowers and birds. Of course this bias is neither obvious nor in-your-face. You will still see the usual quota of bland, feel-good quotes and pictures of friends holidaying in Goa or Singapore. Which is fine, except that you just might feel a tad disappointed that you are stuck in messy Mumbai instead of being in Goa which in another reason for feeling a bit sore with yourself! Since nobody posts about their problems, this too leads to the depressing belief that everyone except you is happy.

In fact, playing and tampering with Facebook users’ emotions and deliberately trying to modify it is the subject of a very controversial paper - “Experimental evidence of massive-scale emotional contagion through social networks”, published in the June 2014 issue of the Proceedings of the National Academy of Science USA, by members of the data science team of Facebook. For the purpose of this paper, the Facebook team deliberately introduced a certain bias in the nature of items included in the Facebook user’s newsfeed and observed the impact on their subsequent behaviour. To quote the authors, “In an experiment with people who use Facebook, we test whether emotional contagion occurs outside of in-person interaction between individuals by reducing the amount of emotional content in the News Feed. When positive expressions were reduced, people produced fewer positive posts and more negative posts; when negative expressions were reduced, the opposite pattern occurred. These results indicate that emotions expressed by others on Facebook influence our own emotions, constituting experimental evidence for massive-scale contagion via social networks.”

This paper was criticised for violating basic ethical principles of psychology research because no consent was sought from the subjects whose emotions were being tampered with. That does not detract from the fundamental premise that Facebook has the ability to modify the emotions of its users and has done so in the past.  In fact, what is even more disturbing is that Facebook now has the technology to use  webcams and smartphone cameras to track emotions in real-time by detecting, decoding facial expressions as we read posts! While there is no evidence of any deliberate evil intent as yet, the fact that it’s AI based news selection service can detect and tamper with the emotions of users is a big red flag because, as noted earlier, Facebook touches more people than any newspaper, television channel or news portal and so has the ability to mould the emotions of a significant part of the global population.

While Facebook has been targeted for being a channel or firehose for fake and unstantiated news, the real danger lies in its ability to tamper with our emotions and, as reported in the HBR paper, make all of us feel angry, frustrated, jealous and upset with the world around us. Can we do anything to mitigate this unfortunate state of affairs? At a personal level, one could reduce the amount of time spent on the platform but since Facebook is an addiction like tobacco or alcohol with similar withdrawal symptoms, this may not be a feasible solution for everyone.

What users could ask for instead, is greater transparency in the algorithm, the procedure, used to determine what they see or don’t. If I want to see posts about birds and flowers, I must not be shown pictures of stone-pelters in Kashmir. In fact, such a process does exist, because you can indicate the kinds of posts that you want to see less of, but a more direct method should go a long way to restore the sense of choice that we have in newspapers and TV to read or ignore specific items of news and views

Social media is here to stay and Facebook, with its unassailable reach and immense clout, is something that -- like the monsoon rain -- we have to learn to live with. However knowing the danger that it poses and working on ways to reduce its impact is something that needs urgent action.


This article originally appeared in Swarajya, the magazine that reads India right.

July 27, 2017

OLAP Data Cube with SQL

As an erstwhile DBA, a long time user and a great admirer of the SQL language -- that has stood the test of time for the last 30 years -- I have always sought to use SQL in many useful ways. In an earlier post, I had shown how SQL can be used to solve a classic data science problem, namely Clustering, using the K-Means algorithm and today, I demonstrate how SQL can be used to process OLAP data cubes and generate the popular cross-tabs table.

Data cubes, or OLAP cubes, are a way to store historic data using the dimensional model, as opposed to the relational or 3rd normal form model. These data cubes can be "sliced" and "diced" to reveal data relevant to particular dimensions. Because of the immense popularity and ubiquity of relational databases, like Oracle and MySQL, data in the dimensional model is routinely stored in relational tables and retrieved -- by slicing and dicing the cube -- using standard SQL constructs like the WHERE clause. This is called Relational OLAP or ROLAP.

Data cubes are very popular because they allow multidimensional data to be collapsed to any two dimensions and shown as a "CrossTab" -- and human beings can comfortably visualise only two dimensions on a page or a screen. Unfortunately, creating CrossTabs is not very easy with normal SQL and that is why there exist a genre of specialist products -- Multidimensional OLAP or MOLAP -- that allow users to create CrossTabs by "rotating" the data cube as necessary.

Microsoft SQL-Server, a RDBMS product, has a proprietary construct called CUBE that allows this feature but this is not available in most RDBMS products and certainly not in MySQL, the free and open-source product that is the most widely used RDBMS on the planet.

The following slide deck shows how MySQL can be used to "rotate" an OLAP data cube and generate CrossTabs for any cube of dimension 3 or higher


(please view the slide deck in full screen mode)
We also show how a "pivot" table, so beloved of Excel users can also be generated using MySQL and hence by extension in any RDBMS.

But why would anyone wish to use SQL or MySQL to build and work with data cubes when MOLAP tools are available?
  1. First, SQL is easily understood and widely used by a vast majority of IT professionals
  2. Second, MySQL is a free and open-source product that is used in almost every web application
  3. Third, SQL is supported in a multi-machine, clustered environments like Hadoop/Hive and Spark and so this technique can be used -- at least in principle -- to support data cubes built with ultra large data sets.
Unless one wants the bells and whistles that come along with most MOLAP products, MySQL is good enough for almost any OLAP activity and can be scaled up with Hive / Spark for very large data.

Acknowledgement : The technique demonstrated in this post has been adopted from information provided at http://www.artfulsoftware.com/infotree/qrytip.php?id=78

June 30, 2017

Quantum Computers

Quantum mechanics is a subject that has the strange property of simultaneously being logically rigorous and yet completely counterintuitive. So much so, that even a towering intellect like Einstein could never bring himself to accept its principles even though products based on the same exist all around us. The earliest oddity, identified by Schrodinger, one of the founders of quantum mechanics is about a hypothetical cat that is neither dead nor alive until someone actually observes it. A similar oddity is that of quantum entanglement, where the behaviour of one particle is instantly affected by the behaviour of another particle, however distant it may be -- an example of “spooky” action-at-a-distance. Explaining these phenomena is beyond the scope and temerity of this article and so the reader would have to accept them here in good, almost religious, faith and carry on with the belief that such phenomenon has been observed and explained by scientists under the most rigorous experimental circumstances.

Image borrowed from Quanta Magazine
Any programmable digital computer that we use, the desktop,the smartphone or the ones at Google, is based on a finite state machine (FSM). It can, at any instant of time, be in one of a large, but finite, number of well defined states. The state of a FSM is defined by the value stored in each of its memory locations and we know that these can either be 0 or 1. So an FSM with, say, 16 bits of memory could in principle be in any one of 2^16 states. Any instruction to the FSM changes the value of one or more bits and and the FSM moves to a different state. An FSM along with the ability to read binary input, from an infinite tape, and write back on the same tape, is the Turing machine that is the theoretical basis of any modern computer.

The fundamental principle of computer science is that the world is computable, meaning that any logically decidable problem can be represented and solved on a Turing machine and hence by extension on some, possibly very powerful, digital computer. This is the basis of our immense belief in computer technology that powers everything from smartphones to artificial intelligence. But even as long back as 1982, Richard Feynman had questioned this principle because he realised that Turing / FSM based computers could not solve the problem of simulating the movement of multiple particles whereas nature was doing it all the time! Did the quantum mechanical behaviour of nature mean that nature had a computing device that was inherently superior to the Turing machines built by classical computer technology? This is where the concept of a quantum computer was born.

A computer, is a state-machine where it’s state is defined by the collective states of each of its memory locations. In a classical computer, each memory location, or bit, can either be 0 or 1 certainly not both, but in a quantum computer it can be both 0 and 1 simultaneously -- very much like Schrodinger’s cat that was dead and alive at the same time!  This is where the going gets really rough for anyone who has spent a lifetime in classical computer science because this is something that is completely counter-intuitive. A memory location, a bit, is a transistor, or switch, made of silicon that is either ON or OFF. How can it be both? Turns out, that if you keep aside computer science and open your books on quantum mechanics, it is indeed possible that a body can be in two states at the same time based on the well established principle of quantum superposition. Now if we go back to our 16 bit classical computer with its 2^16 states and replace it with a quantum computer with 16 quantum bits, or qubits, of memory we have a machine that can be in 2^16 states simultaneously. If that is not mind-bending enough, all these 2^16 states will collapse into any one of the states as soon as we try to observe it. It is almost as if nature is playing a game with us, pretending to be classical whereas it is actually quantum.

But why are we obsessed with this counter-intuitive phenomenon? Will it have a drastic improvement on existing digital computer technology? Not really. Your spreadsheet, email, YouTube, eCommerce, smartphone will hardly change but two things could. First, current cybersecurity systems, that are based on our inability to decompose integers into their prime factors in a reasonable amount of time, could be ripped apart by quantum computers, leaving all passwords vulnerable to hackers. Second artificial intelligence could be taken to altogether and unbelievable levels of sophistication. So quantum computers will soon have a very important role to play -- but how far away are we from real, practical systems?

The biggest challenge is the construction of the physical memory locations and the complexity of the engineering problem is evident from the following : A modern IBM classical computer chip has anything between 2 and 7 billion transistors each of which can be ON or OFF. The corresponding IBM quantum computer chip, that powers the IBM Quantum Experience machine, has only 5, yes just 5, qubits of memory that can be in quantum superposition of ON and OFF. Why so? First, the memory locations have to be cooled to near zero Kelvin to exhibit their quantum superposition behaviour and if the cryogenic challenge was not enough, the second challenge is even bigger. Unlike the memory locations of classical computers whose state can be determined by sensing the presence or absence of an electrical voltage, the multiple, superimposed quantum states collapse as soon as any effort is made to observe them. This is as if a room has a house of cards that collapse as soon as the door is opened by the observer and the observer has to figure out what the house looked like by observing the disposition of the cards on the floor! Since the qubits can never be accessed directly, as in a classical computer with read and write statements, they can only be “influenced” indirectly.

To put things in perspective, ENIAC, one of the world’s first, 1st generation, vacuum tube based classical computer had 20 memory units, or accumulators, in 1945, and a 2nd generation, transistor-based computer from the University of Manchester had only 200 transistors in 1955. Since then we have moved through 3rd generation integrated chips and the current 4th generation of microprocessors have scaled up to billions of transistors thanks to the inexorable pressure of Moore’s Law. If we remember that even with its 20 memory units, ENIAC was used to solve problems in weather forecasting, atomic energy calculations, wind tunnel design, the current 5 qubit IBM machine does not look as hopeless, or helpless, as it seems to be.

But actually things are a little better off. D-Wave a Canadian company that has been building quantum computers since 1999  have come out with a 128 qubit machine in 2010, a 512 qubit machine in 2012 and 1000 qubit machine in 2015. Initially there were some doubts about whether these were quantum machines at all but after these machines were actually installed and used first by Lockheed Martin at the University of Southern California and later at the Quantum AI Lab of NASA Ames Research Centre by a team from Google, these doubts have receded to a large extent. But even though some doubts persist, there is enough evidence of quantum behaviour or at least great promise that these doubts will be removed soon. In early 2017, D-wave announced the sale of their first, commercial available $15 million 2000-qubit machine to cyber-security firm, Temporal Defence Systems.

IBM’s 5-qubit Quantum Experience is positioned as general purpose computer. It could be used for any computational task but would be efficient only if the program was designed to use quantum properties -- a colour TV is useful only if the broadcast is in colour. Very few programs can do this today but Shor’s algorithm, used to crack passwords, is definitely one such. D-Wave systems on the other hand are designed to solve one class of problems that minimise the weighted sum of large number of interrelated, or entangled, variables. This may sound restrictive but the reason why everyone from Google to Temporal is interested is because this class of problems is similar to the ones that occur in artificial neural networks that lie at the heart of systems based on machine learning.

Spectacular progress in machine learning with artificial neural networks using classical computers itself, is rapidly closing the gap between biological and nonbiological intelligence or even between carbon and silicon “life-forms”. With the advent of quantum computers one more crucial barrier between the natural world and it’s man-made, artificial model could break down -- as could the increasingly thin line that delineates man from machine. Will this drag man down to the level of machines? Or will these machines push man up towards his eventual union, or Yoga, with the transcendent omniscience that some refer to as God or Brahman?


This article originally appeared in Swarajya -- The magazine that reads India right!

About This Blog

  © Blogger template 'External' by Ourblogtemplates.com 2008

Back to TOP