The Genie in the Bottle: How to Tame AI? Part 2

It has been two whole years since I posted ‘The Genie in the Bottle: How to Tame AI?’. Interestingly, it seems that the analogy of AI as a genie in the bottle has been made by quite some experts in the field lately. I swear I really made it up myself two years ago! With start-ups, either some competitor is already working on the exact same idea as yours, or your idea is worthless. I suppose the same holds for analogies!

In this first post, I explained why AI should not be programmed with explicit objective functions, but rather evolve alongside humans to avoid AI-pocalypse. However, since the post was becoming quite a long read already, I decided not to go into a more detail about my thoughts how this could actually work. Since it now has been two years since my original post, I think it’s about time for a part 2!

AI learning alongside humans: easier said than done

Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child’s? – Alan Turing, 1950.

This quote from the dawn of the field of AI, is referred to as the first step towards machine learning. The recent breakthroughs (including of course deep learning) in AI are almost all in the field of machine learning, these are different from the series of breakthroughs that preceded the first and second AI winters (to give some sense of timings: first AI winter early 70s, second AI winter late 80s, early 90s). It’s different because it is machine learning, instead of the explicit programming of AI in the first series of breakthrough, and different from the later machine learning where models were modelled with very specific model structures (for example Bayesian Networks). However, although the current breakthroughs are unexpected and amazing, we should not let ourselves be fooled in thinking this means we’re anywhere close to ‘AI that simulates the child’s mind’ as stated by Turing.

The current breakthroughs, especially in deep learning, are mainly models with extremely little model structure implied by humans (more technically speaking: very many parameters that can be estimated by the model). As a result, these algorithms are both extremely data hungry and energy hungry. To put it into the right perspective: the reason these breakthroughs in AI happen now, is both because of the decreasing cost of computing (energy efficiency) and the explosion of data we’re currently experiencing. Neural networks (the type of model used in deep learning) are not new (1951) and it even seems backpropagation was already invented in 1970. It’s only because of the current explosion of computing power and data that these algorithms become powerful. As an example, because of the enormous amount of labeled pictures of cats and dogs on the internet, suddenly these deep neural networks (many layers, many parameters to optimize by the algorithm and limited structure implied by humans) perform very well.

why we shouldn’t overvalue reinforcement learning

What about AlphaZero, who can beat humans in Go, Chess and Shogi without any input data, just by playing itself? This is an example of the breakthroughs in reinforcement learning, which of course benefits a lot from the decrease in cost of computing, but the point about the explosion of data being a driver doesn’t seem to hold in the case of reinforcement learning. Again, while these breakthroughs are amazing, we shouldn’t overestimate how much practical value these breakthroughs offer. Pure reinforcement learning can only be applied in cases where an algorithm can run simulations or just start try doing something with very limited cost for each simulation run or costs. The major examples of reinforcement learning are board games (like Go, Chess and Shogi) and video games (like the Atari games). Even in those cases, cost of computing takes a large role. But almost every practical case, you cannot put an algorithm into the world that just tries random things. Imagine doing this in the field of self-driving cars, healthcare, finance, marketing and business in general. There are some interesting applications in robotics, where the algorithm is trained in a simulation before it’s uploaded to the actual robot. In those cases, it actually functions more as a kickstart of your algorithm, since you ideally also want the algorithm to continue learning when it is actually applied into the robot. Both in the case of robotics and all the other industries I mentioned, I believe the biggest value in the reinforcement learning breakthroughs is transferring these models and algorithms into settings that do learn from the current data explosion (not pure reinforcement learning). UPDATE: after writing this DeepMind and OpenAI announced a collaboration in research on Inverse Reinforcement learning (or Imitiation Learning), which aims to use models from pure reinforcement learning and make them learn from human feedback.

why the current level of AI is nowhere near child level

There are many interesting resources explaining why the current AI is nowhere near child level AI, or put differently, why Artificial Narrow Intelligence is not even close to Artificial General Intelligence. There are two points I want to highlight here. First, the fact that these algorithms are so data hungry is an important point. We humans need very little ‘training data’ obtained from the real world to get an understanding about the world around us and be able to alter it (i.e. taken actions to achieve something, perform certain tasks). Fortunately, when a manager hires an employee, he/she doesn’t need to show thousands of times what the task is the employee should be doing before the employee can start to work. Children and even some animals are extremely efficient in learning from data if we compare it to our current algorithms. Wired has an interesting read about this topic here. Second, as Judea Pearl brilliantly explains, for example in this interview and in his book ‘The Book of Why’, current AI completely lacks any causal understanding. I can really recommend reading the book, because it clearly explains the problem: direction of causality simply cannot be mined from data, since the exact same data can yield completely different conclusion bases on the causal model. We humans can easily understand we cannot make good weather by forcing the barometer up, but this cannot be concluded purely from the data. To get any form of causal understanding, assumptions need to be made, or in other words, structure needs to be implemented into the models, which is fundamentally different from the current deep learning breakthroughs where as little structure as possible is implied by humans.

the danger of the assumption free illusion

A last side note on this topic: I sometimes hear experts saying things like ‘our AI is completely assumption free’. Not only do I think any remark of the sort to be quite unintelligent, I actually think it’s extremely dangerous to, intended or not, create the illusion that something like assumption free AI could ever exist. It’s as much as saying your AI tries every possible model, which in fact is impossible because there are infinite possibilities and practically too inefficient to even try. More important, structure (and therefore assumptions) need to be put into models to be able to generalize. Otherwise, the world only has to have a small but fundamental change to become a version the algorithm has never seen before in the data, which will make the algorithm completely useless. An interesting example is slightly changing the Atari video game ‘breakout’ for the AlphaZero algorithm, as can be seen in this video. A less harmless example would be the stock exchange crashes and flash drops (like for example seen in cryptocurrencies exchanges). Prior to the 2008 financial crisis, housing prices had consistently been going up. Real estate was considered by both humans and algorithms as investments that couldn’t go wrong. They couldn’t be less wrong, but there was no way to tell so from the data only. In flash crashes, an avalanche effects happen both by human behavior, but accelerated by algorithmic trading. There is simply no data about such market behavior until it happens. This cannot be solved by ‘assumption free AI’, in fact assumptions are needed to prevent these thing from happening. I’d rather say that the overconfidence in ‘assumption free models’ is the reason these things happen. As Pedro Domingos noted in his book ‘The Master Algorithm’, ‘People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world’.

Interestingly, the 1968 movie 2001: a Space Odyssey already gave us an example of this kind of AI behavior. When HAL9000, a state of the art spaceship computer, wrongly claims a communications component is broken, while in fact it is working perfectly, it still claims he cannot be mistaken because none of the 9000 series has ever been wrong about anything. Also, In the sequel, the original source of the problems is identified: before the start of the mission the government tempered with the explicitly programmed objectives, illustrating the danger of explicitly programmed objective functions which I explain in the first part 1.

The Paradox

Coming back to the topic: to prevent AI-pocalypse, AI should learn alongside humans and not be programmed with specific objectives. This is what I refer to as democratizing AI. Today we see clear examples of why it is dangerous if companies, governments and politicians have too much of influence in how algorithms work and can bend these algorithms to achieve their own agenda’s. We’ve seen doubtful practices in political campaigning on social media, government hackers continuously working on altering algorithms to change the world population’s world view and companies like Facebook have extremely powerful tools when facing crises to keep their reputation clean with the world population. Forget nuclear weapons, effectively influencing global public opinion is power in it’s purest form. Imagine Zuckerberg facing the Cambridge Analytica scandal, knowing that if users massively leave Facebook the company will be doomed. Even adjusting revenue forecast to a little bit less optimistic resulted in an evaporation of 120 billion dollars value over a single day. How tempting is it to rub that bottle, ask the genie to please make sure consumers will stay with Facebook? Even although the genie is still very weak, already today it’s impossible to fully understand all the side-effects of asking such a question. If you cannot even control the spread of hate speech in your own network to start with, how can you ever fully grasp the impact of manipulating the objectives? Research from the New York times shows that Facebook management seems already be unable to grasp crucial things of how people are working on fullfilling management’s wishes, so how could they ever if AI wash fulfilling their wishes?

so what should it look like?

Let’s say you want to buy a house. There is this AI that can find the perfect house for you, can bid the exact right price and arrange everything considering mortgage and contracts. Would you want to let this AI automagically buy a house for you? Probably not. Instead of letting the AI do the decision making, you would only want it to help you, letting the decision making up to you. An important part is that you would want to understand any important consequences of this decision. You could compare this how we nowadays already use the internet, search and mobile devices to empower our decision making process, but much more stronger empowerment. Like a real estate agent can now explain you everything to know but leaves the decision making up to you. In the positive scenario towards superhuman intelligence, AI will increasingly empower individuals in a democratic way in their professional and personal lives.

the current problem

In the current situation, the problem is that companies and organizations are the ones that are directly influencing consumer behavior, opinion and in the end decision making. Whether it’s companies or organizations that pay Google and Facebook for advertising, or companies like Amazon that try to sell products or services, the main business model is very directly influencing decision making of consumers. China is already regulating freemium games to protect consumers for these games having too much influence on their decision making (spending loads of money on content in addictive games). But why don’t we just give every person a private AI? This is where the problem of current AI being extremely data hungry kicks in. If we would have AI at the child level like Turing imagined all those years ago, it would be no problem to let this private AI learn along during your live. However, the current breakthroughs completely rely on massive amounts of data generated by many individuals, which is exactly why companies like Facebook and Google are in such a strong position. To them it’s no problem to opensource the technologies they invested so heavily in, like Tensorflow and Torch, because they have a unique position in using it to it’s full power. To democratize AI, developing AI to be less data hungry plays an important role. However, there might be another interesting direction.

directions for a technical solution

One of the interesting more recent topics of development that could be of potential use is transfer learning. Transfer learning focusses on reusing components of trained models for training models with a different objective. A nice example you can try yourself is retraining the top layer of an image recognition neural network to classify new categories (flowers) using Tensorflow.

A possible solution for democratizing AI could be using transfer learning, by making a few top level layers of the neural network into the private layers. These private layers are owned by the individual and make sure the individual is in control of the actual decisions being made, giving protection to other actors that might try to directly influence. The lower levels of the neural network can be trained across many individuals tapping into these algorithms to fuel their personal top layers. These lower levels of the neural networks could be public or owned by companies that actively develop these component of the model.

These might be a bit abstract, so let me give a simplified example to illustrate how this would work. Let’s take the example of funny cat videos. Some people enjoy them a lot while others might not care that much about them. So if we would be showing videos of cats and other videos and consumers can explicitly like or dislike these pictures, they can train their own private algorithm to recommend the cat videos. It does so by making use of a model that was trained by all the users that tap into this model with their personal top layers. What these lower layers of the model do could be considered as object detection of all the videos, while the private top layers determine whether this specific user is interested in these topics or not. The interesting thing is that the user keeps full control over what he or she wants to see because of explicitly training the model, and the lower layers of the model would be rewarded for providing relevant information to the private top layers model to perform this task. An individual would not have to personally identify to use these lower layers, it just need to train these layers a bit while using them and then pass the result back to the owner(s) of the lower layers. The beautiful thing would be that the lower layers can build superior video classification algorithms, that are trained strongest in those classes that are most relevant for the consumer, while keeping the actual preferences, objectives and data of the consumer private.

This would be quite a simple example, but many other functionalities could possibly also fit into this framework, like searching the web and doing text analyses of all your emails, messages and real life conversations. Queries could be much more advanced like ‘please give me a summary of the problems discusses in last meeting and the proposed solutions’. But it could as well be used for generating results and suggestions like pre-typed emails, messages and voice, or the more advanced stuff like explaining and advising about anything your need to know to buy a house. The private top layers would be able to combine different models to form the ideal personal assistant for the individual. Moreover, it could be selecting the best options out of the lower layers offered by different providers of these algorithms. This could create a competitive system for providers of these algorithms, who are already rewarded for being used by their models being trained. Of course, consumers could be in control in excluding providers that they don’t want to use, and providers could earn money with premium version of the lower layer models. This could be especially interesting for versions developed for specific professions. As an example, if you’re a developer you could pay for an algorithm you can plug into that does most of the developing for you, while still letting you make every important decision, without sharing any data on what you’re exactly doing to the providers of the algorithm.

What’s important is that we should give much more explicit feedback to our private AI about how it should be behaving or what we exactly find and do not find useful in it’s behavior. This is absolutely needed to build strong private top layers to make such a system useful. As an example, Google search is nowhere near specific enough in receiving feedback for its usefulness, since you cannot tell whether you found the information or did succeed in what you wanted to achieve in a way you liked. Instead it’s driven by its cost per click model and what advertisers want to achieve from a consumer (for example make an order on the website) instead of the consumer explicitly training the model on its usefulness. Similarly, Facebook is limited in continuously allowing users to train their personal algorithm, implicit in liking and following some topics and pages rather than getting continuous feedback on what users truly want when using Facebook. Moreover, Facebook is also listening to what advertisers want to achieve instead of giving consumers full control. At the very core of democratizing AI is rewarding AI for truly helping the individual decisionmaking, putting the current challenge in the fact that rewards are given by third parties (companies and organizations).

It won’t be easy. It might even take one or more additional AI winters before we get there. But I do believe this to be a possible scenario towards a bright AI future. Let’s hope either some of the tech giants change directions towards democratizing AI or some start-up will successfully disrupt the current industry by doing so.