We are not close to general AI, but there does seem to a clear trend towards increasing generality in machine learning methods over the past 10 years. Generality means that the same algorithms can be applied to many different problems. This short blog post (with many citations omitted – maybe I can add some later) describes my position: that we are making real progress towards general AI, i.e. agents with a high level of intelligence (perhaps superhuman, perhaps sub-, but still high) across many tasks and situations.
(I published a version of this, and updated a couple of times – previous versions in Github.)
Machine learning was always about creating algorithms, such as decision trees, into which we could plug “any” data set – they were generic in that sense. But there were always several respects in which they weren’t generic:
Importantly, narrow approaches to ML made amazing progress on “hard” problems like chess and non-linear regression (where typical adult human performance is pretty weak, unaided), but were weak on “easy” problems like recognising a face or speaking a language, things accomplished with little difficulty by three-year-old children. This is Moravec’s paradox. The other effect here is that any AI task which is successfully completed by a machine, so far, turns out to have been accomplished by narrow methods, rather than by general AI methods. And so the common reaction “oh, that’s not AI, it’s just brute-force computation” is correct – so far. “AI is whatever hasn’t been done yet.”. (This is sometimes called the AI effect.)
What has changed?
It is now more common to use the same algorithm for multiple tasks, e.g. gradient boosted trees are near the state of the art for both regression and classification. This unifies these tasks. They’re also fairly agnostic to independent variable types (binary, categorical, ordinal and continuous).
Image, audio, and other sensor-type data can now be processed using convolutional networks which implicitly learn representations suitable as input to the same types of regression and classification algorithms already mentioned. These supervised learning algorithms can in fact be placed “on top of” the convolutional network and trained end-to-end, that is using a single architecture and a single training algorithm. This unifies image, audio, and other sensor data with structured data. Performance is at or beyond human-level for some constrained tasks.
Natural language can be treated using recurrent networks and embedding methods, which again make the data suitable for input to a standard algorithm. They do not retain all aspects of the original language, but examples including modern neural network-based translation show that they can learn sufficient aspects of language structure and semantics to be useful. Again, such models can now be trained end-to-end. This unifies language data with the other types.
Although unsupervised learning algorithms have existed for a long time, modern algorithms can use unlabelled data to learn representations suitable for supervised tasks. This unifies supervised and unsupervised learning. The combination even allows one-shot learning of concepts, something humans seem to be able to do after they have done a lot of repetitive learning in early childhood. With one-class classification, we can learn individual concepts well enough for binary classification using only unlabelled data. The unsupervised plus supervised idea has also given us models capable of being retargeted from one domain to another with little retraining, such as trained convolutional nets trained on generic datasets which can be easily re-used for specialised image recognition tasks, and such as the game-playing models which can learn to play many different Atari games. These represent progress, but not the final step, towards models which can address multiple tasks without re-training.
Another angle on unlabelled data is active learning, that is a type of semi-supervised learning where there are few or no labels initially, but the algorithm can choose points for which to query labels from an oracle (e.g.~from a human operator). It then updates with the new information, carries on predicting, and considers what point to query next, e.g.~to maximise its information gain at each step. Kalyan’s company are doing this. This is a halfway step between supervised learning and true reinforcement learning, because the act of querying for a label is an action, in RL terms, but it’s of a very limited type.
Generative tasks such as generation of realistic speech audio given the required text are not far off human-level, for constrained tasks. Image generation can be done too. Some generative tasks such as transferring of artistic style from one image to another can be done at a level I think is probably attainable by trained artists, but far beyond the skill of most humans. For generation of raw text, we are seeing some very good applications like automated email answering, but in other areas like story generation, the results are perpetually disappointing.
Optimisation methods use the concept of the objective, a measure of how good a candidate solution is. Most ML methods use some kind of optimisation, with an objective function, in their training. Reinforcement learning (RL) is different, because there isn’t a direct measure of how good things are – just a very opaque reward signal, which can be noisy, lack gradient, or even have an indefinite delay between good actions and positive rewards. Perhaps we can frame an objective as a well-behaved special case of a reward signal, unifying these two ideas. To retain generality between the two we now need to abandon ideas like gradient descent which rely on white-box access to the objective function. But we can train our agents using, e.g., evolutionary strategies, which already balance exploration and exploitation. They seem more likely to scale to training neural networks than typical RL training methods like epsilon-greedy, which I guess (but I don’t know) don’t scale. Another angle on this unification: researchers “pretend” that classification datasets are bandit datasets, since bandit datasets aren’t easily available.
RL methods have accomplished some impressive feats, and not just in a narrow, “cheaty” way. Yudkowsky writes “Looking over the paper describing AlphaGo’s architecture, it seemed to me that we were mostly learning that available AI techniques were likely to go further towards generality than expected, rather than about Go being surprisingly easy to achieve with fairly narrow and ad-hoc approaches. Not that the method scales to AGI, obviously; but AlphaGo did look like a product of relatively general insights and techniques being turned on the special case of Go, in a way that Deep Blue wasn’t.” AlphaGo Zero has added another key generalisable skill, training by self-play. It’s closely related to adversarial training, as in generative adversarial networks, and to coevolution, dating back to Samuel’s checkers research the 1950s and 60s. This is important because it seems likely that one of the biggest driving factors in the evolution of human intelligence beyond that of other mammals was not the need to survive in an ice age and outwit bears or tigers, but to survive and outwit each other – not only in fighting and hunting, but in joke-telling, coalition-building, and sexual politics. Social competition is the reason why predicting the stock market is an endlessly difficult problem which will never be fully solved, whereas image recognition is a problem of bounded difficulty which will be fully solved in a decade or two. AGI doesn’t require us to solve the stock market, but it can take advantage of social-competitive forms of training.
Finally, RL research now takes advantage of the representations mentioned above, including convolutional networks, recurrent networks, and embeddings, to represent environment data.
In principle, it is possible to create an RL agent with a single architecture including these components, which learns and competes against itself and others in a problem solving environment, using unlabelled data, with queries for labels as one of its possible actions, and solves regression and classification problems implicitly as part of its overall sense-act-reward loop.
The RL paradigm is defined by an agent in an unknown environment, with sensors, actuators, and a reward signal – a setting sufficiently general for artificial general intelligence to live in. In other words, a sufficiently good RL agent would be an AGI. Therefore, this “grand unification” of reinforcement learning with supervised and unsupervised paradigms, and of regression and classification tasks, and of many different types of data representation and methods of training, will represent an important step towards true AGI, and I think this step is possible in principle today.
All of the ingredients mentioned above were made to work only in the last 10 years or so, and their implications are still being worked out, so I regard this as work in progress – that is, there is work left to do, but there has been real progress.
There are other paradigms of AI not yet unified in the above. The paradigm of optimisation includes areas like goal programming, linear and integer programming, gradient-based optimisation, metaheuristics (evolutionary search and friends) and constructive heuristics. Several of these are used or could be used in ML – SVMs are trained using quadratic programming, for example. But they wouldn’t be suitable, I think, for training our type of agents. I’ve already mentioned the idea of using metaheuristic search to train NNs. But this is not really a unification. Such an agent would use an ES during training, but it wouldn’t have access to an ES module for everyday use, the way we do (just as earlier, an NN which carries out multiplication internally can’t be used as a calculator without training). How could we accomplish that? An open question. However, when we see ideas like learning to learn by gradient descent by gradient descent and learning to optimize with reinforcement learning, we see the first glimmers of architectures which can themselves invent and incorporate new machine learning methods.
The NFL theorems are no obstacle to the proposal of a single model performing well on many different tasks. The first NFL theorem (“for supervised learning”) says that regression and classification models can only do well on a test set if it is drawn from the same distribution as the training set. Our agent isn’t exactly a supervised learning model, but insofar as it might carry out implicit internal supervised learning, the assumption of a stationary or slow-changing environment is the same assumption humans make with every action in our lives. The second NFL theorem (“for search and optimisation”) says that in black-box search across all possible objective functions on a given search space, no search algorithm out-performs random search. Well, what is our search space? It’s the set of all possible sets of neuron weights in whatever NN architecture we choose. What is our objective function? It’s something like to maximise the expected reward over time – a black-box function as already stated. (By the way, typical supervised learning models don’t use a black-box objective – they use an objective like a sum of squared errors, which is quite the opposite of black-box – so the second NFL doesn’t even come near to applying to them.) But of course the vast majority of possible objective functions on our search space are of no interest, since they (for example) award totally different objective values to highly similar network weight configurations which end up acting in exactly the same way. So an algorithm is free to excel on all functions of interest.
Some commenters argue that increasing computing power will inevitably lead to AI, for example Kurzweil takes advantage of exponential trends in computing (Moore’s law, and similar) as a cornerstone of his argument. Many AI researchers dismiss arguments based purely on computing power, arguing instead that progress in “software” – in models, representations, learning algorithms, datasets, task specifications, and so on – is what is required. (Others argue that exponential trends cannot continue, either for a priori reasons or because of physical limits.) I think the focus on software improvements is correct, and I have argued that such improvements are happening, though certainly more are required. However, a separate line of argument supports the claim that improvements in raw compute power are relevant. It relates to AIXI, a general RL agent. In principle, AIXI is already a super-human AI – it’s just not computable. There are approximations to AIXI, including AIXItl and Monte Carlo Tree Search AIXI, but they’re not good enough yet. However, computing power will make these approximations better, and no new improvements in software are required for this. This alone suggests that increasing computing power is relevant to whether and when AGI is possible.
But perhaps there is a missing ingredient for true intelligence – such as consciousness. Some argue that consciousness is not required for intelligent behaviour, or that consciousness emerges in any sufficiently complex system, or any sufficiently complex system acting with goals in the world. There seems no way to answer this question for now. However, some research relevant to a functional implementation of consciousness has been done, as follows.
As a first step, consciousness requires the ability to introspect. Conscious agents would need to be able to sense their own internal state. This is not the same as saying that the state is readable as part of the agent’s computation, which is already trivially true. (Similarly, a neural network, which carries out a million internal additions and multiplications at every iteration, would need to be trained to map a pair of inputs to their sum, and will do so imperfectly.) Instead, the point is that the agent’s state can be sensed and processed using the same machinery which is already used to sense and process external state, hence taking advantage of pattern recognition and other learned functionality there. In particular, the ability to create a theory of mind of others could lead to a theory of mind of the self: a brain looks down, sees a body doing stuff, tries to explain it, ends up conscious. Another precursor is the ability to detect agency in others and in this context I like the hypersensitive agency detection device because of the algorithmic/ML way of thinking: “It would have been far less costly for our ancestors to have detected too much agency in their environment than too little.” Some work related to theory of mind has been done, e.g. on agents who read each other’s source code, so perhaps a next step would be to turn this ability in on itself. Whether this type of approach would eventually lead to “true” consciousness, with qualia and all – if such a thing exists – may be moot.
The ability to “simulate” scenarios in the mind and “watch them play out”, rather than undertaking tests in the real world, seems crucial to an agent trying to live in a dangerous world. “Let my ideas die in my stead.” Further, a “self-adversarial” type of simulation in which an agent proposes an idea to itself, and then simulates itself attacking that idea, would take advantage of the key insights of self-play (AlphaGo) and the social competition aspect described earlier.
Another important aspect of human conscious thought is attention. Our multiple senses, including our introspective sensing of our own mind, are not always given equal prominence in every decision we make and action we take. Attention is a mechanism for focussing on the right data. Again, recent progress has been made in taking advantage of attention mechanisms in machine learning. Separately, one theory of the evolution of consciousness is all about attention, and contra what I wrote above, suggests that self-models came before models of others: “the attention schema first evolved as a model of one’s own covert attention. But once the basic mechanism was in place, according to the theory, it was further adapted to model the attentional states of others, to allow for social prediction.” But thereafter, “300 million years of reptilian, avian, and mammalian evolution have allowed the self-model and the social model to evolve in tandem, each influencing the other”.
Introspection is probably a prerequisite for self-modification. This is where it gets a bit scarier. What would self-modification mean? In the ideal case, it would mean the agent gradually improving its own architecture and re-training for improved speed and performance. Focussing just on performance, this means that the agent improves its own architecture in order to achieve larger cumulative rewards in future. How would self-modification work? We don’t really understand how goals change or stay stable under self-modification. (One example: if an agent has a goal encoded inside itself, and open to self-modification, then of course the agent is in danger of wireheading – giving itself indefinite quantities of reward at every time-step, regardless of its actions or the state of the environment. We would want to prevent this somehow – this “somehow” might seem like a cop-out, but I think that if wireheading is one of the significant remaining obstacles, then we’re pretty close to AGI!)
In a way, I think self-modification is about the transition from stimulus-response and System 1 thinking to more System 2 thinking. If an agent introspects on recent sense-act-reward loops, and pattern-matches in a way that generalises and so allows improved action choices in future, that amounts to a move from stimulus-response (maybe System 1-type) cognition to a meta level – reflective, “thinking” about thinking, and perhaps going towards System 2-type cognition.
As already argued, I think self-improving AI agents are possible in principle, and if agents are at human level in many other respects then self-improvement seems plausible. It’s not worthwhile to think about the singularity – it turns a lot of people off, and in a way it’s irrelevant. Instead, just think about human-level AGI agents capable of self-improvement. IJ Good suggests that the result would be an “intelligence explosion” – self-improvement happening in cycles, quickly outpacing humans in speed and intelligence. No exponential curves have to go vertical for this to be a scary idea. I agree with Bostrom and others that AI motivations will not be inherently positive for humanity, and I agree with MIRI and others that kill switches, AI boxes, and other obvious AI safety measures are fallible. We don’t really know how to specify goals aligned with human values, and we certainly don’t know how to keep a set of goals consistent through cycles of self-modification. I suppose that the risk of catastrophic outcomes of self-improving AI with goals uncontrolled by humans is small, but not negligible. Therefore, I think that research into AI safety is needed.