In a previous post I argued for scalability as the guiding star for AI research. But how do we optimise for scalability? One way is to bias our research according to soft constraints (or if you will, scalability principles), that can maximise our chance that our algorithm will scale beyond the limits of what we are able to experiment with (ensuring it will make effective use of the greater computational resources we will have in future).
The first of such principles is to focus on experiential learning rather than in-built domain knowledge. In an increasingly digital world, we can expect both the amount of data and our ability to process it to continue to grow. We need to make sure our algorithm’s performance can scale as well, and is capable of leveraging increasing amounts of data as this becomes available. For instance, we should ensure that the cost of processing the data scales at most linearly with the amount of data, as anything beyond data is going to become a bottleneck in the future. Gaussian Processes, for instance, have O(N3) time complexity and O(N2) memory complexity, so their scalability with experience is limited. The complexity of training neural networks, in contrast, scales better with dataset size. Furthermore, we need to ensure that any prior knowledge or inductive bias we inject into out system can be overridden as more data becomes available. Prior domain knowledge is often an alluring source of short term progress, but can slowly constrain progress by encoding our preconceptions about how we think something should be solved. We must trust experience to lead us beyond our preconceptions towards better solutions and faster progress (c.f. Rich Sutton's "Bitter Lesson").
A second useful constraint is referred to as the aperture principle (credits: Joseph Modayil). It’s composed of two parts: (1) agents should be assumed to be tiny, compared to the environment they are embedded in. This is trivially true for physically embodied agents, where the environment can be thought to be the entire physical universe. (2) These tiny agents should be expected to interact with their environment through an even tinier interface, maybe just a few sensors to perceive their surroundings. Part (1) means that we must not assume that agents are able to ingest and process a complete description of the true state of the whole environment; this is important to scale to large complex problems. Part (2) goes a step further and notes that in most problem of interest the full state of the environment will be inherently beyond reach, regardless of the inner complexity of the agent and amount of data that it can or cannot process.
A third important principle is to be inspired but never constrained by human intelligence. Human intelligence is an obvious source of inspiration for AI researchers, as it is the most impressive demonstration that flexible goal oriented behaviour is possible. However, we must remember that the specifics of human intelligence are the by product of a rich evolutionary history where we had to contend with incentives and limitations that might not be relevant to an artificial intelligence. We already mentioned the incredible energy efficiency of the human brain and how this might not be necessarily a requirement for AI, but there are other subtle constraints that biology imposed on human intelligence that we may want to forget. One example is the extremely low bandwidth communication channels existing between humans: originally just words, movements and facial expressions. To appreciate the importance of relaxing this constrain, consider just the impact of one improvement in communication among humans: printing. Printing massively increased the bandwidth of communication and the speed at which experience and knowledge could be shared between humans by opening the possibility to broadcast from a single individual to countless others in parallel; the resulting impact on human progress is hard to overstate. However, even with printing, the receiver is tragically limited in the speed t which it may process it. Machines need not to be constrained by these human limits: they can easily share experience, predictions and knowledge of all kind; we should exploit and not fear our ability to interconnect artificial intelligences.