The Art of the Dataset
In an oriental fable, the sons of King Serendippo travel a distant road. They come across a man who has lost a camel. Is it, they ask him, blind on one side, carrying two skins on its back, one of wine, and one of oil, ridden by a pregnant woman? Yes, the man replies, then they have seen it? No, they have not. The man arrests the brothers, accusing them of theft.
The brothers explain how they came to describe the missing camel. Along the road, the grass is eaten only on one side of the track, yet it is the more barren side of the road: the animal is blind on one side. There are traces of oil and wine on each side of the track: they know there are two skins on either side of the animal, and what they carry. Just off the track, there are footprints of someone who has dismounted and been to the toilet, and a handprint in the dust where she has helped herself stand: the sign of a woman bearing such weight that she needed to help herself up – so she is pregnant.
This is the discovery of causes from observations and clues, detective work. The story lead Horace Walpole to coin ‘serendipity’. It poses an important question for those searching for a true artificial intelligence: are the brothers truly unique, or could an algorithm be programmed to draw the same conclusions from the dataset? For an AI to truly learn, that AI should be able to generate new learning, knowledge or insights from data.
This depends on how data comes to be conclusions, and what the process is to move from that data to those conclusions. How do the brothers turn the clues into knowledge, and how do we know that knowledge is true?
Probably the simplest form of drawing conclusions from data is induction, which is learning facts from experience. This suits itself to artificial intelligence because the more data, the more we can generate conclusions or rules. A well-trained AI should be able to process ever more experience, and then create rules – knowledge.
But induction has limitations: as induction learns from experience, it can miss key details that have not been experienced. It is therefore constrained by itself – the empirical constraint. Learning from data or experience alone can create dangerous errors, as Hume said: “induction requires us to believe that instances of which we have had no experience resemble those of which we have had experience”. Bertrand Russell gives a famous example: if you are a turkey on a farm, every day at 9am you are fed. More and more examples of this occur, and by December all available data suggests that every day at 9am the farmer will arrive with food. Induction would create a rule: feeding always happens at this time. However, on 24th December, the farmer comes and instead of feeding you, cuts your throat. So the induction is untrue: it is just a fact observed that, in the sampled data, holds. The data is misleading, and if an AI were to create a rule from this dataset, then the outcome would be dangerous.
So induction alone cannot be enough to truly generate knowledge or learning from data. You must be able to look outside of the dataset. A more robust way of generating conclusions is deduction: a simple, truth preserving mechanism that logically draws an inference from a proposition or rule. For example:
All the apples from this bag are red
These apples are from this bag
Therefore these apples are red
Deduction is logical and cannot contradict itself. But in the real world, life rarely delivers propositions like this that are comprehensive. In rules-based systems, i.e. a game, all possible rules can be learnt, and perfect machine deduction can beat human intelligence, as DeepMind proved by coming up with an AI to beat any human opponent at AlphaGo | DeepMind.
But the problem arises when knowledge outside of the system is required, or when information is incomplete. The requirement for knowledge is sometimes, as the sons of King Serendippo do, to establish causation from data: this is a primary function of AI. The problem is you cannot use induction, as it is flawed, and you cannot use deduction to generate knowledge without certainty and clear rules.
So how do the brothers understand causation from the data they have?
This knowledge is not derived from a rule, as deduction is. It is too individualistic. The brothers see a series of clues, and, applying their understanding of broader contexts (outside of the data), use the clues to generate a narrative. They abstract a possible narrative or hypothesis, and then test the hypothesis against more information.
This form of knowledge generation is called abduction. An abductive inference is generated not by the application of a rule to a fact, but the conjecture or imagining of an (possibly incorrect) explanation of an outcome.
This is hard to fit into an artificial intelligence system. A conjecture is not a random guess, but an uncertain, though plausible, hypothesis that is generated by reason, knowledge of context, and the fitting of clues to possibility. It is an educated guess. Abduction by its nature, might be wrong – so it cannot be deduction because it is not truth-preserving.
Abduction works because it abstracts. It involves a thinking-through of possible explanations and a crude fitting of those possibilities to the data that is provided. The explanation might be rejected, refined, or replaced, and through this refining process, it comes to be a conclusion that is knowledge. It might be that the core of human knowledge comes from this process of pulling together different clues to generate an explaining narrative: it is our hunter ancestors squatting over tracks and droppings, and, from a series of clues, abstracting the narrative of prey. It starts with conjecture.
The brothers could not have deduced the narrative they built from the clues they picked up, because the range of possible explanations is too numerous, and they did not have rules to govern a situation so specific. But they can generate an explanation from the outcome through conjecture.
Would an algorithm be able to produce this explanation? An algorithm could use induction to generate rules from the data, but it could not offer explanations that come from outside the data. An algorithm could use deduction to apply rules to datasets. But there could be no rules to lead to ‘camel passed this way’ because there is any number of possible explanations for the data. Deduction can only work in limited circumstances. Abductive inference, however, often involves choosing from a vast pool of possible options, and does so by applying all lived experience to consider likely scenarios: it returns us to conjecture as to the basis on which causes are considered (but not necessarily concluded).
However much data an algorithm can hold and process, there is still the leap of creating individual cases from general rules. In human experience, this is difficult and often requires the whole lifetime of human experiences to come up with explanations for specific things. If abductive inference involves making a guess, however educated it may be, then it is hard to see how an ML algorithm is better placed to do that than a human – at least at this stage.
An AI can generate possibilities but not narratives: the TarGuess algorithm creates options for passwords from likely scenarios and then tests the hypothesis against a security software – but this is to create an answer to a binary solution and not a narrative. Abductive inference allows us to do both and given it is not a logical process, it seems like the interaction between humans and artificial intelligences will remain for some time yet.