Academia.eduAcademia.edu
A New Look at Hume’s Theory of Probabilistic Inference Mark Collier Hume Studies Volume 31, Number 1, (2005) 21 - 36. Your use of the HUME STUDIES archive indicates your acceptance of HUME STUDIES’ Terms and Conditions of Use, available at http://www.humesociety.org/hs/about/terms.html. HUME STUDIES’ Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the HUME STUDIES archive only for your personal, non-commercial use. Each copy of any part of a HUME STUDIES transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. For more information on HUME STUDIES contact humestudies-info@humesociety.org http://www.humesociety.org/hs/ Hume Studies Volume 31, Number 1, April 2005, pp. 21–36 A New Look at Hume’s Theory of Probabilistic Inference MARK COLLIER 1. Hume’s Theory of Probabilistic Inference Historians of philosophy do not usually take Hume’s theory of probabilistic in­ ference seriously. For some scholars, Hume’s account is dismissed because of its misguided reliance upon psychological rather than logical methods.1 Others are more sympathetic to Hume’s naturalistic approach, but regard the specific propos­ als of his positive account as hopelessly naïve. If his contributions are to be judged as part of the empirical science of man . . . then his ‘results’ will appear ludicrously inadequate, and there will be no reason to take him seriously.2 Still others are willing to defend many of Hume’s positive proposals, but single out his account of probabilistic inference as “unsatisfactory”3 and “dubious.”4 In this paper, I challenge these disparaging assessments. I argue that Hume’s theory of probabilistic inference is neither misguided nor inadequate; quite the contrary, it stands at the leading edge of our contemporary science of the mind. Hume agrees with Leibniz that previous philosophers have been “too concise when they treat of probabilities” (T Abs.4; SBN 647; cf. EHU 6.4; SBN 59). In section 1.3.12 of the Treatise, he attempts to remedy this situation by laying out a theory of “conjectural or probable reasonings” (T 1.3.12.20; SBN 139). It is important to be Mark Collier is Assistant Professor of Philosophy, University of Minnesota, Morris, MN 56267 USA. e-mail: mcollier@morris.umn.edu 22 Mark Collier clear at the outset that Hume is interested primarily in a psychological rather than a metaphysical approach to probability. Hume does in fact subscribe to a particular metaphysical interpretation of probability, according to which it is nothing but a reflection of our ignorance concerning hidden causes, but this position is clearly of secondary interest and is not one that he defends at any length. His main concern is to explain how we manage to make predictive inferences under conditions of uncertainty, and for this issue, questions about the metaphysical nature of prob­ ability are idle; our philosophical interpretations of probability, he maintains, have no influence on how we carry out probabilistic inferences in our everyday lives. When causes are not followed by their usual effects, the vulgar take this as an indication of “contingency” in the cause, by virtue of which the same cause can sometimes produce different effects. In contrast, philosophers retain their com­ mitment to the causal principle, and explain away putative counter-examples in terms of the “secret operation of contrary causes.” The vulgar, who take things according to their first appearance, attribute the uncertainty of events to such an uncertainty in the causes, as makes them often fail of their usual influence, tho’ they meet with no obstacle nor impediment in their operation. But philosophers observing, that al­ most in every part of nature there is contain’d a vast variety of springs and principles, which are hid, by reason of their minuteness and remoteness, find that ’tis at least possible the contrariety of events may not proceed from any contingency in the cause, but from the secret operation of contrary causes. (T 1.3.12.5; SBN 132) Nevertheless, philosophers who reject causal indeterminacy have no choice but to rely upon probabilities for guidance in their everyday lives. But however philosophers and the vulgar may differ in their explication of the contrariety of events, their inferences from it are always of the same kind, and founded on the same principles. (T 1.3.12.6; SBN 132; cf. EHU 6.4; SBN 58) When philosophers make decisions or predictions under conditions of uncertainty, they must make probabilistic calculations just like the vulgar. Hume’s primary concern in T 1.3.12 involves the nature of our commonplace probabilistic inferences. What types of sensory perceptions lead us to make them? What degrees of belief do they generate? Which faculties of the mind enable us to draw such inferences? Hume regards these as empirical questions, and in order to make progress on them, he turns to the resources of his science of human na­ ture. His strategy is to show that probabilistic inferences are a species of inductive Hume Studies Hume on Probabilistic Inference 23 inferences, and therefore can be explained in terms of the “same principles” (T 1.3.11.1; SBN 124). In order to properly understand Hume’s theory of probabilistic inference, then, we must briefly review his psychological explanation of induc­ tion in T 1.3.6. Hume begins his psychological explanation of induction by describing the be­ havior of his fellow men: we have a tendency to make inductive inferences whenever we observe a conjunction between two types of events (T 1.3.6.2; SBN 87). Following Stroud, we can reconstruct Hume’s description in the following terms. Inference from Experience Past Experience (PE): Previously observed A-type events have been followed by B-type events. Present Impression (PI): X is an A-type event. Future Expectation (FE): X will be followed by a B-type event. In other words, Hume discovers the following psychological fact about human be­ ings: each time we witness sensory perceptions such as (PE) and (PI) in the above schema, we come to have the type of expectation described in (FE). In the next step of his investigation, Hume attempts to explain this fact by drawing on his theory of the imagination. His hypothesis is that our capacity to make inferences from experience depends upon the interaction of the sensory infor­ mation registered in (PE) and (PI) with associative principles of the imagination. When the mind, therefore, passes from the idea or impression of one object to the idea or belief of another, it is . . . determin’d . . . by certain principles, which associate together the ideas of objects, and unite them in the imagination. (T 1.3.6.12; SBN 92; cf. EHU 5.2–5; SBN 41–3) The faculty of imagination is governed by three laws of association: contiguity, resemblance, and causation (T 1.1.4.1–4; SBN 11–12). The task of Hume’s psycho- logical explanation is to show that these minimal resources are all that is needed in order to explain why we make the inductive inferences that we do. It is the principle of resemblance, according to Hume’s hypothesis, that ac­ counts for why we assimilate the present impression (PI) to the previously observed event types (PE). In reality, all arguments from experience are founded on the similarity, which we discover among natural objects, and by which we are induced to expect effects similar to those which we have found to follow from such objects. (EHU 4.20; SBN 36; cf. T 1.3.6.14; SBN 93) Volume 31, Number 1, April 2005 24 Mark Collier Hume borrows here from the theory of general ideas that he developed in T 1.1.7. On that account, the imagination automatically categorizes objects and events according to the class of instances towards which they have the highest degrees of resemblance (T 1.1.7.15; SBN 23). We classify the event token (PI) as an A-type event, in other words, because it is more similar to A-type events than any other event types in memory.5 The principle of resemblance accounts for why we categorize the event token as an A-type event, but it does not explain why we anticipate that it will be fol­ lowed by a B-type event. In order to deal with this further fact, Hume appeals to another law of association, the principle of causation. The principle of causation states that whenever we repeatedly observe the relation of contiguity between event types, they will become connected in the imagination. Thus, since A-type events have always been followed by B-type events, the principle of causation entails that we will infer from the A-type event (PI) to the B-type event (FE). Hume’s official psychological explanation of induction, then, is that the prin­ ciple of resemblance explains why the event token (PI) is assimilated to the A-type events in memory (PE), and the principle of causation explains the expectation of a B-type event (FE). Hume’s project in T 1.3.12 is to demonstrate that these same associative principles can be put to work in order to explain how we make probabilistic inferences. The probabilities of causes are of several kinds; but are all deriv’d from the same origin, viz. the association of ideas to a present impression. (T 1.3.12.2; SBN 130) Hume begins his examination of probabilistic inferences, as he did with induc­ tive inferences, with an observation concerning the common behavior of human beings. He notices that we typically make probabilistic inferences whenever we perceive inconstant conjunctions between events. Moreover, he observes that when we perceive inconstant conjunctions, our future expectations are accom­ panied by partial degrees of belief; we say that it is “likely” or “probable” that they will co-occur in the future.6 In the next step of his investigation, Hume attempts to explain why inconstant conjunctions give rise to partial degrees of belief. His strategy once again is to appeal to his theory of the imagination; his goal is to show that the various species of probabilistic inference can be explained, without remainder, in terms of the elementary principles of association. The first species of probabilistic inference, according to Hume, occurs when- ever the conjunction between events in (PE) involves a small sample (T 1.3.12.2; SBN 130–1). We can characterize this type of probabilistic inference in the fol­ lowing terms. Hume Studies Hume on Probabilistic Inference 25 Inference from Small Sample PE: A small sample of previously observed A-type events have been followed by B-type events. PI: X is an A-type event. FE: X has a minimal likelihood of being followed by a B-type event. When we observe one event follow another repeatedly, but not extensively, we are willing to infer one from the other, but we do so with hesitation; as Hume puts it in the Enquiries, “it is only after a long course of uniform experiments in any kind, that we attain a firm reliance and security, with regard to a particular event” (EHU 4.20; SBN 36). Hume maintains that the associative principle of causation explains the fact that our inferences from small samples are characterized by relatively low degrees of belief. As the habit, which produces the association, arises from the frequent conjunction of objects, it must arrive at its perfection by degrees, and must acquire new force from each instance, that falls under our observation. The first instance has little or no force: The second makes some addition to it: The third becomes still more sensible; and ’tis by these slow steps, that our judgment arrives at a full assurance. (T 1.3.12.2; SBN 130) According to the principle of causation, the strength of the association between events is a function of the frequency of their co-occurrence. Hume’s associationist hypothesis, therefore, predicts that our assurance will gradually increase in pro- portion to the size of the sample and that small samples would generate relatively low levels of confidence. The second species of probabilistic inference involves cases where the sample in (PE) is large, but where the present impression in (PI) has a partial resemblance to the events in memory.7 Hume refers to such cases as “probability deriv’d from analogy” (T 1.3.12.25; SBN 142). He has in mind the following type of inference from experience. Inference from Analogy PE: Previously observed A-type events have been followed by B-type events. PI: X partially resembles an A-type event. FE: X will likely be followed by a B-type event. Hume maintains that inferences from analogy can be explained in terms of the principles of association. The principle that does the explanatory work in this case is the associative law of resemblance. Volume 31, Number 1, April 2005 26 Mark Collier [I]n the probability deriv’d from analogy, ’tis the resemblance only, which is affected. Without some degree of resemblance, as well as union, ’tis impossible there can be any reasoning: but as this resemblance admits of many different degrees, the reasoning becomes proportionally more or less firm and certain. (T 1.3.12.25; SBN 142) The principle of resemblance entails that inferences from analogy will be attended with varying levels of uncertainty. The crucial point is that the principle of resem­ blance “admits of many different degrees.” The stronger the similarity between past and present events, then, the more inductive confidence we will have in our future expectations. Since the degrees of belief in (FE) are proportional to the degree of resemblance between (PI) and (PE), it follows that whenever we observe partial resemblances, our degrees of belief will be partial as well. Hume has shown that his associationist psychology can account for how we ordinarily make inferences from small samples and partial resemblances. He manages to do so, we have seen, because of the flexibility of the principles of association. There are two ways in which the conjunction between events can be inconstant: the quantity of the sample can be small or there can be qualitative variation among its instances. The principles of the imagination explain how we make probabilistic inferences in each case, Hume maintains, because the strength of the association between events will vary in proportion to the constancy of the conjunction. As Hume puts it, “[i]f you weaken either the union or resemblance, you weaken the principle of transition, and of consequence that belief, which arises from it.”8 Hume’s associationist hypothesis not only explains how we make inferences from experience in such cases, then, but it also explains why we do so with varying levels of confidence. Hume’s associationist hypothesis faces a more difficult challenge, however, with the third species of probabilistic inference. The third species of probabi­ listic inference involves cases where the conjunction between events in (PE) is composed of mixed frequencies, or what Hume calls “contrariety” (T 1.3.12.4–19; SBN 131–8). Twou’d be very happy for men in the conduct of their lives and actions, were the same objects always conjoin’d together, and we had nothing to fear but the mistakes of our own judgment, without having any reason to apprehend the uncertainty of nature. But as ’tis frequently found, that one observation is contrary to another, and that causes and effects follow not in the same order, of which we have had experience, we are oblig’d to vary our reasoning on account of this uncertainty, and take into con­ sideration the contrariety of events. (T 1.3.12.4; SBN 131) Hume Studies Hume on Probabilistic Inference 27 We can represent inferences from mixed frequencies in terms of the following schema. Inference from Mixed Frequency PE: Some previously observed A-type events have been followed by B-type events, and some previously observed A-type events have been followed by C-type events. PI: X is an A-type event. FE: X will likely be followed by a B-type event or a C-type event. Hume’s examples of such “irregular” conjunctions typically involve medical cases. Sometimes rhubarb proves a purge and sometimes it does not; sometimes opium puts one to sleep and other times it does not (EHU 6.4; SBN 57–8). Once again, philosophers do not regard such irregularities as violations of the causal principle, but merely as a reflection of our ignorance concerning the real causes at work; nevertheless, “[o]ur reasonings . . . and conclusions concerning the event are the same as if this principle had no place” (T 1.3.12.25; SBN 142). That is, when philosophers must decide whether or not to ingest rhubarb or opium, they have no choice but to rely upon mixed frequencies in order to calculate the probability that these medicines will prove to be effective cures. In order to explain how we make inferences from mixed frequencies, Hume once again turns to the resources of his science of human nature. Let us suppose, for simplicity’s sake, that A-type events have been followed by B-type events four times, and A-type events have been followed by C-type events three times. Accord­ ing to Hume’s exemplar-based theory of general ideas, this frequency information will be represented in memory in terms of separately stored instances. AB = {a1 b1, a2 b2, a3 b3, a4 b4} AC = {a1 c1, a2 c2, a3 c3} What will happen, then, the next time we observe an A-type event? Hume claims that there are “two hypotheses” concerning the manner in which we “transfer” these event sequences from memory to our future expectations. First, That the view of the object, occasion’d by the transference of each past experiment, preserves itself entire, and only multiplies the number of views. Or, secondly, That it runs into the other similar and corre­ spondent views, and gives them a superior degree of force and vivacity. (T 1.3.12.19; SBN 138) The first hypothesis, in other words, is that we transfer all the particular events that have been associated with A-type events in the past. If this were the case, Volume 31, Number 1, April 2005 28 Mark Collier then the content of our future expectation would consist of a disjunctive list of event tokens. (FE) = {b1 v b2 v b3 v b4 v c1 v c2 v c3} Hume maintains that we need only introspect, however, in order to recognize the implausibility of this hypothesis. Experience informs us that our future expecta­ tions consist “in one conclusion, not in a multitude of similar ones” (T 1.3.13.25; SBN 142). Moreover, it is implausible on theoretical grounds to maintain that the mind can represent, at one time, a long list of events; as Hume puts it, the disjunc­ tive list events would usually be “too numerous to be comprehended distinctly by any finite capacity” (T 1.3.13.25; SBN 142). When we make inferences from mixed frequencies, then, it must be the case that the “similar views run into each other, and unite their forces” (T 1.3.13.25; SBN 142). The only plausible hypothesis, in other words, is that we perform a sum­ mary computation when we transfer event sequences from memory. The separately stored instances, as Hume puts it, are united by the principles of the imagination into a “general view” (T 1.3.12.17; SBN 137). When we transfer contrary experiments to the future, we can only repeat the contrary experiments with their particular proportions; which cou’d not produce assurance in any single event, upon which we reason, unless the fancy melted together all those images that concur, and extracted from them one single idea or image, which is intense and lively in proportion to the number of experiments from which it is deriv’d, and their superiority above their antagonists. (T 1.3.12.22; SBN 140; cf. EHU 6.3; SBN 57) Through this process of amalgamation, similar events combine their strengths and contrary events cancel each other out. As a result, a composite representation of the frequency information will be formed in the imagination. Moreover, when we observe another A-type event, our future expectations will be proportional to the relative frequencies with which B-type and C-type events have followed A-type events in the past. Hume does not provide any precise account of the processes whereby the separately stored event sequences are brought together into a single representa­ tion. The problem is that principles of the imagination do not supply him with the resources to do so. The principle of resemblance accounts for the classifica­ tion of the event token in (PI), but it does not explain why we develop the future expectations that we do. The principle of causation explains why the A-type event (PI) gives rise to the future expectation of B-type or C-type events, but it does not explain how we manage to form a single probability estimate that is proportional to the mixed frequencies in memory. Indeed, the principle of causation would appear to support the first hypothesis concerning the transference of particular Hume Studies Hume on Probabilistic Inference 29 event sequences from memory; after all, each of the particular event tokens has been associated with A-type events in the past. Barry Gower is correct to point out, then, that “it is hard to see how to account for any probabilities arising from insufficiency of evidence in terms of an associa­ tionist psychology.”9 Hume recognizes this difficulty, and it leads him to lapse into metaphorical talk of particular event tokens “melting together” into composite representations. Hume glimpses, somewhat darkly, that the solution to this prob­ lem must involve summary computations whereby the mixed frequencies stored in memory are combined into a unified representation. But Hume does not offer a sufficient explanation of the combinatorial process through which “we extract a single judgment from a contrariety of past events” (T 1.3.12.8; SBN 134). This lacuna in Hume’s theory of probabilistic inference has led Hume scholars to turn to non-associationist resources in order to explain how inferences from mixed frequencies are performed. The dominant tendency is to reconstruct these inferences in terms of the Carnap-Reichenbach Straight Rule or the mathematical rules of Bayesianism.10 Others interpret the combination of mixed frequencies in terms of a theory of mental oscillations; on this account, probabilities are measured by the amount of time it takes to survey the various event sequences.11 But these interpretations unnecessarily leave behind the spirit and letter of Hume’s account. Hume makes it quite clear, after all, that the combination of mixed frequencies involves “an operation of the fancy” (T 1.3.12.22; SBN 140).12 What is needed in order to defend Hume, in a manner consistent with his general approach to hu­ man nature, is a precise associationist account of how particular event sequences in memory can be combined into a single representation. And as we shall see in the next section, such an account is now available to us. Contemporary associa­ tive theories provide the resources that are needed in order to explain how single probability estimates can be performed on the basis of mixed frequencies. 2. Recent Evidence for Hume’s Theory of Probabilistic Inference In the Introduction to the Treatise, Hume promises to ground his science of hu­ man nature on “careful and exact experiments” (T Intro.8; SBN xvii). In practice, however, Hume’s experimental methods appear substandard when compared to those of his contemporaries, such as the physicists in the Royal Society.13 Hume was aware of the laboratory experiments being performed in the physical sciences; he merely thought that they could not be applied to the human sciences. Moral philosophy has, indeed, this peculiar disadvantage, which is not found in natural, that in collecting its experiments, it cannot make them purposely, with premeditation, and after such a manner as to satisfy itself concerning every particular difficulty which may arise. When I am Volume 31, Number 1, April 2005 30 Mark Collier at a loss to know the effects of one body upon another in any situation, I need only put them in that situation, and observe what results from it. But should I endeavor to clear up after the same manner any doubt in moral philosophy, by placing myself in the same case with that which I consider, ’tis evident this reflection and premeditation would so disturb the operation of my natural principles, as must render it impossible to form any just conclusion from the phenomenon. (T Intro.10; SBN xix) It simply never occurred to Hume that he need not perform these experiments on himself, and that he could make use of experimental subjects (such as under- graduates) who would carry out the tasks without “premeditation.” Of course, one can easily excuse Hume for this oversight, since as Daniel Robinson points out in Toward a Science of Human Nature, it was not until the nineteenth and twentieth centuries that psychologists developed the rigorous methods with which we are familiar today.14 One must concede the point, then, that “Hume had no way of empirically test­ ing his hypothesis.”15 We need not speculate about how well his hypothesis would have held up under examination, however, since contemporary psychologists have devised an experimental paradigm with which to test it. In these experiments, known as probability learning tasks, subjects are presented with frequency infor­ mation about the co-occurrence of events, and are asked to estimate the degree to which these events are related.16 These subjective ratings are then compared with a normative standard, called “contingency,” which measures the actual co-variation between the events.17 There are three important results of the probability learning task experiments. First, the experimental findings demonstrate that subjects are extremely sensi­ tive to the degrees of contingency in the data.18 In experiment after experiment, the subjective ratings of the relation between events correspond quite closely to their actual co-variation. Second, the experimental results show that contin­ gency learning proceeds in gradual fashion; the ratings typically start close to zero, and increase in small steps until they correspond to the objective degree of contingency in the data.19 Finally, the probability learning tasks reveal that the contingency ratings depend upon the degree of resemblance between the events presented during training; the more similar the events, the more strongly they become associated.20 These experimental studies provide confirmation, then, for Hume’s claim that we adjust our degrees of belief according to the resemblance, contrariety, and sample size of the events that are perceived. But do they also support Hume’s contention that our capacity to make probabilistic inferences can be exhaustively explained in terms of the principles of association? The probability learning experi­ ments demonstrate that subjects proportion their degrees of belief to the evidence, Hume Studies Hume on Probabilistic Inference 31 but they do not tell us how they manage to do so. The psychological experiments provide us with precise measures of the sensory input (frequency information and stimulus similarity) and behavioral output (confidence ratings), but they remain silent about the psychological processes that underlie performance in the tasks. The most influential explanation of contingency learning in contemporary psychology is the Rescorla-Wagner Model.21 According to this model, probabilistic learning can be analyzed in terms of a competitive learning rule that modifies as­ sociative weights on a trial-by-trial basis.22 The Rescorla-Wagner Model has proven extremely effective in accounting for the results of the probabilistic learning task experiments.23 First, a learning rate parameter in the model predicts that the learning curves observed in the experiments will be gradual in nature. Second, the model can explain, through stimulus-generalization, why the degree of resemblance between events plays an important role in the subjective ratings.24 Third, the competitive nature of the learning rule entails that subjects will adjust their probability estimates according to the contrariety between events. Indeed, it has been demonstrated that, when there are two variables, the Rescorla-Wagner learning rule is mathematically equivalent to the measure of contingency.25 When the Rescorla-Wagner learning rule modifies associative strengths on a trial-by-trial basis, therefore, it implicitly calculates the co-variation between events. The probability learning experiments can also been explained, at the mecha­ nistic level, in terms of adaptive neural networks. Gluck and Bower demonstrate that two-layer connectionist networks can simulate the behavior in the experi- ments.26 In addition, David Shanks shows that these networks exhibit excellent fit to the learning curves described in the experimental results; the networks, like the subjects, begin with no sensitivity to the contingency between events, but improve on a trial by trial basis, until eventually they converge on the actual degree of contingency in the training sample.27 The ability of adaptive neural networks to explain how we make contingency judgments comes as no surprise. These networks rely upon a learning rule, known as the Delta Rule, which is formally equivalent to the Rescorla-Wagner rule.28 As a result, when a connectionist network modifies its associative weights according to the Delta Rule it is, in effect, computing the contingency between events.29 The networks in these simulations incorporate feed-forward architectures, however, and thus are unable to learn about statistical dependencies that span across event sequences.30 In order to model temporal contingency learning, Axel Cleeremans and his colleagues at Carnegie Mellon University have turned to Simple Recurrent Networks. The network is recurrent because information not only flows from the sensory input layer to the hidden layer, but also back down from the hid- den layer to the input layer. This recurrent connection provides the network with short-term memory, which is necessary in order for the network to learn about event sequences that unfold over time, such as the following example. Volume 31, Number 1, April 2005 32 Mark Collier ABCDEABCEDABCDEABCEDABCDEACBEDABCDEABCED . . . Notice that this sequence is composed of recurring event types with different relative frequencies; for example, although A-type events reliably predict B-type events, C-type events only sometimes predict D-type events. From a computational point of view, the sequence of events constitutes a probabilistic function, and the task the network faces is to learn the mapping between event types. The Simple Recurrent Network learns to approximate this probabilistic function by changing its weights in such a way as to drive down the errors in its predictions from state to state, and it eventually settles on a single set of weights that associates each of the cues with their respective outcomes. Suppose, for example, that the data set on which the network is trained consists of a non-deterministic sequence in which C-type events have been followed by D-type events forty percent of the time and E-type events sixty percent of the time. With sufficient training, the network’s hidden units will be “shaded” in such a way that the next time it observes a C-type event it will estimate the conditional probabilities of the possible succes­ sors—D and E—in a manner that is proportional to their past frequencies.31 These computational simulations show, then, that our capacity to make complex probabilistic inferences can be accounted for in terms of the operations of a simple associative mechanism. After all, when the networks classify a novel event token as an instance of a type, they do so according to its degree of resemblance to event classes in memory. Moreover, the future expectations of the networks depend upon the degree of constancy in the conjunction between event types; the networks simply expect the event type that has followed the novel event token with the highest frequency. The SRN model of contingency learning also provides an elegant solution to the problem, recognized by Hume, concerning the processes whereby mixed frequencies are combined into a single probability estimate. In order to solve the probability learning task, the SRN updates the configuration of its hidden unit weights on each trial, which entails that the network will automati­ cally summarize the event frequencies through a process known by connectionist researchers as superposition.32 These psychological experiments and computational models from cognitive science provide convergent evidence, therefore, for Hume’s hypothesis that the various species of probabilistic inferences can be explained in terms of elementary associationist principles. In the Rescorla-Wagner Model and neural networks, probabilistic inferences are understood solely in terms of the automatic, implicit, trial-by-trial adjustment of associative connections. Contemporary researchers on associative learning agree with Hume that our commonplace probabilistic inferences can be understood in this manner; they merely disagree with Hume over whether or not he provided a complete account of the associative learning principles. Hume thought that our probabilistic inferences could be exhaustively explained in terms of the principles of resemblance and causation. The recent Hume Studies Hume on Probabilistic Inference 33 evidence from cognitive science, however, suggests that his account must be supplemented with “constraining principles” such as superposition and cue com- petition.33 In any case, this addition would be welcomed by Hume, who admits that his enumeration of the principles of association is revisable and open-ended (EHU 1.3.3; SBN 24). While historians of philosophy tend to treat Hume’s theory of probabilistic inference as an embarrassment, his account receives much warmer praise from contemporary researchers in cognitive science who have turned the question of how we make probabilistic inferences into an empirical research program. It is Hume’s model, or refinements of it, which have come to be adopted by many contemporary psychologists, and which seem indeed to be best confirmed by the experimental data on animals and humans.34 It is no overstatement to say that over the last half century “the Humean view” has become the dominant position in psychological research on how we make infer­ ences under conditions of uncertainty.35 We must resist, therefore, the tendency of Hume scholars to treat the particular hypotheses of his science of human nature as a source of disrepute. In the case of Hume’s theory of probabilistic inference, at least, his account has much more going for it than interpreters have previously recognized. It is simply unfair to claim that Hume’s theory of probabilistic inference is “speculative psychology which is not . . . of lasting interest.”36 On the contrary, it stands at the leading edge of our contemporary science of the mind. NOTES An earlier version of this paper was presented at the 30th Annual Hume Conference on “Probability, Chance, and Judgment” at the University of Nevada, Las Vegas. References to Hume’s works will be inserted into the text, using a letter or acronym for the titles followed by the page number, as follows: EHU = David Hume, An Enquiry Concerning Human Understanding, ed. T. L. Beauchamp (Oxford: Oxford University Press, 1999). T = David Hume, A Treatise of Human Nature, ed. D. F. Norton and M. J. Norton (Oxford: Oxford University Press, 2000). SBN = David Hume, A Treatise of Human Nature, ed. L.A. Selby-Bigge, 2nd edition, revised by P. H. Nidditch (Oxford: Clarendon Press, 1978) and David Hume, Enquiries concerning Human Understanding and concerning the Principles of Morals, ed. L.A. Selby-Bigge, 3rd edition, revised by P. H. Nidditch (Oxford: Clarendon Press, 1975). Volume 31, Number 1, April 2005 34 Mark Collier 1 D. C. Stove, Probability and Hume’s Inductive Skepticism (Oxford: Clarendon Press, 1973), 120; cf. A. Flew, David Hume: Philosopher of Moral Science (Oxford: Blackwell Publishers, 1986), 124. 2 B. Stroud, Hume (London: Routledge, Kegan & Paul, 1977), 223. 3 N. Kemp Smith, The Philosophy of David Hume (London: MacMillon and Co., 1941), 430. 4 D. G. C. MacNabb, David Hume: His Theory of Knowledge and Morality (Archon Books, 1966): 84. 5 For a more detailed account of Hume’s associationist theory of general ideas, see my “Hume and Cognitive Science: The Current Status of the Controversy over Abstract Ideas,” in Phenomenology and the Cognitive Sciences 4 (2005). 6 This is not the case with inferences from constant conjunctions. When we have ob­ served a large and uniform sample, our confidence levels are characterized by maximal levels of insurance; as Hume puts it, they “exceed probability” and are “entirely free from doubt and uncertainty” (T 1.3.11.2; SBN 124). 7 The notion of resemblance is a notoriously difficult notion, and Hume says little in the Treatise to clarify his use of the term. His famous note in the Appendix to the Treatise, however, sheds a good deal of light on how he conceives of this notion. If X and A are complex ideas, then we can say that X partially resembles A in so far as the two ideas share “common circumstances” (T 1.1.7n; SBN 637). That is, if X contains the simple ideas d, e, and f, and B contains the simple ideas d and e, then they partially resemble one another in respect of properties d and e. Of course, the resemblance of properties d and e, in so far as they are simple ideas, cannot be further explained in this manner. As Quine points out, empiricists must commit themselves to the doctrine that human beings are born with innate quality spaces. See W. V. O. Quine, Ontological Relativity and Other Essays (New York: Columbia University Press, 1969), 123. Indeed, such a doctrine would seem to be presupposed by Hume’s assertion that the simple ideas BLUE and GREEN are intrinsically more similar than the simple ideas BLUE and SCARLET (T 1.1.7n; SBN 637). 8 Ibid. 9 B. Gower, “Hume on Probability,” British Journal for the Philosophy of Science 42 (1991): 1–19. 10 A. Mura, “Hume’s Inductive Logic, ” Synthese 115 (1998): 307; cf. Gower, 15. 11 P. Maher, “Probability in Hume’s Science of Man,” Hume Studies 7 (1981): 149; cf. L. Loeb, Stability and Justification in Hume’s Treatise (Oxford University Press, 2002), 234. 12 Admittedly, Hume states that many of our inferences from mixed frequencies are reflective and explicit in character. As he puts it, “we commonly take knowingly into consideration the contrariety of past events” and “carefully weight the experiments, which we have on each side” (T 1.3.12.7; SBN 133). But Hume immediately clarifies these statements by adding the proviso that reflection arises from habit in an “oblique manner” (ibid.). The rest of T 1.3.12 is dedicated to a psychological explanation of the Hume Studies Hume on Probabilistic Inference 35 indirect manner by which the imagination extracts a single probability estimate from mixed frequencies. Hume’s official position, then, is that many of our probabilistic inferences involve conscious awareness, but the operations of the mind that underlie them involve nothing but associative propensities of the mind. As Loeb puts it, “Hume’s objective is to show that reflection and deliberation on the probability of causes is itself an associationist process” (Loeb, 230). 13 J. Noxon, Hume’s Philosophical Development: A Study of His Methods (Oxford: Clar­ endon Press, 1973), 120. 14 D. Robinson, Toward a Science of Human Nature (New York: Columbia University Press, 1982). 15 E. Fales and E. A. Wasserman, “Causal Knowledge: What Can Psychology Teach Philosophers?” Journal of Mind and Behavior 13 (1992): 1. 16 G. Chapman and S. Robbins “Cue Interaction in Human Contingency Judgment,” Memory & Cognition 18 (1990): 537. 17 D. Shanks, “Hume on the Perception of Causality,” Hume Studies 11 (1985): 105. 18 A. Dickinson and D. Shanks, “Animal Conditioning and Human Causality Judg­ ment,” in Perspectives on Learning and Memory, ed. L. Nilsson and T. Archer (Hillsdale, N.J.: Erlbaum, 1985); cf. L. G. Allan, “Human Contingency Judgments: Rule Based or Associative?” Psychological Bulletin 114 (1993): 440. 19 D. Shanks, The Psychology of Associative Learning (Cambridge: Cambridge University Press, 1995): 31–3. 20 D. S. Blough, “Steady State Data and a Quantitative Model of Operant Generaliza­ tion and Discrimination,” Journal of Experimental Psychology: Animal Behavior Processes 1 (1975): 3–21;. R. A. Rescorla and D. R. Furrow, “Stimulus Similarity as a Determinant of Pavlonian Conditioning,” Journal of Experimental Psychology: Animal Behavior Processes 3 (1977): 212. Technical note: Blough employs a “common element” notion of similarity in his experiments. He shows that the degree of association between events is a function of their common elements. Rescorla and Furrow demonstrate that when paired events are qualitatively similar the association between them is “substantially superior.” In their first two experiments, Rescorla and Furrow show that this phenomenon holds when the stimulus similarity involves a common modality; when auditory cues are paired with auditory cues, for example, the association is stronger than when auditory cues are paired with visual cues. In their third experiment, they demonstrate that stimulus similarity within a modality (e.g., the dimensions COLOR and SHAPE) also increases the strength of the association. 21 R. A. Rescorla and A. R. Wagner, “A Theory of Pavlonian Conditioning: Variations in the Effectiveness of Reinforcement and Non-Reinforcement,” in Classical Conditioning II: Current Research and Theory, ed. A. H. Black and W. F. Prokaksy (New York: Appleton- Century-Crofts, 1972). 22 Allan, 439. 23 R. Miller et al. “Assessment of the Rescorla-Wagner Model,” Psychological Bulletin 117 (1995): 381. Volume 31, Number 1, April 2005 36 Mark Collier 24 Ibid., 365. 25 G. Chapman and S. Robbins, “Cue Interaction in Human Contingency Judgment,” Memory & Cognition 18 (1990): 545. 26 M. Gluck and G. Bower, “From Conditioning to Category Learning: An Adaptive Network Model,” Journal of Experimental Psychology: General 117 (1988): 241. 27 Shanks, 114–5. 28 R. S. Sutton and A. G. Barto, “Toward a Modern Theory of Adaptive Networks: Expectation and Prediction,” Psychological Review 88 (1981): 155–6. 29 Shanks, 110. 30 J. L. Elman, “Finding Structure in Time,” Cognitive Science 14 (1990): 189. 31 D. Servan-Schreiber, A. Cleeremans, and J. L. McClelland, “Graded State Machines: the Representation of Temporal Contingencies in Simple Recurrent Networks,” Machine Learning 7 (1991): 181. 32 For a discussion of superposition in connectionist networks, see my “Filling the Gaps: Hume and Connectionism on the Continued Existence of Unperceived Objects,” Hume Studies 25 (1999): 161. 33 I. Gormezano and E. Kehoe, “Classical Conditioning and the Law of Contiguity.” in Predictability, Correlation, and Contiguity, ed. P. Harzem and M. D. Zeiler (New York: John Wiley & Sons 1981), 38. 34 Fales and Wasserman, 8. 35 P. W. Cheng et al., “A Causal-Power Theory of Focal Sets,” in Causal Learning: Advances in Research and Theory, ed. D. R. Shanks, D. L. Medin, and K. J. Holyoak (San Diego: Academic Press, 1996), 314. 36 I. Hacking, “Hume’s Species of Probability,” Philosophical Studies 33 (1978): 23. Hume Studies