Debiasing the crowd: selectively exchanging social information improves collective decision-making Short title: Selectively exchanging social information improves collective decisions

Collective decision-making is ubiquitous across biological systems. However, biases at the individual level can impair the quality of collective decisions. One such prime bias is the human tendency to underestimate quantities. Former research on social influence in human estimation tasks has generally focused on the exchange of single estimates, showing that randomly exchanging single estimates does not reduce the underestimation bias. Here we performed estimation experiments to test whether leveraging prior knowledge about this bias when designing the structure of information exchange can attenuate its effects. Participants had to estimate a series of quantities twice. After providing a personal estimate, they received estimates from one or several group members, and could revise their personal estimate. Our purpose was threefold: (i) to investigate whether restructuring the information exchange can reduce the underestimation bias, (ii) to study how the number of estimates exchanged affects accuracy, and (iii) to shed ∗Corresponding author – jayles@mpib-berlin.mpg.de 1 ar X iv :2 00 3. 06 86 3v 1 [ ph ys ic s. so cph ] 1 5 M ar 2 02 0 light on the mechanisms underlying the integration of multiple pieces of social information. Our results show that leveraging prior knowledge about the underestimation bias allows to select and exchange the estimates that are most likely to attenuate its effects. Crucially, this exchange method operates without any reference to the truth. Moreover, we find that exchanging more than one estimate also reduces the underestimation bias. Underlying these results are a human tendency to herd, to trust large numbers more than small numbers, and to follow disparate social information less. Using a computational modeling approach, we demonstrate that these effects are indeed key to explain our experimental results. We then use the model to explore the conditions under which estimation accuracy can be improved further. Overall, our results show that existing knowledge on biases can be used to boost collective decision-making, paving the way for combating other cognitive biases threatening collective systems.

in human estimation tasks has generally focused on the exchange of single estimates, showing that randomly exchanging single estimates does not reduce the underestimation bias. Here we performed estimation experiments to test whether leveraging prior knowledge about this bias when designing the structure of information exchange can attenuate its effects. Participants had to estimate a series of quantities twice. After providing a personal estimate, they received estimates from one or several group members, and could revise their personal estimate. Our purpose was threefold: (i) to investigate whether restructuring the information exchange can reduce the underestimation bias, (ii) to study how the number of estimates exchanged affects accuracy, and (iii) to shed light on the mechanisms underlying the integration of multiple pieces of social information. Our results show that leveraging prior knowledge about the underestimation bias allows to select and exchange the estimates that are most likely to attenuate its effects.
Crucially, this exchange method operates without any reference to the truth. Moreover, we find that exchanging more than one estimate also reduces the underestimation bias.
Underlying these results are a human tendency to herd, to trust large numbers more than small numbers, and to follow disparate social information less. Using a computational modeling approach, we demonstrate that these effects are indeed key to explain our experimental results. We then use the model to explore the conditions under which estimation accuracy can be improved further. Overall, our results show that existing knowledge on biases can be used to boost collective decision-making, paving the way for combating other cognitive biases threatening collective systems.

Introduction
Human and animal decision-making is characterized by a plethora of biases [1,2], such as pessimism, optimism and overconfidence [3]. Such biases, while often rational at the individual level, can have negative consequences at the collective level. For instance, Mahmoodi et al. showed that the human tendency to give equal weight to the opinions of individuals with different competences (equality bias) leads to sub-optimal collective decision-making [4].
Understanding the role of biases in collective systems is becoming increasingly important in modern digital societies. The recent advent and soar of information technology has substantially altered human interactions, in particular how social information is exchanged and processed: people share content and opinions with thousands of contacts on social networks such as Facebook and Twitter [5,6,7], and rate and comment on sellers and products on websites like Amazon, TripAdvisor and AirBnB [8,9,10]. While this new age of social information exchange carries potentialities for enhanced collaborative work [11] and collective intelligence [12,13,14,15], it also bears the risks of amplifying existing biases. For instance, the tendency to favor interactions with like-minded people (ingroup bias) is reinforced by recommender systems, enhancing the emergence of echo chambers [22] and filter bubbles [23], thereby further increasing the risks of opinion polarization. Given the importance of the role of biases in social systems, it is important to develop strategies that can reduce their detrimental impact on collective decisions.
One promising, yet hitherto untested, strategy to reduce the detrimental impact of biases on the collective level is to directly leverage prior knowledge about specific biases when designing the structure of social interactions. Here, we will test whether such a strategy can indeed be employed to reduce the negative effect of a bias on the collective level. We use the framework of estimation tasks, which are well-suited to quantitative studies of social interactions [24,25,26,27], and focus on the underestimation bias. The underestimation bias is a well-documented, robust human tendency to underestimate quantities, observed across many domains, including perception, pricing and risk judgment [27,44,28,29,30]. The seminal study by Lorenz et al. (2011) has suggested that the effects of the underestimation bias, indeed, could be amplified after social interactions in human groups, and deteriorate collective decision-making.
We here investigate the effect of different exchange structures, aimed at counteracting the underestimation bias (see below for details), on individual and collective accuracy. Moreover, we investigate how these exchange structures interact with the number of estimates exchanged in shaping accuracy. Previous research on estimation tasks has largely overlooked both of these factors. Thus far, research on estimation tasks mostly discussed the beneficial or detrimental effects of social influence on group performance [26,31,32,33,34,35,36].
However, most previous studies focused on the impact of a single piece of information (one estimate or the average of several estimates), or did not systematically vary their number; and, in most studies, subjects received social information from randomly selected individuals (either group members, or participants from former experiments) [24,25,26,27,32,35,36,37,38]. One exception is King et al., who showed that providing individuals with the most accurate estimate in a sequential estimation task resulted in higher accuracy than providing a random previous estimate [35]. However, this design requires a priori knowledge about the true value of the quantity to estimate, contrasting most realistic situations. In contrast to these previous works, in most daily choices one generally considers not only one, but several sources of information, and these sources are rarely chosen randomly [39]. Even when not actively selecting information sources, one routinely experiences recommended content (e.g. books on Amazon, movies on Netflix or videos on Youtube) generated by algorithms which incorporate our "tastes" (i.e. previous choices) and that of (similar) others [40].
Following these observations, we confronted groups with a series of estimation tasks, in which individuals re-evaluated their estimates, after having received a varying number of estimates τ (τ = 1, 3, 5, 7, 9 and 11) from other group members. Crucially, the exchanged estimates were selected in three different manners: • Random exchange: subjects received personal estimates from random other group members. Former research showed that when a single estimate is randomly exchanged, individual accuracy improves because estimates converge, but collective accuracy does not [26,27]. Since several random estimates do not, on average, carry higher information quality than a single random estimate, we did not expect collective accuracy to improve when exchanging multiple random estimates. However, we predicted that increasing the number of estimates exchanged would lead to a higher imitation rate and thus to an increase in individual accuracy.
• Median exchange: in estimation tasks, median estimates are often closer to the true value than randomly selected estimates (Wisdom of Crowds) [41,42,43]. In this exchange treatment, each participant received, as social information, the estimates which logarithms 1 were closest to the median log estimate m of the group (except their own). The selected estimates being on average closer to the truth than in the Random exchange, we expected higher collective and individual improvements.
• Shifted-Median exchange: as detailed above, humans have a tendency to underestimate quantities. Recent works have suggested aggregation measures taking this bias into account [44], or the possibility to counteract it using artificially generated social information [27]. Building on this, we here design a method that exploits prior knowledge on the underestimation bias, to select estimates that are likely to reduce its effects. We define, for each group and each question, a shifted (overestimated) value m of the median log estimate m that approximates the log of the true value T (thus compensating the underestimation bias), using a relationship between m and log(T ) identified from prior studies (see details in Materials and Methods). Individuals received the estimates which logarithms were closest to m (except their own), and we expected here the highest collective and individual improve- 1 The use of logarithms is preferable because of the human logarithmic perception of numbers (see Materials and Methods). ments. Crucially, only personal estimates are used to compute m , without any reference to the true value. That is, the accuracy of the selected estimates is a priori unknown, and only statistically expected to be closer to the truth.
We find that, unexpectedly, both collective and individual accuracy improve when more estimates are randomly exchanged, and that collective accuracy is not significantly better in the Median exchange than in the Random exchange. However, in accordance with our prediction, both collective and individual accuracy are boosted in the Shifted-Median exchange compared to the Random exchange, thus successfully counteracting the underestimation bias.
We unveil three key mechanisms underlying these results, and develop a model to analyse the conditions under which collective and individual improvements can be optimised.

Experimental Design
Participants were 216 students, distributed over 18 groups of 12 individuals. Each individual was confronted with 36 estimation questions (see the list in SI Appendix) on a tactile tablet.
Each question was asked twice: first, subjects were asked to provide their personal estimate E p . Next, they received as social information the estimate(s) of one or several group member(s), and were asked to provide a second estimate E s (see illustration in Supplementary   Fig. S1). When providing social information, we varied (i) the number of estimates shown (τ = 1, 3, 5, 7, 9 or 11) and (ii) how they were selected (Random, Median or Shifted-Median exchange). Each group of 12 individuals experienced each of the 18 unique treatments (i.e. combination of number of estimates exchanged and exchange structure) twice. Across all 18 groups, each of the 36 unique questions was asked once at every unique treatment combination. Students received course credits for participation and were, additionally, incentivised based on their performance. Full experimental details can be found in the Supplementary Information.

Compensating the Underestimation Bias
When considering large values, humans tend to think logarithmically rather than linearly [45], therefore the logarithms of estimates are the natural quantity to consider in estimation tasks [27]. The mean or median of log estimates is often used to measure the quality of collective decisions in such tasks (Wisdom of Crowds). Since distributions of log estimates for most quantities are closer to Laplace distributions than to Gaussian distributions [46], the median is more reliable than the mean 2 in estimating the Wisdom of Crowds [47]. Fig. 1a shows that there exists a linear relationship (data were taken from a previous study [27]) between the median log estimate m and the log of the true value: m ∼ γ log(T ), where γ is the slope of the relationship (the "shifted-median parameter"). Note that γ < 1 denotes the underestimation bias. We used this relationship to construct, for each group and each question, a value m (the "shifted-median value") aimed to compensate the underestimation bias, i.e. to approximate the (log of the) truth: m = m/γ ∼ log(T ), with γ = 0.9. m then served as a reference to select the estimates provided to the subjects in the Shifted-Median exchange.
Visual inspection confirms that the identified linear relationship not only holds for the same questions as in the previous study (half of our questions; Fig. 1b), but also carries over to new questions (other half; Fig. 1c), underlining its consistency. Crucially, our method does not require the a priori knowledge of the truth. Data sets including all questions and participants' answers, for each exchange condition and number of estimates exchanged, are included as Supplementary Material.

Results
Following [27], we define (i) collective accuracy as Median i,q log( to a shift of the median log estimates toward the truth, which is perforce accompanied by an improvement in individual accuracy (Fig. 2b), as estimates get on average closer to the truth as well. However, there can be individual improvement without collective improvement (i.e. without a shift), if estimates converge after social information exchange, as shown in [27].
In the Random and Median exchanges, collective improvement increases with the number of estimates exchanged τ (Fig. 2a). In the Shifted-median exchange, collective improvement is substantially higher than in the other two treatments at low values of τ , and decreases with structures, since subjects receive the same information, i.e. all pieces of information (group size was 12). We next describe three central mechanisms of social information use underlying these patterns.

Mechanisms Underlying the Integration of Several Estimates
Coherently with heuristic strategies under time and computational limitations [48,49,50], we assume that subjects intuitively (and rapidly) perceive the central tendency and dispersion of the estimates they receive as social information. Consistent with the logarithmic perception of numbers [45], we assume that this perception takes the form of their geometric mean and geometric standard deviation, respectively 3 .
This allows us to define a measure of the value subjects assign to the social information, as the weight S they give to the geometric mean G of the social information 4 . We define a subject's second estimate E s as the weighted geometric average of their personal estimate E p and the geometric mean G of the social information: E s = E p 1−S · G S . S can thus be expressed as S = log(Es)−log(Ep) log(G)−log(Ep) . S = 0 thus implies that a subject keeps their personal estimate (E s = E p ), i.e. that they discard social information, and S = 1 implies that their second estimate equals the geometric mean (E s = G), i.e. that they follow the central tendency of the social information.
Herding effect: tendency to partially copy social information Fig. 3 shows that the average weight given to the social information is on average strictly between 0 and 1, at all combinations of exchange structure and number of estimates exchanged. This individual tendency to partially follow social information leads to a convergence of estimates, which translates into individual improvement (Fig. 2b) [27]. We call this the herding effect.
The large collective improvement observed in the Shifted-Median (but not Median) exchange at low values of τ as compared to the Random exchange, is a consequence of this imitation tendency. In the Shifted-Median exchange the estimates received were on average higher than in the Random exchange, due to their selection process (the shifted-median value compensates the underestimation bias). Since subjects weighted social information, for 1 ≤ τ ≤ 7, at least as much in the Shifted-Median exchange (dots in Fig. 3c) as in the Random exchange (dots in Fig. 3a), their second estimates shifted toward higher values than in the Random exchange, resulting in the higher collective improvement observed. Asymmetry effect: differential weighting of social information Fig. 3 also shows that subjects weigh social information more when it is higher than their personal estimate (squares) than when it is lower (triangles). This is the asymmetry Moreover, Fig. 4b shows that subjects weigh social information more at lower levels of dispersion, i.e. higher levels of similarity: we call this the similarity effect. This effect explains the higher weight given to the social information in the Median and Shifted-Median exchanges, when 3 to 5 (and to a lesser extent 7) estimates are exchanged (Fig. 3), which entails a higher convergence of estimates and thereby a higher individual improvement ( Fig. 2b). Using an incremental modelling approach, we next emphasise the importance of these mechanisms in explaining the data.

Models of Social Information Integration
The basic model ("model 0") is based on the model developed in [27]: the average weight S agents give to the social information increases linearly with the distance between their personal estimate and the geometric mean of the social information. This is the distance effect, evidenced in [27] (where subjects received the geometric mean of other group members' estimates as social information) and observed in our data as well (see Supplementary Fig. S6).
However, the model 0 is unable to capture several of the empirical relationships observed in our data (see Supplementary Fig. S7 left column).
Including the asymmetry effect ("model 1"), as a linear dependence of the average weight Note that all effects are acting independently, and the herding effect arises from the average weight S given to the social information being strictly between 0 and 1-which only depends on the parametrisation, and need thus not be explicitly put into the model.
Full details are presented in SI Appendix.

Optimising Collective and Individual Improvements
We next use the model to explore the impact of varying group sizes and shifted-median parameter values γ on individual and collective improvement in the Shifted-Median exchange. Fig. 5a shows that the highest collective improvement is expected when the shifted-median slightly overestimates the truth (γ ≈ 0.76 for groups of 12 individuals; orange dots) instead of approximating it (γ ≈ 0.9, as we aimed for; red dots). Note that γ = 1 corresponds to the Median exchange (i.e. the median is not shifted; blue dots). The highest individual improvement occurs when γ ≈ 0.9, but the difference with γ ≈ 0.76 is so small that the latter should be preferred in order to maximise both collective and individual improvements. Surprisingly, our simulations predict that both improvements remain relatively high when the shifted-median overestimates the truth (γ < 0.9), but decay fast if the shifted-median underestimates it (γ > 0.9).
The optimum value of γ and corresponding maximum achievable improvement (Supplementary Fig. S9) are largely independent of group size for individual improvement. However, for collective improvement, both values increase with group size up to groups of about 30 individuals, at which point they stabilise. Interestingly, the saturation value (γ ≈ 0.86) corresponds to a shifted-median that would only very slightly overestimate the truth, suggesting that the larger the group size, the less social information needs to overestimate the truth (i.e. compensate the underestimation bias) for maximising collective accuracy.
Finally, Fig. 5b shows that the optimal number of estimates to exchange (for achieving maximum individual or collective improvement) scales linearly with group size.

Discussion
We studied the impact of the number of estimates exchanged within a group, and their exchange structure, on collective and individual accuracy in estimation tasks, and identified three central mechanisms underlying social information integration: (i) subjects tended to partially copy each other (herding effect), leading to a convergence of estimates after social information exchange, and therefore to an improvement in individual accuracy. Note that, contrary to popular opinion, convergence of estimates need not yield negative outcomes (like impairing the Wisdom of Crowds [26,32,36]): even if the average opinion is biased, exchanging opinions may temper extreme ones and improve the overall quality of decisions. This tendency to follow social information has another important consequence: it is possible to influence the outcome of collective estimation processes in a desired direction. In the Shifted-Median exchange, we showed that subjects' second estimates could be "pulled" towards the truth, thus improving collective accuracy. But subjects' estimates could be "pushed" away from the truth as well (see Supplementary Fig. S2). This is an example of nudging, also demonstrated in other contexts [51].
(ii) subjects were more influenced by higher estimates (than their own) than by lower estimates (asymmetry effect). This resulted in a shifting of second estimates toward higher values, thereby partly compensating the underestimation bias, even in the Random exchange.
Collective accuracy was thus found to improve after several estimates were exchanged, in constrast with former findings [26]. Although the asymmetry effect remarkably explains this improvement, it is likely that it is a consequence of some more fundamental cognitive processes. A possible (at least partial) explanation could be that it results from "people's difficulty to reason about magnitudes outside of human perception" [52]. Indeed, people generally deal with small quantities, typically below one thousand (e.g. number of persons or objects, monetary transactions, sports statistics...), and on much fewer occasions face larger scale numbers (e.g. populations, state level budgets, very high incomes...). They also have no direct experience with astronomical or geological events. It is thus possible that people find it easier to assess the reliability of relatively low numbers as compared to very high numbers, making it more frequent that subjects distrust lower estimates (than their own) than higher estimates. Moreover, even though people apprehend large numbers poorly, they usually know that such quantities are supposed to be large. It is therefore conceivable that people are more likely to assume to have underestimated such quantities than to have overestimated them, as a consequence of which they are more likely to follow higher estimates (than their own) than lower estimates. But whatever its origin, the asymmetry effect suggests that people are able to selectively use social information in order to counterbalance the underestimation bias, even without external intervention (like in the Random exchange).
(iii) subjects are sensitive to the dispersion of the estimates received, and follow the social information more when the estimates are more similar to each other (similarity effect), thus increasing the convergence of estimates after social information exchange. Former work has shown that similarity in individuals' decisions correlates with decision accuracy [53], suggesting that following pieces of social information more when they are more similar is a reliable strategy to increase the quality of one's decisions. Our selection method in the Median and Shifted-Median exchanges thus counterbalances a human tendency to underuse social information [27,54,55], and entails higher individual improvement than in the Random exchange.
Next, we developed an agent-based model aimed to emphasise the importance of these three effects-plus the distance effect-in explaining the patterns observed. The model assumes that subjects have a fast and intuitive perception of the central tendency and dispersion of the estimates they receive as social information, coherent with heuristic strategies under time and computational constraints [48,49,50]. The model further assumes that the effects are independent and linearly related to the average weight given to the social information.
It is conceivable that the strategies used by people when integrating up to 11 pieces of social information in their decision making process are very diverse and complex. Yet, despite its relative simplicity, our model is in fair agreement with the data, underlining the core role of these effects in integrating several estimates.
We then used our model to explore how the Shifted-Median exchange procedure could optimise collective and individual improvements, by varying the group size and shifted-median parameter γ. The model predicts that groups can reach highest collective improvement if the shifted-median value overestimates the truth (γ ≈ 0.76 for a group size of 12) instead of approximating it (γ ≈ 0.9), and that collective and individual improvements rapidly decline if social information underestimates the truth, but not if social information overestimates it.
These results are in line with earlier work on social influence [47], and suggest that favouring high estimates over low estimates at the individual level is a robust strategy to optimise the collective benefits. Our simulations further show that as group size increases, the amount of overestimation (in the shifted-median value) needed to maximise collective accuracy decreases and saturates close to the truth (γ ≈ 0.86) for groups exceeding 30 individuals. Moreover, we found that the optimal number of estimates to be exchanged, in order to optimise collective and individual accuracy, increases linearly with the group size. For groups of 30 individuals, our model predicts that 8 estimates should be exchanged, which is reasonable in terms of cognitive capacities. Finally, our simulations suggest that by tuning the number of estimates to be exchanged and the shifted-median parameter according to the group size, it is possible to achieve almost perfect collective accuracy (i.e. relative improvement of 1).
To conclude, our findings show that (i) it is possible to leverage prior knowledge on cognitive biases to lessen their effects, by organising the exchange of social information in groups in a way that counterbalances them, (ii) people's decisions can be nudged in a desired direction and (iii) there exists an optimal amount of information to share among group members in order to maximise the quality of their decisions, and this amount is predictable. Our results were derived within the paradigm of estimation tasks. Yet, we believe that the mechanisms underlying social information use in estimation tasks share important commonalities with related domains (e.g. opinion dynamics [56]), such that the conclusions presented here may be adapted to other domains. Hence, future work could adapt our findings to online recommendation systems (e.g. in Facebook, Youtube or Netflix) or page ranking algorithms (e.g. in Google, Yahoo or DuckDuckGo), by selecting the amount and type of content presented to the users. This could potentially work against filter bubbles and echo chambers, and reduce the effects of well-known biases such as the confirmation [57] or overconfidence bias [58].

Acknowledgments
We are grateful to Felix Lappe for programming the experiment, and thank Alan Tump, Each of the 12 subjects-in each of the 18 groups-was confronted with 36 estimation questions (see the list in section 3) on a tactile tablet (Lenovo TAB 2 A10-30). Each question was asked twice: first, subjects were asked to provide their personal estimate E p . Next, they received as social information the estimate(s) of one or more group members (i.e. other * Corresponding author -jayles@mpib-berlin.mpg.de subjects in the same room at the same time), and were asked to provide a second estimate E s . As a reminder, their personal estimate was also shown during the second answering of a question. Supplementary Fig. S1 illustrates how social information was displayed on the tablets: on the right side of the screen was a blue panel showing all pieces of social information, sorted in increasing order. All tablets were controlled by a central server, and participants could only proceed to the next question once all individuals provided their estimate. A 30 seconds count down timer was shown on the screen to motivate subjects to answer within this time window, although they were allowed to take more time. When Students received course credits for participation. Additionally, we incentivised them based on their performance P , defined as: where i and q are respectively indexes for individuals and questions, E p and E s are respectively estimates before (personal) and after (second) social information exchange, and T is the correct answer to the question. This performance criterion measures the median distance to the correct answer-in terms of orders of magnitude-over all questions, averaged over the two estimates (before and after social information exchange). The payments were defined according to the distribution of performances measured in [27]: • P i < 0.4: 5e (∼ 20% of subjects) • 0.4 ≤ P i < 0.5: 4e (∼ 30% of subjects) • P i ≥ 0.5: 3e (∼ 50% of subjects)

Pilot Experiment
Prior to the main experiment, we ran a pilot experiment (approved by the Institutional Review Board of the Max Planck Institute for Human Development-A 2019/07), very similar in design, but with two crucial differences: (i) the shifted-median was computed using a different method (described below). This method was more complicated than the one used in the main experiment; (ii) the selection procedure of estimates in the Median and Shifted-Median exchanges was based on a linear scale of numbers, i.e. we presented the closest estimates to the median 2 /shifted-median of the estimates, which was less consistent with the logarithmic perception of numbers.
This method yielded several unexpected results. In particular, the shifted-median exchange treatment worked less well than expected ( Supplementary Fig. S2). This motivated us to refine our experiment, the results of which we present in the main text. Most crucially, we used the logarithmic scale in the main experiment. Nevertheless, the results from this pilot experiment are interesting, because they emphasize the importance of using the log scale when dealing with large numbers. We therefore decided to present them here. Notice that nothing changed for the Random exchange of estimates, such that data from both experiments were combined to produce the Random exchange part of all graphs.

Shifted-Median Value in the Pilot Experiment
Let us introduce log-normalized estimates X = log E T . We identified a linear relationship ( Supplementary Fig. S3a) between the median m X of the log-normalized personal estimates and their diversity η X , defined as the average absolute deviation from their median (respec-2 Technically, we provided the τ pieces of social information that were closest to 10 Median(log(Ep)) , where E p is the actual personal estimate provided by a subject. This is because for even sequences of numbers (in this case 12 estimates), the log of the median is not identical to the median of the log. . This example shows that using a linear scale inadequately favors low estimates, further emphasizing the importance to use the log scale when dealing with large numbers. This result does however show that it is possible to influence subjects' estimates in the wrong direction, by driving second estimates away from the truth.
tively the maximum likelihood estimators of the center and width of Laplace distributions 3 ): with γ X < 0 the slope of the linear regression line.
Let us note m and η the median and diversity of log estimates (i.e. not normalized by the true value). They are related to m X and η X by: Equation S1 thus becomes: We then defined, for each group and each question (12 estimates), an expected approximation T (shifted-median) of the true value T , such that: log( T T ) = m X − γ X η X ∼ 0 (using equation (S1)), or equivalently: log(T ) = m − γ X η (using equation (S4)). Finally we have an expression of the shifted-median 4 value T , free from any reference to the true value: To compare the performance of this method with that used in the main experiment, we investigated how well both approximated the true values of the 98 questions asked in [27].
To make the questions comparable, we respectively defined the quantities (m X − γ X η X ) and m γ − log(T ) (with γ = 0.9, as measured in Main Text, Fig. 1a), following equations S1 and the relation m = γ log(T ) (see Main Text, Materials and Methods), respectively. These quantities are equivalent and represent the deviation of the shifted-median value from the true value, and equal 0 when the former equals the latter (i.e. no deviation).
Distributions of both quantities are plotted in Supplementary Fig. S4, and show that the first method (brown line, as the dots in Supplementary Fig. S3) approximates the truth slightly more closely than the second one (green line, as the dots in Main Text, Fig. 1), as the corresponding distribution is more centred and peaked on 0. However, both methods work very well, and given that the second method is more straightforward and easier to implement, we decided to use it for the main experiment.  [27]) following two different methods: (i) using a linear relationship between the median m X and diversity η X of log-normalized personal estimates X p (in brown; method used in the pilot experiment) and (ii) using a linear relationship between the median personal log estimate m and the log of the true value T (in green; method used in the final experiment presented in the main text). We consider log-normalized estimates X = log E T . For each question, log-normalized personal estimates X p are drawn from Laplace distributions [47], which center and width are the median and average absolute deviation from the median of the experimental log- and a central part that could roughly be assimilated to a Gaussian of mean M g and standard deviation σ g (agents contradict, compromise with or overreact to the social information with probability P g ). Moreover, they showed that the average weight given to the mean M of the social information increases linearly with the distance D = M − X p between M and the personal estimate X p . Individuals thus weight social information more when it is further away from their personal estimate (distance effect). In agreement, in the current study, we also found two peaks at S = 0 and S = 1, and a similar distance effect when exchanging a single estimate (see Supplementary Fig. S6). For a given value of D, the average weight S given to the social information was given by S = P 0 × 0 + P 1 × 1 + P g × M g = α + β|D|, where α and β are the coefficients of the linear cusp relationship between S and D. P g was hence given by P g = (α + β|D| − P 1 )/M g . P 1 was found to be independent of D, so P 0 could be deduced to P 0 = 1 − P 1 − P g . S was then drawn, for each agent and each question, according to these three probabilities, and the updated estimate X s was computed as the weighted average of the personal estimate and the social information: X s = (1 − S) X p + S M . See [27] for more details.

Computation of the Error Bars
When applying this model (model 0) to our data ( Supplementary Fig. S7 left column), we observe that it fails to predict the increase in collective improvement with the number τ of estimates exchanged, in the Random and Median exchanges. To explain this increase, the asymmetry effect needs to be added (model 1). The simplest way to do so is to assume no coupling between the asymmetry effect and the distance effect, and add a linear dependence of S on the number τ of estimates received as social information, with a positive slope β τ + when D > 0 and a negative slope β τ − when D < 0, as suggested in Main Text, Fig. 3a. The average weight given to social information, for a certain τ and at a certain distance D, is then given by: S (D, τ ) = α + β|D| + β τ ± (τ − 1), where β τ ± = β τ + when D > 0 and β τ ± = β τ − when D < 0. When τ = 1, the asymmetry effect term disappears, such that this model (model 1) and model 0 are equivalent for a single piece of social information (model 1 generalizes model 0).
This model is able to reproduce the increase in collective improvement with τ in the Random and Median exchanges ( Supplementary Fig. S7 middle column). Yet, it substantially underestimates the weight given to social information when D < 0 in these exchanges (see Supplementary Fig. S8 top panels). : Comparative fits of the asymmetry effect in models 1 and 2 : average weight given to social information, against the number of estimates exchanged, in the Random (black), Median (blue) and Shifted-Median (red) exchange treatments. Shown are the values when (i) all data are combined (dots), (ii) social information is higher than personal estimate (squares) and (iii) social information is lower than personal estimate (triangles). Filled symbols and empty symbols (accompanied with lines to facilitate visualization) indicate the values obtained from the data and the model simulations, respectively. Model 2 (bottom panels), which includes the similarity effect, is better at reproducing the empirical data than model 1 (top panels), especially when social information is lower than personal estimate (triangles).
This problem is remedied by introducing the similarity effect in the model (model 2).
Akin to the asymmetry effect, we assume no coupling between effects, and add a linear dependence of S on the dispersion σ of the estimates received. The average weight given to social information is given by: S (D, τ, σ) = α + β|D| + β τ ± (τ − 1) + β σ ± (σ − σ 0 ), where β σ ± = β σ + when D > 0 and β σ ± = β σ − when D < 0 (the strength of the similarity effect need not be the same when D < 0 and when D > 0). When τ = 1, σ = 0 (a single estimate has no dispersion) and σ 0 is set to 0, such that models 0, 1 and 2 are equivalent (model 2 generalizes model 1). Model 2 fits the collective and individual improvement better ( Supplementary   Fig. S7 right column), as well as the relationship between the average weight given to social information and τ (Main Text, Fig. 3 and Supplementary Fig. S8 bottom line), confirming that the asymmetry and similarity effects are important for describing the integration of several estimates.
Note that the herding effect occurs if 0 < S < 1 and is thus only parametrization dependent. It therefore does not need to be explicitly put into the model.
All parameters values are reported in Supplementary Table S1.   N − 1)), against the group size N . For large enough groups (N > 20), the maximum improvement is close to its highest possible value (i.e. 1), at which the group is perfectly accurate after social information exchange (i.e. collective accuracy equals 0).