FROM BEHAVIOURAL FINANCE TO KNOWLEDGE MANAGEMENT TO COGNITIVE COMPUTATIONAL MODELS
ANNA MARIA BRUNO
Management and Administration Department, University of Torino, C.so U. Sovietica 218bis
Torino, 10134, Italy
E-mail: bruno@econ.unito.it
NICOLA MIGLIETTA
Management and Administration Department, University of Torino, C.so U. Sovietica 218bis
Torino, 10134, Italy
E-mail: miglietta@econ.unito.it
MARCO REMONDINO
e-Business L@B, University of Torino, C.so Svizzera 185
Torino,10149, Italy
E-mail:remond@di.unito.it
This paper describes in detail the cognitive distortions studied by Behavioural Finance (BF) and transposes these concepts in the Knowledge Management (KM) context, with the proposal of formally embedding them into cognitive agents, for computational modeling. Orthodox-Economic theory fails in representing the decisional process of individuals in a realistic way, especially regarding the non-rational component of their behavior. By moving beyond those approaches, which assume a completely rational behavior, BF explores the cognitive distortions that could lead to sub-optimal decisions and behaviors. The study of systematic cognitive errors brings to an improvement of KM processes as well, where irrational human behavior affect the efficiency of collaborative groups through possible misinterpretations of data, and implicit knowledge in general. The authors propose a formal modeling technique to represent the cognitive distortions through computational agents, with the aim to create realistic simulations of organizations and social systems in general.
1. Introduction
Empirical evidences prove that the economic agent features systematic distortions, compared to the prescriptions coming from the traditional theories of markets efficiency. For example, it’s typical for an individual to excessively suffer for the fear of having to regret about her past per-formed actions. Risk aversion, (i.e.: the reluctance of a person to accept a bargain with an uncertain payoff rather than another bargain with a more certain, but possibly lower, expected payoff) could prevent the economic agent from fully taking advantage of earning opportunities in an optimized way, just to avoid losses she could then recriminate about. The individuals’ behavior is often based on cognitive dissonances, thus giving life to an attitude defined as “refusal”, referred to the defense of the performed choices, even if the empirical evidence would suggest change strategy. Last but not least, when an individual has to choose, she is usually biased by a system of behavioural influences that is far from the rationality criterions of economic convenience. The consequences of these partially distorted behaviors are not explained by Orthodox Theory and are usually defined as “anomalies”. On the contrary, BF aims to study these phenomena; developed at the end of the Seventies, it now represent a real discipline towards which the scientific community shows an increasing interest, in particular for the implications that could be derived about markets efficiency and about the investment processes. Also learning and knowledge diffusion processes are often characterized by a series of cognitive errors. The individuals use to look for a confirmation to their own believes and convictions, when dealing with data and information, often avoiding to consider the ones that would confute their positions. Some of the most common errors to be found in the KM literature are: not considering the concept of knowledge, or considering knowledge as a stock variable and not a flow variable, or not creating a shared context, or undervaluing tacit knowledge, or yet totally replace human contact with technology. This is why a modeling of systemic errors contributes to an improvement of KM processes. In order to diffuse knowledge, the creation of a KM system is not enough, and so isn’t the implementation of a network, leveraging the Information Technology (IT) potential to make the collaboration to grow faster in an organization.
On the contrary, it is very important to focus on the perception of knowledge by individuals, as a first step, and just after that taking care of how it’ll be computed and managed. The centrality of the individual in the process of KM reminds the cognitive distortions studied by BF and suggests how the same topics can be analyzed in the decision making process. In the following paragraphs the main theoretical principals of BF will be exposed; the most frequent distortions on which the approach is based are analyzed in detail. Biases, Heuristics and Framing Effects are systematized in order to qualify potential effects of the behavioural distortions.
After that, the main errors that can be done when dealing with KM are compared to these distortions and analyzed in detail. In the last part of the paper, the aim is to propose a mechanism of action selection, to be used in computational models, based on this kind of biases; this would make models more realistic and would allow to better understand the aggregate behavior of the agents involved in social systems by means of simulation. In fact, action selection paradigms are usually based on reinforcement learning (RL) techniques, which suppose a completely rational and perfect perception of the payoff coming from the environment. The proposal contained in this paper is to add an emotional part to the action selection algorithms, so to better represent real situations, and not just idealized ones, in which the behavior is the optimal and rational one. Note for the readers; in the present article, when describing a behavior as “real” or “usual”, the authors mean that this is the behavior that is most frequently met in the real world, derived from socio-economic concepts, studies and analyses.
2. Behavioural Finance, Cognitive Distortions and Prospect Theory
Many Classical Economists investigated the relationships among Cognitive and Economic Sciences; Smith, Marshall, Pigou, Fischer and Keynes studied the psychological foundations of preferences and beliefs. From the Forties, onwards, Samuelson and Hicks demonstrated technical virtuosity was naturally complementary to rationality hypotheses. The definitive elimination of psychological elements from economic theory dates back to the seventies. In a scientific context focalized on modeling the complete rationality for economic agents, two scientific contributions appear, still considered as the main theoretical premises to BF. These are the works by Kahneman-Tversky, respectively about “heuristics and preferences” and the so called “prospect theory”. These works, which approach Psychology and Economics, are soon followed by those by Richard Thaler. The orthodox theoretical Financial discipline has been created in order to fix prescriptive rules and to describe the choices performed in the real world. BF does not mean to discuss these rules, but aims to describe reality where traditional models fail. The two research approaches are thus complementary; traditional Finance gives proper indications about how to take decisions. BF effectively describes the behavior of agents within markets and organizations. The model of traditional Finance is founded on the figure of a rational agent, which systematically performs the correct choices, assuming that the choice makers follow an optimizing behavior. In order to justify the validity of this approach, the possible irrationality of the agents is considered as a behavior that, since arbitrariness exists, does not determine relevant financial effects. Thus, the theoretical approach has not to be judged upon the hypotheses on which is based, but for the validity of the estimates it could produce (Friedman, 1953). The purpose of BF is nevertheless to propose an alternative vision of the choices made by the agents, thus verifying if traditional financial theories can actually produce satisfying estimates. Many economic disciplines, like Labor Economics, Financial Science, and Macroeconomics, evaluate the behavior of the agents and the consequences of their actions. Finance is the discipline that probably leaves the least space for behavioural aspects. The central hypothesis, during the last thirty years, has been that of markets efficiency. Fama defines as “efficient” a market in which the asset price systematically reflects all the available information. An investor will never be able to always beat the market, and the most appropriate solution is that of managing the investments in a passive way. The theory assumes, in addition to the evaluative rationality of the agents, the possibility of non-rational choices that, nevertheless, would not influence the prices according to the role of the so called arbitragists that, by acting rationally, could nullify the effects provoked by irrational agents. These hypotheses suppose the capability to evaluate the asset value which, by following the principles of discounted cash flows, should embed into them all the available information (past and present). In the real world, the empirical evidence shows how this is hardly predictable and that the asset value follows a random walk. Nevertheless, a wide literature exists, regarding the validity of the markets efficiency theory that, based on the “event-study” paradigm, allows to confirm the fundamental hypothesis of weak-form and semi-strong efficiency for the asset value (Fama). Some empirical researches on financial markets from the second half of the ‘70s show a volatility which is quite higher than how should be justified by low informative asymmetries and by regular oscillations of values, that could be exploited to achieve higher profit. Traditional Finance, in recent times, still considers as an anomaly the achievement of a significant extra-profit since available information, being historical or public, wouldn’t allow it. The Capital Asset Pricing Model (CAPM) and following multi-factorial models are used ex-post to measure empirical results which ex-ante are considered as anomalous. According to BF some alternative interpretations exist, of the phenomenon that traditional Finance classifies as “puzzle”, i.e.: an anomaly not explained by theory. Several kinds of analysis can describe the main regularities of financial markets; e.g.: cross-section, seasonal pattern, event-based analyses, short/medium/long term momentum, and the volatility of prices, in absence of information. Though, the big financial crises, or the speculative bubbles, or simply the variability of the prices for the futures, in absence of information, severely question the hypotheses of market efficiency. BF concentrates on the study of agents’ fallibility on competitive markets and aims to interpret their behavior and the impact on the asset prices. In the following paragraph the main cognitive distortions and models identified by BF are analyzed in details.
2.1. Cognitive Distortions
The many “anomalies” of the markets suggest that agents’ behaviors are sometimes quite far from completely rational conducts. According to the heuristics, the financial choices are based on experience, in addition to rational basis. The cognitive distortions taking part in human behavior are divided into three categories: the heuristics, the biases, and the framing effects, which are a direct consequence of Prospect Theory. The heuristics are rules proposed to explain how individuals solve problems, give judgments, take decisions when facing complex situations or incomplete information. The justification for their existence is founded on the assertion for which the human cognitive system is based on limited resources and, not being able to solve problems through pure algorithmic processes, uses heuristics as efficient strategies for simplifying decisions and problems. Even if they succeed in most cases, these systems could bring to systematic errors. The biases are distortions caused by prejudices towards a point of view or an ideology. A decision system, like for instance an algorithm, can introduce some biases that cannot be removed, but that could be kept into account ex-post, by correcting the perception in order to decrease their effects. The term “framing” is referred to a selective influence process on the perception of the meaning of words and sentences; these distortions are derived from Prospect Theory, whose aim is to explain how and why the choices are systematically different from those predicted by the standard decision theory.
2.1.1. Heuristics
When referring to economic agents’ decision processes, the availability of information and the velocity with which they are supplied and spread, significantly increased over the past, thus obliging the simple investors, but also the specialized operators, to an higher effort to correctly elaborate the data. At a psychological level, when the number and the frequency of information increases, the brain tries to find some “shortcuts”, allowing to reduce the elaboration time, in order to take a decision anyway. These shortcuts are defined heuristics (or rules of thumb). On one side, they allow to manage in a quick and selective way the information; on the other side, they could bring to wrong or excessively simplified conclusions. Financial markets are characterized by herding phenomena that could lead to errors. These define the imitative attitudes of individuals towards “winning” people, e.g.: the so called “financial gurus”, but also simply the “herd following” conducts, i.e.: trying to stay with the majority of the investors. The most significant heuristics are: representativeness, availability and anchoring. The first shows how economic agents tend to make their choices on the basis of stereotypes (e.g.: “winner-loser” effect) that could lead to errors caused by wrong estimates. Although the analysts are professional operators, representativeness could lead to underestimate the tendency of “mean reversion” of financial markets, in their decisional process. The expression indicates that, in general terms, both a stock's high and low prices are temporary, and that a stock's price will tend to have an average price over time. When referring to the availability, the individuals tend to assign a probability to an event, based on the quantity and on the ease with which they remember the event happened in the past. Once again, the heuristic error is the consequence of a simplified cognitive model. Anchoring it the third heuristic behavior that could generate errors in the decision process; it’s the attitude of the individuals to stay anchored to a reference value, without updating their estimates. It’s at the bases of conservative attitudes often adopted by economic agents. Last but not least, also “affect heuristics” could impact decisions; by following emotions and instincts, sometimes more than logically reasoning, some individuals could decide to perform a decision in a risky situation, while not to perform it in other, apparently safer, ones.
2.1.2. Biases
Bias could be considered a systematic error. The most common biases are the over-optimism, confirmation bias, control illusion, and the excessive self-confidence. The concept of overconfidence is one of the most important and studied effects for BF. Many individuals have excessive confidence in their own means, thus overestimating their capabilities, knowledge and the precision of their information. The tendency to overconfidence seems to be a natural feature of human behavior, and could be the main cause of leading phenomena like those of speculative bubbles on the financial markets.
Most of recent studies have been devoted to two other phenomena regularly taking place on the financial markets: underreaction and overreaction. The former can be defined as the phenomenon for which the asset prices under-react to new information in the short terms, i.e.: the prices move slowly and scarcely, in reaction to the announcement of related news. Sometime the prices could under react in the short term, to subsequently correct this error by offering higher earnings after one year or so. Overreaction can be defined as the phenomenon for which the asset prices move too much, as a consequence to related news. Confirmation bias is a mental process which consists in giving the most importance, among the information received, to those reflecting and confirming the personal believes and, vice versa, in ignoring or debasing those negating inner convictions. This process, if exploited, could be even a powerful control tool, since could even lead an individual or a community to deny the obvious. On the contrary, the hindsight bias consists in the error of the retrospective judgment, i.e.: the tendency of people to erroneously believe, after an event has taken place, that they would have been able to correctly predict it a priori. In the final phase of the judgment process the individuals usually tend to create an expectation of success often higher than the objective one. An illusory control model is that based on wishful thinking; according to this phenomenon an event is considered as more probable than another one, simply because is seen as more desirable. At the same way, linked to the illusion of control there is also the concept of the fear of regret. That’s the tendency of feeling distressed for a wrong choice rather than being sad for the effect it produced. The consequence is that of postponing or even avoiding making a choice, justifying this with the necessity to gather more information, even if these won’t change the decision maker’s mind. Another basic behavioural principle is the so called “aversion to ambiguity”, often referred to as “uncertainty aversion”. This can be synthesized in the sentence “People prefer the familiar to the unfamiliar” and describes an attitude of preference for known risks over unknown risks, which can bring people to running an higher, though known, risk, over a potentially lower, but unknown, one. That it is not the same as “risk aversion”, which is the reluctance of a person to accept a bargain with an uncertain payoff rather than another bargain with a more certain, but possibly lower, expected payoff. Many persons use to be unsatisfied when they are convinced that they didn’t perform the best possible action; in order to avoid this, some people modify their behavior in an apparently irrational way; this is the regret theory: People know that when they make a decision they will feel regret if they make the wrong decision. They take this anticipated regret into account when they decide, thus modifying their decision accordingly. Among two alternatives, a person will chose the one for which she feels rejoicing, i.e.: the one with the most desirable consequence, over the one for which she feels regret, i.e.: the one with the least desirable consequence. The problem arises when a personal conviction is wrong and a person realizes that (cognitive dissonance). This is a form of regret a priori, influencing the final decision. Some persons could even find fancy and twisted argumentations to support their original idea and reduce the cognitive dissonance, in order to avoid the inner conflict caused by the evidence of being wrong.
2.1.3. Framing Effects
Prospect Theory is alternative to that of Expected Utility, when it comes to understanding the human behavior under uncertainty conditions, and adopts an inductive and descriptive approach. This theoretical foundation can be interpreted as a synthetic representation of the most significant anomalies found in decisional processes under uncertainty. The analysis carried on in highlights some behaviors seen as violations to the Expected Utility: the certainty effect, the reflect effect and the isolation effect. The certainty effect is referred to the fact that, when facing a series of positive results, people tend to prefer those considered as certain or almost certain, when compared to others with an higher expected value, but not certain. Many other important framing effects are derived from the certainty effect, e.g.: aversion to certain lost, bringing people to secure choices, even if less economically worthy. The reflect effect happens when turning the previous situation upside down, i.e.: instead of considering the probability of a positive outcome, that of a negative outcome is indeed considered. While when considering positive situations the individuals are risk-averse, they tend to become risk-seeking when all the alternatives seem to be negative (they often choose the least certain ones, even when apparently worst, possibly hoping that they will turn less negative). The isolation effect is the tendency to disregard the common elements among more possible choices, just focusing on the differential elements. This can lead to errors, since the apparently equal aspects of different situations can be indeed different, when coupled to others (there could be several ways to decompose a real problem, and many situations are indeed complex, thus stressing the interaction among the parts).
3. KM and Perception: what we do Know and what we should Know
There is not a conclusive and unique definition for KM; in general, the concept is referred to the preservation and sharing of knowledge. In recent times, with IT revolution, KM is intended as the theoretical and applicative research field developing the knowledge cycle within a community, by using technological tools. In 1986 Karl Wiig enunciated the principles of KM and many enterprises, especially multinational, show great interest towards this theory. The aim of KM is pragmatic: to improve the efficiency of collaborative groups by making the knowledge explicit and by sharing the knowledge possessed and acquired by each member, during her professional career. The initial investments were concentrated on the development of tools to make data storing, description and communication easy and quick. This first generation focuses mainly on the instrumental component of KM, i.e.: the IT, which is crucial, but do not totally comprehensive. Knowledge cycle goes beyond data and information transmission, since all the hierarchy should be considered. At the basis we find data, a “raw” and copious material, constituting the information. At an upper level, we find the information itself, i.e.: selected and organized data that could be used for communicating. Then we find the knowledge, which is the revised information, applied in practice. At the top level we find wisdom, i.e.: knowledge distilled by intuition and experience.
The second phase of KM focuses on how to make the specific knowledge of each member available to the whole organization. This logic turns the KM in a collaboration and sharing philosophy, within the working environment. It could meet resistances especially from those who are convinced their role is crucial, after many years of past experience. This vision reduces knowledge to a sort of personal “luggage” that can be taken away by the possessor, when she leaves the organization, thus causing an economical damage. Though, the one of knowledge is a cycle that brings to production of new ideas only through information sharing and processing. If data analysis, indexing and diffusion are now easily attainable operations, thanks to the IT tools, knowledge creation and sharing are the key factors for an effective KM policy. A lack of collaborative culture and knowledge sharing makes even the best tools inefficient. So, what emerges from this considerations is that the individual is central and most important; three groups have been identified, for the individuals in an intelligent organization, and represented in a cognitive pyramid, by Choo Chun Wei (1995). The domain experts are those that create and use knowledge. Their main activity is that of ensuring the effectiveness of the whole organization through innovation, adaptation and learning. The information experts are those that organize knowledge in schemes and structures to facilitate accessibility, improving value and ease of use. The IT experts are those that create and manage technological infrastructures in order to make elaboration and sharing processes more effective. Their role is to ensure the efficiency of the whole process. The user-client is located in the middle of the pyramid, in the triangle left apparently empty. Her main goal is to separate information management from technological management, so that the main systemic goals are kept clear, in order to receive information and turn them into knowledge. Collaboration among these figures is very important.
Figure 2. Knowledge Creating and Diffusing Activities
Human factor, i.e.: the workers, is the focal point around which all the activities move. They possess information and implicit knowledge whose formalization and sharing are the main goals for KM. So the role and specific qualities of persons in their working environments are the main factors to consider. Lack of time, turnover and distance among the collaborators diminish the way in which KM processes grow. The tendency of substituting working teams with virtual ones, linked through technologic infrastructures, causes intellectual distances among people, that concentrate more and more on their own individual tasks, rather than sharing what they know and learn. The same negative effect is present when the individual workload is excessive; under these circumstances knowledge exchange is not perceived as a priority, not being a part of the job tasks. The main cause for workers' resignations is the conviction that the enterprise is not emphasizing their talent; a person leaving the organization equals to losing a part of her knowledge. According to many authors, the process of KM would be influenced by a series of errors that, in some cases, could worsen or even stop knowledge development. It's interesting to notice that, after more than a decade from the pioneer contribution by Fahley and Prusak, many of those errors are still actual and not considered in many systemic KM processes. The basic principle stating that each organizational learning project should be founded on focalization and error correction, when not followed, causes an ongoing deterioration of knowledge and an higher risk of compromising the decision-making process. Managers’ wrong choices would be, to this extent, a direct consequence of systemic cognitive errors. For this reason Fahley and Prusak individuated the eleven deadliest sins of knowledge management, represented in fig. 2.
1. Not developing a working definition of knowledge
2. Emphasizing knowledge stock to the detriment of knowledge flow
3. Viewing knowledge as existing predominantly outside of the heads of individuals
4. Not understanding that a fundamental intermediate purpose of managing knowledge is to create shared context
5. Paying little heed to the role and importance of tacit knowledge
6. Disentangling knowledge from its uses
7. Downplaying thinking and reasoning
8. Focusing on the past and the present and not the future
9. Failing to recognize the importance of experimentation
10. Substituting technological contact for human interface
11. Seeking to develop direct measures of knowledge
Figure 2 The eleven deadliest sins of knowledge management (Fahley and Prusak,1998)
In a recent interview, after more than 10 years from his previous work, Prusak states:
Error 1: This is still one big error. Everywhere I speak people conflate information and knowledge - and this situation is greatly abetted by IT vendors and consultants for obviously commercial reasons. I would estimate that tens of billions of dollars have been wasted by organizations trying to work with knowledge by buying IT tools. Since none of this is taught in Business schools or perhaps ANY schools it isn't too surprising that most people can't define knowledge as distinct from information.
Error 2: This is also still an issue, though we have made much progress in it. There can't be too many organizations these days who still feel that large collections of documents is the best way to work with knowledge, at least not in the US or Europe.
Error 3: I would write this one a bit differently today. While knowledge is still produced and absorbed by people the distinctions between where the knowledge actually resides isn't always worth fighting over.
Error 4: This is as true as ever, even more so with virtuality and all its discontents gaining adherents. Context is a good synonym for knowledge itself, and is best (perhaps only) created through live give and take, etc. It can't be done well, if at all, through email and other e-exchanges.
Error 5: I think too much has been made about the distinctions between tacit and explicit knowledge, all those models for moving one to another, etc. All knowledge is always both tacit AND explicit.
Error 6: This one is also still true. KM in general follows pragmatism as a philosophy in not believing in distinctions between knowing and action. Isolating knowledge as a thing apart is mostly pointless in business, as contrasted with academics.
Error 7: Well, anyone who thinks that anti-intellectualism isn't a very strong force in American and UK culture is just out to lunch. If anything it's gotten stronger with the continuous use of varied media like IM, Google, etc. to replace real reflection and serious reading. I travel all the time and in contrast to years ago, I almost never see people reading anything substantial while flying. I'm told by friends who teach MBAs at the "top" schools that they can't get their students to read anything not online.
Error 8: This is also part of a bigger discussion that many management theorists and practitioners are having about how to escape the iron cage of short-termism. Many of us think that every executive needs to be more mindful of all those Black Swans out there waiting to strike. I haven't any idea how to change this but change it must!
Error 9: Rewarding failure is never easy; it is never going to be too popular. But we must do it to have a culture of knowledge growth. How else can any organization learn if it is afraid to do and think things? So this sin is still valid.
Error 10: This one has waned in commission. While technophiles still abound, they have less salience in KM discussions where they once dominated. No one thinks anymore that technology doesn’t have a real role in any KM work, but no one I know still thinks that KM is mainly a problem needing a technological fix to cure (well, maybe a few deluded souls at some technology companies).
Error 11: Once again, I think this battle is won. There is some great research being done on what actually can be measured in regards to knowledge activities, and more will be done in the future. But no one anymore tries to measure knowledge, per se. This is one we managed to kill.
4. Agents and Reinforcement Learning
When dealing with the problem of action selection, in a computational model, reactive or cognitive agents can be employed. Reactive agents feature a wired behavior, deriving from some conditional embedded rules that cannot be changed by the circumstances, and must be foreseen and wired into them by the model designer. This can be deterministic or stochastic, but won’t change based on the experience. Reactive agents are good for simulations, since the results obtained by employing them are usually easily readable and comparable (especially for ceteris paribus analysis). Besides, when the agent’s behavior is not the primary focus, reactive agents, if their rules are properly chosen, give very interesting aggregate results, often letting emergent system properties emerge at a macro level. Though, in situations in which, for example, learning coordination is important, or the focus is on exploring different behaviors in order to dynamically choose the best one for a given state, or simply agent’s behavior is the principal topic of the research, cognitive agents can be employed, embedded with some learning technique. Besides, if the rules of a reactive agent are not chosen properly, they could bias the results; these rules, in fact, are chosen by the designer and thus reflect her own opinions about the modeled system. Since many computational models of social systems are formulated as stage games with simultaneous moves made by the agents, some learning techniques derived from this field can be embedded into them, in order to create more realistic response to the external stimuli, by endowing the agents with a self adapting ability.
Though, multi-agent learning is more challenging than single-agent, because of two complementary reasons. Treating the multiple agents as a single agent increases the state and action spaces exponentially and is thus unusable in multi agent simulation, where so many entities act at the same time. On the other hand, treating the other agents as part of the environment makes the environment non-stationary and non-Markovian. In particular, models are non-Markovian systems if seen from the point of view of the agents (since the new state is not only function of the individual agent’s action, but of the aggregate actions of all the agents) and thus traditional Q-learning algorithms cannot be used effectively: the actors involved in real Social Systems have a local vision and usually can only see their own actions or neighbors’ ones (bounded rationality) and, above all, the resulting state is function of the aggregate behaviors, and not of the individual ones. While, as discussed in, in iterated games learning is derived from facing the same opponent (or another one, with the same goals), in social systems the subjects can be different and the payoff could not be a deterministic or stochastic value coming from a payoff matrix. More realistically, in social systems the payoff could be a value coming from the dynamics of interaction among many entities and the environment, and could have different values, not necessarily within a pre-defined scale. Besides, social models are not all and only about coordination, like iterated games, and agents could have a bias towards a particular behavior, preferring it even if that’s not the best of the possible ones, as expressed in the previous paragraphs.
Learning from reinforcements has received substantial attention as a mechanism for robots and other computer systems to learn tasks without external supervision. The agent typically receives a positive payoff from the environment after it achieves a particular goal, or, even simpler, when a performed action gives good results. In the same way, it receives a negative (or null) payoff when the action (or set of actions) performed brings to a failure. By performing many actions overtime (trial and error technique), the agents can compute the expected values (EV) for each action. According to Sutton this paradigm turns values into behavioural patterns; in fact, each time an action will need to be performed, its EV, will be considered and compared with the EVs of other possible actions, thus determining the agent’s behavior, which is not wired into the agent itself, but self adapting to the system in which it operates. Most RL algorithms are about coordination in multi agents systems, defined as the ability of two or more agents to jointly reach a consensus over which actions to perform in an environment. In these cases, an algorithm derived from the classic Q-Learning technique can be used. The EV for an action – EV(a) – is simply updated every time the action is performed, according to Eq. (1), reported by Kapetanakis and Kundenko (2004):
(1)
Where λ is the learning rate and p is the payoff received every time action a is performed. The aforementioned RL algorithm analytically evaluates the best action based on historical data, i.e.: the EV of the action itself, over time.
5. Biasing the Learning Algorithms
Simulation applied to social system is not necessarily about coordination among agents and convergence to the optimal behaviour, especially when focusing on the aggregate level; it’s often more important to have a realistic behaviour for the agents, in the sense that it should replicate, as much as possible, that of real individuals. The aforementioned RL algorithm analytically evaluates the best action based on historical data, i.e.: the EV of the action itself, over time. This makes the agent perfectly rational, since it will evaluate, every time she has to perform it, the best possible action found till then. If this is very useful for computational problems where convergence to an optimal behavior is important, it’s not realistic when applied to a simulation of a social system. In this kind of systems, learning should keep into account the human factor, in the shape of perception biases, preferences, prejudice, external influences and so on. When a human (or an organization driven by humans) faces an alternative, the past results, though important for evaluation, are just one of the many components behind the action selection process and the distortions analyzed in previous paragraphs should be kept into account for the individual agents, in order to create more realistic models of financial markets.
Traditional learning models can't represent individualities in a social system, or else they represented all of them in the same way – i.e.: as focused and rational agents; since they ignore many other aspects of behavior that influence how humans make decisions in real life, these models do not accurately represent real users in social contexts.
This is the main reason for proposing the behavioural approach for simulations, derived from the described theoretical frameworks.
Starting from Eq. (1), in the next section some equations will be introduced, showing how RL algorithms can be changed, so to reflect cognitive distortions of human beings.
6. Distortions
Even if preferences can be modified according to the outcome of past actions (and this is well represented by the RL algorithms described before), humans keep an emotional part driving them to prefer a certain action over another one, as described in previous paragraphs. That’s the point behind learning: human aren’t machines, able to analytically evaluate all the aspects of a problem and, above all, the payoff deriving from an action is filtered by their own perception bias. There’s more than just a self-updating function for evaluating actions and in the following a formal reinforcement learning method is presented which keeps into consideration a possible bias towards a particular action, which, to some extents, make it preferable to another one that has analytically proven better through the trial and error period. As a very first step towards that direction, Ego Biased Learning, introduced by Marco Remondino, allows to keep personal factor into consideration, when applying a RL paradigm, by modeling two perception errors described in paragraph 2: Anchoring and Affect Heuristics. In the first formulation, a dualistic action selection is considered, i.e.: . By applying the formal reinforcement learning technique described in equation (1) an agent is able to have the expected value for the action it performed. We imagine two different categories of agents ( : one biased towards action and the other one biased towards action . For each category, a constant is introduced ( , defining the propensity for the given action, used to evaluate and which is the expected value of actions, corrected by the bias. For the category of agents biased towards action we have:
(2)
In this way, represents the propensity for the first category of agents towards action and acts as a percentage increasing the analytically computed and decreasing . At the same way, represents the propensity for the second category of agents towards action and acts on the expected value of the two possible actions as before:
(3)
The constant acts like a “friction” for the EV function; after calculating the objective it increments it of a percentage, if is the action for which the agent has a positive bias, or decrements it, if is the action for which the agent has a negative bias. In this way, the agent will perform action (instead of ) even if , as long as is not less than . In particular, by analytically solving the following:
(4)
We have that agent (biased towards action ) will perform as long as:
(5)
Equation number 5 applies when both and are positive values. If is positive and is negative, then will obviously be performed (being this a sub-case of equation 5), while if is positive and is negative, then will be performed, since even if biased, it wouldn’t make any sense for an agent to perform an action that proved even harmful (that’s why it went down to a negative value). If , by definition, the performed action will be the favorite one, i.e.: the one towards which the agent has a positive bias.
In order to give a numeric example, if and then will be performed by agent till . This friction gets even stronger for higher K values; for example, with a , will be performed till and so on.
By increasing the value of , the positive values of turns into higher and higher values of . At the same time, a negative value of gets less and less negative by increasing , while never turning into a positive value (at most, when , gets equal to 0 for every ). For example, with , is 10% higher than . Since is the action towards which the agent has a negative bias, it’s possible to notice that the resulting is always lower (or equal, in case they are both 0) than the original calculated according to equation 1. In particular, higher corresponds to more bias (larger distance among the objective expected value), exactly opposite as it was before for action . Note that for a (i.e.: maximum bias) never gets past zero, so that is performed if and only if - and hence - is less than zero. The first general case (more than two possible actions and more than two categories of agents) is actually a strict super-case of the one formalized in 4.1. Each agent is endowed with an evaluation biased function derived from equations (2) and (3). Be the set of agents, and the set of possible actions to be performed, then the specific agent , with a positive bias for action will feature such a biased evaluation function:
(6)
This applies to each agent, of course by changing the specific equation corresponding to her specific positive bias. Even more general, an agent could have a positive bias towards more than one action; for example, if agent has a positive bias for actions and and a negative bias for all the others, the resulting formalism is equation (7) and, in the most general case, for each we have the equation (8). In case that two or more have the same value, the agent will perform the action towards which it has a positive bias; in the case explored by equation (7), in which the agent has the same positive bias towards more than one action, then the choice among which action to perform, under the same , is managed in various ways (e.g.: randomly).
(7)
(8)
As a last general case, the agents could be a different positive/negative propensity towards different actions. In this case, the variable to be used won’t be the same for all the equations regarding an individual agent. For example, given a set of and a set of actions , for each agent ( we have:
(9)
Being a fixed parameter, K could be a stochastic value, e.g.: given a mean and a variance.
7. Conclusion and Future Directions
Evidences coming from the real world prove that individuals are not completely rational; their perceptions are biased and distorted by emotions, preferences and so on. BF is the discipline that studies and formalizes these biased behaviours. The authors described the main human cognitive distortions, studied by BF; afterwards, they showed that very similar problems can be found in the KM domain, due to perception biases and social issues. In order to study these problems through innovative computational models of social systems, in this work a formal method for action selection is introduced, called Ego Biased Learning. It’s based on one step QL algorithm (eq. 1), but it takes into account individual preference for one or more actions, thus being a very first step in formalizing human distortions in a RL algorithm. This method is designed to be used in simulation of social systems employing MAS, where many entities interact in the same environment and must take some actions at each time-step. In particular, traditional methods do not take into account human factor, in the form of personal inclination towards different strategies, and consider the agents as totally rational and able to modify their behaviour based on an analytical payoff function derived from the performed actions.
Ego Biased Learning is first presented in the most simple case, in which only two categories of agents are involved, and only two actions are possible. That’s useful to show the basic equations defining the paradigm and to explore the results, when varying the parameters. After that, some general cases are faced, i.e.: where an arbitrary number of agents’ categories is allowed, along with an equally discretionary number of actions. There can be many sub-cases for this situations, e.g.: just one action is preferred, and the others are disadvantaged, or an agent has the same bias towards more actions, or in the most general situation, each action can have a positive or negative bias, for an agent. This technique represents two of the most common perception errors studied by BF: Anchoring and Affect Heuristics. In future works, other biases will be introduced in the learning mechanism, and formally described.
References
Alavi, M. & D. E. Leidner, (2001), Review: Knowledge Management and Knowledge Management Systems: Conceptual Foundations and Research Issues. MIS Quarterly, 25(1): 107-136.
Austin, R. D., (1996), Measuring and Managing Performance in Organizations. New York: Dorset House.
Barberis N. & Thaler R., (2002), A survey of behavioural finance, Handbook of the Economics of Finance, New York,
Charles, S. K., (2002), Knowledge management lessons from the document trenches. Online (Weston, Conn.), 26 n. 1: p. 22-28.
Choi, H. L. B., (2003), Knowledge Management Enablers, Processes, and Organizational Performance. An Integrative View and Empirical Examination. 20(1): 179-228.
Choo, C. W., (1995), Information management for the intelligent organization: roles and implications for the information professions. - online
Cummings, J. N., (2004), Work Groups, Structural Diversity, and Knowledge Sharing in a Global Organization. 50(3): 352-364.
Davenport, T. H. & J. Glaser, (2002), Just-in-time delivery comes to knowledge management. 80(7): 107-111.
De Martino, B., Kumaran, D., Seymur, B. &Dolan, R.J., (2006), Frames, Biases, and Rational Decision Making in the Human Brain, in Science 313; pg. 684
Fahey L. & Prusak L., (1998), The eleven deadliest sins of knowledge management, California Management Review, 40(3)
Fama E., (1970), Efficient capital markets: a review of theory and empirical work, Journal of Finance, vol.25, pp.383-417
Festinger L., (1957), A theory of cognitive dissonance, Stanford University Press, Standford
Friedman M. (1953), Essay in positive economics, University of Chicago Press, Chicago
Gardner H. (1983), Frames of Mind:The theory of multiple intelligences. NewYork: Basic Books.
Gigerenzer, G. & Todd, P.M., (1999), Simple heuristics that make us smart, Oxford University Press, New York; pp.189
Kahneman D., Slovic P. & A. Tversky, (1974), Judgement under uncertainty: heuristics and biases , Science, vol.185, pp.1124-1131
Kahneman D. & Tversky A., (1979), Prospect theory: an analysis of decision under risk , Econometrica, vol. 47, pp.263-291
Levin, D. Z. & Cross R. (2004), The Strength of Weak Ties You Can Trust: The Mediating Role of Trust in Effective Knowledge Transfer. 50(11): 1477-1490.
Lintner G., (1998), Behavioral finance: Why investors make bad decisions, ThePlanner, Cambridge; pp. 7-8.
Liebowitz, J. (cur.), (1999). Knowledge management handbook. CRC Press LCL, Boca Raton (FL)
Lucas R. (1975), An equilibrium model of the business cycle , Journal of political economy , vol.83, pp.1113-1144
Mandelbrot B. (1966), Forecast of future prices, unbiased markets, and martingale models, Journal of business, vol.39, pp.242-255
Markus, M. L., Majchrzak A. & Gasser L., (2002), A design theory for systems that support emergent knowledge processes. 26(3): 179-212.
Matlin M., Stang D., (1989), The Pollyanna principle: Selectivity in language, memory and thought, Schenkman, Cambridge, pp.45.
McNeil B.J., Pauker S.G., Sox H.C. & Tversky A., (1982), On the elicitation of preferences for alternative therapies, New England Journal of Medicine 306
Orlikowski, W. J. (2002), Knowing in Practice: Enacting a Collective Capability in Distributed Organizing. 13(3): 249-273.
Rabin M. (1997), Psychology and economics, Journal of economic literature, vol.36, pp.11-46
Shefrin H., (2000), Beyond Greed and Fear: Understanding Behavioural Finance and the Psychology of Investing, Harvard Business School Press, Boston
Shleifer A. (2000), Inefficient markets, Oxford University Press, Oxford
Singh, J. (2005), Collaborative Networks as Determinants of Knowledge Diffusion Patterns. 51(5): 756- 770.
Tanriverdi, H. (2005), Information technology relatedness, knowledge management capability, and performance of multibusiness firms. 29(2): 311–334.
Thaler R., (1997), Irving Fisher. Modern behavioural economist, American Economic review, n.2
Thaler R., (1999), Mental accounting matters, Journal of Behavioral Decision Making 12; pp. 183-206
Wiig, K. M. (1993), Knowledge Management Foundations: Thinking about Thinking–How People and Organizations Create, Represent, and Use Knowledge. Arlington, TX: Schema Press.
Wiig, K. M. (1995), Knowledge Management Methods: Practical Approaches to Managing Knowledge, Arlington, TX: Schema Press.
Wiig, K. M. (1997). Knowledge Management: Where did it come from and where will it go?, Expert Systems with Applications, 13, 1, 1-14.
2009年12月9日 星期三
訂閱:
張貼留言 (Atom)
沒有留言:
張貼留言