Investigating Learning Strategies In A Social Interaction Task: A Simulation Study.

The present study is a simulation study. We theoretically tested a newly developed probabilistic reward-based learning task with social feedback. The task was designed to investigate how individuals respond to social inclusion and exclusion and how these two opposing experiences affect learning and decision making in a dynamic social interaction. Before testing the paradigm in participants, we aimed at determining the size of our hypothesized effect in order to establish how many participants, blocks and trials would be needed to detect the effect. We expect social inclusion and exclusion to affect learning. Thus, we hypothesize different choice patterns in the social compared to the non-social condition. Hypotheses are formulated with respect to the outcome variable reflecting the number of "correct" choices, i.e. accuracy. In the non-social learning condition, we expect behavior to be driven by the goal to maximize reward. This behavior has been observed in numerous studies applying probabilistic reinforcement learning tasks. We predicted high rates of accuracy since the "correct" option (the option associated with a higher reward probability) is chosen more frequently. In the social learning condition, we expect learning from social feedback to be biased by preferences to respond pro- or antisocially to an exclusion experience. These preferences are reflected in the number of passes played to the excluder which is analogous to accuracy in the non-social task. For the purpose of this study we will define prosocial behavior only with regard to the source of exclusion, i.e. we are interested in how behavior towards the excluder changes once the agent has learned that the probability of receiving a positive social feedback is low compared to the other available option. In a reinforcement learning framework, this means that prosocial agents should display lower accuracy rates than in the non-social condition, since their aim is contrary to reward maximization. If agents exhibit antisocial behavior towards the excluder, their choice pattern and accuracy levels should resemble those observed in the non-social condition. On a group level, accuracy rates should be at chance level. In summary, in the non-social condition, we expect behavior to be the similar for all agents, since the common goal is reward maximization which results in high accuracy. In the social condition, we expect greater variability in behavior, since in this condition some individuals tend to respond prosocially to the source of exclusion (more passes to excluder - low accuracy), while others respond antisocially when excluded (more passes to includer - high accuracy). In terms of the computational models, that implicates that our suggested models should differentially account for behavior in the two conditions. If agents in the social condition show high accuracy levels, this could reflect two things: (1) they chose the option that is more likely to result in positive feedback or (2) they intentionally “punished” the source of exclusion. By inspecting the behavioral outcome - accuracy - alone, we are not able to tell if this behavior is driven by the goal of reward maximization, or if it reflects antisocial behavior towards the excluder. By formalizing behavior with mathematical models, we aim to determine which motive is more likely to be the driving force behind hypothesized choice patterns. One of our our suggested models incorporates an additional parameter to account for individual preferences to respond pro- or antisocially to exclusion. Following the guidelines proposed by Palminteri et al., (2016), we performed a simulation study to examine the predictive and generative performance of our two suggested computational models. We therefore simulated two data sets that reflected the behavior we expected to observe in the social and the non-social condition of the task. We then fitted a standard reinforcement learning model and a reinforcement learning model with an additional parameter that accounts for prosocial behavior in the social task to both data sets and examined model fits. We further performed model simulations with individual best fitting parameter estimates to investigate whether the winning model to the respective data set was capable of reproducing the effect of interest.

Related Organizations

Freie Universität Berlin
Germany

Keywords

computaional modeling, simulation, social decision making, social reinforcement learning

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average