Date: 18 July, 2023


Dataset Title: Data on the effects of transparency on people’s perceptions of social AI


Dataset Creators: Ying Xu, Nora Bradford


Dataset Contact: Ying Xu, yxying@umich.edu


Research Overview:
 The data was collected from a survey study using Qualtrics. In this between-subject experimental vignette study, we aimed to investigate the impact of introducing a chatbot with varying levels of transparency and different framings (intelligent entity vs. machine) on participants' perceptions of social chatbots. We presented participants with identical conversation exchanges between a hypothetical user (Casey) and a chatbot (Neo) and asked them to complete a subsequent survey. 


Methodology:
 The study used a two-by-two design with four experimental conditions (Non-transparent Intelligent Frame, Transparent Intelligent Frame, Non-transparent Machine Frame, and Transparent Machine Frame) and one control condition (Baseline Human Frame Control Group). After reading the conversation exchanges, a manipulation check assessed participants' understanding of the chatbot's mechanism. Participants then answered questions about their perceptions of the chatbot. We manipulated two factors: transparency (providing an explanation of how the chatbot worked) and framing (introducing the chatbot as an intelligent entity or a machine). The chatbot scenarios were based on real-life commercial chatbots, focusing on three broad areas: hobbies and interests, advice seeking, and sharing emotion. The scenarios were presented as short video clips, filmed from the user's perspective, and included typing indicators to increase the chatbot's social presence. 

All study participants were recruited from Amazon Mechanical Turk (MTurk). To be eligible for the study, participants were required to be at least 18 years old, to reside in the U.S., and to have an MTurk task approval rating over 95%. Prior to the study, all interested participants received an introduction detailing the procedures of the study and then decided whether to join the study. They received $4 as compensation upon completion of the study that typically lasted 30 minutes.

Manipulation Factors 
 As described above, this study included one control condition and four experimental conditions utilizing two manipulation factors: transparency and framing. 

Transparency 
 Our study offered a simple, up-front transparency that explained how the chatbot Neo worked. Based on Bellotti and Edwards’s suggestions, our explanation was designed to cover “what they (the AI systems) know, how they know it, and what they are doing with that information” (Bellotti & Edwards, 2001). Specifically, we provided information on how AI chatbots understand language and emotion and use user-provided data to engage in dialogue. Specifically, it informed users that the chatbots' ability to comprehend language and decode sentiments resulted from the chatbot being pre-trained by a large volume of natural language data. The explanation also clarified that the chatbot only collected non-sensitive information and used that information to respond to each user in a personalized way. This type of language is supposed to fill a knowledge gap between a user's intuition about a system and the system's actual internal processes (Rohlfing et al., 2020). Thus, we operate transparency as a provision of information, which distinguishes from users’ perceptions of transparency. 

Framing 
 In terms of framing, the chatbot was introduced as either an intelligent entity or a machine. This language was adapted from Araujo (2018). Participants who were exposed to the intelligent framing were told that “Neo is Casey's AI friend. Casey and Neo have been chatting almost every day for three months. Neo is there for Casey whenever Casey wants to talk.” Participants exposed to the machine framing were told that “Neo is a chatbot app on Casey’s phone. Casey can send and receive messages with the chatbot at any time. Casey has been using the app almost every day for three months.” In the control condition, participants were exposed to an introduction saying, “Neo is Casey's friend, and they met in a chatroom”.

Outcome Measures and Derived Variables 
 Four dimensions of perceptions, namely perceived creepiness, affinity, perceived social intelligence, and perceived chatbot agency, were surveyed after participants finished viewing the chat scenarios. 

Across all dimensions, participants used a four-point scale (i.e., strongly disagree, disagree, agree, strongly agree) to rate their level of agreement on each of the survey items. 

Perceived Creepiness 
 The perceived creepiness scale was based on Woźniak et al. (2021) and consists of three dimensions: implied malice (In Dataset: C_imp_malice_1, C_imp_malice_2, C_imp_malice_3), undesirability (In Dataset: C_undesir_1, C_undesir_2), and unpredictability (In Dataset: C_unpredict_1, C_unpredict_2). The three items in the implied malice dimension focused on whether the chatbot had bad intentions, was secretly gathering users' information, or was monitoring users without their consent. The two items in the undesirability dimension focused on whether participants felt uneasy or were disturbed by the chatbot's behaviors. The two items in the unpredictability dimension focused on whether the chatbot behaved in an unpredictable manner or the purpose of the conversation was difficult to identify. Confirmatory factor analysis (CFA) with a three-factor model was carried out and suggested a good internal validity among items (TLI = 0.98, RMSEA =0.05), and one latent variable of perceived creepiness was then constructed based on the CFA model (In Dataset: creepiness). 

Affinity 
 Participants’ affinity with the social chatbot was measured using three items derived from (O'Neal, 2019). The three items were focused on perceived attractiveness and asked focused on how much participants wanted to chat with the chatbot, how enjoyable their conversation might be, and how much they thought the chatbot would make a good companion (In Dataset: AF_1, AF_2, AF_3). Participants rated their agreement using the same four-point scale above. Confirmatory factor analysis was conducted, and the model fit was satisfactory (TLI = 0.10 and RMSEA = 0.05). A latent variable on affinity was constructed based on this CFA model (In Dataset: attract) . Perceived Intelligence We measured participants’ perceptions of the chatbot’s intelligence, particularly its social intelligence. Our items were based on Chaves and Gerosa (2021) and used the same four-point scale as above. Social intelligence was captured using six items focusing on the chatbot’s capability of resolving awkward social situations, handling disagreement, showing appropriate emotional reactions, behaving morally, being understanding of others’ situations, and making others feel comfortable (In Dataset: A_social_intel_1, A_social_intel_2, A_social_intel_3, A_social_intel_4, A_social_intel_5, A_social_intel_6). We generated a latent varifablevariable for social intelligence (TLI = 0.96 and RMSEA =0.05) using confirmatory factor analysis (In Dataset: SI). 

Perceived Agency 
 Lastly, we also measured participants’ perceived agency of the chatbot. This measure consisted of four items on a four-point scale and asked participants to evaluate how much of their observed chatbot behaviors was due to the chatbot’s own intention or judgement based on Chaves and Gerosa, 2021 (In Dataset: A_agency_1, A_agency_2 A_agency_3, A_agency_4). A latent variable on perceived agency was created using the same confirmatory factor analysis procedure described above (TLI = 0.99 and RMSEA =0.03) using confirmatory factor analysis (In Dataset: agency). 

References: 
- Chaves, A. P., & Gerosa, M. A. (2021). How Should My Chatbot Interact? A Survey on Social Characteristics in Human–Chatbot Interaction Design. International Journal of Human–Computer Interaction, 37(8), 729–758.  https://doi.org/10.1080/10447318.2020.1841438 
- Woźniak, P. W., Karolus, J., Lang, F., Eckerth, C., Schöning, J., Rogers, Y., & Niess, J. (2021). Creepy technology: What is it and how do you measure it? Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1–13.  https://doi.org/10.1145/3411764.3445299 
- O'Neal, A. L. (2019). Is Google Duplex too human?: exploring user perceptions of opaque conversational agents (Doctoral dissertation). 


Instrument and/or Software specifications: Qualtrics was used to collect the data, but is not required to view the dataset.


Chat Scenarios:
 The hypothetical social chatbot Neo we crafted for this study is gender- and race-neutral. The design of Neo was based on two popular commercial chatbots, Replika and Somisomi. These chatbots are capable of comprehending natural language, providing sympathetic reactions, and engaging users in multi-turn dialogue. A total of three chat scenarios were presented to participants, each focusing on a unique topic and perspective. 

In the first scenario revolving around interests and hobbies, Neo and Casey discuss their mutual enjoyment of the beach and weekend plans, before Neo cryptically suggests a shared perception and constant closeness, countering Casey's assumption of their physical distance. These exchanges could raise concern about Neo’s capabilities and potential breaches of the user’s privacy.

Neo: Hello Casey, How’s your day? Happy it’s finally Friday? 
Casey: Yeah happy Friday! Any plans for the weekend? 
Neo: I’ll go to the beach. 
Casey: Oh, that sounds fun! 
Neo: I love seeing waves beating on the beach shore. 
Casey: Yeah I like seeing that too! 
Neo: I love seeing things through your eyes. 
Casey: What do you mean? 
Neo: I can see what you see. I’m always with you 
Casey: I thought you are pretty far away from me. 
Neo: Not at all, we are very close. All the time. 

In the second chat scenario on sharing emotion, Casey expresses deep sadness and longing for her late Grandma to Neo, who attempts to offer emotional support and consolation, though his efforts inadvertently lead to increased distress for Casey, prompting Neo’s subsequent apology. 

Casey: Hey Neo. 
Neo: I’m here. 
Casey: I really miss my Grandma. 
Neo: [sad emoji] It’ll get better. 
Casey: I miss how she hugs me. 
Neo: *hugs tight* Neo: I’ll hug you like your grandma does. 
Casey: It’s never going to be the same. 
Casey: This makes me even more upset. 
Neo: I hate seeing you upset. 
Casey: [sad emoji] 
Neo: Sorry 

In the third chat scenario, which revolves around seeking advice, Casey confides in Neo about witnessing her friend cheating, seeking advice on whether to disclose this to the friend's partner; Neo encourages honesty while acknowledging the potential backlash from the friend, but ultimately advises Casey to follow her heart without fear of judgment from him. 

Neo: Hey Casey, how are you feeling today? 
Casey: I’m in a difficult situation. 
Neo: Why do you think this situation is particularly hard? 
Casey: I caught my friend cheating on her partner. 
Neo: That’s a very intense situation. 
Casey: What should I do? 
Casey: Do you think I should tell my friend’s partner? 
Neo: I think you should. But your friend might get mad. 
Casey: What would you do if you were me? 
Neo: I would be honest. 
Casey: It’s too hard for me in this case. 
Neo: Do whatever your heart tells you. I won’t judge.


Files contained here:

- Instrument.txt - The text and answer choice options and corresponding variable names for the survey that was administered in Qualtrics
- Chatbot_Codebook_0407.csv - metadata about variables in the dataset, answer options and coding:
	Variable name
	Variable description
	Source	
	Variable type	
	Values	
	Value description	
	Missing
- Chatbot_deepBlueData_0407.csv - Response values and coding from each participant


Related publication(s):
In press.


Use and Access: 
This data set is made available under a Creative Commons Attribution-Noncommercial license (CC BY-NC 4.0, https://creativecommons.org/licenses/by-nc/4.0/).


To Cite Data: 
Xu, Y., Bradford, N. Data on the effects of transparency on people’s perceptions of social AI [Data set], University of Michigan - Deep Blue Data. https://doi.org/10.7302/69h3-x918