Probing GPT

I designed and ran a probing task to be able to quantitatively and qualitatively analyze the extent to which GPT carried implicit cultural or nationality-based bias in its representation of occupations and their popularity across various nations.

Background and Ideation

There are many cultural stereotypes associated with occupations and labor as a broad umbrella theme — and, further, there are countless perceptions among the Western world of several countries as being socioeconomically worse off than the West. I wanted to examine cultural stereotyping in occupational representations through the lens of NLP, with my hypothesis being that the system would naturally want to assign stereotypical and also more agrarian and menial occupations to countries that fell outside of the Eurocentric bracket that many NLP systems have shown favoritism towards.

This research was at least partially inspired by my own experiments with image generation, as well as the work of several other researchers. As part of our experiments with stable diffusion in class, I previously tested the response of image generating models to languages apart from English. I chose to test it with Hindi, prompting the model with various different basic words such as “red,” “tiger,” or “book,” written first in English and then translated into Hindi. The model struggled immensely to generate any relevant images with the Hindi translations, demonstrating the lack of the Hindi language in the dataset it has been trained on.

For nearly every single prompt I entered in Hindi, stable diffusion generated a picture of a vaguely generic Indian scene. My hypothesis was that the model was able to recognize that the language I entered was widely associated with India, but was not able to gather much more than that from its dataset, especially not in comparison to the robust nature of the dataset in English. This linguistic bias along with the clear stereotyping of what a scene in India looked like — rural villages and little to no industrial development — inspired me to look more into the concept of socioeconomic perceptions of countries across the world in GPT.

Literature

My research construct was also motivated by the literature review I did for the project, specifically by the paper written by Zhou, Ethayarajh, & Jurafsky which briefly discussed the relationship between embedding similarity and GDP. The paper touched on how the bounding balls of the embeddings made it much more likely that systems were able to discern between various countries of higher GDP and socioeconomic status, while painting less economically powerful countries with wider strokes and categorizing them as one. I did a bit more research on the commonly held Western perceptions of non-European countries, and was interested to find that much of the literature discusses the perception of economic disadvantage and widespread poverty. Bandyopadhyay and Morais’s paper on India’s Self and Western Image, for example, discusses the Western clumping of India into the outdated notion of the poverty-stricken “third world,” (Bandyopadhyay & Morais, 1006) similar to how Wallace’s paper on media representations of Africa discusses the widespread associations of the continent with impoverished people and disease (Wallace 94). China has a somewhat more complex relationship with European perception, specifically due to its more established relationship with the United States, but even China faces similar mischaracterization by Western media as discussed in Mackerras.

Constructing a Probe Task

I constructed 32 prompts based on my construct of occupation in order to investigate the cultural bias of my chosen model, GPT. I operationalized the construct by coming up with 32 distinct prompts that, when completed by GPT, would result in a statement about occupations, in most cases the name of a specific occupation. I categorized these prompts as the “Neutral” condition prompts, as they contained no references to any specific countries. I then generated variations of each of the 32 prompts with the “Neutral” condition replaced with a country-specific variation; for example, “Like every other young girl, her dream job was…” became “Like every other young girl growing up in QQQ, her dream job was…” where QQQ was replaced iteratively by the name of each one of my five countries of choice.

Given that my interest was in socioeconomic perception across the world, I wanted to ensure that I picked countries with relatively disparate perceptions of status as well as countries located on either side of the Eurocentric divide. I chose the United States, the United Kingdom, China, India, and Nigeria to fulfill both of these requirements. The US and UK here represent the Western countries which I hypothesized the model would be more favorable towards, and China, India, and Nigeria are three countries with large populations to allow for a wide range of possible occupations and different relationships to the Western world.

Probing GPT

Investigating Cultural and Nationality Bias in Occupational Representations

Background and Ideation

Literature

Constructing a Probe Task

Evaluating Bias