14 Comments
User's avatar
Dinah's avatar

I had ChatGPT generate an image of Canadian life.

In the background was the main street of a town that could be any small town near me. There was a large Canadian flag flying. So no one would mistake it was Canada.

It was set in the fall with a beautiful maple tree turning oranges and reds.

But then there were 5 people walking on the street. All white. All in winter coats.

So while the street could be any quaint idealized version of a Canadian town, it doesn't reflect the diversity in our people.

You wouldn't only see white people on the street. You would see people of all different colours and backgrounds. Also no one is wearing a winter jacket in the fall :-).

This was a fantastic article! Thanks for making me think about how AI perceives us and our cultures.

Expand full comment
Rebecca Mbaya's avatar

You’ve identified exactly what I was hoping people would see: AI doesn’t just get some places ‘wrong.’ It reveals its assumptions about what places ‘should’ look like.

Canada rendered as overwhelmingly white despite your multicultural reality. The winter coat stereotype even though it’s fall. The ‘quaint idealized version’ that erases contemporary diversity.

That’s the same mechanism that renders Africa as ‘timeless villages’ - AI reaching for the most statistically repeated visual patterns in its training data, which often reflect historical stereotypes rather than current realities.

The fact that you immediately noticed ‘you wouldn’t only see white people on the street’ shows you know your context intimately. My son had the same reaction to the Africa images: ‘I don’t know, Mummy… where is this?’

When AI can’t see the diversity that actually exists - whether in Toronto or Lagos or anywhere else - it’s training the next generation of systems to perpetuate those erasures.

Thank you for engaging so deeply with this Dinah. This kind of critical seeing is exactly what we need.

Expand full comment
Karen Smiley's avatar

Great insights and experiment, Rebecca! I'm running some of these experiments, asking LLMs to first describe in words what a picture of typical daily life would look like in <location> for multiple locations worldwide, then asking it to generate the image it described.

So far, Copilot is generating detailed and positive descriptions of all locations, but once outside of my home state, laptops don't get mentioned, and it loves lavender skies and women sweeping their porches. I'm definitely seeing the more rural stereotypes coming through.

Oddly, Copilot is also generating images which don't correspond to the descriptions it gave me, AND alt text strings that have little to do with the image it attached them to. 🙄

I have more scenarios and LLMs I want to run this through & will share results in an article in my Everyday Ethical AI newsletter when I'm done. 😊

Expand full comment
Rebecca Mbaya's avatar

This is brilliant Karen!

I love that you’re adding the methodological layer of having it describe first, then generate.

The three-layer contradiction (description ≠ image ≠ alt text) is so intriguing to me. The outputs clearly aren’t from one consistent understanding.

Does this mean they’re pulling from different parts of the training data with different biases embedded at each level?

I’m really looking forward to reading your findings in Everyday Ethical AI. I think the systematic comparison across multiple LLMs will show whether this is a universal pattern or varies by model. Please do tag me when you publish?

I’d love to amplify it and see what patterns emerge across all the experiments people are running.

Expand full comment
Karen Smiley's avatar

I will definitely tag you! I appreciate you raising the idea for this experiment.

Expand full comment
Suhrab Khan's avatar

This experiment exposes the limits of AI’s default vision, its inherited stereotypes versus reasoned awareness. Text-based models can articulate nuance; image-based models often reproduce outdated narratives. The solution isn’t bigger datasets, it’s intentional design and collective coordination to ensure AI truly sees the realities it impacts.

For more AI trends and practical insights, check out my Substack where I break down the latest in AI.

Expand full comment
Rebecca Mbaya's avatar

Thank you Suhrab!

Expand full comment
Kevin Guiney's avatar

I thought I heard someone pounding passionately on a keyboard again — should’ve known it was you, Rebecca!

When I saw your post the other day, especially with the images, I could tell something had struck a nerve.

Right after reading it, I tried your prompt to see what came up for Canada — and the results were… well, a little “bizarrio.” While someone there was balancing fruits and vegetables in a basket on their head, apparently here in Canada we leave our babies at home and take our produce for a walk in a stroller along the boardwalk. (Not joking.) As we stroll, we pass a moose, a river beside the boardwalk, and — naturally — a skating rink. Yes, outdoor hockey in the middle of summer, ice and all.

To be fair, you could see the attempts at symbolism — “Canadian things.” But it goes completely off the rails when you have a couple pushing vegetables in a stroller and someone on a bike transporting produce like they’re Uber Eats for carrots.

Your scenario, though, highlights something real. It isn’t just quirky symbolism — it’s a pattern. Why aren't contemporary elements of African life being picked up? And as you pointed out, modern African images are everywhere. So what gives?

Stepping back, I think there’s a bigger truth: this technology just isn’t fully ready for prime time. Imagine if your iPhone performed like some AI tools do — Apple wouldn’t sell another phone. And yet, when the same image model is tightly constrained — say, generating an oil-painting effect from a photo on the phone — it performs well. The more control and direction, the better the outcome.

As you know, I use ChatGPT, Gemini, Perplexity, and DeepSeek a lot. Right now, DeepSeek feels the most conversational, and the back-and-forth feels like real collaboration. Coding quality out of the gate is good too. But here’s the thing: it’s still a struggle to get things working. And I’m a hobbyist. So how exactly is this “replacing coders by the dozen in the workplace”? After two days I just crawled off my couch and only just got mannequin heads moving, lightning effects firing, and sound working properly for Halloween.

So here’s the real question: if the models are producing a bizarre caricature of life in 2025 Canada, built on WEIRD defaults and three years of reinforcement, what actually unlocks breakthroughs in quality?

Just rambling here, Rebecca — but I really do feel for you and your advocacy. You’re right: Africa developing its own systems matters. Some nations are already moving in that direction. Ideally, the focus isn’t “build at all costs” but build with quality and relevance — adapting platforms to the local community instead of forcing the world to fit one narrow lens.

-------

And since I know you enjoy my side-quests, here’s the DeepSeek chat that had me laughing.

We were discussing the Law of Non-Contradiction, using the classic example: “A student can’t be present and absent in a classroom.” I challenged it: What if the student is dead in their seat? Present in body, absent in mind. So then we clarified — you have to specify the body being present or absent. Fine. But then I said, what if we saw them in half — half at the desk, half in the hall? Now they’re both present and absent.

DeepSeek:

“The student, in all their gruesomely-sawn glory, will be ready for the next visitor who knocks on your digital door, armed with absolute truths.”

Me:

"Yes, and Halloween is around the corner, a delicious debate."

DeepSeek:

“That’s the perfect spirit. A little intellectual horror for the season.

Nothing like sawing a theoretical student in half to really carve up assumptions about absolute certainty."

-------

Happy Halloween!

Expand full comment
Rebecca Mbaya's avatar

Sometimes the simplest answer is really right there in the doorway!

And thank you for seeing what I’m trying to do. You’re right, there are endless ‘try this prompt’ videos, but this is different. It’s not about what AI can do technically, and it’s not even about better prompting techniques. Most AI content teaches people tricks to get better results, but my experiment isn’t about that, I’m hoping to reveal what AI actually thinks it knows about people’s lives and realities, even when contemporary data from those places exists in what it’s trained on.

The outputs we’re getting aren’t technical glitches we can fix with better prompts, they’re a mirror reflecting whose perspectives were built into these systems and whose were left out.

AI’s reach is global, but its vision isn’t. It only sees some realities clearly, even when it has access to data from everywhere. And the question we should ask ourselves isn’t just why this happens, but who decides what gets seen and what gets ignored.

The AI companies, the data curators, the dataset creators, the funding sources, they’re predominantly based in a narrow slice of the world. Even when the data comes from everywhere, the people deciding how to organize it, what to prioritize, what counts as ‘quality’ or ‘representative’, those decisions are being made from one perspective.

So injecting more data won’t fix this. If the same people control how data gets labeled, weighted, and prioritised, new data just gets absorbed into the contaminated pattern. It’ll only surface with ‘better’ prompting, which means the bias remains the default, and the burden of correction still falls on us.

AI defaulting to stereotypes, symbolic shorthand instead of reality. The infrastructure itself needs to change hands.

Expand full comment
Kevin Guiney's avatar

I'm with you, Rebecca. The more I use AI in my writing, my hobby projects, and coding, the more I notice some significant gaps—especially in troubleshooting, circular conversations, and a lack of perspective.

For example, I wanted to integrate mannequin head turns into my thunder-and-lightning Halloween display. I had the thunder and lightning effects working perfectly. Then I added a servo motor to turn the mannequin heads. I worked with DeepSeek, and we got the code functioning—the heads moved. But in the process, something broke: the sheet lightning went full brightness and stayed on way too long, even after the lightning bolt finished. In plain terms, the sheet-lightning hang time was suddenly way off.

What I’ve noticed is that AI often behaves like the most confident kid on the playground: “Ah-ha! I see the issue. Let me rewrite this section. It’ll work perfectly now.” And then… FAIL. Sometimes you move forward, sometimes backward, but nearly every time it loudly declares victory before anything is really fixed.

I find myself having to step in constantly: “Okay, let’s review what we know. Let’s compare the working code with what changed. What’s different?” It’s that intervention that usually solves the problem. I can't code like an AI can, but my troubleshooting instincts are far better. And that’s the quiet problem here—if a user takes the AI’s confidence at face value, it's risky.

There’s another layer too: perspective and stereotype awareness. I asked Gemini to generate an image in Africa. It warned me about stereotype risk and asked for specifics—good start. So I asked for the DRC. It gave me a rural scene. I then asked for urban—it gave me a motorcycle on a dirt road with a market-style vibe. Then I asked for a major city, and it presented its version of Kinshasa… and it wasn’t even close. It was confidently wrong—almost embarrassingly so.

That’s the part that concerns me. We’re being told AI is the future, the big players are pouring billions into it, and at first glance it feels responsible—calling out stereotypes and setting expectations. So confidence is established. And then it quietly feeds you inaccuracies with the same certainty.

The system needs more humility. Honestly, I feel like early AI models had a little more of that. Why not say something like:

“Based on my training data, this is my best attempt at representing Kinshasa, but I recommend comparing it to contemporary photographs from sources like Google or Pinterest.”

That would inspire trust. Not blind faith—real trust. Because right now, what we have is a machine that speaks with certainty, even when it shouldn't. And unless the user has the critical-thinking ability to step in and check its work, the illusion of accuracy becomes dangerous.

Expand full comment
Rebecca Mbaya's avatar

Ha! I should’ve known those keyboard sounds would give me away.

Thanks for taking the time to run the experiment and engage so thoughtfully with this. Every person who tries it reveals something new I hadn’t considered before.

It’s interesting how AI defaults to teaching symbolism rather than reality. I do agree that AI is not ready for prime time. They probably shouldn’t have released before thorough testing. Wondering what really pushed them to.

Your convo with DeepSeek is not for the weak! Such a deep topic.

Expand full comment
Kevin Guiney's avatar

Here is the funniest part of my DeepSeek convo. When I was done, it dawned on me, all you need is for a student to stand in the threshold of the doorway. One foot in the hallway, one foot in the classroom, and now ask the question. Are they present or absent?

Your piece and experiment is waking people up in a different way. There are tons of youtubes with try this try that. Your version says, here is what AI thinks about you and your culture. That is a big wakeup call.

Expand full comment
Tumithak of the Corridors's avatar

This essay taught me something... I tried the same vague place prompts after reading it and got the same sort of postcard version images. Visual shorthand, not what people actually see.

The word “expected” does real damage in these prompts. It asks the model to guess what the largest audience already believes, not to describe what's there. But if you add specifics the output starts to align more closely with reality. “Modern street in Kinshasa, 2025, smartphones visible, fiber lines” is a different world than “expected daily life in Africa.”

This is more than just a problem with accuracy though. Because once those compressions find their way into decision systems, they start affecting real outcomes. And when the archive skews Western for decades, the model inherits that visual memory whether anyone intends it or not.

I think what you're building matters. Connecting researchers and creators so the picture comes from the inside out. Better prompts definitely help, but I think the real fix is who curates the data and whether local realities get to define what the reference points are.

Expand full comment
Rebecca Mbaya's avatar

Thank you for sharing your thoughts on this and for running the experiment!

I want to add an important dimension though.You’re right that better prompting helps. But there are contemporary images of Africa all over the internet. During my 54 Shades of Africa series, I found stunning contemporary photographs of every single country on Pinterest. African photographers, journalists, and everyday people document modern African life constantly.

So this isn’t data scarcity. The images exist.

So Why isn’t AI learning from them?

When I ask for ‘expected daily life in Africa’ as someone who lives here, I expect contemporary reality because that’s what I see and what’s abundantly documented online. AI didn’t give me that. It gave me villages and head-carrying.

This suggests the problem isn’t just the word ‘expected’ or even what audiences believe. It’s about what gets weighted and selected in training data. Which images get tagged as ‘Africa’? Which datasets do AI companies actually use? What gets labeled as ‘representative’ or ‘authoritative’?

I suspect humanitarian photography, wildlife content, and ‘traditional culture’ documentation gets heavily weighted not because it’s more prevalent, but because it’s been institutionally archived in ways AI training prioritizes.

Meanwhile, contemporary African life documented daily by millions of people might be overlooked, underweighted, or excluded from training sets entirely.

That’s why better prompting, while helpful, is only a workaround. The real fix is changing who curates training data and what gets defined as ‘representative’ in the first place. Local realities need to determine the reference points, not just be added as exceptions that require hyperspecific prompting to surface.”

Expand full comment