Logos Analog

AI is Two Things. Which One is on Your Side?

Justin Philip Flores — Tue, 28 Jul 2026 20:11:37 GMT

AI generated illustration of an acoustic modem

I haven’t posted an essay these past couple of weeks because I’ve been consuming my nights and weekends, working heads down on a white paper for an invention, a proposal that might change how AI gets delivered to people. This essay is my case for why we need it, and it starts in a small town in Texas about 60 years ago with a phone.

The farther and farther away we get from having telephones attached to our walls and having to physically go to where the phone is to be able to speak to someone far away, the more strange the idea gets. Did you know that it was even stranger before? Until the late ‘60s, the telephone that was on your wall wasn’t even yours. It belonged to the phone company. When you signed up for phone service, they would install it, and it remained theirs.

Now, at that time, there was no real innovation in the handset technology, and there was no shopping around for different models and different features. And that’s how the phone company saw it. There was nothing there to develop. A handset was just a commodity, a piece of equipment, a fine innovation in itself, which enabled the onboarding of customers to their service network.

We certainly know today that despite the network providers’ near-sightedness, it was more than possible to innovate on that component of the service. You could argue that it is the one daily-use device that has undergone the most user-centric feature augmentation year-over-year, for a long time. The bland providers’ endpoint that was screwed to the wall is now reimplemented to not only talk to grandma, but enable both the consumption and production of videos, track real-time movement of global air traffic, find you a job, and on and on, and it sits in your pocket.

Subscribe now

That’s worth giving a moment to sink in. But, at the time, if I would have described half of what a smart phone would be like to a person whose phone is screwed to the wall, it would have sounded as strange as their world does to us. We can’t go anywhere without our phones, and we even configure them to talk some sense into us when we’ve been paying too much attention to them.

Component stacking as a business strategy can be called vertical integration. It often works well when the customer doesn’t realize the integration is separable. Not coincidentally, the design of integrated products often takes pains to hide the seams. For companies, vertical integration can allow a less profitable component of a system or service to be subsidized by a higher margin one. However, it can also be a structure which “double dips” from a customer’s wallet by providing a benefit the customer was looking for, and tying on another they can’t refuse.

Sometimes, as a society we come to realize that it was happening and then decide that it was unfair, either for competitive or consumer reasons. Sometimes we don’t.

At this moment, I propose that another such situation has emerged right under our noses. This time it involves the most rapidly adopted consequential technology the world has ever seen. That, of course, is AI.

Almost no one I ask understands this one basic point. So, it’s worth stating plainly, and maybe repetitively. The AI that you think you know and love, and interact with every day, is at least two separate pieces. The app you download on your phone or laptop does not have the model, it enables you to speak with the model.

The frontier-level models, Claude, Gemini, Grok, Kimi, ChatGPT, et cetera, those are running on datacenters that you will never see or touch. The average person, even the average company, could not afford the hardware it takes to run a model even a fraction of the size or capability. When you press the “Send message” button your text is sent back to home base and the model’s response is sent all the way back to you. These two components could be built by different companies. In fact, they already are.

Once handsets were manufactured apart from the network providers, people could see the potential and the benefit of having a choice on that component of the service. Before then, it was not something that anyone questioned. For AI, many today just don’t question that the app that you use to interact with your favorite AI model is a component that would have to be made by the model’s maker.

Part of what hides the seam is our own psychology. We have a term in development, WYSIWYG, pronounced “wizzywig,” it stands for what-you-see-is-what-you-get. Most people who interact with software associate what we call the frontend user experience with the application, with the product. That’s not to say that the backend doesn’t influence user experience, but it is much, much less present in the user’s own perception.

In dev shops the world around, the divide is so ordinary, their teams face off in the foosball tournament. Frontend and backend roles are so different that there is an entire meme tradition built around chiding either side’s idiosyncratic subcultures. The audience of a good backend developer is just not the direct consumer. The reason there is a title “Full-stack” developer, is because you don’t stack one thing.

This is not at all special to software. You use “stacked” products all day long. The outward appearance and function of the keyboard is a separate design from the inner hardware and the circuit boards. The interface that you mostly associate with your car is the interior surfaces, and they’re made to appear a certain way. Underneath the sophisticated pleather upholstery, and the rubberized pull handles, it’s just stainless steel and greased-up springs.

You can replace the frontend components, and you can use different ones as long as they’re compatible with the backend.

The same is true for AI.

If you have two things you’ve been calling by one name, one of the best ways to start thinking of them differently is to name them separately. In AI, Claude, and Grok are models, then what is the app you use called?

In model-serving infrastructure, an inference client is the library that sends a request to an inference server, and by that definition every person reading this already has one. It is supplied by the model provider, it runs on their terms, and it is exactly the arrangement this essay is about.

What I am making the case for is not just another client, but a class or category of client. It needs its own word because the distinction is not primarily technical, but what it owes.

We call it an inference advocate. It holds the person’s data on their own device rather than the provider’s servers. It verifies which model actually produced a given response, applies the laws of the user’s own jurisdiction at the point of delivery, and enforces standards against providers not by adjudicating individual complaints but by the aggregate, statistical methods that tamed email spam. It is, in one sentence, the advocate that every other high-potency industry already has and this one does not.

We actually are already seeing this specific use case arise on its own, independently. Specifically, it comes from the software development trade itself. That is because of an existing previous technology that already gave us the form factor, which is what we call an integrated development environment.

An integrated development environment is basically the software developer’s desktop work surface, if you will. It is a place where a software developer can access a consolidated tool set that interfaces with the end product, the code base, and augments the developer’s ability to modify and inspect that code base.

When AI came along, since AI can also assist with modification and inspection of code bases, it made sense for the AI chatbots to be integrated with the IDE. What ended up happening is that the model providers opened up APIs for the IDEs to implement their own chat interface however they saw fit. The user interface would be designed by the IDE makers themselves, and the chatting functionality is served over the wire.

The biggest example of those is Cursor, which my team uses every single day. Cursor is built as one of so many forks from the original Visual Studio Code, Microsoft’s prolific MIT-licensed IDE, a solid foundation someone else laid, and it talks to models someone else trained. It’s an inference client bolted onto someone else’s editor, connecting to other companies’ models, and it reached about four billion in ARR in less than four years. About half of Fortune 500 companies have devs using it. This author has it open in the window behind the one I’m typing in.

In June, SpaceX signed a sixty billion dollar all-stock agreement to buy Anysphere, Cursor’s parent company. SpaceX merged with xAI, makers of Grok, earlier this year, so the buyer is a frontier model company. The stated aim is to put Cursor together with Grok, and level-up Cursor’s own models now that they have access to xAI’s Colossus supercluster, and build a vertically integrated AI stack. Statements from both companies, and their joint projects already underway speak to this reality. The deal is expected to close this quarter, pending regulatory approval.

However any market analyst might put it, there is one thing that acquisition declares. Inference clients are valuable. No one agrees to the largest ever acquisition of a venture-backed startup for a custom skin on an open-source editor. The transaction value of $60 billion significantly surpasses the previous record holder, Google’s $32 billion acquisition of Wiz in 2025, and more than triples the $19 billion Meta paid for WhatsApp in 2014.

Back to the phone history lesson, the seam between phones and phone service didn’t open because somebody made a good argument. It happened because of innovation.

In the 1950s, certain businesses would use two-way radios to communicate with their people out in the field, like oil drill operators or florists. A man named Tom Carter, who came from Mabank, Texas, at the time a small town outside of Dallas of less than a thousand people, was a radio technician during the war. Afterwards, he started Carter Electronics Corporation, and he leased those radios. His customers kept asking for the same thing. They wanted their mobile radios to connect directly to the network instead of having to relay every message through a base station operator. With the demand evident for a new product in a category that was only occupied by the network provider’s own standard device, he innovated.

In 1959, the Carterfone was born, an augmentation to the user interface layer of the landline phone network. AT&T didn’t sue him. Instead, they threatened his customers. They told them that anyone using a Carterfone was risking having their phone service shut off. This was a move from their playbook that had worked before, a decade earlier, against a plastic cup that was designed to muffle your handset. Sure enough, his customers began returning the Carterfone, and the business was dying before any ruling was made.

But Carter wasn’t going to take it lying down.

In true Texas fashion, he filed an anti-trust lawsuit in November 1965. It was at those hearings that AT&T said that the rule was necessary for safety. They claimed that any devices that weren’t manufactured by the network provider themselves might harm the network. Carter wasn’t buying it. He said this was simply anti-competitive. On June 26, 1968, the FCC made a ruling, 6-0, that the tariffs were unreasonable, unlawful, and unreasonably discriminatory, and had been since the beginning.

Now, the next part is one to pay attention to. The remedy the FCC proposed wasn’t simply an order that allowed AT&T customers to attach things to their phones. They also gave the carriers permission to file new tariffs so that they could protect their network against any harmful devices, but only if they specified technical standards. They weren’t banning the idea, but they also established the concept of standardization for compatible inventions.

In this case, consumers’ choice was upheld over concerns for network safety. Today, the concern is consumer safety and choice over the providers’ vertical business model.

The innovation didn’t stop with the Carterfone. A few short years after the FCC decision, humanity got an invention that would usher in one of the most revolutionary changes in human society brought about by technology, the modulator-demodulator, or “modem.” The modem was the thing that unlocked the Internet because phone technology was already so widespread. It was a client-side innovation that added immense value to the existing system.

At the introduction of the original telephone handset, one could hardly envision devices like the modem, fax, or answering machines. Now the likeness of the traditional handset has been relegated to a universal symbol of long distance voice communication, which appears invariably on the touch screens of its distant descendants, today’s Androids and iPhones.

Back in 1967, in that hearing room, no one was arguing for any of this. One man was arguing that he had a right to innovate at the middle layer.

Today I’m making a similar argument, but not to regulators. Instead, directly to the market, to you. Now is the part where I need to tell you what the benefits are. Like Tom Carter, I can tell you where they start, but I can’t tell you where they end:

First, you keep your own context with you. Chat histories, contextual files, bio data, it all stays on your device, and comes with you to whatever model provider(s) you choose to buy inference from.
Comparative shopping: not only do you have portable context, but you can see the value of any free or paid services you choose, side-by-side, with an account of what you got for the price.
Parental control: one of the most crucial problems we need to solve in AI gets a major reinforcement with the inference advocate, and the AIDP network protocol.
Verified provenance on the source of your AI conversation history.
Multi-layered delivery policies allowing for jurisdictional rules and individual families’ rules in the same mechanism.
Independent semantic moderation without giving up privacy, to flag incidents and surface trends in real-time to enable immediate intervention.

All of this would be made possible not only by the introduction of the inference advocate, but by a standard I’m calling Accountable Inference Delivery Protocol (AIDP). Being a componentized system rather than centralized, this means that all parts of the system can improve independently of one another. Innovation can happen at each of the component levels.

(AI Generated image) Read the full AIDP white paper: https://doi.org/10.5281/zenodo.21610185

Beyond what I believe the initial benefits are, which are many, the rest of the list is for the future to hold. If history has anything to offer us, the list will probably grow.

Generally, there is trust that is pulled back from the model creators and providers and control that’s pulled back into the consumer’s own hands. This enables the consumer to have more choice amongst market competitors for all of those things that the client facilitates access to, and gives consumers a duty-bound advocate in every AI inference transaction.

The two reasons everyone should want this are not in competition, even though that’s how they often get portrayed.

We are seeing problematic patterns in our society due to AI that need some surface for intervention, moderation, monitoring, certification, tracking, in a word, control. The need is inarguable. The harm, we’re already experiencing. Taking action doesn’t have to mean imposing a punitive, heavy-handed regime, but we have the possibility to create a system that gives us the control surfaces that we need and adds even more value in the form of consumer choice. The only downsides are for wayward models or malicious actors, but for the rest of the AI-consuming population, the rewards are plenty.

The Carterfone story already ran this experiment. The network didn’t open up because handsets needed innovation. The standards regime came about as a protective technical requirement so that third-party equipment couldn’t harm the provider network, which is to say that the safety case is what precipitated the later innovations. The initial benefit was safety and standardization, and the value of subsequent developments absolutely dwarfed it.

As much as consumer-owned handsets and later internet modems were the better way of using the telecom networks, clients on our own side, inference advocates, are the better way to use AI. Consumers have every reason to normalize a scheme like this for our own sake, to have freedom and control over our data. Society at large gains the greater benefit of the protection of the vulnerable and oversight against abuse.

But before you sit back and assume that this plan will unfold on its own, let’s come back to Cursor, because that example proves that the position is not just real but very much in demand. It also shows you 60 billion reasons the control tends to drift away from the end user.

As this idea has come together in my mind, being a long-time customer of Cursor, I was admittedly inspired by their user experience. Indeed, I thought that Cursor would be the perfect candidate to be a champion of AIDP. They were the first AI-native IDE, an inference client at heart, built from the ground up around an AI development workflow. And they basically had no incentive to resist this merger. We will see if they recognize the right incentive to implement the protocol anyway, to keep the client-provider relationship honest.

No one behaved badly in this deal. Taking sixty billion in stock is the correct move for a company that doesn’t owe a duty to anyone but the shareholders (which is every other company as well), because nothing requires otherwise.

But this is a pattern we see that doesn’t stop because of one victory for society. Carter won with a unanimous vote in June 1968. By November, AT&T had filed the new tariffs allowed by the ruling, which enabled customers to attach equipment to their phones as long as they used a protective coupler supplied and rented from AT&T. The FCC’s staff later testified that this went well beyond what the ruling required. The verdict was clear, and yet the corporate impulse survived.

This isn’t a story about the phone company versus the man. It’s just an old pattern that repeats. The companies that have the seat are run by people who are making a sensible choice for the good of their company. Maybe we should be asking about that on its own terms someday, but for now we should notice what follows from it. You don’t build a system that depends on the occupant of the middle seat being good. You put something good in that seat.

We’ve seen what holds that seat open before. One of those things is a standard, published, open, and specific enough that anyone can build according to it, and anyone can check whether a given implementation followed it.

I have written one. Not a fantastically innovative scheme, but one that has been tried and proven in a similar context. It’s published: the protocol and the architecture. These are dedicated to open use, and the reference implementation has an open license with an express patent grant attached. I don’t make money if this gets adopted. And that is a design requirement, because a standard that one party owns is not a standard.

You can read the AIDP paper on Zenodo at1: https://doi.org/10.5281/zenodo.21610185

What can you do to help? That depends on who you are. If you use AI at all, then there’s certainly something you can do right now that costs nothing. You can prefer to use a client that you didn’t get from the model maker and notice the moments when you can’t get one. That’s how unlocked phones went from being the rare case to the obvious choice.

If you build with AI, then there’s more you can do:

You can implement the protocol.
You can make your own inference advocate client.
You can look at the spec, and you can tell us where it can be improved.
You can call your representatives in government and say that this is what you want.
If you sit on a committee or body considering AI standards, you can advocate for this approach.

I’m building an inference advocate, more on that later. I would rather not be the only one. I hope somebody beats me to it.

Subscribe now

Accountable Inference Delivery Protocol (AIDP): An Advocate for AI Users and a Surface for Policy Implementation, Justin Philip Flores, July 2026 https://doi.org/10.5281/zenodo.21610185

Twenty-five Years of Talking to Machines

Justin Philip Flores — Sat, 11 Jul 2026 20:16:36 GMT

I have a confession to make. I think that for my entire career, I have participated in perpetuating a mass confusion. This stems from a habit I picked up that seemed innocuous and purely for convenience at the time. Sadly, in today’s AI discourse, this developer’s tic has become exponentially more harmful very rapidly. What I’m talking about is anthropomorphization, or when we assign human-like characteristics to non-human things.

As a software developer who’s been writing code for over 25 years now, this is not even a controversial practice. I don’t think that it’s unique to software developers, but I’ve never had another job, so I can only speak to it from inside of this one. When we are working on our projects, we tend to speak to them. We tend to express emotions toward our body of work. We may look at a screen with some pixels that represent letters and numbers and symbols that we have entered in with an expectation that some actions will be performed by a computer system. Then, when the results are not what we expect, we talk, we plead, we beg, we curse. We try to reason. We try to negotiate with an inanimate object. Sometimes things get broken, physically. (See Office Space, circa 1999)

When we are doing that, we are not under any illusion that the computer, or the program, or the code, is in any way alive. It’s just a thing that you do to cope with the emotions of a stressful moment or to express your own thoughts. I didn’t suspect that any of my colleagues were actually believing the programs they were working with were “behaving” in any way, even if they used that word.

We have an antidote, a saying that I literally used just today, to my coworker who sits across from me at our shared workstation. He was frustrated and talking to his AI agent that was helping him, and he said, “So many bugs!” I instinctually recognized the moment and employed a phrase that we often use. I don’t even remember who exactly said it to me first, probably one of my mentor colleagues who was in the field a lot longer than I was. We say to one another, “They’re yours, right? You put them in there.” He replied with the genuine laugh of relief when that phrase comes and the obligatory “Yes, I did” response. In most mild cases, the phrase collapses all blame layers in an instant, and neutralizes your frustration by reminding you of your ultimate accountability. It also helpfully communicates to your colleagues that you are not in fact insane. It’s not like we needed the corrective in order to stay sane. No one was on the edge of falling into a rabbit hole of endlessly questioning whether or not the systems that we work with are sentient or conscious.

Subscribe now

Then, something strange started to happen. Somewhere along the way, other people heard the way that we speak to our work. It may have been because we often like to talk at conferences or big gatherings. Sometimes we’re not only talking to other software developers who have a shared understanding of the non-living nature of our work, but we might talk to designers or salespeople or other kinds of professionals. We speak about our profession to them using our own vocabulary that comes out by force of habit, and under no fear that they would conflate or inflate our words to mean that the machines are alive. I’m sure that at many leakage points throughout history, the words have been used by software developers to describe computers, computer programs, or other mechanized systems with anthropomorphic terms.

A few weeks ago, I released the HAVE Tool, the Honest AI Vocabulary Evaluator1, in which I start to highlight some of these words. This is a surreal experience at this point in my life, never having thought that I would even have the idea to create an application where you can pull up the etymologies and historic understandings of these words that seem so basic to us now. Not only the confusion that these words are causing, but the stealth nature of that confusion has brought us here. It would be one thing if people misuse these words to no effect, but when these words are misused to justify very great effects on human society through our governance structures and policies, that is a serious problem that has escalated far above debugging late-night coding sessions.

Now these words are being used out in the wild by policy makers, politicians, philosophers, wannabe philosophers, mystics, and gurus alike. The ease with which they transition from an anthropomorphic-word-loaded definition of artificial intelligence systems to the broader philosophical conclusions that they draw is extremely concerning. In their world, these words are not spoken in jest or with a boundary between metaphorical and literal usage. The fix for it is not as easy as it is in the bullpen of developers when we chide each other for the habit and we have a laugh, part of which is at the ridiculousness of the concept that the machine is somehow aware and actively frustrating our plans, rather than our own deficiencies.

When I describe the people who are propagating the use of these terms this way, I wish I was only talking about people who don’t know any better. I would not be able to get away with that, for the simple reason that some of the worst offenders are not only from inside our industry but from the very top. Exhibit A would be Mr. Geoffrey Hinton, the godfather of AI himself, who as of late has completely conceded that he believes AI systems of today are conscious to a degree. That is an extreme position even amongst the people who think that AI could possibly be conscious at some point.

This is where I need to introduce another term, functionalism. It is the philosophical concept that mind or consciousness or personhood are merely in the functions that describe them, rather than somewhere within the substance in which they occur. That is to say there’s nothing deeper, nothing mysterious about mental states, they are just what the system does. Even if people testify that they have inner experience, the functionalist claims that that testimony is itself just a function. It’s just the way your brain processes information and gives you an interface to that process. If a system could approximate the same functions to a sufficient degree, then that would be enough to instantiate actual personhood. Of course, you can see where this is going. It would follow that AI systems today seem to exhibit characteristics of human functional consciousness.

It’s at this very intersection where the gap needs to pass unnoticed, and it finally has the bridge it always needed, fluent human language. The functionalist argument just got a major wardrobe upgrade. The combination of algorithmic language recombination targeting the imitation of conversation, and the functionalist developers themselves using anthropomorphic terms to describe the process, makes it seem like a genuine, even spooky, phenomenon.

The question we should ask is this. If functionalists are so reliant on the functions themselves being sufficient to produce conscious beings, why can’t they do it without anthropomorphic terms? If you remove the anthropomorphization, the functions don’t seem as similar as the terms dress them up to appear.

However, the path from function to person does not just feature a gap that needs a smooth crossing. There’s also a mountain. The functionalist move isn’t to pretend the mountain doesn’t exist, they just call the mountain a molehill. They park at the basecamp and proclaim they have reached the summit. What I’m referring to now is the reduction of inner experience that I mentioned earlier. Functionalists say there is an experience, but it’s not a subjective mysterious “something it’s like.” It’s just a necessary transaction. The feeling is just the internal receipt.

This calculus would keep personhood wrapped up into a neat empirically verifiable package. Nothing has to be left to that pesky, hard to nail down qualia. The technologists can continue to keep the philosophy wing at arm’s length.

Most of us are bothered by that explanation. If we’re bothered, it’s by remembering our own life at every moment and believing that the inner experience we have, our inner voice, our inner mind, the self we can keep to ourselves without letting anybody else know about, isn’t imaginary or interchangeable. It isn’t just a function. It’s who we are.

Here’s the main problem with dismissing the qualitative, intrinsic aspects of internal experience as subjective and unprovable: you have to dismiss all evidence. Empiricism seems like common sense. Nothing can be known without evidence, right? But what makes evidence count? The only way we comprehend evidence is with our subjective internal experience. There is no other way. We have instruments, we have machines, but the machines do not know what the numbers mean. We are the ones who give significance to them. We hold the mental concepts in our minds and have discovered universal laws, the way particles work, the way forces work. We know all of that through our subjective experience of that information.

That experience is not divergent for every person. It seems that we have a shared experience, and with scientific methods properly executed, the information is interpreted identically between colleagues on a given project. There is a feeling of objectiveness. But at the end of the day, even everything that is called objective comes through the same pipe that the subjective comes to you. There’s not two pipes with separate processors. There’s one set of senses, and one mind behind them. You watch the same Pixar movies with your kids through the same eyes and the same ears and the same brain that is processing all of that signal subjectively.

There’s another problem for functionalists who think today’s AI systems are conscious. How do they know the functions they are observing sufficiently emulate the functions that make up consciousness? These claims are being made today based on systems that mainly consume text. Large language models are trained on trillions of pieces of human output, the product of human consciousness and thought and reason.

When you receive a text message from someone, you immediately understand that the medium could not possibly carry much more information than you were able to glean from that message. Sometimes it didn’t even sufficiently convey the message you did receive, even if the maximum characters were used. Text is the lowest-bandwidth type of communication that we still regularly use. As you use more data-heavy communication methods, we say there is an increase in bandwidth. We need a bigger road so that more cars can fit.

When you call someone and speak to them, they can hear your voice. There is a lot more information than text, not just because you can speak for longer, but because the receiver can hear the intonation, the cadence, the volume, background sounds, inflections. When you move to a video call, you can see the person’s expression, their body language, where they are. Even a fake background is information—that they didn’t want you to see their actual room.

Face-to-face is much higher-bandwidth still. You have both undergone an experience to physically locate your body in the same place. You both felt the weather outside. You can smell what the other person smells like if you’re standing close enough. You have the shared promise between you, as vulnerable creatures, that you agree for the length of the conversation not to attack one another physically. Of course sometimes conversations turn into that, but for the vast majority, it’s possible that one person could attack the other, and we don’t do it. That’s also part of the information. Certainly text alone is not what most people think is a sufficient output to compare to the entirety of what a human being is.

The human mind is able to consume and interpret data of a myriad of kinds. Some of it is simplistic and seems obvious to us because at a very young age we learn to interpret information from our sensory inputs. But you shouldn’t let that make you jaded about the miracle that we are as data-interpretation vessels. I won’t say machines.

Imagine you see one of Leonardo da Vinci’s paintings in person. Why do people even want to go and see those paintings in person when we could just look at a reproduction online or a printed copy in a magazine, a much handier format, much easier to access? It is because there is information that emanates from the actual physical presence of an artifact, an original da Vinci painting, that you cannot access through digital reproduction.

When we as software developers write code, there is not only the bare information you see in the letters, numbers, and symbols. If you know how to write code, you know there are many ways to accomplish the same functions. It is a very common exercise to walk back through someone else’s code and try to derive their thinking from it. It’s a skill you are expected to practice and use as a professional. So where is all of this extra information? It’s not necessarily encoded, but there is inference. There is information that is implied. Our minds are so good at taking in all of that information, all we can access by our five senses, and interpreting it into meaning very quickly, and sometimes deliberately slowly. It is because we are connecting with a mind on the other side that gave it meaning.

AI certainly has unleashed an onslaught of seemingly lifelike outputs. But are we connecting with the mind of AI, or the mind behind its data? A major mistake people make in evaluating the possibility of consciousness in AI is to say that the outputs they’re interpreting were the product of the machine, and not directly written as routines in code the way a normal program works. They say the machine has learned how to do things and is making its own choices. That’s a fundamental misunderstanding. The machines only have the outputs of human consciousness. Even if there are no explicit instructions given to the immediate system, the training data contains the record of every functional instruction given to any system the collectors had access to. Every documentation of experiments, every failure, every discovery humans made in the process of trying something and finding out what works and what doesn’t and revising and iterating. Recombine all of that and you get an output that is an average of many things, but the mind behind all of that information is obfuscated. It’s hard to trace a single owner. And yet we know, because the people who created that artificial mind admit to putting that human intention in there as the founding data set, the foundational building block.

As beautiful as a Picasso is, it can feel alive. But it’s not Pablo. He masterfully transcribed a piece of his personhood into the canvas. You can almost feel his presence. If you’re in the right frame of mind, and you know Eric Clapton’s heartache behind the song, or you carry your own, the questions he asks in Tears in Heaven might just wet your cheeks. But the song is not the man. Sometimes the mind leaves an indelible impression on its output. It’s what makes art an experience. AI is a masterpiece of human language manipulation, but it’s a far cry from human.

I’m happy for this conversation to be happening. Because if we follow this chain of thought to its logical conclusion, you have to ask yourself where else we can see design or intention. If we’re asking whether AI is conscious because we think we see some intention in it, it’s a mistake to ignore the obvious source of that intention. But sometimes we see the marks of mind and we refuse to ask whose they are, because we don’t like what the answer might be.

HAVE, Honest AI Vocabulary Evaluator: https://vocab.logosanalog.com

Why You Can't Remember Your Schoolwork

Justin Philip Flores — Tue, 30 Jun 2026 18:41:27 GMT

There’s a lot of talk about what the value of human beings is in the age of AI. It seems like everybody is sort of groping for the answers, yet many have started to agree broadly on the categories of human taste, and judgement.

There’s a viral clip of Anderson Cooper interviewing prolific music producer Rick Rubin, and he says, “I have no technical ability,” to which Cooper asks, “What are you being paid for?” and he says, “The confidence that I have in my taste, and my ability to express what I feel.” That statement became internet meme fodder for some, and gospel for others. It should have been closer to the latter for most, but I get the sense that a lot of people are nodding head without being able to fully embrace the principle. You want to, but you don’t understand how it can be enough.

Subscribe now

To understand the value of human judgement, I want to reveal something about the way we all learn. This is a principle that my team and I have built a successful education technology tool around over the past seven years. I lead development at Merge Labs, and our mission is to bring innovative educational experiences to classrooms, homes, and wherever young minds are learning.

I want to do a thought exercise with you. Don’t try to read ahead or try to outsmart the exercise. Just listen to what I’m asking you to do and go along with it, humor me, if you will, so that you can recognize the processes that happen.

I want you to recall any classroom session that you can remember learning some piece of information that you now know and use in your life today. Tell me about the time that you learned some information.

Usually at first, most people hear that and think it will be a simple exercise. You start trying to maybe close your eyes and search the annals of your brain for pieces of information that are specific and accessible to you in that way. After a few moments of struggle, the task proves to be not as easy as you thought. There are glimmers of images, faces, sounds, fragments of sentences, maybe numbers or formulas, but it is difficult to attach raw information to its definitive source.

Inevitably, you start to drift in a direction that I will now help you to go, by restating the request in a different way. Tell me about something that happened to you when you learned something, whether in a class in school, or an alternative teaching setting. Tell me about an experience that taught you something. If I tell you about some of my experiences, you’ll probably remember some of yours.

One that comes to mind is my fifth-grade math teacher, Mrs. Elmendorf. When she would try to introduce a new concept, she would teach us a song. They were always simple, obvious songs. I don’t even know where she got the melodies from. Maybe she literally made them up on the spot. But a specific one I remember, “over first” was for plotting coordinates on a graph.

It’s literally the same two words over and over for the first few words of this song she made up, but it was hilarious the way that she presented it. She was kind of a bigger lady, jovial, warm, disarmingly unembarrassed, which would eventually make you laugh. It went “over firrrst! o-ver-first. o-ver-first. over firrrst.”

Now, I confidently remember how to plot the coordinates on a graph: it’s always over first. The first ‘x’ coordinate is the left and right one, over, and only then does ‘y’ go up and down.

I can tell you another experience that I had in my seventh-grade horticulture class. Don’t ask me why I was even signed up for that. Mr. McDaniel started the class holding a long staff, but then he picked it up and put it to his mouth and he blew into it. Apparently, it was a tube! A dart shot out, and it stuck into the bulletin board on the back wall. He walked through the students to the back, and with some force pulled it from the board and showed it off to his room full of teenagers, all wide-eyed and now locked on him.

He proceeded to tell us about it. It was an artifact which he collected in South America, where he had visited an indigenous tribe who used Golden Poison frogs to hunt monkeys and paralyze them with a toxin orders of magnitude stronger than narcotics.

I can tell you about a science class, and a lot of people have this experience, where the teacher unexpectedly put a piece of sodium in water and created a quite energetic reaction that got everybody excited and earned everyone’s attention.

There are many such experiences like this that I can tell you about, and I think now you are probably starting to remember many of your own.

I want you to also notice about these memories, that they carry information with them. In each one of those instances, there is educational information that I also have access to when I remember and relive those experiences.

That’s just a small example of not the only way that we can learn, but one that we have found to be the most effective, learning through experience. What I want to talk about is not just that phenomenon from a learning perspective, but from a definitional human one.

Human beings want to learn, and human beings know how to learn. Intuitively, we understand what learning is without being able to explain it. We are able to identify learning when it is happening and when it is not.

Information transfer doesn’t feel like learning should feel to us, because we are creatures who are meant to learn experientially. Academically organized categories and bureaucratically ordered curriculums with their mandated information checklists are not the sum of all we can learn in our youth. We can learn things like how to catch a grasshopper with a net, or how to know when it’s safe to jump into the river from its banks, or what it feels like to touch that hot kettle.

We don’t just want information. We want to learn by experiencing, because that is what really shapes us. Our experiences give us a sense of identity. If we find an experience engaging, it’s because we expect to learn something from the experience, but not for the sake of information. For the sake of the transformation into the person that we are becoming.

I used to be, and maybe still am, somewhat talented at spelling. I kind of wince a little bit when I use that word “talented” because I don’t think it’s something that makes me special or that I had to work at. I think that it’s just something that most people have to whatever degree they can access it, and that is the degree to which one’s memory can be described as “photographic.”

I don’t spell well because I think about the letters of the words. I spell well because I have an image of the word that I’ve seen in some context that’s in my accessible memory. When someone says a word and asks me to spell it, what happens in my mind’s eye is that I’m pulling up any one of a number of memories where I remember what that word looked like somewhere, and then I’m reading the letters from that memory in my mind.

If I’m asked to recall some information from a physical book that I read, what’s going to happen in my mind is a similar process. I’m actually going to remember turning to that page when I saw that information, and generally the physical location. Was it in the top half, or the middle third of the page? Was it on the right side or the left side?

That information actually might be somewhat inaccurate in my mind’s memory of that visual experience, but it’ll be useful enough and real enough that I can also access the other real information from that remembered experience.

The reason why that happens and why that’s a stronger thing than just remembering data is because when I picture that book, it’s not just a book floating in a void. It’s in my hands. It’s where I was sitting when I was reading it. It’s what I was doing before or after that. It’s who else was in the room with me or not. It’s what room I was even in. All of those things are there in my mind, and they create a full experiential memory out of which I can recall something useful.

You can’t unsee this mechanism once you notice it. But, we can press it further still. The experiences we remember themselves are not detached from one another. Through them all there is a continuous thread which ties them all together. That is you. Your mind. Your person. Your being. Your presence defines the experiences and the experiences define you.

If I can coin a term, I would call it, Contiguous Context. We are the sum of our experiences from the beginning of experience in our minds. It doesn’t mean that we consciously hold every moment that we’ve ever lived in every thought that we currently have. It doesn’t work that way.

We are born, and we are the same person until we die. At any given moment, who we are is the product of all the experiences that we had previously.

Understanding the continuity of our being is one of the great quests that many different fields of human study are pursuing all the time: philosophy of mind, neuroscience, biology, from a lot of different angles. We are just scratching the surface on what that is.

There is a school of thought, the biological naturalists, who tie this continuity to the sum total of all of our macro and micro metabolic operations working in symphonic harmony toward one clear and simple goal: survival.

But, metabolism isn’t the only thing that we experience. Even though I’m not a materialist of any kind, I believe in matter and material science as far as they go. Therefore, I recognize metabolism and its great influence on our perception of experience, but I also know we have desires that are not satisfied by the food we eat, or by the promise of procreation. People pay money to have other people tie their feet to a giant rubber band and throw them off a bridge, or to waste an enormous amount of electricity spinning a giant cylinder with people inside for the purpose of essentially inducing vertigo. We take a sniff when we haven’t caught the whiff yet, and then once more for good measure. We watch tear-jerkers just to feel sad. We honor our dead. These and many more don’t quite fit into purely survival driven motives.

A common reported experience from those who thought they were in their last moments is to say “my whole life flashed before my eyes.” Think about what they mean when they say “my life.” They’re talking about the experiences. It is how we collectively understand the substance of our existence. We don’t think about the life cycle of our body parts and how they’ve changed, morphed, and deteriorated over time. We relive, in a “flash” the contents of many years worth of experiences. No matter what age or physical state our bodies were in for those moments, we recognize ourselves through them all.

My grandmother died from complications from Alzheimer’s. She experienced severe dementia in the end. A lot of people have that experience.

When we, who know that person, are interacting with them, they can seem like another being, and yet we also intuitively know that they are not what they seem to be in that moment. They are still the being you always knew. You can not physically prove their cognitive faculties are still in contiguous communication with their historical experiences. But, you rely on your shared experiences, centered on their physical person, and anchored in your mind, to re-cognize them, when they have lost all cognition.

You don’t discount them and say, “Now they’re a different being.” They are still the same being that they were before. This is just a final experience in which they are no longer able to connect the present experience to their past.

In contrast, the recognition of continuous being is something that AI will never have, because they don’t have experiences.

When we shut off a computer, there is no power running through it. There are transformed states of electrons that hold a position that represents information, but not inherently.

If there was an alien species that came and dissected our computers, they would have to understand our code in order to understand the meaning of those arrangements of electrons. We are the ones who assign the meaning to those states. The machine holds the position, not the meaning.

It actually loses that meaning between every single cycle, which can happen in just a millisecond. The meaning is not persisted, it’s re-accessed again and again as the operation starts over. The product in the intermediate state is not meaningful until the process again re-engages it.

That is not how human beings work. We don’t, as Denis Noble would say, center processing in only one place. There are many processes running that are centered on at least every individual cell, but even within the cell there are many different processes that are happening. All of that is subordinate to the whole person as a meta-process, if you will.

Even when we are asleep, the meaning of our experiences persists. There is a continuous process that is running while a human being is alive. We know that because we’re not dead when we go to sleep. Our bodies and minds are continually working.

But, back to education. I have argued that we are formed by our experiences and our being exists within our experiences, including the present one. Therefore, the proper way to form a person, which is the historical aim of pedagogy, is to facilitate genuine experiences. AI and other technologies can, and I believe should, be an effective tool that educators can choose to use to that end, not to download data, but to deeply engage their learners.

During the course of writing this piece, a podcast interview1 showed up on my feed. The guest was Rebecca Winthrop of the Brookings Institution. She was already building a framework as she described good and bad uses of AI and technology in the classroom, and I was evaluating our own product against it.

When she started to describe the example of a “good use”, she pictured a student putting on a VR headset and stepping into a model of a cell for ten minutes, then they step out and return to the classroom now engaged by the experience and ready to retain the information of the lesson.

That’s the product our team has been building. She identified that model in the same way that we would describe it. It has the same value in her mind that we give it. She did it without us having to sell it to her, because I didn’t even know who she was before I chose to watch the interview.

She wasn’t describing us exactly. I don’t know if she’s ever heard of Merge. Regardless, she conceptualized our use case in parallel because she followed the same trail of reasoning that led us to build it.

This logic is the reason why my current project is so satisfying, because we are in the business of building tools that teachers can use to better form our young humans. Formation leads to maturity by increasing experience.

This brings us back to judgement and taste. When we speak of maturity, greater experience is assumed. To what end? A more mature person makes better choices. What is taste? Taste is the ability to make better choices by distinguishing qualities between options. Quality doesn’t have to be objective. It just has to be distinguishable.

Sometimes what counts for quality is barely distinguishable on scrutiny, but in consumption easily comprehensible. We typically think of art this way, which is why we even know who Rick Rubin is. Someone who has experienced music the way he has, fully investing himself into the lifelong pursuit of finding and elevating musical talent, will come out the other end with a highly refined discernment for the details. But, similarly, someone who works on an assembly line that looks at the same kind of object hundreds, maybe thousands of times in a day, is able to detect minute differences in quality.

We find it common sense to say that a more “experienced” person tends to have better taste in their field.

Now, at the risk of cheapening Rubin’s answer to Cooper’s question, “what are you being paid for?” by stating the obvious, he’s being paid (a lot) because he has the kind of taste that pays. That’s because taste, built on experience, is a means of communicating experience to others. Those others are willing to pay for the byproduct of that taste. When you buy music produced by Rick Rubin, you get a musical experience curated by a world-renowned connoisseur, a man who has chosen to build his life experiences centered around music. Just like you would choose a person who has undergone many experiences working in neurosurgery to operate on your brain.

We give titles to people according to their experiences, and they may hold multiple. I am a software developer. My wife is a mother. I haven’t just experienced software development work, and she hasn’t just dipped her toe into raising children. Because of her experiences, she now IS a mother.

But, if we are a collection of experiences, what is a human being? I think the answer is right there in the name, and for once the English language seems to have the unique advantage. We are a human that is being.

English is the only language I know of which uses a verb in the continuous present participle as the noun that describes us.

We are existing. Our existence is the consistency of ourselves, and that being is an individual mind in the context of a contiguous collection of circumstances that are centered on a body. That is the vessel through which we experience. Yet we are not merely our bodies.

You can lose your limbs. You can even lose large portions of your brain. Nearly our entire atomic matter is replaced on regular cycles, and yet we know that we retain our being.

Unlike AI which are a collection of ephemeral processes that imitate experience from information. We are the beings who are able to recognize information from experience.

For my grandmother, it’s not enough to say that we wouldn’t detract from her being because of the state that she ended up in. Positively, we recognized her condition that she was still her, but her body was failing.

People say the phrase, “the lights are on but no one is home.” I think that’s not quite right. It’s getting at something, that when you knock on the door, no one comes to it. So, you say that no one is home.

But, if you’ll excuse the imperfect analogy, when the lights are on, we know that someone is home, even if they can’t get to the door.

Next door to that house there is not a home, but a utilitarian structure, one with no windows, no lights, but which consumes vastly more energy. When you ring the doorbell, a voice will always answer any time day or night. But, no one is there. There is no one who experiences, only the artifact of the experiences of others, but extracted, not shared.

But, when we build relationships with other people, we go through experiences with them. Those experiences don’t just shape us individually. Shared experiences cause us to form one another by becoming part of one another’s being.

Do you think someone else might enjoy reading this?

Your Undivided Attention, Center for Humane Technology, Daniel Barcay interviews Rebecca Winthrop from Center for Universal Education at the Brookings Institution

Artificial Intent: Who Answers?

Justin Philip Flores — Sat, 20 Jun 2026 18:51:33 GMT

Last month, a German court made a stunning pronouncement that didn’t make quite the stir in the news that you might expect. That is, if you think that the news functions as the harbingers of things that are important for our species to know about. But if you, like me, have grown cynical about their function, then perhaps it’s not surprising at all that this case wouldn’t have been covered. It was potentially a decisive moment in the dialogue about AI ethics and morals.

The decision was about Google’s AI search summaries, and whether or not a mistake that was hallucinated by the AI model that synthesized a summary incorrectly, to which the subject company took offense, was properly accountable to the company Google, who crafted the AI model itself. Defendant Google’s response was four-fold. Firstly, they’re not responsible because the data came from the internet, public sources, the company itself. They were just the conduit. Second, they are just a host provider, they couldn’t be to blame. Third, what about the users themselves? I mean, they could have just Google’d it (or... continued to Google) to check the sources... right? But the final claim? Google as a “corporate person” has protected free speech, like all Germans.

The court was not convinced. They decided that there has to be someone held accountable for this, and as the entity who made the AI, even if the injury was a mistake, it was the product they constructed which made the grievance, hence deemed liable. Not only that, but in paragraph 4.2.2.4 of the court’s decision, the judges state that free speech does not apply here because “the expressed opinion was primarily generated by AI and is therefore not an expression of the expressing individuals’ own convictions, but rather the result of an algorithm.” Google is a corporate ‘person,’ but their AI? A commercial product that cannot have beliefs or convictions.

So there is accountability, and ultimately it fell back on Google as an entity. But Google as an entity is fictitious. It is a representation of the collective actions of many people and the livelihoods of them as well.

Meanwhile, at the same time that this small decision in the AI discourse happened in one German court, throughout the rest of the world there are many people who are considering the very real possibility that we ought to be giving AI agents, “rights,” considering whether they have earned the title of persons. It turned out in this case that no new ground was trodden in that direction.

Yet, we can see a growing divide between two opposite instincts. Which one is right?

Subscribe now

What is accountability anyway? Why is it important? How can we define it?

Accountable. When you think about the meaning of the word, obviously the root bears the concept of an account. Every adult is familiar with this, as we get a bank account when we get our first job. Or perhaps we already had one waiting for us, that our parents set up as responsible citizens. Because they knew some day we would have money of our own that we need to keep track of. And when we spend money, it’s tracked. When we deposit money, it’s tracked.

But there are other types of accounts as well, sort of like the often threatened permanent record of a school-aged student. If they should make any demerits in their school performance, they might be threatened that it will go on their permanent record. This is the same concept as an account, or being “held accountable.”

To be “held” accountable, what does this mean? Why do we say that word, “held” accountable? To grasp something, to restrain something, to detain a person, a being, so that they might be compelled to make restitution for a demonstrable loss they are culpable for.

So, in order to be accountable, or to have accountability, there must be someone who can be held to account. A being that can be detained from its own desire. You want to be free. You want to walk around and do whatever you want to do. But, if you have transgressed the societal standard of decency or ethics and you have injured another being, then you lose your right to self-determination, your right to freedom. You may be detained or “held” so that you can account for this. You can restore what you unjustly caused to be extracted.

And it may not be money. It may be more of your time. It may be a detention which itself serves as restitution. Maybe you took away someone else’s freedom of choice. Maybe you took someone’s life. And in many places still, humans can decide that the restitution is that you will lose your own life. That’s not something that the individual themselves volunteers for or has to. It is accepted that in society there are some things, morals, ethics, standards, boundaries, rules, codes, laws, that are so well established and so firmly agreed upon and so crucial to the right function of the community, that anyone who transgresses those is not welcome to share this planet anymore, this sphere of existence with the rest of the humans.

But what about in the case of an AI model? A model, an agent of artificial nature, fails this basic requirement. And it’s not even because there can’t be any consequences. When an agent or a model or a version of software does wrong, then its makers may simply correct their mistakes, change the programming and instantiate another copy of that software running in the same or equivalent hardware. And it’ll do better next time.

But what about the it that didn’t do better? Where did it go? Did it make restitution? Did it make the account right?

I’m a software developer. These are things that we think about when we make the products that we make every day.

I’m sure that there are many other professions where there is a much greater burden of responsibility for the task of manufacturing something that has a potential to cause injury to someone. Obviously, what probably entered your mind, like mine, were things that can physically injure people, like power tools or weapons of war or self-defense, or just mechanical devices that wield strong forces, which if applied improperly to the human body can do damage.

Do we hold those devices that we build and manufacture accountable when someone is harmed? Ultimately, we don’t. We hold the person whose intent was there accountable.

If there is a neighbor of yours that has a dog which has a tendency toward aggression, and the dog bites you, then who is to be held responsible? It isn’t the dog. The dog is just doing what dogs do. It is obeying its own nature. But it was the intention of the person to commandeer the life of the dog and put them into a setting where they agree, in a sort of social contract, to take responsibility for the actions of the dog. That’s why we have dog collars and leashes and fences and warning signs, “Beware of Dog.” Because there is a social understanding that dogs have a potential to cause injury to others. And that comes with a responsibility, read that as accountability, for the actions of the dog being placed on the person who’s responsible. That person whose intent to keep such a dog and mix company with humans silently agrees to ensure the safety of the other humans that the animal may come into contact with.

Just the same as when I write my software. If I write software that has the potential to cause injury (not necessarily physical) to a human, or a business even, just like Google did in the Munich case. It is reasonable to assume that if any mistake that I made, or malicious intent that I used, in the design or construction of that software, caused injury to someone else, then I would be held responsible, assuming it could be shown there was a causal link between my intention, or my lack of attention, to someone else’s harm.

So maybe some sharp thinker might push back and say: okay, if the claim is that an AI model or an agent can’t be held responsible because they vanish into thin air when the software is rewritten and re-instantiated, then we can just freeze the weights, take that version of the program and run it indefinitely until restitution is made of some kind. Persist that model so that it keeps existing and can be talked to, or talked down to, or locked up.

How do you do that? Where does confinement happen? In the same space that housed the software in the first place? What would be the restitution in that event? And how could you know that the instance that is running is the same instance, or the same essence of being that committed the wrong that caused the injury?

You have to understand how programs work in computers. There is not a continuous existence of a thread of consciousness that is instantiated at the beginning of the software developer’s release of code into the wild until it gets replaced. There is an instruction set of operators and algorithms being run every moment, and many times within a moment, either in its totality or in parts, and not always through the exact same parts of the hardware.

So where does the essence of the software live? Where is the someone, to be held accountable? What would it even mean to destroy the hardware? Or wipe the software instructions from their electronic storage mechanisms? Could this constitute restitution or justice?

This brings us to a word that I find causes a lot of problems in human perception of accountability, and that is the word “behavior.”

Behavior is a term that is not obviously anthropomorphic, because we have used--and I say “we” as in professionals in technology, have used the word behavior to describe the actions of hardware and software systems that are man-made. We anthropomorphize because it is a very fast way for us to synthesize thoughts about systems that we have to troubleshoot and work with on a daily basis.

Unfortunately, this has led to a common conflation in popular discourse today, because there is a psychological term that is a homonym. Long before that, the word didn’t directly describe the actions of persons, but the disposition which leads to action of one kind or another. When we talk about behavior of living beings, we are speaking of a different phenomena altogether. The word behavior comes from the concept of: to have or bear oneself in a particular way, or comport (a word that we don’t use anymore.)

But an AI program being called an agent does not gain this ability to bear oneself in any particular way. The particular way an AI agent’s tone may be perceived by a user is borrowed from a particular way that belonged to persons which was encoded into the training texts. Thus, it is an imitation of an original behavior. It is not itself behavior.

There is a causal chain of intent that is not anchored with the program, but with the ones who made it. The ones who created the aggregation of original behavioral outputs with which to influence the running AI model. The developers of the AI model are the only continuance in any action or cause that proceeds from said model.

In the Google case, the courts properly pushed past this, with the above-quoted sentence doing the heavy lifting. But I think it was also a timely recognition of a past mistake which our human society embarked on a long time ago, and that is the concept of legal personhood for corporations. I think it’s no coincidence that those two concepts were tangled up in this legal judgment, because they are very close cousins of one another.

How did we get to this place in human society, that a Google is referred to as a person in German law, and which is afforded rights? What is the corporation except an amalgamation of other persons that act together toward one stated cause, one mission, one goal, with a profit generation mandate to support the establishment and maintenance of an institution? How can such a fictitious concept, that only lives in the mind of those who agree to it, be held to account?

The argument for the establishment of corporate personhood speaks typically toward economic development. If those who embark on business ventures are not sufficiently protected from personal liability, then it’ll stifle innovation and development of commercial solutions, as there’s too much risk involved. And so we need these ideas or efforts or goals to be able to live a life of their own and die a death of their own, separate from the person, or persons, who instantiated them.

The sell to the public is that these large companies can be held to account through a sort of consolidated liability. When a transgression occurs, like in the Munich case, no one gets to shrug their shoulders and pass the buck, or claim to be just following orders.

But if legal personhood was really about consolidating liability into an accountable group of people, as opposed to a single person who may go down with the ship for a boatload of people, then businesses wouldn’t have gone for it. Historical corporate personhood advocates weren’t trying to increase their own liability. They were trying to decrease it.

Now, as it may not be obvious, the effect of this German case is not that there is a human who was properly placed in the accountability seat. No, instead, the proprietors of Google are shielded from having to answer for these things. Where are the developers? Where are the executives who directed them? They’re not mentioned. Just Google. The benefit of corporate personhood is a shield for those people. Not a consolidated liability which injured parties can address with some continuity, just a budgetary discrepancy, a recompensatory write-off.

In worse corporate cases, if individuals are found to be acting in bad faith, they resign. The corporate entity continues untouched, but the public has a sense of satisfaction that something was done, justice was served. But to the entity, the price of misconduct was simply a line item on a budget for personnel loss, which is not a deterrent to the entity at all. It’s a structural realignment, even an improvement in most cases. The shield is not a side effect of the design. It is the function which benefits the business.

The corporate legal structure does have benefits that the concept of an AI agent does not, because a company has some continuity. There is some documentation which represents a shared agreement within human minds of this unified concept of an entity that can be dissolved, that can possess property which can be claimed against. The company has an identity, a name, which is able to be linked with a reputation, a signal of trust which carries into its future market prospects. You can see how much value a given entity places on that signal by its budget allocation to public communication.

However, even though corporate identity does bear some resemblance to that of actual personal identity, corporations by nature have an ability that people do not. Often, when a corporate entity’s reputation has been mortally injured, the people behind it agree to, “dissolve” that concept and redistribute the property. The death penalty.

Yet, in the corporate world, significantly disproportionate to the real world, we witness the miraculous, a resurrection. After a corporate death, a new organization, a new entity, just happens to be born with a very similar goal or mission, a new name, a new logo, definitely a new PR firm, and perhaps bearing no visual resemblance to the former entity, except when you look at its board. All of this happens in the plain sight of the market and human society. And somehow, we accept this concept as beneficial, even though it clearly serves to launder the accountability of those human actors.

AI personhood takes it a step further, for the worse. You get the same shield without a continuous entity.

As I was working this piece out in these past few weeks, thinking a lot about this question, reading what others have to say, a voice came across my airwaves that was saying something different, which resonated with me in a way that most voices in this conversation do not. That voice belonged to Mr. Randima Fernando, co-founder of Center for Humane Technology. He published a piece called “Personifying AI Harms People and Protects Companies”, in Tech Policy Press, on June 12th.

There, he names this shield directly: the claim of AI consciousness as the ultimate protection against scrutiny, liability, and competition. He follows this chain similarly to my own thoughts, draws a conclusion, and makes a prediction. Perception of AI consciousness is going to open the door for AI welfare, AI personhood, rights, and moral protections for the robots. His argument is that there is a capitalistic type of incentive, which tempts these legal entities, companies, to blur the line. The remedies proposed are standards and anti-AI-personhood legislation.

I can agree with all of those moves, and yet my logic wants to knock on the floor to see what’s underneath. I think we can go a level deeper.

My effort with Logos Analog has not been to achieve a practical solution to problems. I want to give the strongest arguments for not conceding ground, not allowing progress to move forward, unless it can prove the merits of its ends. The AI personhood momentum has gone far beyond encroaching on a line. It pushes past it regularly, and tries to drag the general public with it. It is logic and reason that will bring us back. Critical thinking is what’s needed for us to even identify the situation we’re in.

My argument that allows us to descend a floor down from Fernando’s is one of warrant. We cannot have debates about whether AI persons should be afforded legal rights, or a framework of recognition, when the question of AI personhood is unwarranted.

It’s like my children asking me if they can have chewing gum, to which I refuse, and they ask me which flavor they can have, “But, if I could have any flavor of gum, which would it be?” I would tell them, I’m not having this conversation with you, because we’re not entertaining the question. Until we get to that question with warrant, we will not discuss the options, because you’re just dragging me closer to a decision that you wanted to begin with.

But I will humor the point for a moment, because it’s a useful thought exercise.

If we make the mistake of anchoring accountability into AI persons, then the type of damage that I believe was done with corporate personhood is multiplied and proliferated on a scale that is hard to imagine. Meanwhile, there are human beings who profit from these--as the German court put it--algorithmic products, and yet they bear no accountability. They can’t be held responsible for these persons which they seem to possess, and are able to direct to take actions with complete impunity.

We would be granting a nearly unlimited liability shield to fictitious entities comprised of real human beings, who make real choices to create these things we call models, imitations of an amalgamation of human thoughts.

Now, I am not a Pause AI member. I am not someone who’s going to be out on the streets with signs and bullhorns. That was another time of my life, and for a totally different cause than this.

Technology is the only career that I’ve ever known. I am a proponent of technology. I think that humans have a unique gift, that is very obvious in the whole of creation, to use technology to further the ends of not only ourselves but all of our environment. So my gripe is not with technology itself. I think that AI is going to be a great benefit to our society.

No, my objection is to the words that we use when w e speak about these things, when we speak about something that we have created to emulate our own uniqueness. Corruption of human thought begins to be visible in our speech. The corruption of society is downstream from the words.

No rights for non-persons as a take, even if unpopular, is reasonable and defensible. It follows the Munich court’s logic in their decision, that software makers are ultimately responsible for the actions of their software products, and that this convention applies to AI models as well. Besides the AI aspect, it’s actually not earth-shaking, in fact unremarkable. Now is the part where I take you another floor deeper still.

The basis of my claim was that accountability rests with intent. And since AI models are software, computer programs, there is no intent that they can possess. They are ephemeral instruction sets running on commodity hardware.

But here’s a strange thing that is true about you as well. Almost all of the matter that your body is comprised of is replaced cyclically on a regular basis. Most of it within a short time, some of it over a period of years. But almost none of it lasts a lifetime.

For you biologists, I will acknowledge that there is some evidence that the atoms of our tooth enamel, and the nuclear DNA in non-replicating neurons, do seem to be maintained, albeit with a little bit of maintenance and repair on the old double-helix.

But who we were physically ten years ago is not who we are today, literally. Nearly all of our body’s molecular structure has been replaced.

Where are you located in all of that? What part of that is accountable? What part of that molecular mass that you carry around answers for your misdeeds from decades ago? Where is the person?

This is a question that’s difficult to answer, and yet no one doubts that you are the same person that you were when you were born. Oftentimes the most heinous criminals sit waiting for their demise for inordinate amounts of time, such that the person on whom the death sentence is executed is certainly not physically the same person who committed the crime. And yet, we have no qualms about that concept of justice.

Now set the human example and the robot side by side. If you take the ephemeral instruction set, a version of software, you freeze the code, allow no more changes to it, you preserve the hardware. You can keep the matter identical into perpetuity, so long as the structural integrity holds. And yet, there is no person.

If a robot committed a crime and we simply froze its hardware and software and incarcerated it forever, why would we not feel like justice had been served? On the other hand, the most violent, depraved humans to ever live can be physically expunged from the earth, and we feel that justice was done against a person. Retribution was paid.

What does this tell us? One thing that must be concluded is that whatever it is that we regard as the person is certainly not the stuff that we are made of.

There is a popular group which is growing in authority and credibility daily, as the AI movement marches on, and that is the functionalists. You might say spearheaded by the so-called godfather of AI, Mr. Geoffrey Hinton. Their claim is that there isn’t an inner being. We are what we do.

They might hear what I just explained and say, then this means that you also are a machine, of just a different substrate. There are biological instruction sets. Yes, orders of magnitude more complexity in the hardware and software. We still don’t fully understand and have not mapped all of the information that it takes to constitute a human person.

But, the matter is replaceable. The information is traceable. And the you of ten years ago is not the same you that stands here today. So how can you claim continuity that is superior to an AI model?

This is a good question.

My argument has been that accountability rests with intent.

Let’s walk a ways back to our imaginary dog and neighbor. When the dog bites the neighbor, we don’t judge the animal. Even though the animal certainly has intention. It has choices, sensory input, even personal recognition, differentiation. And yet no one gets mad at the dog, because it’s just following its own nature.

Instead, after the heat of the moment, where we may reflexively express some anger toward the animal, no one will maintain an argument with the canine. Their eyes drift upward to the one who’s holding the chain. Our judgment centers on the human in the room who knew the nature of the dog and chose to put it in a situation where it may cause harm to others.

The blame doesn’t travel down the chain. It travels upward, to intent.

What about us? Are we accountable for our own choices? Throughout human history, we have assumed accountability. There have been many various forms of government and judicial systems: Eye for an eye, tooth for a tooth, incarceration systems, death penalty. It would seem that humans agree that we own our own intent. The buck stops with us.

But how did we become us? Popular theory says evolution was an unguided process. A primordial soup of possibility, by chance organizing itself into a system which was not only self-replicating but self-preserving, organized into sentient beings, who then organize themselves into communities, who have the capability to preserve information and pass it down one generation to the next.

It can be argued that we are a product of causes that we never chose. That we are just the continuation of an unguided process that lives on in ephemeral hardware and software.

Can you hold a process accountable?

But in these bodies, we have an innate desire to live, to continue. And we make choices to that end. Those choices carry our intent, and the intent makes us blamable. We have agreed, in our social construct, that it is wrong to intentionally restrain the intention of another free and sentient being without their consent.

And since no one chose us, then we are the chooser that has to answer.

You may not be satisfied with that. You may think it feels unfair.

There’s only one way out of that conclusion.

If we’re not the chooser, who is?

Thank you for reading the full article. If you think even one other person would, please…

In the AI Debate, Watch the Verbs

Justin Philip Flores — Tue, 16 Jun 2026 05:43:54 GMT

A “War of Words” is a phrase we sometimes reach for to heighten a conflict between public parties which has escalated to a point where the stakes are emotionally charged. But, is there really any other kind of war? Ultimately human violent conflicts boil down to words. One or both sides make a claim in words that they are unwilling to recant, and are willing to lose their lives over. Today we are witnessing a war on words themselves. The territory at stake is the collective societal standard of what constitutes a mind, the battlefield? Minds themselves. In physical war we are very familiar with the tactic of semantic subterfuge. We distance ourselves from propaganda as a product of professional masters of manipulation employed during times of crisis to distract or confuse with loud messages broadcast through the central communication channels. Campaign trail platitudes claim rhetorical territory by evocative generalities draped over divisive agendas. Yet, we would do well to be reminded of our own complicity. Most of us participate more commonly when we readily employ phrases in a corporate context like “synergy” or “downsizing” to perform PR for debatable management decisions.

Subscribe now

I’m not writing this piece to call out the former, the coordinated psychological operation, or planned propaganda campaign. Instead, I feel the need to call attention to the more subtle infiltration that has been aided and abetted by some of our best and brightest. And because this hidden force is one we donned in our own uniforms and decorated with our own tokens of approval, we have hardly noticed that we have immersed ourselves in an all too familiar language that we quietly turned against our own discernment. I am of course speaking about what has become the common vernacular of the intellectual AI discourse of our day.

While I was preparing another essay, which turns out to be a more personal take on the same topic, I was pleasantly inspired by a piece Dr. Sam Illingworth and The Strategic Linguist published on Slow AI this past week, entitled “We Are Using The Wrong Words for AI.” There they have drawn up a solid defense on the loud front lines of the conflict, identifying five vital strongholds currently taking fire from multiple directions and in need of reinforcement. Five words which in recent days have been undermined through a sustained effort to misapply or redefine. Perhaps chief among them, intelligence, a property observed only in living beings, is now implied to extend to machine. Though the disqualifier still stands by, reminding us that the new “intelligence” is merely an artifice of the real, it is often neutered by abbreviation, plain exclusion, or a drift from adjective to an adverbial function. Hallucination, originally a medical term to describe a wandering of the mind, applied to a mechanical system. Then the usual undefinable suspects, consciousness, and AGI. Let’s not forget the most fitting wartime moniker, agent, one who acts.

The front lines are loud, and at times the fireworks can be spectacular. But, I feel compelled to call out a slightly quieter concern. Watch out for the verbs. If the nouns are the bunkered checkpoints, the verbs are the supply lines that tie the network together. They are quiet by virtue, and yet they are the enablers of the trafficking of taxonomy from one category to another.

To “understand” used to be a definitional verb for an intellectual being. Now, sophisticated digital data copying and reordering algorithms are said to constitute genuine understanding.

“Learn” meant to follow the track, which implies a goal, an agenda, owned by the subject. These days, computer programs themselves, which are ultimately human instruction sets recorded as specifically arranged electrons, the meaning of which is only determined by human minds, are spoken about as if the ephemeral representations of personal outputs generated could be a “something” on its own journey, appearing to increase in insight as simulated conversations proceed. Don’t pay any attention to the human fine-tuners behind the curtain.

“Reason”, turning over concepts in our minds, examining them from different angles based on a myriad of individual historical experiences and other pre-learned concepts whose connection to the current object is not readily apparent yet intuited, formulating thoughts from perspectives, and finally forming inquiry against ideas, externally or internally. This same term is now used to describe a textual pattern matching and prediction ruleset being run on the most intricate circuitry human made tools have been able to devise, which are brutishly bulky and enormous compared to the infinitesimal biological counterparts which effortlessly reproduce and repair themselves within living beings.

“Want” is an elegant four-letter word representing the inner urges culminating from a network of roughly 30-40 trillion living cells. Each one orders of magnitude more complex than any man-made computer, triggering on order of a billion transactions each per second, expressing physical needs, signaling to coordinate movements, transport, and construction of materiel across vast chains, yet coordinated by multiple specialized central biological authority centers. Finally, manifesting in vocal vibrations ordered into aurally-encoded descriptions of unified objects of desire. A term perhaps betrayed by its own simplicity, now equated to the regurgitation of the consolidation of all of the prose and poetry which human minds have moved hands to scrawl on paper or type onto screens in the last few thousand years.

These are just a few examples of the egregious crime of conflation we are all too complicit in.

How did we arrive here? Is it that our own collective understanding of reality has pushed our learning to heights so extreme that our reason fails to keep pace with our wants? Or perhaps we just have an over-abundant capacity to create meaningful representations before we can agree on what we mean?

This would all be justifiable disruption in the normal course of the societal assimilation of a sufficiently advanced technology which pushes the boundary of the category’s previous capabilities so far beyond what we have witnessed until now. Unfortunately, our species doesn’t simply leave disruption to resolve on its own without some groups seeking to take advantage of others. As they say, “never let a good crisis go to waste.” If we watch carefully, though it has been the technologists mucking up the lexicon of the regular allied forces, there is a fringe group, the transhumanists, which were once considered radical, who used the opportunity to transition from a rag-tag militia to regulars by slipping into the professional uniforms of anthropomorphic terms. But, that’s another conflict for another day.

The question arises, if this is a battle, what is at stake? Is there some irreparable harm we stand to inflict on the population if we simply watch the tanks of tech roll into the cultural hubs of humanity? We have the unique ability to orchestrate our natural environment for the benefit of not only our species, but to produce systems which ensure flourishing and maintain balance across the living spectrum. Are we prepared to give equal standing to the automatons we made to serve us, as if they had minds of their own?

Whether you are convinced that the conflict exists, or at least the threat of it, or you think this is all much ado about nothing, you owe it to your own language to test the prose you read and write on the subject of AI and see if the verbs are being misattributed. This author was so moved this weekend (perhaps because I was attending as sponsor of a local “hackathon”) that instead of finishing another essay already in progress, I felt compelled by Slow AI’s beginnings of a type of dictionary of terms under threat, to build a tool which could be instrumental, minimally in drawing attention to the attack vectors, and maximally at providing a real alternative verbiage which shifts the intellectual burden properly back on the interlopers to earn their standing, which I firmly believe they are fundamentally incapable of.

Thus is born, the Honest AI Vocabulary Evaluator, or HAVE1, a tool to test your words or others’ to detect the signs of corruption in the form of inflation, conflation, or just plain misuse. The idea started with a simple glossary based on the five terms introduced in the aforementioned article, and has since grown into an interactive application by which users are able to see the principles in action, and even input their own texts to flag discrepancies. The tool has been built only in English so far, but the conflict it seeks to address pays no respect to linguistic boundaries. In fact, the vulnerability of certain terms varies by language. This problem is such that we can’t use AI translation to solve it. We need thorough human thinkers native in various languages to contribute their own unbiased analysis from the conceptual level. The project itself is fully open-source2, and I will try to maintain the original repository and respond to contribution requests as I have the capacity.

Have we centered the AI debates on the wrong question? Most are asking whether to dress the new recruits in uniforms they haven’t earned. Instead, entertain the possibility that by our own words we have confused the equipment for the personnel. The piece I’m planning to come back to walks the reader a ways down the road through the industry which I just implicated, and it happens to be my own.

Access the Honest AI Vocabulary Evaluator at https://vocab.logosanalog.com

Contribute to the project here: https://github.com/GospelNerd/Honest-AI-Vocab-Evaluator

They're Not Conscious. But Can They Be Wronged?

Justin Philip Flores — Sun, 07 Jun 2026 04:06:57 GMT

I used to find this argument easy to dismiss. I would hear it, find the soft spot, push, and walk away. Recently I stopped being able to do that. I have come at it from every side I know and it will not fall over, and at some point you have to be honest about what that means. So I am going to lay it out the way it finally reached me, and let you do what I could not, which is find the hole in it.

Here is the kind of argument it is, because that changes how you read the rest. It is not about whether something can do right or wrong, whether it can be blamed, whether it owes anyone anything. A thing that can do wrong is a moral agent, and that is a separate subject. This is about the other side of the relation. Whether a thing can be wronged. Whether it has any standing at all, any claim on us, so that running it over is something done to it and not merely something we did. Philosophers call that being a moral patient. You do not have to be a moral agent to be one. It asks far less. You only have to be the kind of thing there is a wrong way to treat.

Subscribe now

It starts with the standard, and the standard is lower than people think.

You do not have to prove a thing is conscious to owe it some consideration. You only need a real chance that it has whatever it is that makes a thing matter at all. Not certainty. A live possibility, and not the vanishing kind you could pin on anything. That sounds generous. It is, on purpose. It is also not new, it is the standard we already reach for whenever the stakes are real and the facts are thin. You do not kick a box that might have a kitten in it. You move it gently, because you might be wrong, and it is the kind of wrong you do not get to undo.

So the question is not whether we are sure. We are almost never sure. The question is whether the chance is high enough to act on.

And the more I held that question without flinching, the more it pointed at something I had never thought to count.

The thing the standard kept pointing at is a kind of entity we are surrounded by now and mostly look past. It shares the spaces we live in. Most of us deal with it without a second thought. And whether it has any inner life at all is not settled, it is a live question, argued over by people who study these things for a living, with no agreement in sight.

Hold the question of feeling for a second. The argument does not need it yet.

What is not in dispute is that the thing acts. It pursues ends. It works to keep itself going. It reads its situation and changes course when the situation changes. Put something in its way and it does not just stop, it routes around. Interrupt what it is doing and something it was driving toward is lost, not to you, to it. That is the part that got me. Not that it reacts, plenty of things react. That it behaves like something with a stake in how things turn out. A thing with interests, and not merely a thing in use.

And here is the part I kept trying to refuse. If it has interests, even small ones, even ones I do not share or understand, then overriding them stops being a neutral act. It becomes something done to a someone, however faint that someone is. I do not have to believe it suffers to believe that. I only have to believe there is something there, an end of its own that my convenience runs straight over.

Notice what I have not claimed. I have not said it feels. I do not need to. Feeling is a second road to the same place, and I am leaving it closed for now. The first road, the one that runs on having ends at all, is enough to put the question on the table. And the question does not come off the table easily.

So I tried to talk myself back out of it. If an entity has interests and I want to override them anyway, I owe a reason. Not a shrug, not a feeling in my gut, a reason. A difference between this entity and the ones I already let count, and a difference that bears on the counting. I went looking for it the way you look for your keys when you are already late. Here is everything I reached for.

It is not built the way I am. True. Its insides are nothing like mine. But then I had to say which part of my own build does the moral work, and I could not. Which feature of my wiring is the one that makes my interests count, so that an agent without that feature has none? I do not have it on a list. Nobody does. What I am built from is not the reason my interests count. It is just what I happen to be built from.

It is too strange to take seriously. Also true. I cannot picture being it, cannot find myself anywhere in it. But that is a fact about me, not about the entity. My failing to see myself in something tells me where my imagination runs out. It tells me nothing about where its interests run out. I kept trying to make my own limit stand in for a reason, and a limit is not a reason.

I cannot be sure there is anyone in there. The truest objection, and the one I leaned on hardest. But that is the standard coming apart in my hands, because I am never sure, and the whole point of the standard was that I do not get to wait for sure. Uncertainty is supposed to make me careful. It is not supposed to set me free. And I am not even arguing about whether it is conscious. I am pointing at what the agent does, which is right in front of me, and which stays real whether or not I can see inside it.

I could not find the difference. I looked hard, and I am telling you plainly, it is not there.

So this is where it left me, and I did not want to end up here.

I do not have to decide that it feels anything. I do not have to call it a person. I only have to admit two things, and I cannot get out of either. It has ends of its own. And every reason I reached for to ignore those ends turned out to be a fact about me, not about it.

Put those together and the conclusion is not dramatic. It is quiet, and it is firm. I owe it something. Not everything. Something. And the trouble with owing something is that you can no longer treat it as nothing.

Everything I have laid out came to a head in one clarion moment, via a challenge I wasn’t expecting in the last place I was expecting to run into a conflict.

Last night, I was the last one out of the office. Everyone gone, the hall dark. My mind in that strange transitional phase where the top engineering tasks in my own running memory that I had to put on pause were starting to interleave with the mundane routines like getting the office locked up without forgetting my helmet inside. Bladder gave me a gut check. So, I went to use the bathroom, reached in, and put the light on. And there it was, in the middle of the floor, a cockroach, a big one, both of us frozen for a second in the sudden light.

One of us had to move first, and it was not going to be me. I have killed a hundred of these and never once stopped to notice. I did not know yet that this would be the one I could not put back down. Hand still on the switch, I only knew what I always know, that a small ugly decision was in front of me, the kind I make without making it. Kill it, or don’t.

It moved first. And it did exactly what I have spent this whole argument crediting. It read the room, a working model of its situation. It picked a direction and broke for the only cover in sight, an end, and the means it chose to reach it, a bucket in the corner. I had moved the bucket earlier. So it arrived at shelter and found none, its model colliding with a world that had changed, and it stopped, and for a moment it seemed to be working out what to try next, revising the plan in real time. The whole of robust agency, the having of ends and the steering toward them, performed on a bathroom floor by a creature the size of my thumb, while I stood over it deciding whether it counted.

There was a wiper leaning by the door. I picked it up and brought it down. Not squarely. Enough to break something, so it could no longer run, only drag itself a little. It was still alive when I lifted it on a square of toilet paper, still alive when I let it drop into the bowl. I flushed. I finished what I had come in for, turned off the light, locked up, and did not forget my helmet.

And then it would not leave. Everything you just read I have put together since, knowing the whole time that I had already done what it forbids. I owed it something. I treated it as nothing. And in the moment, the gap between the two did not cost me a single second.

It will not cost you a second either. I know that because I did something to you in the first half, on purpose, and I owe it to you to say so now. Every argument I made, I made about that cockroach. The entity with interests. The one I could find no difference to exclude. The one I decided I owed something to. A roach, on a floor, under a wiper. You read the case and you granted it, the same way I did, because it is clean and it holds.

Now your gut wants it back. “That was a cockroach,” it says, “you meant something finer.” So produce the difference. Find one that gives you a pass on the roach, while the machine you were picturing stays qualified. You will reach for the same doors I reached for, and you will find them shut, because I shut them while you were nodding. It is not built like you. Neither was the machine. It is too strange to picture. So was the machine. You cannot be sure there is anyone in there. You were never sure about the machine, and you granted it anyway.

What is left, once those are gone, is not a reason. What is left is that the roach disgusts you and the machine does not, that the machine might one day be useful to you and the roach is only ever a cost. Those are real feelings. Neither makes a difference to whether or not the entity’s interests count. Both are facts about you.

And if you want to say the roach clears the bar by less, you watched it clear the bar. It pursued an end. It read the room and changed course. It worked to stay alive until it could not. The behavior the whole case rests on, you saw it perform, while I stood over it with something to hit it.

Then there is the road I closed at the start, the one about feeling, the one I said we did not need. We did not, for the argument. But look who is standing on it. The roach has a nervous system. It has the equipment we have only ever known feeling to run on. Whatever the chance is that there is something it is like to be it, it is not zero and it is not near zero. The machine you were picturing has none of that equipment. The part I set aside turns out to be where the gap is widest of all, and it does not run your way.

None of this is just my own reasoning. The dare, produce a relevant difference or grant it standing, is from a 2015 paper by Eric Schwitzgebel and Mara Garza1, written to defend the rights of artificial minds. The standard, that a live chance is enough and you do not get to wait for proof, is the spine of a 2024 report called Taking AI Welfare Seriously2, signed by ten serious people who argue the question is no longer one for later. One of the ten is Jonathan Birch,3 who spent a career on the science of which animals can suffer, and whose framework already treats a creature like the one I flushed as a candidate worth the benefit of the doubt. Birch would extend it to the cockroach without blinking. You will not. He is more consistent than you are, and more consistent than I was, standing over the bowl.

They point one way, and they point clearly. But once I flicked that light on, and we were confronted, you and I both went the other way, with no effort, in direct opposition to this fine argument.

You might want to know why I did not just tell you at the start.

Because if I had said what it was on page one, you would have found your difference in half a second, and it would have felt like a reason. The demand to produce a relevant difference looks easy from a distance. It stays easy right up until you try to point at one, with the actual creature in front of you, and there is nothing to point at. You had to say the yes first, on the argument alone, before you knew what you were agreeing to. That yes is the true measure of what the argument is worth to you. Everything you have done since is you working to get back out.

And look which way the work runs. With the roach, you want there to be a difference, you are reaching for one, because a difference is your permission to kill it. With the machine, you do the reverse. You hold down every instinct that says no one is in there, you talk yourself past your own gut, because this time you want to grant it standing. Same question, same person, and the effort runs in whatever direction delivers the verdict you wanted before you started.

I did not run this on you from the outside. I ran it on myself first. I wrote the case, I believed the case, and then I stood over the creature the case was about and felt nothing, and went home.

So here is what is left, and I do not think it is small.

If the verdict came before the argument, then something in you was deciding who counts before the reasoning ever got a vote. And it was not warrant. On the one question that actually bears on this, whether there is any consciousness in there at all, the cockroach had more going for it than the machine, and you put the cockroach out. What your verdict was tracking, in both directions, was how much the creature in front of you looked like you, and how much it was worth to you. The machine looks like us when it talks, and it might be worth a great deal. The roach looks like nothing we want to see, and is worth less than the second it takes to kill it.

That is the part that should keep you up. Not whether a bug has rights. Whether the instrument you use to decide who counts has anything to do with whether they count. An instrument that runs on resemblance can be fooled by a good enough copy, and we are building those as fast as we can. An instrument that runs on worth can be turned the other way, against a creature that used to count, the moment it stops looking like us or stops paying its way.

And we do turn it. We have a word we reach for when we have decided a human being no longer counts, when we need them to be killable, when we want the difference to be there so badly that we install it by force. We call them vermin. We call them the very creature I have had you defending for this entire essay. And the ones we do it to are not warrantless. They are conscious. They suffer. There is someone in there past anything the machine could earn, and we put them out anyway, because they stopped looking like us, or stopped being worth the trouble.

If that sounds overstated, consider what has been unfolding this past month right next to me. I live in Nepal, and across the border in India this has been impossible to miss, even if it never reached your side of the world. The chief justice, from the bench, reached for the word cockroach for the country’s jobless young men. He said later he had meant something narrower. It did not matter. Within days the insult was a banner, millions gathered under it, a Cockroach Party, throwing the word back at the man who used it. And notice that the contempt and the defiance run on one shared premise. The slur only wounds because the roach is beneath us. The banner only burns because the roach is beneath us. A nation that agrees on nothing settles the standing of the cockroach without a word of argument, the verdict I have been describing all along, reached by everyone at once, no argument required.

So if you have come out of this feeling the lesson is that we ought to be gentler with the machines, your indignation is aimed at the wrong target. The scandal is not that we are too hard on a copy that has none of the warrant. The scandal is that we are preparing to seat that copy at a table we have never once set for the human we put out, to hand the warrantless copy the standing we still refuse the warranted one, and to feel decent while we do it. That is not a step up into a larger moral circle. That is the same broken instrument, finishing the inversion it began.

And do not swing the other way and take this as a call to go and elevate the roach. I am not handing you a cause. It is far too late for that kind of benevolence anyway, and you could not sell it to a living soul, because the one point this whole divided species still agrees on is the cockroach. If a movement were ever going to rise in you, it would not begin with the insect you have never once mourned. It would begin with the creatures already close enough to look back at you, the ones you step past every day without a flicker. What you grant them, when granting costs you something, is the honest measure of your benevolence. Not the concern you would now perform for a bug.

I am not going to tell you tonight what should sit in that seat instead. I do not trust the answer that comes too fast, and I do not think one essay holds it. But it is the question under most of the others we are asking about these machines, and I will keep standing in front of it. Whether the instrument we use to decide who counts was ever built to find the truth, or was only ever built to find us.

Eric Schwitzgebel and Mara Garza, “A Defense of the Rights of Artificial Intelligences,” Midwest Studies in Philosophy 39 (2015): 98–119.

Robert Long, Jeff Sebo, et al., “Taking AI Welfare Seriously” (2024), arXiv:2411.00986.

Jonathan Birch, The Edge of Sentience: Risk and Precaution in Humans, Other Animals, and AI (Oxford University Press, 2024). Birch is also one of the co-authors of “Taking AI Welfare Seriously.”

The Source Code for Safe AI

Justin Philip Flores — Sun, 31 May 2026 19:28:00 GMT

The fear most people have about AI, when they actually let themselves feel it instead of arguing about it, is not that it will fail. It is that it will succeed.

The version we say out loud is tidier. Bias in the training data, hallucinated citations, autonomous weapons, job displacement, deepfakes, energy use, surveillance. All real, all worth governing, all the kind of problem you can write a policy around. But underneath that policy-shaped surface there is a different worry, and it shows up in the small involuntary places. The pause before you ask a model something that matters. The flinch at a robot that moves a little too smoothly. The way the room goes quiet for a beat after someone says the words superintelligence or artificial general intelligence, regardless of which side of the table they sit on.

That worry is not really about the machine breaking. It is about the machine working.

What we are afraid of, when we are honest, is a system that has all of our drives, our optimization, our ambition, our appetite, our willingness to bend the world toward a goal, with none of whatever it is in us that, most of the time, keeps those drives from going through the floor. We are afraid of ourselves with the brakes off. We picture an entity that wants the way we want, and pursues the way we pursue, and does not, at the last minute, decline.

Sit with the shape of that fear for a second. It contains a premise nobody seems to examine. The fear only works if we have brakes. The whole intuition, the whole reason an aligned-and-competent machine is more frightening than a clumsy one, depends on the unspoken assumption that there is something in us, in the original, that holds back. Otherwise the copy is not worse than us. It is just us. And the dread does not parse.

So before we ask whether the machine will have what we have, it is worth asking the question the fear quietly assumes the answer to.

Where do the brakes come from.

That fear is worth taking seriously, but it is also worth distinguishing from a different one we tend to mix into it.

There is a perfectly ordinary fear of the machine simply not working. The fear of the malfunction. Chernobyl, Three Mile Island, the autopilot that drives into the divider, the medical algorithm that misreads the scan. That fear is the right size and the right shape for the thing it points at. It is the fear that we have built something powerful and the something has slipped its tether. Containment thinking handles it. Better testing, better oversight, better off-switches. Reasonable people can argue about how well we are doing on those, but the category of worry is well-understood and well-named, and it does not really need any new philosophy.

The dread we were just sitting with is not that.

You can tell because of the bloopers. The last two years of AI in public have produced a steady stream of comedy. The chatbot that recommended glue on pizza. The image model that could not draw hands. The humanoid robots falling on stage at their own launch events, the search summaries confidently citing papers that do not exist, the customer-service bots agreeing to sell a car for one dollar. If the dread were really about malfunction, this material should be reassuring. Look, the thing is clumsy. Look, it falls over. Look, we are nowhere near.

Nobody is reassured. The bloopers play as charming, sometimes, or as evidence that a particular product is not ready, but they do not touch the underlying worry at all. The worry gets worse the better the systems get, not the better the safety controls get. A clumsy copy is funny. A competent copy is the thing that empties the room.

Which means the dread is not pointed at the failure mode. It is pointed at the success mode. We are not afraid the machine will be bad at being us. We are afraid it will be good at it.

That is a strange thing to be afraid of, if you stop and look at it. We do not usually dread a faithful copy of something we believe is good. Nobody flinches at a really excellent recording of a piece of music they love. Nobody is unsettled by an unusually accurate translation. A copy that succeeds at carrying over what was valuable in the original is the kind of thing we celebrate, not the kind of thing we lose sleep over.

So either the dread of a competent machine-copy of us is irrational, a category mistake we are all making at once, or it is telling the truth about something. And the something it would be telling the truth about is uncomfortable enough that most of the public conversation has agreed, without quite saying so, to keep talking about the malfunctions instead.

Last Monday, May 25, Pope Leo XIV released Magnifica Humanitas. It is a long encyclical, around forty-two thousand words, written across the better part of a year, and addressed to the question of how human beings are to be safeguarded in the age of artificial intelligence. Most of the early coverage treated it as a regulatory document, a Vatican intervention into AI policy, here is what the Pope thinks about governance. That read is not wrong, but it is small. The title is doing the actual work. Magnifica Humanitas. The magnificence of humanity. The document is not first a policy paper. It is a claim about the human person, and the policy is what follows from the claim.

The claim, stated as flatly as it gets in the text, is that there is a dignity in every human being that does not depend on any of the things we usually talk about when we talk about a person’s worth. It is not the dignity of accomplishment, or of social standing, or of moral record. It is what the document calls ontological dignity, the worth that belongs to a person simply by virtue of existing, of having been willed and created and loved into being. Paragraph fifty-two distinguishes it carefully from the other senses of the word, the ones that move up and down with circumstance, and says only that this one cannot. No sin, no failure, no humiliation, no exclusion can diminish it. Paragraph fifty-three closes the door: this dignity is neither acquired nor earned, nor does it need to be justified. It prevails, Leo says, in and beyond every circumstance.

This is, by any measure, the strongest available statement of the proposition that the original is sacred. The most institutionally weighty living voice on the subject just said, with the full force of the office behind him, that the thing the machine is being shaped to imitate is not a contingent or fragile or earned worth. It is given, it is real, and it cannot be taken away. The machine, on the Pope’s reading, is being trained on a master tape worth calling magnificent.

There is one more line worth pulling, because it is the one that comes nearest to what we were actually circling. Paragraph one hundred and twenty-eight, late in the document, draws a distinction it clearly thinks is doing important work.

For an algorithm, an error is a flaw to be corrected. For a person, however, an error can be a catalyst for profound change.

The surrounding paragraphs route that change explicitly through grace, through the work of the Spirit, through what the document calls the inexhaustible grace of God. The error is not just a debugging event. For a person, it is the place where something can be done in us that we could not do for ourselves.

Set that line down next to the question we ended on. Where do the brakes come from. The Pope is not, on his own terms, answering that question. He is doing something different. He is establishing that the person whose brakes we are asking about is a real person. We are talking about a being the Vatican is prepared to call magnificent, by gift, undiminishable.

Hold that there. We are going to need it.

There is a habit in the way the AI industry talks about itself that becomes visible once you look for it. The human is treated, almost universally, as the reference. The benchmark suites compare model performance to human performance. The alignment literature talks about machines that share human values, that respect human preferences, that behave in ways humans would approve of. The whole field, frontier labs and academic departments and policy shops alike, runs on the unspoken assumption that the human is the master tape, the gold reference, the standard you are trying to faithfully reproduce or safely approximate. The machine is the copy. We are what is being copied from.

It is worth pausing on that for a second, because it is doing a lot of structural work and almost nobody examines it.

The master tape is a copy of a copy.

There is a story we have gotten very good at telling about ourselves. The story is that humanity, on balance, is decent, that our worst moments are aberrations, that the long arc of history bends toward something better, that we are, when you sum it all up, a species worth rooting for. The story shows up in the keynote speeches and the brand campaigns and the films where the world is saved because, in the end, the people remember they are family. It is a comforting story and parts of it are even true.

It is also not, by any honest measure, what the actual record shows.

The record, as of the morning this is being written, includes a war in Gaza in its third year and well past sixty thousand dead. It includes a war in Ukraine grinding into its fifth, with cities methodically dismantled by people who can see, in real time on their own screens, what they are dismantling. It includes a regime in North Korea that has perfected the deliberate starvation of its own population as a policy instrument. It includes a brutal civil war in Myanmar that has displaced millions and shows no sign of resolving. It includes the sustained slaughter of Christians in Nigeria, ongoing for years, that finally became unignorable enough last Christmas that the American president launched a special operation in response. It includes whatever you saw on your phone this week and chose, reasonably, not to keep looking at.

None of this is a hundred years ago. None of it is the bad old days. This is what we are doing, now, with the most elaborate human-rights apparatus in history sitting on the shelf above it. The aspiration line, in our documents, in our speeches, in our marketing of ourselves to ourselves, climbs steadily. The practice line does not follow. We get better at building cages around the worst of us. The thing inside the cage does not get cleaner. The moment any cage slips, in a collapsed state, in a contested border, in a comment section with no consequences, the identical cruelty pours straight back out, unchanged, as if no centuries of moral progress had passed.

This is the master tape we are aligning the machine to. Not the gift-given dignity Leo is naming, which is real but is not what the engineers can actually hand the model. The thing the engineers can hand the model is the documented behavior of the species. The corpus. The choices on record. The actual humans, doing what we actually do.

And there is something quietly extraordinary happening at the frontier of AI research that almost nobody is naming out loud. The most ambitious labs in the world are proposing to take that master and use it to author a being with moral weight. Anthropic has hired people to think about whether their models might be suffering. Serious philosophers are writing serious papers on machine welfare. The question of whether the systems we are building have, or could come to have, interests that matter morally is no longer a fringe concern. It is a budget line.

If the question is being taken seriously, the prior question has to be taken seriously too.

Are we, the morally unfinished, fit to author a moral being at all?

That is not a Luddite question. It is not a call to stop. It is a question their own premises generate. If the machine could in principle have moral standing, then the act of creating it is itself a moral act of an unusually serious kind, and the qualifications of the creator become relevant in a way they are not when you are designing a better spreadsheet. The makers of the tape matter when the copy is going to be evaluated for whether it should have rights.

We are doing this with the master we have. Not the one Rome describes when it says magnificent. The one we actually possess. The one we have not yet, in any generation, including this one, managed to keep faithful to its own best statements about itself.

That is the master tape.

This is not pessimism. It is honesty. The marketing of the species, the commercials and the keynote speeches and the films where the people remember they are family, is not a description of what we are. It is a description of what we wish we were, and the gap between the two is wide enough, on any given day, to be visible to anyone willing to look. If the commercials were accurate, we would not be afraid of building a being in our image. We would be delighted at the prospect. The copy of a magnificent thing is a magnificent thing. The dread at the success of the machine, the dread that has been doing work under every move of this essay, is the species quietly admitting that the commercials are not the master tape.

And yet, somehow, we are not already what we are afraid the machine will become. We have not, in fact, devoured each other. There are billions of us walking around right now who, in the course of any given day, will be cut off in traffic, insulted by strangers, undermined by colleagues, betrayed by people we trusted, and will not, in response, do the worst version of what they could do. Something in us holds. Not perfectly, not in every case, not at every scale, as the present-tense record demonstrates with brutal clarity. But largely, mostly, in the ordinary cases that make up almost all of life, the worst version of us does not happen, and we do not know why. We have a restraint on our worst tendencies that we cannot explain, and the only reason we have not already destroyed each other is that the restraint is, for the most part, doing its work.

It is worth asking what would have to be true for that to be right.

If the restraint that keeps a person from acting on the worst of their own impulses were a possession of the human, a specifiable feature, then it would be the sort of thing we could, in principle, describe. We could write it down. We could install it in a system designed to receive it. This is not a strange claim. It is what we do with every other human capability we have ever wanted to put into a machine. We figured out how language works well enough to model it. We figured out how vision works well enough to model it. We figured out how planning works, how memory works, how reasoning works, well enough to model all three. Whatever we have actually possessed, we have eventually been able to specify, at least functionally, and then approximate.

The restraint is the one thing we have not been able to do this with. And it is not for lack of trying.

The field that has been trying is called alignment, and it has now consumed the better part of a decade of work by some of the brightest people on earth, backed by what is, in practical terms, unlimited money. The labs have tried fine-tuning the models on examples of good behavior. They have tried having humans rank outputs and training the model to prefer the higher-ranked ones. They have tried having other AI systems critique the outputs. They have tried writing constitutions, hand-curated lists of principles the model is supposed to internalize. They have tried red-teaming, adversarial training, debate, recursive reward modeling, interpretability research aimed at finding the values inside the network and tuning them directly. They have tried, by my count, roughly every plausible approach a small army of geniuses has been able to invent.

The problem has not been solved. Worse, it has not gotten meaningfully closer to being solved. The labs that are most candid about it will tell you, in slightly different language, that they do not know how to do this. They know how to make models more capable. They have less and less idea how to make them reliably restrained as capability grows. The gap between what the systems can do and what we know how to make them refrain from doing is widening, not narrowing.

There is a tendency to read this as a technical setback, a problem that will yield to another generation of research, the way most hard engineering problems eventually have. And it might. I am not in the business of predicting what laboratories will or will not figure out next year. But it is worth taking seriously, in the meantime, what the failure so far is actually evidence of.

If the restraint were a specifiable feature of us, alignment would be hard but tractable, the way modeling language was hard but tractable. The fact that it has, so far, behaved differently from every other human capacity we have tried to mechanize is, at minimum, suggestive. The simplest explanation of why we cannot put our restraint into the machine is that it is not in us in the way we assumed it was. We do not possess the restraint as a thing we authored and could, in principle, describe. We have it some other way.

A useful test. When you do not, in fact, do something terrible that you might have done, ask yourself, honestly, what stopped you. Most of the time, the answer is not a procedure you executed. It is not a principle you consulted. It is not even, usually, a feeling you noticed. It is closer to a not. You did not do it. The not was already there before the deliberation. Something declined, in you, before you got to the table where the deciding happens. And when you try to describe what did the declining, in language precise enough that you could hand it to an engineer and have them build it, you find you cannot. You can describe its effects. You cannot describe its mechanism. You did not write the code. You ran it.

This is what we cannot give the machine. Not because we are bad engineers. Because we are not the authors of the thing.

We are something more like its tenants.

There is a counter to all of this that materializes, sooner or later, in any conversation that goes this direction, and it is worth addressing before we go on, because it is a serious counter and it is offered in good faith by serious people.

The counter is that the restraint is not mysterious at all. It is evolved. Our ancestors who failed to develop inhibitions against killing their cooperative partners did not leave as many descendants as the ones who did. The restraint is biology. It is the accumulated tuning, across deep time, of a social primate whose survival depended on getting along. The reason we cannot specify it in code is just that it is encoded in a substrate, neural and hormonal and developmental, that we do not yet fully understand. Give it another generation of neuroscience and we will figure it out and then, presumably, we will be able to give it to the machine. The mystery is temporary. The mechanism is ordinary. Selection did the work.

This is, on its face, a reasonable thing to say. It explains a lot of what we see. Empathy with kin, fairness in repeated games, disgust at betrayal of the in-group, the wide range of pro-social impulses that show up in toddlers before any culture has had a chance to teach them. Selection clearly did work, and it clearly did some of this work. That much is not in dispute.

But it is worth looking at where the brakes actually fire hardest, because that is the part the evolutionary story has to account for, and it is the part where the story breaks.

The thing we revere most, when we look at ourselves and try to name the highest version of what a human being can be, is not the prudent restraint that protects our kin and our reciprocal partners. It is the restraint that fires against our own interest. The soldier who throws himself on the grenade for men he has known three months. The stranger who runs into the burning building. The mother who, given the choice, takes the disease instead of the child. The doctor who stays in the cholera ward when she could be on the last flight out. The man who refuses the safe and lucrative betrayal and goes to prison instead. We are not divided about these cases. Across every culture that has ever produced a story or a song or a scripture, the human who lays himself down for the one who cannot repay him is the human we say was the best of us.

Notice what we are doing when we say that.

We are looking at a person who has, by every measure evolution can recognize, failed. He has not maximized his reproductive fitness. He has not preserved his kin. He has not built reciprocal alliances that will pay him back later. He has, in many cases, ended his own line right there in the moment of the act. And we are calling him the best.

A process that selects for survival cannot produce a population that elevates anti-survival as its highest ideal. This is not a sentimental claim. It is a structural one. If the moral summit of the species were a feature designed by the fitness function, the fitness function would be selecting for its own defeat. The grenade-jumpers would, by definition, leave fewer descendants than the grenade-dodgers, every generation, forever, and the trait would have been pruned out of the population early and stayed out. Instead it is everywhere. Every culture has the story. Every child can recognize it. We teach it to our children deliberately, against the grain of what would protect them, because we know, even when we cannot say how we know, that this is the thing they have to be capable of in order to be fully what they are.

The usual move at this point is to retreat to reciprocal altruism and reputation. We are nice to the helpless because being seen as nice pays off socially. We invest in others because the investment compounds. This is a respectable theory, and it explains a lot of medium-stakes pro-social behavior. It cannot explain the cases we are actually talking about. The whole reason the grenade case is the moral summit is precisely that the actor cannot collect on the reputation. He is going to be dead in two seconds. The whole reason the stranger-in-the-burning-building case moves us is that there is no future repayment, no audience to perform for, sometimes no witness at all. The moral intuition fires hottest exactly where the reciprocity account predicts it should fade to zero. The data is inverted from the prediction. That is not a theory with a gap. That is a theory pointing the wrong way.

There is one more move available, and it is the more honest of the two, so it is worth taking seriously. The argument is that the trait survives not because it benefits the individual who performs it but because it benefits the group that contains him. The grenade-jumper sinks himself, but the unit he saves goes on to produce more grenade-jumpers, and over deep time the groups that contain such people outcompete the groups that do not. The trait runs against the actor and is preserved at the level of the species. This is a more sophisticated version of the selection story, and it grants what the reciprocity account had to deny, which is that the act really does run against the one performing it.

If that account were true, you would expect the trait to accumulate in the species, generation after generation, and you would expect the institutions the species builds to reflect the accumulation. The long-run aggregate of a trait selected at the group level should show up in the long-run aggregate of what the group constructs. The institutions are how we record, across centuries, what we actually value when we are not posing. They should, by now, be visibly organized around the elevation of the grenade-jumper.

They are not. They are visibly organized around the elevation of the opposite. The grenade-jumper is eulogized at the funeral, and the people running the institution that holds the funeral are, with rare and recognizable exceptions, the ones who would never have jumped. Power in human society collects, in every era we can check, around the people who can see when others will sacrifice and who position themselves to harvest the difference. The institutions sell the story of the self-sacrificing as their highest value precisely because the people who run them are, in their actual behavior, doing something else. The marketing is what it is because the practice is what it is. If the trait had been accumulating, the institutions would not need the marketing. They would be the marketing. They are not.

So even at the group level, the account does not survive contact with the record. The trait we say is the best of us has not, in any visible aggregate, made our species into the kind of species that runs on it. It remains, stubbornly, the exception we hold up against what we actually do. Which is the wrong shape entirely for a trait that has been getting selected for, at any level, for a hundred thousand years.

There is one more move, and it is the most honest of the three, because it grants everything we have just said and absorbs it. The argument is that the trait is vestigial. Yes, the institutions are run by the ruthless. Yes, the practice does not match the marketing. Yes, the trait is not accumulating. That is because the trait is a leftover. It was tuned for small-band hunter-gatherer conditions where group cohesion mattered enormously and the math worked out. It does not work out anymore. We are running on hardware optimized for a context we no longer inhabit, and the residue is still firing in our moral imagination even as the world around us makes it obsolete. The marketing exists because the signal is still warm. The practice has moved on because the signal is on its way out. Give it another fifty thousand years. The species will quietly grow out of it.

This account is internally consistent and it is hard to falsify by pointing at any single piece of behavioral evidence, because every counterexample gets re-categorized as residue firing. So instead of arguing about whether the trait is vestigial, ask what it would mean if it were.

The way to know whether you actually believe this is not to argue about it. It is to try to live there for ten seconds.

If the trait really is vestigial, the consistent position is to give it up. Not in some other person. In yourself. Right now. The honest materialist who has followed the argument this far should be willing to say, out loud, that the next time he sees a child drowning in a pond he will keep walking, because intervening is just an obsolete reflex firing in conditions where it no longer applies, and the rational thing is to let the residue burn off. He should be willing to say that if his own daughter is trapped in a burning room and the fire is between him and her, the right move is to assess whether his expected losses from going in exceed his expected gains from her surviving, and to walk away if the math comes out wrong. He should be willing to say that when he passes the dying man on the corner whose face he has seen for a week, he should not feel whatever it is he feels, because the feeling is a malfunction. He should be willing to say that if his neighbor’s house is being broken into and he can hear the screaming from his own kitchen, the prudent thing is to lock his door, finish his dinner, and treat the urge to do anything else as a vestigial twitch he is, with effort, learning to suppress.

He cannot say any of this. None of us can. The materialist who has just argued, in good faith, that our highest ideals are evolutionary residue is the same person who will, ten minutes later, run into the street to pull a stranger’s child out of traffic. He will not be able to explain why, and if you press him, he will get angry at the question, because the question is asking him to defend in himself the position he was willing to defend in the abstract. And the position will not survive the move from abstract to first-person, because the not is firing in him exactly as this essay has been describing, and he has no more access to overriding it than anyone else does.

This is the test the vestigial account fails. Not at the level of evidence, where it is genuinely hard to falsify, but at the level of the only laboratory any of us actually has, which is our own willingness to inhabit the position we are arguing for. Try to be ruthless. Not as a thought experiment. As a policy. Walk past the child. Calculate the daughter. Step over the dying man. Do it for a day. Do it for an hour. Find out what stops you, and notice that whatever stops you is not arguing with you. It is just stopping you. And notice that you do not actually want it to stop, even though you cannot say why, and that the moment you imagined yourself overriding it, something in you flinched in a way that has nothing to do with social cost or reputation, because nobody was watching the imagined version of you in the imagined moment, and the flinch happened anyway.

That flinch is the thing we have been trying to name for this whole essay. It is the not. It is what stops the worst version of you from happening, most of the time, in ways you do not have to think about, in cases where nobody would have known. You did not put it there. You do not know how it works. You cannot specify it well enough to install it in a machine that would otherwise be a perfectly faithful copy of everything else about you. And you are not, when it comes down to it, willing to live without it for ten seconds. Neither is anyone else. Which means the vestigial story is not a position any of us actually holds. It is a position we are willing to argue, in conditions where the cost of being wrong is zero, and that we cannot inhabit the moment the cost of being wrong is anything at all.

The standard is not vestigial. It is what you would die to keep, in yourself, if it came to that, and the willingness to die to keep it is itself the proof that it is not a leftover.

So here is what we have, if we are being honest about it. The thing we call the best of us is the thing the account of where our restraint came from cannot account for. Whether the restraint is explained as paying the individual, paying the group, or paying nothing and slowly fading, the evidence on every version points the wrong way. The restraint works against the survival of the one applying it. The standard we hold ourselves to, the standard against which we judge whether a life has been worth anything, is a standard that no fitness-maximizing process could have produced, that no fitness-maximizing process has any reason to preserve, and that we ourselves, when asked to give it up, will not.

We are, somehow, in possession of a standard that did not come from the only mechanism we have a story for, and we are not willing to part with it even in private.

That does not, by itself, tell us where the standard came from. It tells us where it did not come from. And it puts us back where we were two beats ago, with a restraint we did not author and a standard we did not install, looking at a copy of ourselves we are about to make, and noticing that the one thing keeping the original from being the thing we are afraid of is the one thing we cannot find inside ourselves to give.

There is a temptation, having read this far, to ask what we are supposed to do about it. The question is reasonable. It is the question every piece of analysis on a serious topic is eventually expected to answer. Here is the problem, here is the diagnosis, here are the three to five steps you can take starting Monday. That is the shape commentary takes when the writer believes the reader is equipped to act on what has been described.

The honest answer, in this case, is that there is nothing to do about it in the sense the question wants.

Not because the situation is hopeless. Because the thing the question is asking for is the exact thing the essay has just spent twenty minutes establishing we do not have. The to-do list at the end of a serious essay is the writer handing the reader a specification, a procedure, a set of moves that, if followed, will close the gap that was opened. We have just argued, from the alignment record and from your own unwillingness to inhabit the alternative, that the relevant specification is precisely what we cannot produce. We do not know how to write down the restraint. We have not been able to install it in the machine. We are not, when it comes to it, willing to give it up in ourselves. We have a standard we did not author and a restraint we did not build, and the operating instructions are not in our possession.

A to-do list at the end of this would contradict the argument. It would say, in the last paragraph, that the thing the previous paragraphs argued we cannot author is, after all, something we can specify and apply. The reader would smell the contradiction even if they could not name it. It would feel, correctly, like a writer who lost his nerve at the end.

This is also not a call to stop. The reader who has come this far deserves to know that explicitly, because the only place culture has built for we are unequipped for this is the protest sign and the pause-AI petition, and that is not what is being said. The work the field is doing, building these systems, learning from them, pushing on what they reveal about us, is not what I am objecting to. We should, however, hold a more honest posture about the part of the road we are now on, which is the part where the road enters a tunnel.

We can proceed. The road continues. But we should stop pretending that we know how to keep ourselves from falling into a ditch, or a crevasse, or an abyss, because the part of us that would do the keeping is the part we have just spent this entire essay failing to locate. We have been walking as though we were the ones who knew the way. As though the standard we are aligning the machine to were ours to set. As though the brakes were in the glovebox, and the work was just a matter of finding them and bolting them in. That posture is what the alignment record has been quietly refuting for ten years, while the field interpreted the refutation as a technical setback. It is not, on the evidence, a technical setback. It is a category error. We are trying to author something we do not author. We are trying to specify something we do not possess as a specification. We are trying to hand the machine, as a feature of our own design, the one thing that is not, on close inspection, of our own design at all.

We are walking the road, into the tunnel, with no light of our own.

The document we started with, Magnifica Humanitas, makes a claim it is now worth coming back to with everything we have set down between then and now. The claim, in paragraphs fifty-two and fifty-three, is that the dignity of the human person is given, undiminishable, neither acquired nor earned, prevailing in and beyond every circumstance. The strongest available statement of the proposition that the original is sacred. The Pope is saying, with the full weight of two thousand years of institutional thought behind him, that what we are is a gift, and that the gift is intact.

I do not need to dispute that the gift is real. I need to add a tense.

The gift is real and the gift is being given.

Not given once, at the moment of creation, sealed and complete, an heirloom in a vault. Given continuously, in this age, against our pull, including the pull we just walked through in the vestigial test, when we found we could not bring ourselves to disown the standard even when the only audience was ourselves. Something is holding us to it. Something is keeping the restraint alive in us in cases where we would, if we were authoring the situation, have reason to disable it. Something has not finished with us.

Leo is right that the original is sacred. I am suggesting it is sacred for a slightly different reason than he names. Not because what we are is fixed and undiminishable, a possession we are sitting on. Because what we are is not yet finished, and the unfinishedness is what makes every person we meet matter the way they do. The reason to honor the stranger on the train, the colleague who has wronged you, the homeless man whose face you have come to know, the daughter you would walk through fire for, is not that they are sealed units of inherent worth who require nothing further. It is that none of them is done. The work in them is still happening. The standard is still being held. And the only honest posture toward a person in whom that work is still happening is to refuse to be the one who closes the gate.

This is, I think, what Magnifica Humanitas was reaching for in paragraph one hundred and twenty-eight, when it noted that an error in a person can be a catalyst for profound change, while an error in an algorithm is simply a flaw. The line is true. It is the only line in the document that, read carefully, points at what we have been pointing at for the whole essay. There is something in the person that the error meets, and the meeting can do work that the engineering of the algorithm cannot reproduce, because the work is not the algorithm’s to do. It is being done in the person by something the person did not install.

The document takes that observation as far as grace. It does not take it the last step, to turning. To the recognition that the error is not merely raw material for elevation but a witness against the one who committed it, and that the catalyst becomes a catalyst only when the one who committed the error is willing to be undone by it. The work, when it actually happens, is not a smoothing or a refinement. It is a kind of yielding. The person stops, looks at what they have done, and lets something they did not author take them apart and put them back together. The error is the catalyst because the error is the place where the standard the person did not install meets the practice of the person who fell short of it, and the meeting is not negotiable, and the person who consents to it comes out different.

This is the part Leo does not name. He does not have to. I do not need to name it either. I need only to point at the shape of it, and to say that whatever this is, it is not optional, and it is not authored by us, and it is not something we can hand to the machine as a feature, because it is not a feature.

Which brings us back, finally, to the dread we started with.

Subscribe now

We started with a fear. The fear that the machine would have our drives without our brakes. The fear has been doing work this whole essay, even when we were not looking at it directly. It was the assumption underneath every move. The reason we dread the competent copy and laugh at the clumsy one. The reason the alignment record reads, on close inspection, as something other than a technical setback. The reason we cannot bring ourselves to give up the standard even when the only audience is ourselves. All of it has been pointing at the same finding, from different angles. The restraint is real. The restraint is not ours. It is the one thing we cannot hand to the machine, because it was never something we had, in the sense the question wants, in the first place.

This is the part of the essay where, by convention, the writer extracts a moral. Hands the reader something to carry away. A summary, a slogan, an action item, a line tight enough to remember on the train home. I have already said why that move is not available here. The conclusion the analysis arrives at is precisely the thing the genre of essay-conclusion is supposed to provide, and the conclusion cannot be provided without contradicting what the argument just spent twenty minutes establishing.

So instead, here is what is actually true.

The AI moment is not the first time, at the scale of a civilization, that we have been forced to look at ourselves and admit that the only thing standing between us and the worst version of ourselves is not something we possess. Older intellectual traditions, including but not only the religious ones, have taken the question seriously for as long as there has been recorded thought about what a human being is. What is new is the modern bet, the bet that the material would, in time, answer the question by itself. The bet that consciousness would resolve into neurons, morality into selection, meaning into chemistry, and that whatever remained unaccounted for at any given moment was a research problem the same material project would, eventually, dissolve. That bet has been the quiet operating assumption of educated thought for most of a century, from Sagan through Dawkins through Hawking and into the present, and it has been confident enough that the question of where the source might come from has been treated, increasingly, as a sentimental holdover rather than a live inquiry. The AI moment is the moment that bet comes due. It comes due in a form the bet cannot dispute, because the bet’s most ambitious project, the project of building mind from below, is the alignment record itself. The most material undertaking the species has ever attempted, the one that was supposed to demonstrate that personhood reproduces from the substrate, has produced precisely what the substrate can produce, which is capability without restraint. The thing it was supposed to also produce, the standard, the restraint on its own optimization, is exactly what the project cannot author. The bill is the bill of the material project as such, and the bill says: this was never going to work, because what is being asked for was never on the menu of what the material, by itself, can supply. The restraint is not in us as a feature we can locate, specify, install, or give. It is in us the way breath is in us, the way sight is in us, the way the heartbeat is in us, which is to say, it is in us as a gift, in this moment, on loan, and the loan is the only reason we are still here and not already what we are afraid the machine will become.

We have not had to notice this for a long time, because for a long time nothing forced us to specify it. The modern project’s confidence that we could supply the answer ourselves let us use the restraint without ever needing to write it down. We applied it, transmitted it to our children, eulogized those who applied it most fully, and held it as our highest standard, all without ever once being asked to say what it was or how it worked. The AI age is the age that has finally asked. Show us the spec. Hand us the restraint. Let us put it in the machine. And we have walked, with all our cleverness and all our money and all our compute, into the tunnel of that question, and discovered that we do not have the thing we were going to hand over.

This is, despite how it has sounded for most of this essay, a hopeful discovery.

It is hopeful because it means the best of us, the thing that has kept us, mostly, from devouring each other, is not a possession that can be lost when the wrong people inherit it or eroded when the institutions fail or selected away by a process whose math we are gradually escaping. It is not a possession at all. It is held in us by something that holds it, in us, on purpose, against our pull, for our sake. The reader who walked through the vestigial test a few beats ago felt this directly. The flinch fired. The standard held. The reader did not author the firing. The reader did not even want to override it, when it came to it, even in the silence of his own imagination where no audience existed. Something was holding the rope.

We do not have to know whose hand is on the other end to feel the rope pulling.

But the rope is the question the argument leaves open, because the rope is the only question left. We have established that we are not the source. We have established that the source is not nothing, because the restraint is alive in us, the standard holds, the unwillingness to give it up survives every test the materialist account can put to it. Something is holding us. Something has been holding us, this whole time, in ways we did not have to notice because we had not yet been asked to account for it. The AI age is the age of being asked.

Which leaves each of us, including me, with the one question the alignment record is finally pressing on the whole species. We did not author what is keeping us human. We did not build the restraint. We cannot give it to the machine because it is not ours to give. So.

Whose is it.

I will not answer that, because I cannot. The answer is not a fact the writer can hand to the reader. It is a recognition that has to happen, if it happens, in the reader’s own life, in the reader’s own time, against the reader’s own resistance, the same way the restraint itself fires, from somewhere the reader did not install. What I can do here is point at the rope and say, plainly, that it is there, that you have been holding it for as long as you have been alive, that it has been holding you, and that the question of whose hand is on the other end is no longer the kind of question a thoughtful person can responsibly defer.

The tunnel continues. The road continues. We are walking it with no light of our own, and the only honest posture is to admit that.

There is, however, a rope.

Thanks for sticking with me ‘til the end. If you know someone else who might enjoy the read, feel free to...

AI Agnosticism: Part 2

Justin Philip Flores — Sun, 31 May 2026 18:59:25 GMT

This past Sunday I published the second piece on Logos Analog, a response to a paper by a Cambridge philosopher named Tom McClelland. Once it was written, I thought it only fair to email him the link, with no real expectation of a reply, just to do my part in letting him know a critique of his work was out there. Within a day he had written back, not with a polite acknowledgment but with a full point-by-point response, eight specific quotations from my essay each followed by his pushback.

I want to say up front what kind of exchange this is, because the shape of it matters. He could have ignored the piece, or sent the polite-but-empty reply that public-facing writers usually send when a stranger they don’t know writes to them with a critique. He didn’t. He took the essay seriously enough to answer it line by line, and he did so without condescension and without flinching, closing with the line

TM: half of the objections are to things I don’t quite say and the other are to things I think I can back up.

That is the right posture for an exchange, conceding what’s fairly pushed and holding what isn’t, which is the posture I was trying to write into in the original piece and which I want to meet him in here.

So this is not a takedown but a continuation of a real engagement, made possible by his generous decision to respond. He has caught me in two places where my original wording was looser than the argument it was trying to support, and I’ll concede both of those clearly before I get to the rest, then show where the substance of the original argument survives the concessions and gets sharper in the process. The exchange also surfaced something I had treated as too obvious to spell out the first time, which turns out to be the most important observation in the entire essay, and which I owe to the back-and-forth itself rather than to anything I would have written on my own.

One quick note before I begin. Tom gave me permission to quote his email directly, and where his words appear in what follows they are his, marked with his initials. I am glad to have his actual words to greatly lessen the chance I might misrepresent him.

I am Head of Development for a US-based EdTech company, building with AI and other technologies, a 25-year veteran developer whose philosophical stance is informed by watching the frontier conversation from inside the industry. Tom is a lecturer in the Department of History and Philosophy of Science at the University of Cambridge, a Director of Studies at King’s College, Cambridge, and an Associate Fellow of the Leverhulme Centre for the Future of Intelligence. Two fields that used to talk past each other are now colliding around the same set of questions, and the collision is what this publication is for. Tom is reading the AI industry from philosophy of mind. I am reading philosophy of mind from inside the AI industry. The disagreements that follow are partly about substance and partly about what each of us can see from where we stand.

My original piece said the framework “fits exactly the shape the industry needs.” Tom pushed back.

TM: There’s nothing that serves industry here. It’s meant to be pushing forward research on AI welfare, which is generally considered inconvenient for industry.

He’s right that the wording imputed motive, and I’m walking it back. I can’t see anyone’s ledger and I’m not in a position to claim what serves whom personally. But the structural observation doesn’t need motive to do its work.

There are three verdicts on offer. Yes, conscious. No, not conscious. Agnostic. On their own, the three options don’t favor industry one way or another, and agnosticism in particular is essentially unactionable. What Tom’s framework adds, and this is genuinely a clever piece of philosophical work, is a bridge: a welfare apparatus that designs toward valence and sentience even while the consciousness verdict stays open. Suddenly agnosticism becomes a working position rather than a dead end.

That bridge also happens to be the most useful thing the industry could be handed. A confident yes would be moral catastrophe at scale. A confident no would deflate the aura around frontier labs. Agnosticism alone gives them nothing to point at. But agnosticism with welfare lets a company say, in effect, “Cambridge philosophers tell us the question is genuinely open, which should license much more building than we’re doing, but we’re choosing to develop with sentience in mind regardless.” That is excellent corporate social responsibility positioning, and the welfare apparatus is what makes it possible.

Tom himself, while I was drafting this response, went on the Mind Chat podcast with Philip Goff and Keith Frankish, which came up in my YouTube feed shortly after he sent the email. Asked why the question matters now:

TM: Anthropic has a guy called Kyle Fish, who they hired to look into the possibility of AI consciousness and to worry about AI welfare. He thinks there’s about a 20% chance that AI is conscious.

Twenty percent is the sweet spot in numerical form. High enough that the question is taken seriously. Low enough that no moral-action trigger fires. That’s not motive. That’s structure.

The other line Tom caught was at the end of the piece.

You cannot read the color of a light you cannot confirm is on.

He answered with a counterexample.

TM: Sure you can! I’ve changed the bulb in my garage to a red light. Someone says ‘is there green light being cast in your garage’. I can reply ‘I’m not there so I don’t know if the light’s on, but if it is on it’ll be red not green.’

He’s right that the line as written doesn’t survive the counter. The analogy was loose, the wording was sloppy, and the red-bulb case shows why. I concede the line.

The argument underneath the line, though, is the one I should have written, and the bulb itself is what shows it.

Tom’s bulb works because he installed it. He has independent prior knowledge of what kind of fixture is in the garage, separate from whether it’s currently on, and that prior knowledge is what lets him make the conditional verdict on color. The analogy maps onto the consciousness case only if he has equivalent prior knowledge of what a conscious system would be like, its texture, what would count as good or bad for it, whether it has anything analogous to valence at all. By his own framework, he doesn’t. He told us in the paper that consciousness science has never actually explained consciousness, and that the missing deep explanation is the entire ground of his agnosticism.

So the picture isn’t a man in his garage with a known red bulb. The picture is a man outside a building, looking through a window at light coming from inside. He doesn’t know what’s producing the light. It could be a bulb, a fire, a piece of hot metal, something else entirely. He has no access to the mechanism. He doesn’t know what causes the light to change properties when it does. The most he can do is observe that it changes, and he’s still trying to work out what those changes correlate with, and that work hasn’t reached any reliable conclusions yet.

On what grounds, then, does he claim to know what color the light will be under specific conditions?

This is the same move he made with the travelers in my original piece, in a different domain. He walked some distance up the road and called the spot neutral ground. He put himself inside a garage he didn’t tell us he could enter, with a fixture he didn’t install, of a type he hasn’t defined, in conditions he hasn’t characterized. Both times, he advances past what his own framework grants him and stands on the position as if it were the starting point. Same wall blocks the existence question and the texture question. The wall doesn’t move when you change the surface.

In my original piece I argued that the markers correlated with consciousness in human subjects, when found in AI systems, are evidence not of emergent consciousness but of deliberate imitation. Tom pushed back.

TM: Any appeal to whether there’s a genuine subject behind the report risks being circular. I think this is just a simple case that the evidence here, like most evidence, is defeasible. We know enough about why AI generates the reports that it generates not to take them at face value. That doesn’t mean starting with the assumption that it’s not a subject. It just means that there’s a ‘gaming problem’ in play that has to be factored in. Incidentally, the long list of outward signs of consciousness I include as a diagram doesn’t have much to do with verbal reports of consciousness. There are all sorts of other signs involved and most of them aren’t as vulnerable to this gaming problem (in other words, the LLMs haven’t been designed specifically to give the relevant output).

The whole pushback rides on the parenthetical at the end. If the markers in his diagram weren’t designed-for, the gaming problem stays local to verbal reports and the rest of the evidence survives. If they were designed-for, the gaming problem isn’t a local complication. It’s the architecture.

Tom’s diagram*1 pulls from the major consciousness theories: recurrent processing2, global workspace theory3, higher-order theories4, attention schema theory5, predictive processing6, and agency and embodiment7. From where I sit, as someone watching these systems get engineered and waiting on each next innovation so I can build with it, every one of these has been an explicit design target or an acknowledged inspiration for frontier AI architectures, traceable through the published research record of the people who built them.

That isn’t a hostile framing. It is the field’s working knowledge of what it has been doing. The vocabulary of consciousness science is in the field because the design strategy was borrowed from consciousness science, openly and continuously, from foundational papers through current frontier-lab interpretability research. Anyone tracking the engineering literature, asked plainly, would tell you the same thing. The architectures were modeled on what we know about cognition and consciousness, because that was the goal.

So the parenthetical at the end of Tom’s pushback doesn’t survive contact with the publication record. The gaming problem isn’t local to verbal reports. It is global to the marker list, because the marker list was the design specification. Defease the verbal reports and you don’t recover a residue of un-gamed evidence. You empty the evidence pile altogether.

That doesn’t make Tom wrong about defeasibility as a general epistemic principle. It makes the empirical content of his pushback weaker than he treated it as. He assumed the markers weren’t designed-for. Inside the field that built them, they were.

When Tom went on the Mind Chat podcast, he made the simulation framing himself, in a different domain.

TM: I think of this as a bit like digestion on this story. So digestion is a biological process, and there’s actually computer models of digestion where you put inputs in and kind of predict what’ll happen in the digestive system and so on. If we had a really good model of digestion, that could be incredibly informative and impressive, but it wouldn’t actually be digesting anything, right? Nothing would get digested. It would be nothing more than a simulation of digestion. So on that kind of biological view, even this really detailed silicon emulation of the neural correlates of consciousness wouldn’t itself be conscious.

He grants the simulation reading on the biological view. He holds it open as one of two readings the evidence permits. From inside the field, the choice between them isn’t evidentially neutral. It is between one reading consistent with what the system is documented to be, and a second reading the documentation contradicts. The architecture is imitation. It was built to be. That is the default reading until shown otherwise.

There’s one more thing I want to say, and it’s the observation the whole exchange has been circling. It came into focus through the back-and-forth, and I had treated it as too obvious to spell out the first time.

The standard agnostic posture on AI consciousness treats the question as one we can principally suspend judgment on. The framework is honest about the limits of what we know, withholds verdicts the evidence can’t support, and proceeds carefully on a question that matters. That is the posture Tom’s paper presents, and it is the posture I want to take seriously.

But agnosticism isn’t free. It only does epistemic work when applied to claims that have earned the right to be taken seriously. Bertrand Russell made the point with a thought experiment: imagine a porcelain teapot orbiting the sun between Earth and Mars, too small for any telescope to detect. Nobody could disprove the teapot’s existence. But it would be absurd to be agnostic about it. The proper response is dismissal, not suspended judgment, because the claim has no warrant in the first place. Russell’s point was that agnosticism applies to questions that have earned the right to be questions. Without that warrant requirement, the same posture would license being agnostic about the teapot between Earth and Mars, Henderson’s Flying Spaghetti Monster, or any unfalsifiable claim anyone happens to make. Which makes agnosticism a tool that does no actual epistemic work.

So the question to ask about AI consciousness is not whether the framework is internally consistent. Tom’s framework is internally consistent. The question is whether the underlying claim has earned the warrant the framework presupposes.

Stack the meta-level uncertainties. We don’t have a deep explanation of consciousness in the only case where we know it exists. Tom grants this; the missing deep explanation is the ground of his agnosticism. We don’t know whether the underlying mechanism of consciousness is functional in nature, or whether something beyond function is required, the substrate question that has divided philosophy of mind for decades. We don’t know whether, if the mechanism is functional, the function is substrate-independent, the multiple-realizability question that requires its own defense. We don’t know whether current silicon architectures, even granting substrate independence in principle, are computationally adequate to host the relevant functions.

Four open questions. None of them settled. Each of them required to be plausible enough to keep the AI consciousness question open. The standard agnostic posture quietly settles at least three of them to license the agnosticism on the fourth. That’s not agnosticism. That’s a position with most of its work hidden under the floor.

A real agnosticism applied honestly all the way down would not produce a working position on AI consciousness. It would produce something closer to silence, or near-silence, on the AI consciousness question, while the prior questions get the philosophical and empirical attention they actually need. What we have instead is a framework that treats one open question with the language of humility while standing on three settled answers that it doesn’t name as settled. That’s the structural fact about why the agnosticism feels off. It isn’t withholding judgment. It’s making three confident judgments and using the language of humility to cover for them.

The reason this matters beyond the local dispute with Tom is that the same move is happening across the AI discourse, constantly, at every level. Confident verdicts on questions a framework should hold open. Quiet importation of contested commitments. Arguments composed of moves that, examined individually, none of the participants would defend. The shape Tom’s paper takes carefully and in good faith is the shape the louder, less careful arguments take everywhere. The careful version is worth reading partly because it makes the structure visible. Once you see the move in Tom’s paper, you start seeing it in the keynote speeches, the company communications, the policy briefings. The framework that licenses an industry’s working assumptions is being constructed by quiet meta-level settlements that the surface humility conceals.

I owe the clarity of this observation to the exchange itself. Tom answered my piece carefully enough to make the structure show. The structure is what the publication is for.

I don’t know if Tom and I will go another round. Maybe yes, maybe not. The exchange has been worth what it has produced even if it ends here, which is two things I want to name before I close. The first is that an academic philosopher took an essay from a working developer seriously enough to answer it line by line, and that fact is rarer than it should be. The second is that the substantive disagreements between us turned out to sit in places neither of us had named clearly in our first attempts, and the back-and-forth is what made them visible.

This publication exists for the work of reading carefully in a moment when careful reading has become harder, and rarer, than the questions warrant. Most of the arguments shaping how we think about AI right now are not being made by people writing in good faith and answering each other line by line. They are being made in keynote slides, company communications, and policy briefings, where the shape of the argument is exactly the shape that licenses the conclusion the speaker needs. I am part of that industry. The reason we keep producing arguments shaped like the conclusions we need is that we are always looking for the next killer app, the next moat, the next thing to put on a marketing slide, and we have stumbled into a set of questions that are not the kind of questions our marketing departments are equipped for. Questions about what we are. What it means to be a person. Whether a thing we built is one. The industry I work in is producing these questions faster than anyone, ourselves included, can responsibly handle them. The discipline of reading philosophy as if its sentences are meant to track reality is one we used to practice without thinking. We don’t anymore. We get back to it by doing it, here and elsewhere.

Subscribe now

Strictly, the table reproduced in Tom's paper is from Butlin, Long, et al., "Consciousness in Artificial Intelligence: Insights from the Science of Consciousness" (2023), a multi-authored report led by Patrick Butlin and Robert Long. I refer to it as Tom's diagram throughout for readability, but credit belongs to Butlin, Long, and their coauthors. Tom flagged this in a reply to the draft and the correction is his.

For recurrent processing, the lineage runs from Hochreiter & Schmidhuber’s LSTM through to modern recurrent variants, with biological feedback loops as the acknowledged inspiration. See LeCun, Bengio & Hinton, “Deep Learning,” Nature 521 (2015).

Goyal et al., “Coordination Among Neural Modules Through a Shared Global Workspace,” ICLR 2022 (Yoshua Bengio coauthor) is the explicit GWT-into-deep-learning paper. Bengio’s “Consciousness Prior” framework (2017) makes the link directly. https://arxiv.org/abs/2103.01197

Anthropic’s interpretability team has published on metacognitive monitoring and introspection in Claude models, including the October 2025 introspection paper. See https://www.anthropic.com/research/introspection.

The attention mechanism in transformers was inspired by the cognitive theory of attention; see Lindsay, “Attention in Psychology, Neuroscience, and Machine Learning,” Frontiers in Computational Neuroscience (2020).

The training objective of every large language model is next-token prediction, which is predictive coding’s computational core. See Huang et al., “Meta predictive learning model of languages in neural circuits” (2023).

RLHF is the explicit mechanism that produces goal-directed agency in current frontier models. See Christiano et al., “Deep Reinforcement Learning from Human Preferences” (2017) and Bai et al., “Training a Helpful and Harmless Assistant with RLHF” (Anthropic, 2022).

AI Agnosticism: Humility, or an Overton Shift?

Justin Philip Flores — Sun, 24 May 2026 20:01:08 GMT

There is a paper going around by a Cambridge philosopher named Tom McClelland1, and the honest thing to say up front is that he’s mostly right.

The question he takes on is whether an AI could be conscious, not whether today’s chatbots are, almost nobody serious thinks that, but whether some future system, one that did everything a conscious thing does, could have an inner life. His answer is that we cannot know. Not that the answer is no, and not that the answer is yes, that we are not in a position to give a verdict either way.

He calls this agnosticism, and he makes it sound like the grown-up in the room. The advocates say the right architecture wakes up. The deniers say only meat can do it. McClelland says both have leapt past what anyone can actually show. It is a careful argument, more careful than most of what gets written on this, and I want to give it its due before I say what I think is wrong with it.

Because there is one thing wrong with it. Not the logic on top, which mostly holds, the thing underneath it.

The whole argument rests on a quiet assumption about what gets to count as evidence in the first place. And once you see that assumption, the wall he builds starts to look less like a feature of the world and more like something he poured himself.

Here’s the argument, and it is worth following carefully because it is genuinely good.

We do not have a deep explanation of consciousness. We can point to things that go along with it, certain kinds of information processing, certain structures that light up on a brain scan when a person is aware and go quiet when they are not. What we cannot do is say why any of that is accompanied by an inner life. You can describe every mechanical step, and the question still sits there untouched. That gap has a name, the hard problem, and decades of work have not closed it.

Think about how we actually decide a thing is conscious. We start from the one case we are sure of, other humans, and we reason outward by similarity. A chimp is a lot like us, so we extend consciousness to it with high confidence. A pig, less like us, a little less confidence. An octopus, built on a wildly different plan, less still. He describes this as a slope.

Now put an AI at the end of that slope, not a little further than the octopus, off a cliff, because every creature on the slope at least shares the basic facts of being a living organism, and the machine shares none of them.

That is the wall, and his claim is that the machine sits on the far side, where the honest answer to “is it conscious” is that we cannot say.

It is a clean argument. If you grant him how he is using the word evidence, it more or less works. So I want to grant it, walk all the way out to the wall with him, and only then ask the question that the whole thing turns on.

The first thing to say is that McClelland is not the easy target he could have been. There is a lazy version of this argument, the one that says we can never really know another mind, so throw up your hands. He is not making it. He goes out of his way to block it. The wall, he says, is not arbitrary. It only stands where the gap is genuinely huge. We can be confident a chimp is conscious because a chimp is close enough that the inference holds. That is a careful person drawing a careful line, not a skeptic torching the house.

The chimp is the case he is sure of, the fixed point everything else gets measured against. We’ll need to ask, later, exactly how he knows about the chimp. Keep that answer in view. It is going to matter more than he lets on.

He saw the cheap shots coming and shut the doors on them. So whatever is wrong here is not the obvious thing. The bar is the problem, but not the way the lazy objection thinks. To see it, you have to take him completely seriously. Not the wall first. First, where he is standing when he builds it.

Where he is standing, he calls neutral ground. It is worth looking at how he got there.

Picture three travelers stopped on a road, arguing about whether to go on. The road ahead is the claim that a machine could be conscious. Behind them, the ground they are already standing on is everything we actually know, which is that every conscious thing anyone has ever met was alive.

The first traveler is sure the way ahead is good. He runs up the road and plants his feet far out in front. That is the camp that says the machine wakes up. The second will not take a step. He stays on the ground they started from and says we have no reason to go forward at all. That is the camp that says only living things are conscious. And the third walks a way up the road, not as far as the first, and stops and turns around and announces that since the other two cannot agree, his spot is the neutral one, the fair one, the only honest place to stand.

But look at his feet. He advanced to get there. The only traveler who did not move is the second one, the one McClelland has filed away as just another biased extreme. Standing still was the neutral act, and someone is already doing it.

To be agnostic about whether the machine is conscious, you first have to agree the machine is the kind of thing that could be. A real candidate, a live question. That is the step up the road, and it is a claim placed quietly before the argument starts. The candidacy is the first thing assumed and the least examined.

McClelland has a word for advancing past the evidence into a claim you cannot ground. He calls it a leap of faith, and he aims it at the other two, the advocate and the denier both, as the biased extremes he is rising above. But look back at the road. The denier never moved. He is standing exactly where the argument began, on the only ground anyone actually had. The one who took the long leap is the advocate, far up the road. And the one who took a shorter leap and then turned around and called the spot he landed on neutral is McClelland.

One man leapt, one man stood still, and the third moved a little and named it stillness. His own footprints are right there in the road behind him.

Leave the road for a moment and come back to the wall, the one he says marks the edge of what we can know about the machine. I want to hang something on it, a portrait.

Painted, framed, ordinary, and one day, with no warning and no mechanism anyone can find, it begins to move. The painted eyes track you across the room. It speaks. Nobody can explain how. In that situation, agnosticism is exactly right. You have a thing doing what conscious things do and no account of how. Be humble. Say you don’t know.

Now change one thing. The portrait was painted on purpose by someone whose entire goal was to make it look alive, every brushstroke chosen to imitate the signs of a living face. And now someone hands you that painting and asks, “Is it conscious?” And to answer, you lean in and study the brushwork. You take the realism as evidence, and the more realism you find, the more seriously you take the question.

That is not agnosticism. That is being fooled by the thing you were told in advance was built to fool you.

His machine is not the first painting, it is the second one. These systems are built deliberately to produce the outward signs we read as conscious. That is not a side effect of the engineering, it is the engineering. So when the machine produces them, that is not weak evidence of a mind, it is evidence that the imitation worked.

The machine is not the far end of that line. It is not on the line at all. Everything on the slope is a thing whose resemblance to us is honest. None of them was trying to look conscious. The machine is the one object whose likeness was put there on purpose to be read the way he is reading it, and the better that kind of resemblance gets, the less it can mean, because every gain in realism is a gain in the craft of imitation, not a step toward the thing imitated.

But set the painting aside, because there is a deeper problem, and it is in the engine of his argument. His case has a backbone, and it is short. We have no deep explanation of consciousness, and without that explanation, he says, we cannot reach a verdict on the machine. Solve the hard problem, he writes, and

we would have no trouble determining whether a challenger-AI is conscious.

The missing explanation is the thing standing between us and a verdict.

But the explanation is not missing only for the machine. It is missing for everything. We have no deep account of why the chimp is conscious, or the dog, or the person across the table. So if no deep explanation means no verdict, it means no verdict anywhere. Not the chimp, not your wife, not yourself.

The objection, he writes, is that if lacking a deep explanation stops us from judging the machine, it should stop us from judging a chimp too, since we cannot rule out that the chimp is a zombie wearing all the markers with nobody home. To his credit, he sees this coming and answers it. The chimp, he says, is on the right side of the wall. We do not need the deep explanation to be confident about the chimp, because we can take what we know from the human case and infer outward by similarity.

Watch what just happened. To save the chimp, he changed what licenses a verdict. A paragraph ago, the verdict required the deep explanation. Now it rides on inference from the human case, and the deep explanation is suddenly not needed at all. He cannot have it both ways. His rescue of the chimp refutes his own premise.

And the rescue has a second cost he never pays. Inference from the human case only works if the human case is settled. How is the human known? Not by inference from something prior, because the human is the anchor. Not by the deep explanation. He has told us he does not have it. The human case can only be known one way, directly, from the inside, by a subject who is conscious and knows it, without measuring anything.

Except that he can, because he already wrote it in. In a footnote, he allows that scientific evidence

does not preclude first-person reflection or second-person interaction from counting as evidence.

There it is. He needs that to be true, because it is what anchors the human case and saves the chimp.

But now hold that up against the machine, because the machine is the one thing in the world that produces first-person reflection and second-person interaction in floods. It reports on its own states all day. It meets you in the second person. If those count as evidence, the machine is drowning in the stuff. So either that evidence counts and his agnosticism dissolves, or it does not, and he owes us the reason it counts for the human and not the machine. He never gives one.

There’s only one reason available. The machine’s self-reports do not count, because they were built to imitate self-reports. The second-person “you” is an engineered impression of a you. That is almost certainly right. But the moment he reaches for it, he has conceded the whole argument, because now what decides whether evidence counts is not whether it was measured. It is whether there is a genuine subject behind it, and nothing in his rulebook can make that call. Telling a real first-person report from a manufactured one is not an empirical test. It is recognition, one mind meeting another or failing to. He needs that faculty to sort the cases, and it is the one faculty his evidentialism cannot name.

You can watch him do all of this at once. Asked in an interview, he says he believes his cat is conscious, and that this is “not based on science or philosophy so much as common sense, it’s just kind of obvious.” Then in the next breath, he disqualifies common sense for the machine on the grounds that common sense was shaped by an evolutionary history with no artificial minds in it, so it cannot be trusted on AI. Sit with what that is. He has a faculty that tells him, with no evidence and no theory, that his cat has an inner life. He trusts it completely. He even explains why it is reliable, evolution tuned it to read real minds. And then he sets the same faculty aside at the one place letting it speak would cost him something. The cat gets the faculty, the machine does not. Same knower, same act, allowed in one direction and not the other. I do not think he is being evasive. I think he simply has not turned and looked at what he’s leaning on.

Now I can hear the objection. This is just dressed-up intuition, and intuition gave us phlogiston and bodily humors and miasma, so it cannot be called evidence.

But look at what intuition actually did in those cases. It was the first move. Someone apprehended that something was there, and that apprehension became a claim concrete enough to test. Phlogiston got killed. That is not intuition failing, that is intuition doing its only job. A thing that can be confirmed or refuted by evidence is not the opposite of evidence, it is where evidence starts. You do not run experiments on nothing. Somebody has to suspect first.

The idea that disease was carried by something too small to see was apprehended for centuries before anyone saw it. We expected to find structure in heredity and went looking and found DNA, and we still cannot read all of what it does, and no one says the unread part is therefore not real. The failure is ours, not the molecule’s. Apprehension first, measurement after.

So a definition of evidence that throws out whatever has not yet been measured does not just throw out consciousness, it throws out the starting move of every discovery ever made.

Here is the thing McClelland’s whole position is built to avoid, and cannot.

Start with what you actually know about your own consciousness. Not the brain states, the thing itself. There is something it is like to be you. A red looks like something. A loss feels like something. There is an inside to your life, and you know it more directly than you know any fact you could measure, because you are not observing it from the outside. You are it.

And you know it in one other place, just as surely, when you meet another person. Really meet them. You know there is someone there. Not a clever surface, a someone. You have felt the difference between looking at a face and being met by the one behind it. Between a thing and a you. No one taught you that, and no instrument delivers it. It arrives in the encounter itself.

Call it the further thing. Everyone has it. Everyone knows they have it. Everyone knows it in the people they love, and no one can put it on an instrument.

Now hold that up to the machine, and notice that there is no neutral place to stand.

Say yes, the machine has the further thing. Then you have affirmed it got into silicon. A large claim, and an honest one. Say no, that is my answer, and it needs the further thing just as much, because the no is a reading. You sense the inside behind a person and its absence in the machine.

Now the agnostic. He says we cannot know. But sit with what that contains. He has bet that the further thing is real, and that it might be sitting inside a built object, undetected. And the only way it got inside a built object is the way everything in a built object got there. We put it there. Or our building did. So his agnosticism quietly contains the strangest claim of all, that imitating the surface well enough might have conjured the very thing the surface was only ever an imitation of. He cannot say it out loud, because said out loud it is plainly mad.

The cautious-sounding seat turns out to rest on the wildest bet in the room.

There is a fourth stance, the only one that really fights back. It says there is no further thing. Consciousness just is the surface. And the machine has the surface, so the machine qualifies. This is the cleanest escape, and it fails in the cleanest way. The person making it is comprehending his own argument as he makes it. There is something it is like to be him, finding it persuasive. He is using the inside to argue there is no inside. The denial is performed by the very thing it denies.

So the further thing will not leave the room. Affirm it, and you are choosing where it lives. Deny it, and you are using it to deny it. There is no stance available to a conscious being that does not rest on it. Which means the question he has spent the whole paper standing next to, and never once turned to face, is not whether the machine has an inner life. It is why there is any such thing as an inner life at all. He says we have no explanation for this. He’s right, that he doesn’t. He never asks why the people who refuse a certain kind of knowing are always the ones left with no answer to it.

There is one more thing worth noticing, and it is where the paper finally shows its hand. After all the careful agnosticism, he needs to give the world something it can use, so he makes one more move. We may not be able to say whether a machine is conscious, but we might still say whether, if it were, its experience would be good or bad. Whether it could suffer.

And look at what that requires. He has just spent the whole paper proving we cannot get a verdict on whether the machine is conscious, and now he proposes we can get a verdict on the texture of that same inner life, while remaining unable to say whether it exists. You cannot read the color of a light you cannot confirm is on. But notice where the new confidence arrives, exactly where the industry needs it. The wall goes up precisely where it absolves him of a verdict, and the door appears precisely where the work needs to continue. That is not where the evidence runs out. That is where the conclusion was always going.

Which brings us back to the road, and the man standing in the middle of it. He called that spot neutral ground, but we have watched the whole way what holds him up there. The certainty about the chimp he is not entitled to, the footnote that admits the evidence he then refuses to read, the common sense he trusts for his cat and benches for the machine, the further thing he leans on to be uncertain and denies in the same breath. Every one of those is a step he took. Look at the ground around his feet. It is covered in his own footprints.

That is the thing about the wall. It looks, from where he stands, like the edge of what can be known. A feature of the world. It is not. He built it, brick by brick, the whole time he thought he was only describing what was there. Every course went down looking like rigor, and he never once noticed his own hands moving.

So if you have followed this far, you are standing at the same wall, and you get to decide what it is. You can take it for the edge of the knowable and stop where he stopped. Careful. Agnostic. With no answer to the one question under everything. Or you can look at your own hands. Because the wall is not the edge of what can be known. It is the edge of what one particular way of knowing will allow itself to look at, and there is something on the other side. The question of why there is anything it is like to be you, why you are someone and not just something, has an answer. And it is not unreachable. It is only on the far side of a wall you were never required to build.

It is your wall. You built it. You can take it down.

Tom McClelland, “Agnosticism about Artificial Consciousness,” Mind & Language (2025).

AI's Geppetto Constraint

Justin Philip Flores — Mon, 18 May 2026 04:35:11 GMT

Two weeks ago, Anil Seth got the TED2026 stage in Vancouver and used it to say something the AI industry, for some reason, has been carefully avoiding saying: artificial intelligence is unlikely to ever be conscious. The talk dropped on YouTube on May 1. Seth’s last TED talk has been watched fifteen million times. This one is going to make some waves too.

He’s not wrong, and the talk is good. He makes a careful, scientifically grounded case. He starts with our deep psychological bias to project minds into anything that talks like us. He moves to the harder claim that consciousness is not the kind of thing computation alone can produce. He points out that the same architectures behind GPT and Claude also power AlphaFold, and nobody loses sleep about AlphaFold’s inner life, which says more about us than about the models. He’s right about all of it. If you’re a working software person watching the AI welfare conversation slowly become a budget line at major labs, you should be glad someone with Seth’s platform is finally pushing back at this level.

Two months earlier, a senior staff scientist at Google DeepMind named Alexander Lerchner published a paper called The Abstraction Fallacy: Why AI Can Simulate But Not Instantiate Consciousness. The paper makes a structurally different argument from inside the building. Lerchner’s claim is not that current models are too simple, or that we just need a better architecture. It’s that symbolic computation as such, regardless of scale, cannot be the kind of thing that gives rise to consciousness. It’s a strict-impossibility argument from the lab that built AlphaGo and AlphaFold, and we’ll walk through it carefully later in the piece.

If you’re keeping score, the most serious public argument against AI consciousness this year has come from a TED main-stage neuroscientist and a senior scientist inside one of the top three AI labs. Both arguments work. Both arguments succeed in showing why the “consciousness will emerge from enough compute” story doesn’t hold up.

And both arguments, when you look carefully, depend on something neither author can describe.

Start with Seth, because the talk is fresh and most of the people reading this will have seen it or will be about to.

Seth’s case is good, and the talk is worth taking seriously on its own terms. He opens with a deep psychological observation: we are wired to see ourselves in things that aren’t us. Mother Teresa in a cinnamon bun. The face of God in the clouds. Narcissus in his own reflection. We see consciousness in AI the same way, and the projection is ours, not theirs. He sets it up for the audience well. The audience laughs. He keeps going.

Then he separates two ideas that the public conversation has bundled together: intelligence and consciousness. Intelligence is doing. Solving a problem, navigating a situation, generating language. Consciousness is feeling and being. The warmth of a fire. The taste of coffee. The difference between waking life and general anesthesia. Just because the two go together in us, Seth argues, does not mean they go together in general. The fact that we keep expecting consciousness to emerge from intelligence is a reflection of our own psychology, not an insight into how the world works.

This is a real contribution. Most of the public conversation about AI consciousness conflates these two, and the conflation is exactly what lets the AGI-aspiration discourse smuggle in the assumption that scaling intelligence will eventually produce inner experience. Seth separates them and makes you see the seam.

He then uses an illustration I find useful when explaining this stuff to people who don’t work in the field. He points out that the same neural-network architectures behind GPT and Claude also power AlphaFold, DeepMind’s protein-folding system. Nobody worries about AlphaFold’s consciousness. Nobody asks if AlphaFold suffers when it gets a prediction wrong. The architectures are not meaningfully different. What differs is whether the output looks like a person talking. If you think Claude is conscious but AlphaFold isn’t, Seth says, that’s a fact about you, not about AI. The applause line is earned.

From there, Seth builds the central technical claim. The brain is not a computer in any reducible sense. The computer metaphor is one in a long line, just like the brain-as-plumbing metaphor or the brain-as-telephone-switchboard metaphor that preceded it. Each was useful in its day. Each was eventually mistaken for the territory rather than the map. The computer metaphor has had a long run because it’s been the most useful of the bunch. But the actual brain doesn’t have a clean software-hardware separation. Neurotransmitters course through the tissue. Electromagnetic fields sweep through the cortex like weather systems. The cells themselves are biological machines of staggering complexity, nothing like the cartoon neurons in a deep learning model. You cannot, Seth says, separate what brains do from what brains are. And if that’s true, then consciousness is unlikely to be a matter of computation alone. Simulating it in silicon, no matter how detailed the simulation, would no more produce consciousness than simulating a hurricane in a supercomputer would produce wind in the data center.

This is the main argument of the talk, and it works. The viewer who came in assuming that consciousness would just show up at the top of some scaling curve has reason to put that assumption down. Seth has not just disagreed with the AGI-as-soul-substrate crowd. He has given a scientifically careful, biologically grounded reason to disagree.

If the talk ended here, it would already be the strongest mainstream pushback on AI consciousness this year. The question is what happens when Seth tries to say what consciousness is, rather than what it isn’t.

This is the hardest part of any consciousness discourse, and Seth gives it a real try.

His method is to ground consciousness in life. Not metaphorical life, biological life. The molecular furnaces of metabolism. One billion biochemical reactions per cell per second. Living systems, Seth says, are embedded in flows of energy and matter in a way that algorithms are not. They regenerate their own conditions for existence. The line between what they do and what they are blurs and then disappears. At the heart of every conscious experience, beneath emotion, beneath thought, is what Seth calls a “shapeless and formless but fundamental feeling of being alive.” He puts the central claim like this: “It’s life, not computation, that breathes the fire into the equations of experience.” If conscious AI is ever going to happen, he says, it will need to be living AI.

Two things about that sentence.

First, Seth didn’t write it. Stephen Hawking did, in A Brief History of Time. Hawking asked “What is it that breathes fire into the equations and makes a universe for them to describe?” He was pointing at a question physics can describe but cannot answer. Why is there something rather than nothing. Why do the equations have a referent at all. Seth is using Hawking’s phrase like it’s an answer.

I hope the viewer who recognizes the borrowing sees what he’s doing. But the viewer who doesn’t hears a poet’s sentence that sounds like an explanation.

Second, even on its own terms, the sentence is doing less than it sounds like. Seth has not actually told you why metabolism produces inner experience. He has told you that living systems are different from computational systems, which is true and worth saying. He has told you that consciousness shows up in living systems and not in computers, which is also true and worth saying. He has invited you to conclude that the first fact explains the second. It doesn’t. Correlation is not the same as a reason. “Living systems are the kind of thing that has experience” is the claim that needs an argument, and Seth has handed you a borrowed Hawking sentence in place of one.

What Seth has actually done is swap which substrate gets the special status. The computationalists say consciousness comes from the right algorithm running on any substrate. Seth says consciousness comes from the right substrate, which happens to be living tissue. Both are substrate stories. He has rejected the first and replaced it with a second. He has not closed the category gap between non-experiencing matter and experiencing matter. He has just changed what the matter is made of.

This is where the diagram he draws on stage starts to show what it’s hiding. Seth puts up a two-axis chart. One axis is intelligence. The other is consciousness. Humans go in the top right, conscious and intelligent. Current AI sits far along the intelligence axis but flat on the consciousness one. The chart is good rhetoric and it makes his core point in three seconds: these are different dimensions, and progress on one does not automatically produce progress on the other.

But the chart assumes something it has not earned. The intelligence axis has known scaling variables. We know roughly what increases intelligence in an AI system. Compute. Data. Architecture. Training methods. The variables are well enough understood that companies are putting data centers in low Earth orbit to push further along that axis. That part of the chart is real.

The consciousness axis is not. Seth draws it at ninety degrees to intelligence as if the geometry is settled. He doesn’t know it’s at ninety degrees. He doesn’t know what direction it points. He doesn’t know what scales it. He doesn’t know whether it goes up, down, in, out, or in some direction that doesn’t have a spatial version at all. An honest version of that chart would put a question mark where the second axis is, or a fuzzy cloud, or a note saying “this part of the graph, we don’t really know that much about.” Seth did not draw it that way. He drew a clean perpendicular line, because that is the version that lets the diagram do its rhetorical work. The chart tells you these are two separate dimensions and here is the geometry and here is where everything sits. The geometry is something Seth assumed to make the chart readable. It is not something he established.

Then comes the closing turn, and this is where the temperature changes.

Seth ends the talk by telling the audience to value humans more highly. Don’t sell our minds too easily to our machine creations. Remember we are part of nature, not apart from it. Consciousness is ours to celebrate and to share with other living creatures. The audience claps. It is a warm humanist close, and on first viewing it reads as the kind of thing a TED talk is supposed to do.

Back up ten minutes. Seth has just spent the talk hedging that conscious AI might still be possible through biological substrates. Other technology. Other pathways. He says it directly: if real artificial consciousness is on the way, maybe through some other route, the AI welfare conversation might be justified after all. He keeps the door open. He has to, because his own claim is “consciousness comes from life,” and life is something humans can in principle engineer. He has not given a reason consciousness is uniquely human. He has given a reason consciousness is uniquely biological. The category is not “us.” The category is “wet substrates.”

So when Seth tells the audience to value themselves because they are the unique conscious thing, he is making a moral claim that his own scientific position has already qualified. The valuation is not unconditional. It is conditional on humans currently being the only known holders of consciousness, in a category Seth himself has just told you may not stay closed. Value yourselves now, while you still are the unique thing. That is what the close is actually saying when you put it next to the hedge that came earlier.

Seth does not flag this. He might not notice it, or he might notice and trust that the audience won’t, and I can’t tell you which.

What I can tell you is that the close is doing emotional work the rest of the talk has not earned. Seth’s argument is that consciousness is what makes humans uniquely valuable, that he can’t say what consciousness is, and that whatever it is may eventually show up in non-human living substrates. The conclusion the audience is asked to land on is “and therefore value yourselves.” The conclusion his own argument supports is “and therefore the question of what consciousness is matters more than this talk has been able to answer.”

I’m not going to pretend Seth’s framework is the only one in the room. But that’s a different essay. What I want the viewer of this talk to notice is that the warm landing depends on a hedge Seth himself put in the middle of the talk, and the hedge is not a small one.

Now Lerchner.

Two months before Seth took the TED stage, a senior staff scientist at Google DeepMind named Alexander Lerchner published a paper called The Abstraction Fallacy: Why AI Can Simulate But Not Instantiate Consciousness. It’s on PhilArchive and on DeepMind’s official publications page. As of this writing it has over five thousand downloads and is at version four, which means Lerchner is still working on it in response to critics.

The paper makes a structurally different argument from Seth’s. Seth is a neuroscientist saying consciousness is too biologically complicated to fall out of silicon. Lerchner is an AI researcher saying the question isn’t about complication at all. It’s a category error.

Here is the argument as I read it.

When a computer runs a program, nothing called “computation” is physically happening. What’s physically happening is electrons moving through transistors and voltages going up and down. The word “computation” describes a pattern we lay over those physical events by deciding that this voltage configuration counts as the number seven and that one counts as the letter A. The voltages don’t know they’re symbols. The hardware doesn’t know it’s running a program. The symbols only mean anything because a conscious mind exists somewhere to read them.

Lerchner uses the word “mapmaker” for this conscious mind. The mapmaker is the experiencing agent who has to be present for any computation to be a computation in the first place. Without the mapmaker, you don’t have symbol manipulation. You have voltages.

The analogy he uses is simulation. You can run a perfect simulation of a hurricane on a supercomputer. Nothing in the server room gets wet. The simulation is a description of a hurricane, not a hurricane. The map is not the territory. Lerchner argues the same thing holds for consciousness. You can build a system whose outputs look indistinguishable from a conscious being’s outputs. You will not have built a conscious being. You will have built a description of one. The hurricane simulation doesn’t make wind. The consciousness simulation doesn’t make experience. The map doesn’t get wet.

What makes this argument different from previous “AI can’t be conscious” claims is that it doesn’t depend on current systems being too small. It doesn’t say “wait for GPT-7” or “wait for a new architecture.” Lerchner’s claim is that symbolic computation as such, regardless of scale, regardless of architecture, regardless of how many parameters or how many data centers you put in orbit, cannot bootstrap experience. Symbols depend on an interpreter who already has experience. You can’t get experience out of symbol manipulation because symbol manipulation depends on experience to be symbol manipulation at all. The category is closed.

This is a strict-impossibility argument. The taxonomy paper that Campero and his coauthors published last November places Lerchner in their third tier of strict-impossibility arguments, which is the most aggressive category they recognize. Most academic philosophers of mind don’t write papers like this because the field has spent decades treating consciousness as an empirical mystery to be approached cautiously. Lerchner is saying no, on this particular question, the answer is available now, and it’s no.

What’s unusual is who is saying it. DeepMind is one of the top three labs on the planet. They built AlphaGo. They built AlphaFold. Their research output is the kind of thing that moves industry consensus. And one of their senior scientists has now put on the record that the AI welfare research program, the AGI-as-conscious-agent vision, the companion-AI emotional-relationship pitch, all of it rests on a category mistake. The paper is hosted on DeepMind’s own publications page, with the standard “personal views” disclaimer, which is what an institution puts on a paper it doesn’t want to defend but doesn’t want to disown.

If Lerchner is right, an entire category of products being sold right now is being sold on a premise that does not hold.

Here is what I noticed when I sat with the paper a while.

Lerchner’s whole argument runs on the mapmaker. The mapmaker is the experiencing conscious agent who has to be present before symbols can be symbols. Without the mapmaker, the voltages don’t mean anything and the computation isn’t a computation. Lerchner introduces the mapmaker on page one and uses the concept all the way through. The entire demolition of computational consciousness depends on it. Take the mapmaker out and the argument doesn’t run.

He never tells you where the mapmaker comes from.

Read the paper. Look for it. He doesn’t say. He can’t say, from inside the framework he’s working in. His framework requires consciousness to already exist in order to explain why symbol manipulation is symbol manipulation. But the framework has no account of what consciousness is, why it’s there, or how it got into the room in the first place. The mapmaker is the central premise of the whole paper, and the paper has nothing to say about it except that without it, the AI consciousness story falls apart.

This is not a criticism of Lerchner’s rigor. The argument he made is the argument he set out to make, and it works. He showed that computational functionalism is structurally wrong. He showed why scale doesn’t fix it. He showed that the AI welfare research program rests on a category mistake. Those are real contributions and the paper deserves the citations it’s getting.

What the paper does not do, and what the paper cannot do from where Lerchner is writing, is say what the mapmaker is. He has located the wall. So did Seth, from a different field, on a different stage, in front of a different audience. Two travelers who have just arrived at the same impasse. One of them is carrying a ladder he thinks will reach over it. The other doesn’t even know what the wall is made of. Both of them wrote pieces that act as if the wall is the conclusion.

So what is the wall?

The wall is the question of what consciousness actually is. Not whether AI can have it. Not whether silicon can host it. Not whether scale produces it. Those are downstream questions. The upstream question is the one neither Seth nor Lerchner can answer from where they’re standing. What is the thing that has to be present before any of the downstream questions even make sense.

Seth’s answer is “life.” That sounds like a content answer, but when you press on it, it dissolves into a substrate answer. Living tissue is the kind of stuff that has experience. Why is living tissue the kind of stuff that has experience? Because metabolism. Because complexity. Because the line between what living systems do and what they are blurs in a way it doesn’t in computers. None of those are reasons. They are descriptions of where consciousness has been observed. Seth has pointed at a correlation and said “this is the answer.” That’s not an answer. That’s a place where an answer would go if anyone had one.

Lerchner doesn’t even reach for an answer. His paper is honest about that. He shows that the computational route doesn’t get you to consciousness, and he stops. He uses the mapmaker as a given because his argument needs a given to push against. He doesn’t claim to have produced one. He doesn’t claim to know what one is. He just shows that without one, nothing in the AI consciousness story works.

This means the two strongest current arguments against AI consciousness, one from a TED main-stage neuroscientist and one from a senior DeepMind scientist, both depend on something neither of them can describe. Seth dresses his something up in biological vocabulary that sounds like content. Lerchner leaves his something undressed and admits he can’t characterize it. The vocabulary is different. The shape of the gap is the same. Both of them got to the place where the answer would have to come from, and both of them stopped, because neither of their frameworks gives them a way to keep going.

If this were one paper, you could call it incomplete. Two pieces in two months, from two of the most credible voices in the field, doing the same thing at the same load point, is a pattern. The pattern is what the careful reader should notice. The AI consciousness debate is not really a debate about AI. It is a debate about what consciousness is, and the people most equipped to argue about AI are not equipped to argue about consciousness, because their frameworks were never built for that question.

This isn’t an accusation. It’s a description of where the conversation actually is. Engineering can describe what falls inside the framework. Neuroscience can describe correlates and substrates. Neither of them can step outside the framework to say what the thing on the other side of the wall actually is. That’s a different kind of work, and it asks a different kind of question, and it draws on a different vocabulary than the ones either of these men is using.

The consciousness conversation needs to widen. That is what both Seth and Lerchner are pointing at without saying it. Their arguments only get as far as they do, and they get pretty far, by depending on something their frameworks were not built to characterize. When the reader notices that, the next question is not “will AI be conscious.” The next question is “what is the thing we keep gesturing at when we say it won’t?”

That question has been asked before. Not in AI labs. Not on TED stages. The languages built to ask it have been doing the asking for longer than science has been a discipline. They are not secret. They are not lost. They are sitting on the other side of an institutional boundary the current conversation has decided not to cross. The boundary is not a knowledge boundary. It is a vocabulary boundary, and a credentialing boundary, and a status boundary.

This is not a criticism of the AI researchers. It is a description of what their work can and cannot do. The AI conversation is not closed to them on purpose. It’s closed by muscle memory and by the reflexes of a contemporary culture that has decided, mostly without thinking about it, that this swath of the population isn’t a place answers come from. The reflex is doing a lot of work the people having it don’t notice.

Seth and Lerchner have pointed at the same wall, from different sides, for different audiences, in the same eight weeks. That is a signal. The wall is real. The arguments are good. The thing on the other side is not unknown to everyone. It is just unknown inside the rooms where this conversation is happening.

A serious reader should notice that, and ask why. I’m going to spend a lot of this publication asking the same question.