The Symbolic Machine and the Paradox of Generating Intelligence

By Santiago Komadina Geffroy

January 8, 2026

In this post you will find:

The Symbolic Machine and the Paradox of Generating Intelligence

Modern artificial intelligence (AI) arises from the intersection of computer science, mathematics, and linguistics¹. At its core operate logic and formal languages ²; however, reducing intelligence to the logical-formal has shown persistent limits³. Symbolic AI based on rules did not achieve general intelligence: manipulating symbols did not guarantee understanding them⁴. In humans, language structures and conveys thought without exhausting it ⁵; in artificial systems, in contrast, we tend to equate linguistic generation with intelligence⁶.

In 1950, Turing shifted the question “Can machines think?” to a criterion of textual behavior (the Imitation Game)⁷. More than seven decades later, generative models—from GPT to Gemini or Claude—seem to fulfill this criterion of indistinguishability in broad domains⁸. The success, however, reignites the question: is linguistic generation a form of thought or a statistical simulacrum that confuses fluency with understanding? ⁹

Here we propose a cross-reading between the Turing machine and Chomsky’s generative grammar to illuminate the promise and the limit of generative AI: generating language without necessarily understanding it¹⁰. Furthermore, the contemporary irony is that we attempt to generate intelligence without fully understanding what intelligence itself is , we aspire to algorithmically produce that which we can barely define conceptually.

1. Turing and Language as Procedure

The Turing machine (1936) formalizes computability¹³. In its standard definition, an MT is a 7-tuple:

M = (Q, Σ, Γ, δ, q₀, q_accept, q_reject)

Where ($Q$) is the set of states, ($\Sigma$) the input alphabet, ($\Gamma$) the tape alphabet, ($\delta$) the transition function, ($q_0$) the initial state, and ($q_{accept}, q_{reject}$) the halting states¹⁴. With a theoretically infinite tape and a finite set of rules, Turing shows that every effective algorithm is, in principle, simulable¹⁵.

In terms of computation theory, it is convenient to distinguish decidability from enumerability: there are languages for which a procedure exists that decides acceptance or rejection in finite time (decidable), and others for which we can only enumerate the accepted elements without a guarantee of halting on the rejected ones (recursively enumerable)¹⁶. The Church-Turing Thesis is not a formal mathematical theorem—it relates an informal concept (“effective calculation”) with a formal one (Turing machine)—but it constitutes a universally accepted fundamental principle after almost a century of accumulated evidence: any effective calculation we can clearly describe is simulable by a Turing machine¹⁷. This enables the bridge to the theory of languages and automata: each formal system can be seen as a device that generates or recognizes well-formed strings¹⁸.

The Turing test transfers computation to the stage of language: if textual interaction is indistinguishable from human interaction, the machine passes as “thinking”¹⁹. Contemporary LLMs embody this procedural vision: speaking as calculating²⁰.

Simple representation of a Turing Machine with a size 3 alphabet/vocabulary and transition table

2. Chomsky, Formal Languages, and their Equivalence with Machines

The Chomsky Hierarchy orders grammars by their generative power and correlates them with classes of automata22:

Note on Type 1: A Linearly Bounded Automaton (LBA) is essentially a Turing machine whose tape is restricted to a linear multiple of the input size²⁷. This limitation captures the family of context-sensitive languages, where derivations may depend on the environment but are still subject to a memory resource bounded by the input²⁸.

In standard notation, the language recognized by an automaton:

L(M) = { w ∈ Σ^* | M accepts w }²⁹

Generative grammar describes a device capable of producing well-formed sequences³⁰. Chomsky interprets it as a cognitive faculty³¹; Turing, as a computational characterization³².

The infinite productivity of generative grammar—the capacity to generate and understand an unlimited number of sentences from finite rules ³³—inspired the intuition that formal systems could capture the essence of language³⁴. For decades, symbolic AI attempted to explicitly encode these rules: feature grammars, syntactic parsers, semantic ontologies³⁵. The program assumed that describing linguistic competence was equivalent to implementing it³⁶. Modern AI inherits the strong intuition that mind and language are modeled as symbolic manipulation governed by rules³⁷.

However, the contemporary turn inverts the paradigm: rules are no longer programmed, they are learned³⁸. Neural models are parametric automata whose expressive complexity can reach that of Type 0 languages ³⁹, but they are induced from data instead of being specified by linguists⁴⁰. Where Chomsky postulated innate universals, LLMs discover statistical regularities in massive corpora⁴¹. Where generative grammar sought underlying competence (the speaker’s tacit knowledge), transformers capture observable performance (the regularities of effective use)⁴².

This methodological rupture has profound consequences: we no longer formalize language and then implement it ⁴³; we train systems to induce their own regularities⁴⁴. The result is an implicit grammar, distributed across millions of parameters, which can generate fluid text without possessing an explicit model of syntax or semantics⁴⁵. We have learned to generate linguistic intelligence without needing to specify it ⁴⁶; the pending question is whether that generation constitutes genuine understanding or mere sophisticated simulation⁴⁷.

The elegance of formal correspondence should not be confused with an ontology of meaning: that a class of languages is representable by an automaton does not imply that human language use is exhausted by that representation⁴⁸. Furthermore, that a neural system simulates linguistic competence does not imply that it replicates the underlying cognitive mechanisms⁴⁹.

3. Convergence and Rupture: Generating is Not Understanding

LLMs approximate probability distributions $p(w\_t|w_{t-1})$ over sequences using neural architectures (transformers) that can, in principle, model the complexity of Type 0 languages in the Chomsky hierarchy ⁵⁰, albeit without implementing explicit rewrite rules⁵¹. Unlike formal grammars that operate through symbolic rewrite systems, LLMs are universal approximators that learn patterns from data: a statistical syntax trained on massive corpora⁵². They achieve fluency and local coherence without a guarantee of internal semantics⁵³. If Chomskyan grammar aimed at competence (underlying knowledge), LLMs primarily capture performance (regularities of use) and signals of the world distilled from text⁵⁴.

In the Chomskyan tradition, grammar is above all a mental faculty: a biological system with innate constraints that explains how humans acquire and manipulate syntactic structures⁵⁵. In generative AI, in contrast, “grammar” is the operational name of a probability function over strings, parameterized by millions or billions of weights that adjust the likelihood of each token in context⁵⁶. Both perspectives share a formal intuition, but differ in their ontology: the former refers to cognitive mechanisms ⁵⁷; the latter, to statistical procedures⁵⁸. The result is a statistical syntax without intentionality⁵⁹. Generation is not equal to understanding⁶⁰. Formal coherence does not necessarily imply referential grounding or a world-view in the human sense⁶¹.

Natural language is ambiguous by design⁶². Formal logic aspires to eliminate ambiguity ⁶³; linguistic use thrives on it⁶⁴. How do models obtain “meaning” without total disambiguation? ⁶⁵Through relational density: meaning is approximated as a position in a space of relationships⁶⁶. And it is precisely this functional dimension that the technical marvel of computational linguistics exploits: embeddings⁶⁷. These are nothing more than vectorial representations of words learned by a statistical model⁶⁸. After “learning” on enormous text corpora, embedding models capture both semantics and syntactic relationships⁶⁹. Having vectors for words then allows us to perform mathematical operations on the words and their meanings⁷⁰. This is how we have managed to translate strings of symbols that make sense to us humans, into a computable reality⁷¹. The “artificial” meaning comes from here⁷².

However, its evolution illustrates the transition from static to dynamic semantics⁷³:

Word2Vec and GloVe (2013-2014): Each word receives a single vector based on co-occurrences in the corpus⁷⁴. The famous vectorial analogy “king – man + woman = queen” works because algebraic operations capture recurrent semantic relationships⁷⁵. However, these models do not distinguish polysemy: “bank” (financial institution) and “bank” (seat) share the same vector⁷⁶.
Contextual Embeddings (ELMO, BERT, 2018): Each word receives a distinct vector according to its context⁷⁷. “Bank” in “I went to the bank to deposit” and “I sat on the park bench” generate different representations⁷⁸. The model attends to the environment to disambiguate⁷⁹.
Generative Transformers (GPT, 2018-): Multi-head attention mechanisms allow each token to dynamically modulate its representation based on all other relevant tokens in the sequence ⁸⁰, integrating long-range dependencies that imitate grammatical agreement, anaphoric references, and thematic coherence⁸¹.

Embeddings condense semantic and syntactic regularities into high-dimensional vectors whose proximity reflects co-occurrence patterns and contextual transformations ⁸²; they do not say what a thing is, but what it tends to appear with and how it changes when combined⁸³. Attention mechanisms allow each decision to dynamically condition a token with respect to potentially distant others ⁸⁴, integrating long-range dependencies that imitate agreements and anaphoric relationships⁸⁵. So-called mechanistic interpretability has identified heads and circuits sensitive to grammatical and style features, proving that the models capture useful statistical structure⁸⁶.

However, classic vectorial analogies work in frequent domains of the corpus ⁸⁷, but break down under domain change or active polysemies: when switching from historical geography to contemporary geopolitics, or from everyday language to technical jargon, the semantic neighborhood varies and vectors drag spurious associations⁸⁸ The phenomenon illustrates a strength (capturing local regularities) and a limit: fragility when faced with conceptual transfers that require re-grounding meanings⁸⁹.

Furthermore, given their computational nature, we can perform mathematical operations on “the words.”] ⁹⁰

Having rich representations is a necessary but not sufficient condition for situated understanding⁹¹. What emerges is powerful statistical competence ⁹², but without a direct ontological commitment to referents in the sense of autonomous perceptual experience ⁹³—although recent multimodal models (CLIP, GPT-4V, Gemini) are beginning to integrate visual and textual signals, establishing incipient forms of perceptual grounding⁹⁴.

4. Language between Structure, Symbol, and Culture

Chomsky’s achievement was formalizing language as a system of rules; its cost was abstracting the social dimension⁹⁵. Sociolinguistics, pragmatics, and usage-based linguistics remind us that language is negotiated in communities of practice⁹⁶. LLMs reincorporate culture via statistics: they absorb ideologies, biases, registers, and dialects without awareness⁹⁷. They produce a synthetic culture: a reflection of historical distributions of our voices, sometimes with amplification of asymmetries⁹⁸.

To say that language is symbolic is to affirm that its units mean by convention and relation (not by physical resemblance to their referents)⁹⁹. This arbitrariness allows the same medium (text, audio, code) to express an infinite number of contents and styles¹⁰⁰. That is why it is possible to model large families of practices with a single formalism: natural languages with their dialects, technical jargon, code, formulas, musical notation¹⁰¹. The symbolic dimension of language is what makes learning from heterogeneous corpora scalable¹⁰².

It does not follow from language being symbolic that there exists an operative Universal Grammar capable of recognizing everything by explicit rules¹⁰³. We seek the formalism of intelligence (the “artificial”), but when we formalize without residue, we lose ambiguity ¹⁰⁴, precisely the resource that living language exploits¹⁰⁵. That language is alive does not imply anarchy: it implies regulated variation¹⁰⁶. Rules (conventions) are learned and adjusted through use¹⁰⁷. LLMs learn a snapshot of that variation: a model trained up to 2023 does not know the idiolect of 2025 without updating¹⁰⁸. They are grammars frozen at their moment of training: they capture historical regularities of the corpus ¹⁰⁹, but not its subsequent transformative dynamics without retraining, fine-tuning, or updating mechanisms like retrieval-augmented generation (RAG)¹¹⁰.

Large-scale models absorb cultural diversity without postulating explicit universals ¹¹¹, and thus can simulate local consistencies in relatively closed domains while exhibiting grounding failures when practices, referents, or norms change¹¹². Their notable functional efficacy does not constitute evidence of understanding ¹¹³: rather, it reveals the power of correlations¹¹⁴. Similarly, the biases they reproduce are not a simple “bug” that can be corrected a posteriori ¹¹⁵¹¹⁵, but a cultural effect of statistics: the distillation of historical distributions with their asymmetries^116116116. They simulate consistency in closed domains, but fail in grounding when practices, referents, or norms change¹¹⁷. Their functional success does not prove understanding; their bias is not a merely technical “bug,” but a cultural effect of statistics¹¹⁸.

5. Beyond the Turing Test: Contemporary Criteria and Limits

Today we evaluate systems with complementary tests: Winograd/Winoground (references), ARC (abstraction and composition), BIG-bench, MMLU (knowledge and reasoning), and even ARC-AGI (the Holy Grail of current AI, being general), among others¹¹⁹. The panorama is mixed: notable advancements in statistical competence coexist with limits in systematic generalization, semantic grounding, and situated adaptation¹²⁰. The leap from prediction to interpretation remains the bottleneck¹²¹.

In recent years, evaluations of grounding (temporal, factual, and multimodal) and protocols for using tools that connect the model with search engines, databases, or calculators have emerged¹²². These approaches improve accuracy when the task requires verifying or updating facts ¹²³, but they also transfer part of the intelligence to the orchestration: deciding when to consult, which source to use, and how to reconcile conflicting evidence¹²⁴. In parallel, work on emergent communication in reinforcement learning environments shows that agents can develop codes useful for their purposes, although they are rarely compositional and transferable to human practices without tutoring¹²⁵.

A critic might argue: “Babies also acquire language through massive exposure to linguistic input, detecting statistical patterns in the speech around them. If humans learn this way and develop understanding, why wouldn’t LLMs, which do essentially the same thing with more data, have it?” ¹²⁶

The objection is serious and deserves a careful answer. The differences are not merely quantitative (more data, more parameters), but qualitative¹²⁷:

Multimodal Grounding: Children learn language simultaneously with direct perceptual experience of the world¹²⁸. When they hear “ball,” they see, touch, and throw balls¹²⁹. The word is anchored in recurrent sensorimotor experiences¹³⁰. Purely textual LLMs lack this triangulation ¹³¹; recent multimodal models (GPT-4V, Gemini) are beginning to close this gap, but still without the richness of corporeal experience¹³².
Causal Models of the World: Humans develop intuitive physics (objects do not disappear, they fall down), intuitive psychology (agents have intentions), intuitive biology (living things grow and die)¹³³. These causal models structure linguistic understanding¹³⁴. An LLM can predict that “if you drop a glass, it breaks,” but not because it understands gravity or fragility, but because that sequence is frequent in its data¹³⁵.
Intrinsic Motivation and Situated Experience: Children learn language in contexts with real communicative purpose: asking, sharing attention, negotiating¹³⁶. Every linguistic act has consequences in the social world¹³⁷. LLMs are trained on a technical objective (predicting the next token) without agency or existential consequence of their “speech acts”¹³⁸.
Continuous Learning and Adaptation: Humans constantly update their understanding of language in interaction with the changing environment¹³⁹. A child learning “dinosaur” adjusts their concept by visiting a museum, reading books, watching movies¹⁴⁰. An LLM frozen at its training checkpoint does not evolve with the world¹⁴¹.

That humans and LLMs share statistical mechanisms in linguistic processing does not imply they share understanding¹⁴². The difference lies in the causal and perceptual grounding, motivated agency, and experiential continuity¹⁴³. LLMs are systems that have learned to imitate linguistic competence without developing the cognitive preconditions that sustain it in humans¹⁴⁴.

However, it is appropriate to recognize an epistemic limit to the previous argument¹⁴⁵. The four pointed differences (multimodal grounding, causal models, intrinsic motivation, continuous learning) describe observable external conditions, not internal states of understanding¹⁴⁶. We face here the classic problem of other minds: we do not have direct access to the subjective experience of any human or artificial system beyond our own consciousness¹⁴⁷.

When we affirm that a child “understands” the word “ball” because they have seen, touched, and thrown it, we infer that understanding from their coherent behavior in varied contexts ¹⁴⁸: they can ask for the ball, distinguish it from other objects, use it appropriately¹⁴⁹. But the phenomenological experience of the child—the “what it feels like” to understand—is inaccessible to us¹⁵⁰. The same applies to LLMs: we can document their systematic failures (fragility to domain changes, regression to frequent patterns) ¹⁵¹, but we cannot categorically rule out that some form of emergent “understanding” exists in their internal layers ¹⁵², distinct from the human one but functional in its own way¹⁵³. Furthermore, perhaps humans are also, at some fundamental level, sophisticated statistical approximators whose “understanding” is not qualitatively different ¹⁵⁴, but only quantitatively richer due to our multimodal and evolutionary grounding¹⁵⁵. This hypothesis, although speculative, cannot be empirically refuted from outside the system¹⁵⁶.

Does this uncertainty matter? Pragmatically, not for engineering purposes¹⁵⁷. What matters are the measurable consequences¹⁵⁸:

Robustness: Does the system maintain coherence under variations in context, domain, and formulation? ¹⁵⁹
Adaptability: Can it update its responses to new evidence without complete retraining? ¹⁶⁰
Grounding: Does it verify its statements against external sources or generate plausible fiction? ¹⁶¹
Transparency: Can we audit its decisions and correct systematic biases? ¹⁶²

From this functionalist perspective, the relevant question is not “does it truly understand?” but “does it behave as if it understood in a reliable, adaptable, and auditable manner?” ¹⁶³The current answer is: in closed domains, often yes; in open or changing domains, systematically no¹⁶⁴. We adopt a pragmatic agnosticism about the internal understanding of LLMs¹⁶⁵. We do not affirm that they lack all form of “understanding” ¹⁶⁶, but given their observable failures, they do not meet robust functional criteria for situated understanding¹⁶⁷. The burden of proof lies in demonstrating generalizable reliability, not in postulating inaccessible internal states¹⁶⁸.

6. Conclusion: Towards a Critical Theory of Artificial Generation

Turing’s program presents language as a procedure and offers textual indistinguishability as a pragmatic criterion¹⁶⁹; Chomsky’s conceives it as a faculty with formal constraints that explain structural productivity at the cost of abstracting the social ¹⁷⁰; generative AI treats it as a statistical-cultural model capable of capturing correlations and styles without autonomous experiential grounding¹⁷¹. These three perspectives are not mutually exclusive, but neither are they reducible to one another¹⁷².

The contemporary project of generating intelligence through massive statistical learning reveals a foundational paradox: we seek to produce what we cannot define¹⁷³. What exactly is the intelligence we aspire to generate? ¹⁷⁴Capacity to solve problems, adaptability to the unforeseen, situated understanding, phenomenological consciousness? ¹⁷⁵Each proposed criterion—Turing test, academic benchmarks, performance on specific tasks—captures partial dimensions without exhausting the phenomenon¹⁷⁶. We have learned to generate intelligent behavior in delimited domains ¹⁷⁷, but the question of whether that constitutes genuine intelligence or sophisticated simulation remains open ¹⁷⁸—and perhaps, from a pragmatic perspective, is undecidable¹⁷⁹.

Generating text is not equivalent to predicting meaning ¹⁸⁰: at most, it participates in its negotiation in a limited way¹⁸¹. The aspiration to “perfect prediction” is not only unattainable but conceptually flawed when faced with an object in constant change¹⁸². Therefore, intelligence is not synonymous with language ¹⁸³: generative AI today functions as a computational interface between our intentions and the machine ¹⁸⁴, translating instructions into operations through natural or formal languages¹⁸⁵. It is an automated statistical competence ¹⁸⁶that simulates coherence in finite formal spaces without its own semantic horizon ¹⁸⁷in the sense of autonomous agency¹⁸⁸.

The most promising horizon is not about “passing off” machines as human ¹⁸⁹, but about co-designing models that evolve with us ¹⁹⁰, mindful of the permanent negotiation between structure, culture, and novelty¹⁹¹. Let us measure artificial intelligence where prediction stumbles and interpretation begins ¹⁹²—in situated adaptation, grounding to the world, and responsibility for the consequences of what is said¹⁹³.

References

Avramides, A. (2001). Other Minds. Routledge. Bender, E., & Koller, A. (2020).
Climbing towards NLU: On Meaning, Form, and Understanding in the Age of
Data. ACL. Bender, E., Gebru, T., et al. (2021). On the Dangers of Stochastic Parrots. FAccT. Bybee, J. (2010). Language, Usage and Cognition. CUP.
Chomsky, N. (1957). Syntactic Structures. Mouton. Chomsky, N. (1965). Aspects of the Theory of Syntax. MIT Press. Clark, A. (1997). Being There. MIT
Press. Dennett, D. (1991). Consciousness Explained. Little, Brown. Floridi, L.
(2019). The Logic of Information. OUP. Halliday, M. A. K. (1978). Language
9
as Social Semiotic. Edward Arnold. Harnad, S. (1990). The Symbol Grounding
Problem. Physica D, 42(1–3), 335–346. Hopper, P. J. (1987). Emergent Grammar. BLS 13, 139–157. Hymes, D. (1972). On Communicative Competence.
In Pride & Holmes (eds.), Sociolinguistics. Lakoff, G. (1987). Women, Fire,
and Dangerous Things. U. Chicago Press. Lazaridou, A., Peysakhovich, A., &
Baroni, M. (2016). Multi‑Agent Cooperation and the Emergence of (Natural)
Language. ICLR. Lenneberg, E. (1967). Biological Foundations of Language.
Wiley. Mikolov, T., et al. (2013). Efficient Estimation of Word Representations
in Vector Space. arXiv. [Word2Vec] Mordatch, I., & Abbeel, P. (2018). Emergence of Grounded Compositional Language in Multi‑Agent Populations. AAAI.
Pennington, J., Socher, R., & Manning, C. (2014). GloVe: Global Vectors for
Word Representation. EMNLP. Peters, M., et al. (2018). Deep Contextualized
Word Representations. NAACL. [ELMo] Radford, A., et al. (2018). Improving
Language Understanding by Generative Pre-Training. OpenAI. [GPT] Schick,
T., Dwivedi‑Yu, J., et al. (2023). Toolformer: Language Models Can Teach
Themselves to Use Tools. arXiv. Yao, S., et al. (2023). ReAct: Synergizing
Reasoning and Acting in Language Models. ICLR. Zhou, B., et al. (2021). TimeDial: Temporal Commonsense Reasoning in Dialog. EMNLP. Marcus, G.,
& Davis, E. (2019). Rebooting AI. Pantheon. Searle, J. (1980). Minds, Brains,
and Programs. BBS, 3(3), 417–424. Tomasello, M. (2003). Constructing a
Language. Harvard. Turing, A. (1950). Computing Machinery and Intelligence.
Mind, 59(236), 433–460. Vaswani, A., et al. (2017). Attention Is All You Need.
NeurIPS. [Transformer architecture

This article was written in Spanish and translated into English and Portuguese with the help of ChatGPT.

Discover more articles of your interest!
Go back

The Symbolic Machine and the Paradox of Generating Intelligence

1. Turing and Language as Procedure

2. Chomsky, Formal Languages, and their Equivalence with Machines

3. Convergence and Rupture: Generating is Not Understanding

4. Language between Structure, Symbol, and Culture

5. Beyond the Turing Test: Contemporary Criteria and Limits

6. Conclusion: Towards a Critical Theory of Artificial Generation

References

Are you interested in this information?Receive the latest updates in your email!

Are you interested in this information?
Receive the latest updates in your email!