From Semantic Markers to Machine Meaning: How Linguistics Prepared the Way for AI


1. The problem first emerged in Noam Chomsky’s early work

In Syntactic Structures (1957), Chomsky argued that a grammar could generate all and only the grammatical sentences of a language. But his famous example — Colorless green ideas sleep furiously — showed that syntax alone could not explain meaning. A sentence might be grammatically sound yet semantically nonsensical.

A similar anomaly — Sincerity loves John — made the point even more sharply. Grammatically, the sentence is correct; semantically, it is absurd. Chomsky recognised that his theory was incomplete without some means of excluding such violations. He later called these constraints selectional restrictions (Aspects of the Theory of Syntax, 1965): the verb love, for example, requires an animate subject and an animate object. Since sincerity is abstract, the combination fails.


2. Katz and Fodor: The semantic component

In the early 1960s, two of Chomsky’s colleagues at MIT, Jerrold J. Katz and Jerry A. Fodor, set out to integrate semantics formally into generative grammar. Their paper “The Structure of a Semantic Theory” (1963) and the later volume Katz and Postal, An Integrated Theory of Linguistic Descriptions (1964) proposed that every lexical item should carry a set of semantic features or markers — small meaning components such as [+human], [+animate], [+concrete], [+count].

Thus:

Man = [+human] [+male] [+adult]
Bachelor = [+human] [+male] [+adult] [−married]
Love (verb) requires subject [+animate] and object [+animate]

Sentences violating these feature constraints, like Sincerity loves John, could then be ruled out as semantically ill-formed even though syntactically correct.

Katz and Fodor’s system treated semantics as a component of linguistic competence parallel to syntax and phonology. Their model was influential, though Chomsky himself remained cautious. He accepted that meaning must be represented in the lexicon, but maintained that syntax should remain autonomous.


3. Generative Semantics and its aftermath

By the later 1960s, linguists such as George Lakoff, James McCawley, Paul Postal, and John R. Ross pushed the idea further in what became known as Generative Semantics. They argued that deep structures were not syntactic but semantic — that syntax should be derived from underlying meanings rather than the other way round.

The dispute marked the first major internal revolt within the Chomskyan tradition. It generated vigorous debate but little practical outcome. By the early 1970s, Chomsky’s Extended Standard Theory had reasserted the independence of syntax, and Generative Semantics — for all its energy — faded from view. In retrospect, it seems much ado about nothing: a theoretical battle whose noise exceeded its lasting results.

For all its intellectual energy, the wider debate about how language is generated — and even how it might have originated — led nowhere. It produced models and counter-models, but no clearer understanding of what language actually is: a living instrument of meaning between human beings. The attempt to explain creativity through rules finally showed that language resists full explanation.


4. Montague Grammar and formal semantics

In the same period, Richard Montague (1930–1971), a logician rather than a linguist, developed what became known as Montague Grammar. His papers “English as a Formal Language” (1970) and “Universal Grammar” (1970) showed that natural language could be described using the tools of formal logic.

Montague treated meaning compositionally: the meaning of a sentence is determined by the meanings of its parts and the rules combining them. His approach laid the foundations for truth-conditional semantics, later developed by David Lewis and Barbara Partee in the 1970s.

Compared with Katz and Fodor’s intuitive feature lists, Montague’s logic-based model was far more rigorous. But it also marked the end of the early “semantic marker” tradition. There was no agreed inventory of primitives, and the marker approach could not handle context, metaphor, or ambiguity.

All these theorists treated language as a string of symbols, analysing surface structure in search of what lay beneath it. But their models remained symbolic; they described patterns rather than explaining how such patterns arise. It took a different kind of mind — the connectionist insight of Geoffrey Hinton in the 1980s — to grasp that the true foundation of language lies in the mechanism of learning itself.


5. From connectionism to Large Language Models

Today’s Large Language Models (LLMs) are a direct extension of that insight. An LLM is a computer model trained on vast amounts of text to detect the statistical patterns of words and phrases. Instead of applying explicit rules, it learns to predict what word is likely to come next, adjusting its internal weights through exposure to language. In this sense, it mirrors the way humans absorb linguistic structure through experience rather than instruction.

It is worth noting that the word model now means something quite different from what linguists once intended. Traditional linguistics used the term to describe a theoretical system of rules — an abstract grammar meant to explain how sentences are formed. In practice, it never did; the approach was misdirected, relying on formal descriptions that could not capture how people actually speak or learn. In AI, the model is built directly from the data. The corpus itself becomes the model once mathematical procedures are applied to it, turning patterns in real language use into probabilities. The first kind tried to explain language; the second simply learns from it.

Surface language is therefore not the product of rules but the excrescence of learning — the outward sign of a system that has learnt to predict which word is most likely in a given context. Humans seem to do this instinctively, but the process is no less intricate. Our brains are not solving equations; they are recognising patterns of sound, meaning, and context built up through experience. The mathematics lies beneath awareness, embedded in the neural connections that store and compare patterns. We know to say sweet, sour, red, or green apple, but not purple apple, because years of exposure have tuned our expectations to what language and life make probable. In that sense, human speech and machine prediction are parallel processes: both depend on statistical learning, though one is lived and intuitive, the other computed and abstract. The difference is that we know from experience that an apple is green; the machine only knows that green often appears near apple. It is remarkable that this statistical approach to language production — once dismissed as mechanical — has nevertheless achieved such sophistication as the very article (sic) in which this reflection appears.


6. Cognitive and social turns

By the late 1970s and early 1980s, the centre of gravity had shifted again. It captured the fatigue that had set in after two decades of theoretical wrangling about deep and surface structures. Linguistics began to look outward once more.

Cognitive linguistics (Lakoff, Langacker, Fillmore) reintroduced meaning as part of human conceptualisation, drawing on psychology and embodiment.
Sociolinguistics and pragmatics (Labov, Grice, Leech) explored meaning in relation to use and context rather than internal representation.

The old idea of fixed semantic markers gave way to notions of prototype and frame — more flexible ways of modelling meaning that reflected how speakers actually think and interact. Yet none of these approaches solved the central problem. Linguistics had become a profession in search of an object: an elegant form of intellectual self-maintenance. Departments multiplied, conferences flourished, but the promised unifying theory never arrived. By the end of the century many theoretical linguistics programmes were closing or merging into larger schools of communication, psychology, or computer science — a sign that the discipline’s centre had shifted elsewhere. What had begun as the search for a universal grammar ended as a series of specialised sub-fields, each absorbed into other domains. Much of modern linguistics came to seem scholastic: a science without a subject, analysing the motion of angels on the head of a pin.

Without the connectionist breakthrough of Hinton and his colleagues, theoretical linguistics might have continued for decades refining the same symbolic models and producing an ever-growing literature of diminishing insight. Connectionism at least offered a mechanism: it showed how language could arise from learning rather than from rules. That realisation redirected the study of language toward cognition and computation, and in doing so rescued it from a cycle of self-referential debate.


7. What the notion solved — and why it failed

The introduction of semantic features and selectional restrictions did solve an immediate theoretical problem: it explained why grammatically correct sentences could still be semantically wrong. It also led linguistics to recognise the need for a lexical component — an inventory of words carrying both syntactic and semantic information.

But the approach failed to mature into a general theory because:

  • There was no agreed list of basic features or markers.
  • It could not explain idioms, metaphor, or context-dependent meaning.
  • It treated words as bundles of abstract properties detached from use.

By the end of the 1970s, semantic markers had largely disappeared from mainstream theory. Their importance today is historical: they represent the first serious attempt to make meaning formally accountable within grammar.


8. Modern parallels: How AI avoids “Sincerity loves John”

Artificial intelligence faces the same problem that troubled Chomsky in 1957: how to prevent well-formed nonsense. But it solves it by different means.

Large language models and other machine-learning systems are not built on rules or features. Instead, they learn from patterns in enormous text corpora. If sincerity almost never occurs as the subject of love, the system assigns that combination a vanishingly low probability. It therefore avoids producing it — not because of a rule, but because the pattern is statistically implausible.

In effect, machine learning has rediscovered Chomsky’s selectional restrictions, but in a probabilistic rather than logical form. Where the early linguists sought symbolic rules, AI now infers them from data. In that sense, AI has unwittingly revived a question that linguistics once abandoned — how meaning and structure arise together.


Continuity and convergence

The early search for semantic markers was a genuine attempt to bridge structure and meaning — the very question that still drives linguistics and computational language research today.
Traditional linguistics, for all its humanistic limitations, asked why language means what it does.
Modern AI asks how language works in practice.
Both approaches remain incomplete on their own, but together they suggest a shared horizon: a science of language that joins precision with understanding, analysis with empathy, and mechanism with mind.


References

Chomsky, N. Syntactic Structures. The Hague: Mouton, 1957.
Chomsky, N. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press, 1965.
Katz, J. J., & Fodor, J. A. “The Structure of a Semantic Theory.” Language, 39(2), 170–210, 1963.
Katz, J. J., & Postal, P. M. An Integrated Theory of Linguistic Descriptions. Cambridge, MA: MIT Press, 1964.
Montague, R. “English as a Formal Language.” In B. Visentini et al. (eds.), Linguaggi nella Società e nella Tecnica. Milan: Edizioni di Comunità, 1970.
Montague, R. “Universal Grammar.” Theoria, 36(3), 373–398, 1970.
Lakoff, G. Linguistic Gestalts. Chicago Linguistic Society Papers, 1969.
Langacker, R. W. Foundations of Cognitive Grammar. Stanford: Stanford University Press, 1987.
Partee, B. H. “Montague Grammar and Transformational Grammar.” Linguistic Inquiry, 6(2), 203–300, 1975.


Leave a Reply

Your email address will not be published. Required fields are marked *