The M.O.

Eszter Hagyatéka. A good portrait of a psychopath (con-man) and how their manipulative charm does not wear off even when their falseness and emptiness is transparent. The only thing that is not perfectly repulsive about them (for those who, unlike Eszter, are not otherwise in their thrall) is their almost touchingly naive conviction that everyone else is a psychopath too, “righteousness” being just another con. In this film, Lajos even effects to want to co-opt Eszter’s haplessly unvindictive righteousness to complement his own “insufficiently talented” M.O.

Eszter’s Lajos is unlike Mann’s Felix Krull, whose manipulative skills are grounded in a capacity for empathic mind-reading that is then used for exploitation. But there is still the same sense of an inescapable superficiality always yearning (but only, of course, superficially) for depth, while addicted only to the allures of the surface. Perhaps it’s a mistake to say that psychopaths have no feelings: They do, but they are faint and fleeting. They need to use method acting to simulate a soul — a soul that they know so well to be false, that they cannot conceive it to be otherwise in anyone else.

Richard Wagner

wagner

The Poet
is but an Oracle.

His ventriloquist Muse
channels through him,
via his Oeuvre,
not his Life
or his obloquy.

If a man chance
to be born taller than all others,
yea, let us have him
throw our Hoops.

But let us save our hoopla
for his opera omnia,
or his DNA,
not his Character.

Except he chance
to have one.

Onomastication

weltenchronik
onomastication: having to eat your words (not to be confused with onomasturbation, which is (1) a form of “oral” sex (aka ononanism) as well as (2) a variety of logodaedaly, also known as onomancy)

On abstraction, definition, composition and symbol grounding in dictionaries

Re: Blondin Masse, A, G. Chicoisne, Y. Gargouri, S. Harnad, O. Picard, O. Marcotte (2008) How Is Meaning Grounded in Dictionary Definitions? TextGraphs-3 Workshop, 22nd International Conference on Computational Linguistics, Coling 2008, Manchester, 18-22 August, 2008

Many thanks to Peter Turney for his close and thoughtful reading of our paper on extracting the grounding kernel of a dictionary.

Peter raises 3 questions. Let me answer them in order of complexity, from the simplest to the most complex:

— PT: “(1) Is this paper accepted for Coling08?

Yes. Apparently there are different sectors of the program, and this paper was accepted for the Textgraphs workshop, listed on the workshop webpage.

— PT: “(2) How come we claim the grounding kernel (GK) words are more concrete, whereas in our example, they are more abstract?

The example was just a contrived one, designed only to illustrate the algorithm. It was not actually taken from a dictionary.

When we do the MRC correlations using the two actual dictionaries (LDOCE and CIDE), reduced to their GK by the algorithm, GK words turn out to be acquired at a younger age, more imagable, and (less consistently) more concrete and more frequent.

However, these are separate pairwise correlations. We have since extended the analysis to a third dictionary, WordNet, and found the same pairwise correlations, but when we put them together in a stepwise hierarchic multiple regression analysis, looking at the independent contributions of each factors, the biggest effect turns out to be age of acquisition (GK being acquired earlier), but then the residual correlation with concreteness reverses polarity: concreteness is positively correlated with earlier age of acquisition across all words in the MRC database, but once the GK correlation with age is partialled out, the remaining GK words tend to be more abstract!

This obviously needs more testing and confirmation, but if reliable, it has a plausible explanation: the GK words that are acquired earlier are more concrete, but the GK also contains a subset of abstract words, either learned later in life, or learned through early abstraction, and these early abstract words are also important for the compositional power of dictionary definitions in reaching other words through definition alone.

The next step would be begin to look at what those GK the GK words, concrete and abstract, actually are, and the extent to which they may tend to be unique and universal across dictionaries.

— PT: “(3) Does our analysis overlook the process of abstraction in its focus on acquiring meaning by composition (through dictionary definition)?

Quite the contrary. We stress that word meanings must be grounded in prior sensorimotor learning, which is in fact the process of (senssorimotor) abstraction!

Peter writes: “we may understand ‘yellow’ as the abstraction of all of our experiences, verbal and perceptual, with yellow things (bananas, lemons, daffodils, etc.). When we are children, we build a vocabulary of increasingly abstract words through the process of abstraction.”

But we would agree with that completely! The crucial thing to note, however, is that abstraction, at least initially, is sensorimotor, not linguistic. We learn to categorize by abstracting, through trial and error experience and feedback, the invariant sensorimotor features (“affordances”) of the members of a category (e.g., banana, lemons, daffodils, and eventually also yellow), learning to distinguish the members from the nonmembers, based on what they look and feel like, and what we can and cannot do with them. Once we have acquired the category in this instrumental, sensorimotor way, because our brains have abstracted its sensorimotor invariants, then we can attach an arbitrary label to that category — “yellow” — and use it not only to refer to the category, but to define further categories compositionally (including, importantly, the definition through description of their invariants, once those have been named).

This is in agreement with Peter’s further point that “As that abstract vocabulary grows, we then have the words that we need to form compositions.”

And all of this is compatible with finding that although the GK is both acquired earlier and more concrete, overall, than the rest of our vocabulary, it also contains abstract words (possibly early abstract words, or words that are acquired later yet important for the GK).

— PT: “The process of abstraction takes us from concrete (bananas and lemons) to abstract (yellow). The process of composition takes us from abstract (yellow and fruit) to concrete (banana).

The process of abstraction certainly takes us from concrete to abstract. (That’s what “abstract” means: selecting out some invariant property shared by many variable things.)

The process of “composition” does many things; among them it can define words. But composition can also describe things (including their invariant properties); composition also generates every expression of natural language other than isolated words, as well as every expression of formal languages such as logic, mathematics and computer programming.

A dictionary defines every word, from the most concrete to the most abstract. Being a definition, it is composite. But it can describe the rule for abstracting an invariant too. An extensional definition defines something by listing all (or enough of) its instances; an intentional definition defines something by stating (abstracting) the invariant property shared by all its instances.

— PT: “Dictionary definitions are largely based on composition; only rarely do they use abstraction.

All definitions are compositional, because they are sentences. We have not taken an inventory (though we eventually will), but I suspect there are many different kinds of definitions, some intensional, some extensional, some defining more concrete things, some defining more abstract things — but all compositional.

— PT: “If these claims are both correct, then it follows that your grounding kernel words will tend to be more abstract than your higher-level words, due to the design of your algorithm. That is, your simple example dictionary is not a rare exception.”

The example dictionary, as I said, was just arbitrarily constructed.

Your first claim, about the directionality of abstraction, is certainly correct. Your second claim that all definitions are compositional is also correct.

Whether the words out of which all other words can be defined are necessarily more abstract than the rest of the words is an empirical hypothesis. Our data do not, in fact, support the hypothesis, because, as I said, the strongest correlate of being in the grounding kernel is being acquired at an earlier age — and that in turn is correlated, in the MRC corpus, with being more concrete. It is only after we partial out the correlation of the grounding kernel with age of acquisition (along with all the covariance that shares with concreteness) that the correlation with concreteness reverses sign. We still have to do the count, but the obvious implication is that the part of the grounding kernel that is correlated with age of acquisition is more concrete, and the part that is independent of age of acquisition is more abstract.

None of this is derived from or inherent in our arbitrary, artificial example, constructed purely to illustrate the algorithm. Nor is any of it necessarily true. It remains to see what the words in the grounding kernel turn out to be, whether they are unique and universal, and which ones are more concrete and which ones are more abstract.

(Nor, by the way, was it necessarily true that the words in the grounding kernel would prove to have been acquired earlier; but if that proves reliable, then it implies that a good number of them are likely to be more concrete.)

– PT: “As I understand your reply, you are not disagreeing with my claims; instead, you are backing away from your own claim that the grounding kernel words will tend to be more concrete. But it seems to me that this is backing away from having a testable hypothesis.

Actually, we are not backing away from anything. These results are fairly new. In the original text we reported the direct pairwise correlation between being in the grounding kernel and, respectively, age of acquisition, concreteness, imagability and frequency. All these pairwise correlations turned out to be positive. Since then we have extended the findings to WordNet (likewise all positive) and gone on to do do stepwise hierarchical multiple regression analysis, which reveals that age of acquisition is the strongest correlate, and, when it is partialled out, the sign of the correlation with concreteness reverses for the residual variance.

The hypothesis was that all these correlations would be positive, but we did not anticipate that removing age of acquisition would reverse the sign of the residual correlation. That is a data-driven finding (and we think it is both interesting, and compatible with the grounding hypothesis).

– PT: “There is an intuitive appeal to the idea that grounding words are concrete words. How do you justify calling your kernel words “grounding” when they are a mix of concrete and abstract? What independent test of “groundingness” do we have, aside from the output of your algorithm?

The criterion is and has always been: reachability of the rest of the lexicon from the grounding kernel alone. That was why we first chose to analyze the LDOCE and CIDE dictionaries: Because they each allegedly had a “control vocabulary,” out of which all the rest of the words were defined. Unfortunately, neither dictionary proved to be consistent in ensuring that all the other words were defined out of the control vocabulary (including the control vocabulary), so that is why Alexandre Blondin-Massé designed our algorithm.

The definition of symbol grounding preceded these dictionary analyses, and it was not at all a certainty that the “grounding kernel” of the dictionary would turn out to be the words we learn earliest, nor that it would be more concrete or abstract than the rest of the words. That too was an empirical outcome (and much work remains to be done before we know how reliable and general it is, and what the blend of abstract and concrete turns out to be).

I would add that “abstract” is a matter of degree, and no word — not even a proper name — is “non-abstract,” just more or less abstract. In naming objects, events, actions, states and properties, we necessarily abstract from the particular instances — in time and space and properties and experience — that make (for example) all bananas “bananas” and all lemons “lemons.” The same is true of what makes all yellows “yellows,” except that (inasmuch as vocabulary is hierarchical — which it is not, entirely), “yellows” are more abstract than “bananas” (so are “fruit,” and so are “colors”).

(There are not still unresolved methodological and conceptual issues about how to sort words for degree of abstractness. Like others, we rely on human judgments, but what are those judgments really based on?)

(Nor are all the (content) words of a language ranged along a strict hierarchy of abstractness. Indeed, our overall goal is to determine the actual graphic structure of dictionary definition space, whatever it turns out to be, and to see whether some of its properties are reflected also in the mental lexicon, i.e., not only our mental vocabulary, but how word meanings are represented in our brains.)

— PT: “You suggest a variety of factors, including concreteness, imageability, and age of acquisition. You are now fitting a multilinear combination of these factors to the output of your algorithm. Of course, if you have enough factors, you can usually fit a multilinear model to your data. But this fitting is not the same as making a prediction and then seeing whether an experiment confirms the prediction.

I am not at all confident that the grounding kernel, extracted by our algorithm, was bound to be positively correlated, pairwise, with age of acquisition, concreteness, imagability and frequency, but we predicted it would be. We did not predict the change in sign of the correlation in the multiple regression, but it seems an interesting, interpretable and promising result, worthy of further analysis.

— PT: “I am willing to make a testable prediction: If my claims (1) and (2) are true, then you should be able to modify your algorithm so that the kernel words are indeed more concrete. You just need to ‘turn around your operation’.

I am not quite sure what you mean by “turn around your operation,” but we would be more than happy to test your prediction, once we understand it. Currently, the “operation” is just to systematically set aside words that can be reached (via definition) from other words, iteratively narrowing the other words to the grounding kernel that can only be reached from itself. This operation moves steadily inward. I am not sure what moving steadily outward would amount to: Would it be setting aside words that cannot be reached via definition? Would that not amount to a more awkward way of generating the same partition (grounding kernel vs. rest of dictionary)?

Please do correct me if I have misunderstood.

Supply and Demand

Not to put too shrill a point on it, but what do the following have in common?

paparazzi
pornographers
pimps
tobacco companies
drug dealers
arms dealers

Just trying to make ends meet?
Just giving people what they want?
Just doing what Darwin (or Adam Smith) dictated?

The Dark Side of Apertude

I don’t know anything about Steven Jones, but I became virtually certain that he’s a quack from just a few glances at the links. There’s a familiar profile to all this (9/11 conspiracy theory, cold fusion, wikipedia celeb, blogger hero).

This is the kind of urban mythology of which we will alas be seeing more and more in an age where the media have enfranchised rumor and opinion on an instant, pervasive, globalized scale. No wonder everyone wants to be a celebrity and celebrities are getting voted in as elected officials instead of people who actually have qualities:

We are headed (quite naturally) for an Opinocracy in which truth has about as much weight as it has in Wikipedia policy and chat TV, and “notability” reigns supreme…

I take OA‘s struggle for its small (peer-reviewed) niche in cyberspace to be a countervailing measure, if ever so small a one. (If the peer review abolitionists have their way — as they well might — even OA won’t help.)

Stillbirth

Why would anyone desire to become a posthumous cult, like CS Peirce? If one’s ideas have any value, let them be given enough credence, that vital irrigant, while one is still compos mentis, one’s cortex not yet compost! Bury them with their author, still-born, and they might as well not have been, for never having become what they might have been.

Wacky Wikipedia

A “deletion debate” recently took place on Wikipedia about whether or not to remove what looks to be a vacuous and self-promoting entry on a set of equations (apparently self-baptised by a collaborator and compatriot of the author as the “XXX Equations”).

A remarkable series of interactions! I am neither a physicist nor a mathematician, hence I am completely unqualified to make any judgment about the substantive content at issue here, so I won’t.

But I think I may be qualified (after a quarter century of umpiring Open Peer Commentary) to make a judgment about the quality of the interactions among those who appear to be adjudicating the content in this deletion debate.

One observation is inescapable: Those who say (and sound like) they understand the substantive content under debate are the ones who are for deletion, and those who say they do not understand the content are for retention.

This is quite striking. I have never before looked into a Wikipedia deletion debate, but if this trade-off is not an uncommon one, the question that naturally arises is whether the quality of Wikipedia content (such as it is) arises because of or despite factors like this.

Those adjudicators who proudly state that they “don’t know what they are talking about but…” seem to cite two things in defense of deciding on a basis other than understanding and truth:

(1) Wikipedia is not by or for qualified experts (“peers”), but by and for “ordinary people”.

(Is this true? If true, what does it mean? Do ordinary people not need content whose truth can be relied on?)

(2) The alternative to truth or understanding is “notability”.

(Presumably this means that even if something is wrong, if it gathers enough attention, it merits a Wikipedia entry.)

There does, however, seem to be some consensus against using Wikipedia for self-promotion.

Another striking feature of Wikipedia is that most contributors (whether authors or editors) seem to prefer to contribute anonymously. (I wonder why?)

In peer review (about which I know somewhat more), referees have the option to be anonymous to authors, and authors (sometimes) have the option to be anonymous to referees, but both are answerable to the editors, who know their identities, and who are themselves openly and personally answerable to the entire peer community (including their authors and referees) for their editorial judgments.

There is no such personal answerability in Wikipedia. (Is that a problem?)

And without open personal answerability, and without the need to be qualified to judge content, hence no answerability to understanding or truth, what are we to make of “notability”?

They say that the ultimate goal of commercial “branding” is to make a product’s name so notable that you are ready to pay just for the name.

I spend a lot of my time defending and promoting open access to peer-reviewed research, and one of the chief incentives I cite is that open access increases citation impact (“notability”).

But citation impact is based on refereed work citing refereed work (not self-citation, or circle citation), and refereeing is constrained by personal answerability. (Could the growing spate of email and search-term spamming be a sign that free-floating, unanswerable “notability” may not be a value but a virus?)

There’s something to be said for a “CiteRank” version of Google’s PageRank algorithm (recursively weighting citations by the citedness of the citer, rather than just relying on flat citation counts).

I notice that someone “weighted” the deletion votes here by affixing the voter’s prior number of Wikipeditorial contributions; but surely we can come up with a more sophisticated algorithm than that, lest our self-generated busybody-metric becomes our self-validating ticket to “notability.”

Perhaps it’s safer to trust a mindless algorithm for measuring “notability” (suitably designed to detect and expose self-citation, circle-citation, noncumulativity, etc.) than measures of “notability” invoked by minds that have cheerfully declared themselves to be without understanding or answerability to the truth.

At least that would be my view if it were the treatment of Myocardial infarction that was at issue, rather than “XXX equations” — but then maybe “ordinary people” are not that concerned with the truth about Myocardial infraction either…