Semantico-Phonetic Form: A Unitarianist Grammar

Ahmad Reza Lotfi, Ph. D.
Department of English Language
Azad University at Khorasgan,
Esfahan, IRAN.
E-mail: lotfi@www.dci.co.ir

ABSTRACT

Semantico-Phonetic Form is a unitarianist theory of language in two
different but inter-related senses: first, it assumes that the Concep-
tual-Intentional and Articulatory-Perceptual systems (responsible for 
semantic and phonetic interpretations respectively) access the data at 
one and the same level of interpretation; hence a single interface 
level--Semantico-Phonetic Form, SPF. Second, it is unitarianist in that 
(although it is still a formalist theory of language) it potentially 
permits the incorporation of both formalist and functionalist explana-
tions in its formulation of the architecture of language. 
                                             
Within the framework of Semantico-Phonetic Form, and as an alternative 
proposal to Chomsky's minimalist thesis of movement, the Pooled Features 
Hypothesis proposes that "movement" is the consequence of the way in 
which the language faculty is organised (rather than a simple "imper-
fection" of language). The computational system CHL for human language 
is considered to be economical in its selection of formal features from 
the lexicon so that if two LIs (to be introduced in the same derivation) 
happen to have some identical formal feature in common, the feature is 
selected only once but shared by the syntactic objects in the deriva-
tion. It follows that the objects in question must be as local in their 
relations as possible. The locality of relations as such, which is due 
to economy considerations, results in some kind of (bare) phrase struc-
ture with pooled features labelling the structural tree nodes that 
dominate the  syntactic objects. Pooled features, in a sense, are 
structurally interpreted. Other features, i.e. those not pooled, will be 
interpreted at SPF.

KEY WORDS: bare phrase structure, economy, faculty of language, 
           feature checking, feature sharing, formal features,
           imperfections, lexicon, logical forms, minimalist syntax,
           Semantico-Phonetic Form, strength, unitarianist theory

1. Introduction

1. Semantico-Phonetic Form is a unitarianist theory of language in two
different but inter-related senses: first, it assumes that the Concep-
tual-Intentional and Articulatory-Perceptual systems (responsible for 
semantic and phonetic interpretations respectively) access the data at 
one and the same level of interpretation; hence a single interface 
level--Semantico-Phonetic Form, SPF. Second, it is unitarianist in that 
(although it is still a formalist theory of language) it potentially 
permits the incorporation of both formalist and functionalist explana-
tions in its formulation of the architecture of language. 

2. The paper is organised as follows. Section 2 reviews orthodox mini-
malist accounts of interface levels, and introduces a unitarianist 
theory of the interface between competence and performance systems.
Section 3 examines the limitations of a theory of language that con-
fines itself to economy-oriented explanations of language. Instead,
it offers a theory of language that favours economy on a par with 
distinctness. Sections 4 and 5 introduce the Pooled Features Hypo-
thesis as a unitarianist alternative to Chomsky's minimalist thesis
of movement. Section 6 concludes the paper with some final comments on 
the relationship between feature sharing and Semantico-Phonetic Form.

2. The Unitarianist Interface Hypothesis

2.1 Logical Forms: The Background

3. Orthodox Minimalists assume that the faculty of language FL con-
sists of a cognitive system (a computational system and a lexicon)
responsible for storing information, and performance systems (the
"external" systems A-P and C-I interacting with the cognitive system
at two interface levels of PF and LF respectively) responsible for
using and accessing information. In accord with the requirements of
conceptual simplicity, it is further assumed that "there is a single
computational system CHL for human language and only limited lexical
variety" (Chomsky, 1995:7). Being a theory of Universal Grammar (UG),
the minimalist programme considers a structural description (SD) to
be "the optimal realizations of the interface conditions, where 
'optimality' is determined by the economy of UG" (Chomsky, 1995:171).
Since the programme does not assume the existence of any conditions
(such as the Projection Principle) relating lexical properties and
interface levels (p. 220), one may conclude that (for Chomsky) the 
economy of UG can be best viewed as a function of such operations of 
the computational system CHL as Merge, Move, and Agree rather than 
other operations of the cognitive system, such as LI/FF selection for
the lexical array LA, or other components of the system like the 
lexicon. 

4. The introduction of Logical Form--a "minimalist" interface level 
where semantic interpretation takes place--goes back to the late 70s 
with Revised Extended Theory as the dominant version of Transformational 
Grammar. However, REST considered shallow structure--the level of syn-
tactic representation following the application of all transformational 
rules except filters and deletion--to be the input to the semantic 
rules. For May (1991), Logical Form was the representation of the form 
of the logical terms, or the expressions with invariant meanings, of a 
language. As a result, "on this view the syntax of natural language does
have a logical form, in that at LF it represents the structure required
for the application of semantic rules for logical terms" (May, 1991: 
55).

5. The operations Wh-Movement and Quantifier Raising were assumed to
derive LF from S-structure (in GB). In other words, such a designated
level of syntactic representation was motivated by some theory-internal
considerations such as the principles of the Binding Theory of the time.
"Indeed, if the Binding Theory could be shown to require the particular
articulation of structure found just at LF for its full application, 
this would constitute a sort of 'existence proof' for LF, and the 
devices employed in deriving it" (May, 1991:339). Typical empirical
support for such "invisible" LF operations were sentences with 
quantifiers like (22) in May (1991), reproduced here as (1a), with a
structure satisfying Principle A ONLY AFTER the application of QR at
LF (1b) so that "both 'the women' and 'the men' locally c-command an
occurrence of 'each other' ":

(1)
    a. The men introduced each other to everyone that the women did.
    b. [everyone that the women [VP introduced each other to e-i]]i
       [the men introduced each other to e-i]

In other words, instead of revising the (binding) theory to make it
more compatible with such empirical data, it was decided to change
(the designated architecture of) language to fit in. 

6. Instead, they could designate an elliptical VP with a visible move-
ment of 'everyone' from the ellipsis site and the deletion of elements 
recoverable from the linguistic context without the unnecessary formal 
step of "copying" the WHOLE visible VP into the ellipsis site: [1]

(2) The men-j introduced each other-j to everyone-i that the women-k 
    did [VPellip introduce each other-k to e-i]

Now one of the *each other*s c-commands 'everyone' while the other its
trace. Then QR, and as a result, LF, would be dispensed with. But the
purported formal restrictions inherent in any logical system (natural 
languages included) did not allow the move: (2) was simply illogical as 
copying would blindly proceed with the whole VP 'introduced each other 
to everyone that the women did' copied into the site, naturally with
another elliptical VP created in the embedded clause, and this would
proceed ad infinitum.[2]

7. The GB solution--LF as the exclusive input to semantic rules, which
has been carried over to the minimalist era as the C-I interface 
level--suffers both conceptual and empirical shortcomings.[3]
Firstly, one can still keep logical operations operating in natural
languages without assuming language to be nothing but a perfect logical
system. To be more specific, it is not conceptually necessary for a
phonetically realized VP to be fully copied into an elliptical one.
A partial copying will do as long as the semantic system (whatever it
is) can assign a plausible interpretation to the structure. Interest-
ingly, even May's formulation of LF cannot fully avoid this logical
trap: even in (1b), the cyclicity shows up as e-i is co-indexed with
the phrase [everyone that the women [VP introduced each other e-i]]i
in which it occurs. May's formulation, however, is at a disadvantage
as it is conceptually more complicated than (2). Secondly, LF as 
specified solves one empirical problem for the Binding Theory but
creates some other. If QR saves binding principles for sentences like
(1a), it actually refutes them for some other sentences that are
ungrammatical in terms of binding prior to the movement of the 
quantifier to the left-most position of the sentence at LF but fulfill
the binding requirements after QR:

(3)
      * a. She-i met everyone that Mary-i knew.
        b. [everyone that Mary [VP knew e-i]]i [she met e-i]

8. In (3b), 'Mary' is outside the c-command domain of 'she'. Then 'Mary'
can antecede 'she' with no violation of binding requirements. The pre-
diction proves to be empirically false. Based on similar cases, Chomsky 
(1981) concluded that Principle C was satisfied at s-structure, a solu-
tion is not available in MP with S-structure dispensed with.[4]

9. LF made it possible to afford such syntactic operations as Procras-
tinate, Attract, and covert raising in order for UG principles to 
remain maximally generalizable, and to downgrade cross-linguistic 
variation.  Perhaps there is nothing wrong a priori with such a 
theoretical device. However, and despite the purported theoretical 
elegance due to the introduction of logical forms, it is a pity that 
the generativist has not been interested in the question of whether it 
is possible in the real world for LF (so vastly divorced from the 
phonetic reality of sentences, i.e. PF) to be the interface level at 
which *real speakers* semantically interpret whatever they hear (or 
whatever they don't). The issue is presumably dismissed as a matter of 
performance, one that the generativist has traditionally found an irre-
levant question to ask. More on this below.

2.2 Logical Form as the I-C Interface Level

10. Chomsky (1995) takes a particular language L to be a procedure of 
constructing pairs (pi, lambda) out of LIs selected from the lexicon 
into a lexical array/numeration to be introduced into the derivation by 
the computational system. The operation Spell-Out strips away pi 
elements from the structure sigma to be mapped to pi by the phonological 
component of the computational system and leaves the residue sigma-L for
its covert component to map to lambda so that they are interpreted at
the A-P and C-I interfaces respectively as "instructions" to the rele-
vant performance systems. If they consist entirely of interpretable 
objects, i.e. those that are legible for the external systems, the 
derivation D converges as it satisfies the condition of Full Interpreta-
tion. "A derivation converges at one of the interface levels if it 
yields a representation satisfying FI at this level, and converges if it 
converges at both interface levels, PF and LF; otherwise, it crashes"
(1995:219-220). In case there are more than one convergent derivation 
possible, the most economical one blocks all others. Uriagereka (1998) 
diagrams the procedure as follows:

                                 Lexicon
                                    |
                                    V
                                    
                                    A
                                    
                                    |
                                    V
                              Merge and Move
                                    |
                                    V
                                Spell-Out
           PF component           /   \              LF component
                                /       \
                             PF         LF
                           level       level
                           /                \
                         /                    \
                        V                      V
                       A/P                    I/C
                    component               component

                                    
Figure 1. Uriagereka's formulation of the MP model (from Uriagereka, 
          1998:536)

11. It is not clear, however, whether Chomsky meant the model to be one
for the speaker, for the listener, or both. Chomsky talks about "instruc-
tions" sent to performance systems. This means the procedure is what 
the speaker goes through in order to articulate a sequence of sounds.[5] 
But the model cannot represent what the speaker does because this means 
the speaker first selects LIs, has the structure derived with all its 
complexity via the application of Select, Merge, Agree, Move, etc, 
strips sound from meaning (Spell-Out), and then and only then 
"understands" what she said means. On the other hand, only a miracle 
can make the model work for the listener.  The only pieces of informa-
tion she receives are those legible to her A-P system. What happened to 
sigma-L? How can she access the information that does not travel through 
the air? Is she supposed to reconstruct the derivation at her own LF and 
in reference to the PF information she receives?[6] How likely is it to 
take place given the mismatches between LF and PF? If such mismatches 
are trivial enough to make her reconstruction of LF possible, why should 
one hypothesise the existence of Logical Form in the first place?[7] If 
the computational system can distinguish pi elements from lambda ones 
(spell-Out), why isn't it possible for the performance systems them-
selves to do so while accessing a single interface (instead of two) 
representing both types of information? Accidentally, Chomsky distin-
guishes formal features from semantic ones (1995:230) although both 
are presumably interpreted at LF. If the performance systems responsible 
for semantic interpretation can access and interpret features as 
different as formal and semantic features, why is it necessary to strip 
away phonological features after all? Couldn't the performance systems 
access and interpret all these features--phonological, semantic, and 
formal--at a single interface level? Such a level is not only concep-
tually possible but also more desirable in terms of economy considera-
tions. Chomsky would remind us that this is an empirical question: given 
any empirical proposal, we can always ask why things should work that 
way rather than some other way. Thus why do mammals have two eyes in-
stead of three (an eye in the back of the head would be very useful for 
escaping predators)? But what is the empirical support (apart from the 
dubious "existence proof" for LF reviewed earlier in the paper) for the 
split? On the other hand, there is some empirical evidence to suggest 
that such a split does not substantiate (see 2.3 below for details).

12. As mentioned earlier in paragraph 9, such questions are usually
dismissed as aspects of performance, which is basically irrelevant to
what the generative grammarian does. For Chomsky, "[l]inguistic theory
is concerned with an ideal speaker-listener in a completely homogeneous
speech community [...]" (1965:3). This formulation of grammar (I think 
still correctly) excludes "such grammatically irrelevant conditions as
memory limitations, distortions, shifts of attention and interest, and
errors [...] in applying [...] knowledge of the language in actual per-
formance" (1996:3). He further asserts that "[a] generative grammar is
not a model for a speaker or a hearer" but one that "attempts to charac-
terize in the most neutral possible terms the knowledge of the language 
that provides the basis for actual use of language by a speaker-hearer
(1965:9)." Thus the differences between the speaker and the hearer 
(those between production and understanding of language respectively) 
are dismissed as matters of performance. 

13. In practice, however, this neutrality of terms has proved to be 
unattainable. Chomsky takes " L to be a generative procedure that 
constructs pairs (pi and lambda) that are interpreted at the articula-
tory-perceptual (A-P) and conceptual-intentional(C-I) interfaces, 
respectively, as 'instructions' to the performance systems" (1995:219). 
Moreover, he thinks of the computational system "as mapping some arrays 
A of lexical choices to the pair (pi, lambda)" (p.225). Apparently, 
lexical selection (among many other operations of the system such as 
Merge, Move, Delete, Form Chain, etc) is what only the ideal *speaker* 
rather than the listener can afford. On the other hand, interpreting the 
linguistic expressions of L (already computed by "the ideal speaker") at 
LF is what only a listener is liable to indulge in. Chomsky's account of 
competence is then concerned with "the ideal speaker-listener" in the 
sense of SOMETIMES being a model of the ideal speaker, and SOME OTHER 
TIMES that of the ideal listener. It can hardly be a model of the 
abstract idealisation of linguistic knowledge divorced from all aspects 
of language production and perception; a "speaker- and hearer-neutral" 
description of what they both have in common as the basis for their 
actual use of language.

14. Chomsky shows his concern for performance systems when he writes, 
"[f]or each language L (a state of FL), the expressions generated by L 
must be 'legible' to systems that access these objects at the interface 
between FL and external systems--external to FL, internal to the person. 
[...] SMT (the strongest minimalist thesis) or a weaker version, becomes 
an empirical thesis insofar as we are able *to determine interface 
conditions* (emphasis mine) and to clarify notions of 'good design'. 
[...] While SMT cannot be seriously entertained, there is by now reason 
to believe that in nontrivial respects some such thesis holds [...]" 
(Chomsky, 1999:1). To me, Chomsky's "Aspects" formulation of the re-
search scopes for the generativist grammarian cannot hold for minimalist 
research anymore if one is not to contradict oneself in terms of re-
search questions set to address (see Chomsky's Introduction to MP, 1995 
for these questions). Performance considerations as "external cons-
traints" on the functioning of the language faculty now need play some 
role not only in psycholinguistics but also theoretical linguistics. 

15. Chomsky (1995) states that "[t]he language L [...] generates three 
relevant sets of computations: the set D of derivations, a subset Dc of 
convergent derivations of D, and a subset Da of admissible derivations 
of D. FI (Full Interpretation) determines Dc, and the economy conditions 
select Da" (p. 220). Given the assumption that the convergence of a 
derivation is conditional upon its interpretability at both interface 
levels, he hypothesises that "there are no PF-LF interactions relevant 
to convergence--which is not to deny, of course, that a full theory of 
performance involves operations that apply to the (pi, lambda) pair"
(p. 220). Now some tough empirical questions for the minimalist to 
address:[8]

(1) Suppose the derivation D converges at PF but crashes at LF. This
means D is expected to crash in the final run. Now how does PF "under-
stand" that D has crashed at LF, then NOT to be articulated phonetic-
ally? How do PF and LF communicate? Are sensori-motor instructions
sent to PF temporarily stored somewhere (where?) so that the case of D
is decided on at LF, and then PF is informed (how?) to proceed with its
articulation of D?

(2) Also suppose that two rival derivations have converged but only one
of them, say Da, passes the test of optimality. For example, (4a) below
is more economical than (4b) in terms of the DISTANCE/STEPS needed for
the Wh-word to move from its canonical position to [Spec C].

(4)
    a. Whom did you persuade t to meet whom?
    b. Whom did you persuade whom to meet t?

Da (4a) must be blocking the less economical but still convergent
derivation (4b). How is it signalled to the other interface level to
phonetically articulate this single admissible derivation and not the 
other? How long should PF wait before deciding to articulate a pi (it 
is too risky to articulate pi even if D has converged at LF as it may
simply prove to be less economical than another)? Can one take care of
such a mapping between PF and LF without violating the independence
assumption of interface levels? Is it the computational system that 
monitors PF and LF in this respect? Or perhaps all these questions
are to be dismissed as the concern of "a full theory of performance"
rather than those of the minimalist syntax as a theory of competence?

16. One way out of the dilemma is to stipulate that the convergence/
crash of each derivation is decided on in advance as LIs are mapped 
onto the lexical array LA or introduced into the derivation. This 
stipulation does not solve the problem but only displaces it. Moreover, 
this means that (a) ill-formed derivations do not crash; they are 
always cancelled, and (b) we need another interface level--LA, d-
structure, or whatever you wish to call it--at which crash/convergence 
issue is taken care of prior to any interfacing with performance sys-
tems. Whatever the case, Chomsky's formulation of such basic tenets of 
the theory begs the empirical questions outlined above.


2.3 Semantico-Phonetic Form


17. In absence of empirical support for Chomskyan split-interface 
claims, and in agreement with Liberman's (1993) requirement "that in all 
communication the processes of production and perception must somehow be
linked; their representation must, at some point, be the same (Place,
2000: par. 40)", an attempt is made here to propose a more conservative
and conceptually simpler ALTERNATIVE HYPOTHESIS--THE UNITARIANIST 
INTERFACE HYPOTHESIS--according to which at one and the same interface 
level, say the semantico-phonetic form, the derivation D containing 
bundles of diverse information types--phonological, formal, and semantic 
features--is accessible to both C-I and A-P performance systems. Compa-
tible features (phonological features for the A-P system, and formal-
semantic features for the other) are processed by each system, which 
ignores incompatible features, leaving them to the other system to 
interpret.[9] The derivation crashes if it still contains uninterpreted 
features when the processing is over. Otherwise, it converges. Although 
the truth of this hypothesis is not obvious either, I will try to show 
that it is still essentially possible to explain language data in its 
light without the extra step of stipulating LF covert operations, and 
(for the computational system) to generate optimal convergent deriva-
tions at no extra cost. I will take this to be the empirical support for 
the unitarianist claims made here. More on this below.

18. As it was pointed out earlier, LF-PF mismatches inherent in gen-
erativist works of the past two decades make it less probable for a
more communication-oriented model (than the MP) to entertain the 
possibility of a level of representation solely interfacing the C-I
performance system. It is further claimed here that sound-meaning and 
sound-syntax correspondences also suggest that C-I and A-P access the 
relevant pieces of information at the same interface level. 

19. Metrical Phonology research in the organization of prosodic struc-
tures known as metrical trees (with phonological constituency assumed
to be binary branching with two sisters of a branch to be [S(trong)
W(eak)] or [W(eak) S(trong)] affords a formal representation of
strength relationships at a sentential level that is only comparable
with X-bar representations in generative syntax:

            
            ------------------------------------------------- U
                                  /\
                                /    \
                              /        \
                            /            \
           --------------- W ------------- S ---------------- phi, I
                          /\             /   \
                        /   \          /       \
           -----------W ---- S-------- W ------  S ----------- F
                     /\     /\        / \      /  \
                   /   \  /   \      /   \    /    \
          --------S----W-S----W------S----W---S-----W--------- sigma
                  |    | |    |      |    |   |     |
                  ma  ny lin guists  go   to Ess    ex

Figure 2. A metrical tree (from Durand, 1990:225)


Such striking similarities between prosodic structures (shaped by
phonological features, which are the input to A-P) and grammatical
ones (shaped by formal features of L not interpretable at PF under 
minimalist assumptions) need to be explained in terms of Chomsky's
split-interface claims outlined earlier. Furthermore, the direct asso-
ciations between such prosodic features and Semantico-pragmatic con-
siderations like pragmatic emphasis, information structures, and illo-
cutionary force suggest that sound-meaning direct, i.e. non-symbolic, 
correspondences are real. Also data from Romance and Germanic languages 
(Zubizarreta, 1998) suggest that phrasal prominence (nuclear stress) 
reflects syntactic ordering with two major varieties: the asymmetric 
c-command ordering and the ordering between a head and its complement. 
These properties of language suggest that formal and/or semantic 
features and some prosodic features are mysteriously pied-piped together 
so that the selection of one requires that of another. The pied-piping 
of a formal feature and the relevant prosodic feature, say [Q] and the 
phonological feature of raising intonation, cannot take place in the 
lexicon itself: a Q particle, e.g. 'aya' in Persian, has no high tone of 
its own when it appears in isolation. Then the prosodic feature must
be added later to the whole sentence for some unknown reason via some 
unknown operation. The other alternative explanation is the possibility 
of having one and the same feature interpreted differently by both A-P 
and C-I performance systems, e.g. the formal feature [Q] interpreted by 
A-P and C-I systems as "raising intonation" and "asking a question" 
respectively, an explanation which is possible only under the Unitarian-
ist Interface Hypothesis.

20. Research in motor theories of language [10] suggests that 
gesticulation--the (involuntary) body movements accompanying speech--
highly correlates with articulatory properties of language (prosodic 
features included), which by their own turn match syntactic, semantic, 
and pragmatic ones. Informally speaking, language spreads across all
body organs like a wave: it is one's existence in its totality (and 
not just one's speech organ) that "speaks" a language. In other 
words, even features intended to be interpreted by the performance 
system S may happen to be "read off" by the other performance systems S' 
and S". This supports a single interface between the competence system 
and the performance systems. Moreover, it implies that the original 
unitarianist hypothesis formulated earlier in the paper needs to be 
modified in this respect. A-P and I-C performance systems do not simply 
ignore ALL the "incompatible" features present at SPF. Synatctico-
semantic information (formal and semantic features of LIs) may happen to 
be also interpreted by A-P as those prosodic properties of speech (like 
phrasal stress patterns, intonation, etc.) that are NOT inherently 
available in the lexical entry for LI. Gesticulation, on the other hand, 
could be the "translation" of such information into the language of the 
body. While the Unitarianist Interface Hypothesis is silent on the issue 
of the mechanism of implementation for the phenomena in question, it 
takes them to be the empirical support for the unitary nature of the 
interface between the computational system for human language and exter-
nal performance systems.

21. The findings of some minimalist research into ASL also seem to lend 
support to a unitarianist theory of language. Wilbur (1998) shows that 
a purely functionalist approach to "brow-raising" in ASL--that "br" 
marks non-asserted information--cannot be the whole story. She argues 
that there is some syntactic motivation behind "br". Using a minimalist 
framework of study, she hypothesises that "br-marked structures are as-
sociated [...] with [-Wh] operators" (p. 305). The brow furrow (bf), on 
the other hand, is still assumed to be associated with [+Wh] operators 
spreading across the c-command domain (Aarons et al 1992). In a unita-
rianist reading of such findings, formal features associated with these
operators are available for interpretation at SPF. For a language like 
English, gesticulation and articulation are both possible since the re-
levant biological systems access the features in question. For ASL, on 
the other hand, such formal features--not available at PF in orthodox 
minimalist accounts as they are illegible there--are interpreted 
as brow-raising and brow furrow. It is not very probable that such 
features were consciously incorporated by Charles Michel de l'Epee, 
Thomas Gallaudet or other educators behind sign languages back in the 
eighteenth and nineteenth centuries. If they were, the issues were not 
as controversial as they are today since the origins of ASL forms are 
presumably more accessible than those of normal languages. It is more 
probable that hearing users of sign languages, who had access to such 
features in their native language, unconsciously incorporated the fea-
tures in their signed performance, too. In other words, the relevant 
formal features normally associated with their L1 equivalents of ASL 
lexical items, e.g. Wh-words, AUX, and Q particles in articulated 
languages, were still accessible to their signing performance system 
while communicating in ASL. As a result, while they were signing LIs 
to other speakers of the language, a formal feature like br crept into 
their performance NOT as a conscious attempt to signal a [-Wh] operation 
of the language faculty but an unconscious move on the user's part 
because her signing performance system could not help accessing the 
relevant feature at SPF: hence, a unitary interface level between the 
competence system and the performance systems.
                         
22. Brody (1995) recognises that LF representations must be "regularly 
recovered quite fully on the basis of PF evidence" (p. 3), which seems
impossible in Chomsky's minimalist framework given the mismatches 
between PF and LF. Brody argues that LF representations "vary from 
language to language only to the extent to which language learner can 
determine the relevant parameters on the basis of PF data" (1995:3). 
He further hypothesises that "semantic interpretation rules and the 
lexicon have access to the same interface, the level of Lexico-Logical 
Form (LLF)" (1995:2). In order to achieve a higher level of conceptual
economy in his theory, Brody dismisses the operation Move altogether on
the grounds that "chains and Move Alpha cover the same class of pheno-
mena [...]" (1995:8). Since for Brody "the concept of chains is in-
dependently motivated by the principle of Full Interpretation and by
the condition that determines the distribution of the set of thematic
positions [...]" (p. 5), he concludes that a theory with both concepts
of chains and movement is wrong. Then a minimalist theory with a single
syntactic structure (LLF), which is input to SPELLOUT, emerges.
"Since the theory has no movement, categories in LLF representations
will have to occupy their PF positions" (p. 20). Then LLF representa-
tions are comparable with S-structures "with respect to the positions 
lexical categories occupy [...]" (p.21). If I understand the proposal
correctly, a schematic representation of LLF may look like the diagram
below:


                              Lexicon
                                 |
                                 |
                                 V
                             Form Chain
                                 |
                                 |
                                 V
                                LLF
                                 |
                                 |
                                 V
                             Spell-Out
                                 |
                                 |
                                 V
                                 PF


Figure 3. A formulation of Brody's LLF model


23. The model is superior to Chomsky's as it can better explain the LF-
to-PF mapping. Mismatches disappear because it is LLF that is spelled 
out as PF. Then it comes closer to my SPF. Despite that, LLF is dis-
tinctly different from SPF in two important ways. Firstly, LLF is still 
a split-interface model presumably with the C-I representation as the 
input to the A-P system. In other words, the C-I system reads the 
expression first, and the residue is sent to the A-P system for inter-
pretation. SPF, on the other hand, dispenses with the operation Spell-
Out altogether on the grounds that there is no need to separate pi and 
lambda at the interface level as pi and lambda features--phonological 
and Semantico-formal ones--are mainly unreadable BUT/THEN harmless to 
an incompatible performance system (with significant exceptions like 
some prosodic features and gestures, which LLF cannot explain). Then no
mapping is needed in SPF between PF and LF at all: [11], [12] 


                                                           _
                               Meaning                      |
                                                            |
                                 ...                        |
                                                            |
        D                      Lexicon                      |
        E                         |                         | WHAT THE
        R                         |                         | SPEAKER 
        I                         V                         | DOES
        V                   Lexical Array                   |  
        A                         |                         |
        T                         |                         |
        I                         V                         |
        O                   Move and Share                  |
        I                         |                         |
        N                         |                         |
                                  V                         |
        I        A-P. . . . . .> SPF <. . . . . . Motor     |
        N      system                            system    _|
        T                        /\                        -
        E                        .                          |
        R                        .                          |
        P                        .                          |
        R                        .                          | WHAT THE
        E                        .                          | LISTENER
        T                        .                          | DOES 
        A                        .                          | 
        T                        .                          |
        I                        .                          |
        O                       C-I                         |
        N                      System                      _|


Figure 4. The SPF Model

24. Brody's (1997) formulation of the level seems similar enough to SPF 
as he considers LLF to be "the input to both semantic interpretation and 
the SPELLOUT component" (Brody, 1997:139). Following the standard 
minimalist terminology, Brody (1995:34) still considered SPELLOUT to be 
an operation of some sort, but now Brody (1997) refers to it as a 
component. It might be possible then to consider LLF as the input to 
both semantic and phonological interpretation systems. If this is really 
the case, then Brody's LLF, like SPF, is a unitarianist (in the first 
sense of the word) theory of language. Despite that, the LLF model 
(contrary to SPF) remains ambiguous with regard to the question of the 
speaker/hearer orientation of a model discussed earlier in 2.2.

25. Secondly, SPF does not reduce Move to Form Chain. Brody dispenses 
with Move because whenever an element moves overtly, a trace is left be-
hind that is linked to the new position via an invisible chain. Then 
Form Chain must suffice to explain the phenomena under study. This re-
ductionism, however, suffers the same weaknesses that typical reduction-
ist approaches are open to: it equates {P iff Q} to {P=Q}, which ignores 
the causal relation between P and Q--chains as the consequence of Move. 
Since chains are the product of Move, Bordy's reductionist thesis misses 
the possibility of other syntactic effects due to Move: {P iff Q} and 
{P AND S iff Q} are not contradictory. With Move banished from our 
theories, one has to either (a) reduce other effects to Form chains 
again, or (b) dispense with them. In both cases, some empirical and 
conceptual losses will be inevitable. Movement is NOT self-motivated. 
It is then X, X being a morphological requirement in Chomsky's frame-
work or anything else conceivable, that triggers movement. Then:

             X ---> Move ---> Chains

Eliminating Move establishes a direct relation between X and chains, 
which is at least inaccurate. It is like saying

    John drinks iff he has a quarrel with his wife, and the police 
    officer gives him a ticket iff he drinks, then the police officer 
    gives him a ticket iff he has a quarrel with his wife--drinking 
    now can be reduced to having a ticket.

Although the statement 'the police officer gives John a ticket iff he  
has a quarrel with his wife' is true, the reduction itself is not an
acceptable move because it misses the whole point. It fails to explain 
why things work that way. Similarly, Brody's reduction of Move to Form
Chain offers a simpler description but a less adequate explanation.


3. Language and the Trade-off between Economy and Distinctness
 
3.1 Extrema: Minima and Maxima

26. Chomsky's strict attention to the economy of derivation and (to a 
less degree) the economy of representation along with his contention 
that "[o]ne expects 'imperfections' in morphological-formal features 
of the lexicon [...]" (Chomsky, 1995:9) raises both theoretical and 
empirical questions about the relevance of economy to other aspects 
of human cognition in general and the language faculty in particular, 
and the extent to which a minimalist grammarian can develop any taste 
for such facets of human cognition. Moreover, he does not explain the
cost at which the purported economy of the human cognition is at-
tained--what economy is counterbalanced by. More on this in the rest
of the section.
   
27. We should bear in mind that whether the phenomenon under study is a 
minimum or maximum solely depends upon the angle from which we observe 
things. Think of two cities, say Teheran and Algiers. You can then con-
nect these two points with a straight line (an arc to be more precise 
because the earth is a sphere). You may call the length the shortest 
distance. But there is inevitably another arc connecting the two the 
other way round the earth which is definitely the longest distance bet-
ween these two points. It is now a text-book piece of reality for mathe-
maticians (among many other scientists) that maximum and minimum always 
co-exist, that in almost all problems of the living world the organism 
is indulged in a search for extrema--a generic term for maxima and 
minima, that the graph for many of such organic problems looks like a 
parabola with a single point on the curve denoting both the minimal 
value of one variable and the optimal value of the other: an animal 
picks up the shortest route (minimal value) in order to maximize its 
efficiency of locomotion, a TV engineer minimizes the reproduction error 
of the set in order to maximize the quality of the picture, the reader 
of this paper maximizes his or her reading rate in order to reduce the 
reading time to a minimum, and even a person standing still over there 
is constantly looking for extrema in order to stay in a position of 
equilibrium. Likewise, any economical endeavour of the human mind must 
be directly associated with another extremum in the opposite direction. 
Then the maximal economy of representation as formulated in a model of 
knowledge like semantic memory might be related to some other extremum 
such as the least amount of retrieval time, the least expenditure of 
energy, etc.

28. Going back to language economy and minimalist syntax, one should
ask what the thing is that a language-user minimizes, and what the thing
is that she maximizes in return when features/syntactic objects are
dislocated across sentential boundaries. The most straightforward
"minimalist" answer to this question seems to be one with the shortest
move/covert movement (Procrastinate)/ no movement operation (move as the
last resort) as the minima and economy of derivation/representation as 
the maxima. However, Chomsky explains nowhere what exactly and 
technically the nature of this economy is. Do we spend less and less 
energy as we proceed from left to right through the ranked sequence of 
shortest move--> covert movement--> no movement? If yes, then Chomsky's 
will be a model of performance, which he has repeatedly denied. If no,
then what? In other words, what are our concrete criteria for the 
economy of the language faculty? Secondly, and even more importantly, 
from Chomsky's colourful but vague expressions such as PF "extra bag-
gage", it may be inferred that a wh-in-situ language is more economical 
than one with overt movement, that a pro-drop language with no pronominal 
subjects is more "perfect" than a non-pro-drop language with expletives,
that synthetic languages are more elegant than analytic ones, that the 
more the number of "strong" features in a language the less economical 
and then less perfect that language would be, etc  because in all these 
cases the amount of the speaker's PF "extra baggage" to carry would be 
substantially different from one language type to another. All these 
seem to be very undesirable (even catastrophic) but logical inferences
for a theory of language that is expected to reduce parametric variation
among particular (I-) languages to choosing among a number of deriva-
tions all of them convergent and maximally economical. It reminds one of
(now dead and buried) reflections of such historical linguists as 
Friederich Muller, and August Schleicher (see Otto Jesperson 1993 for 
a review) on the "perfectness" of "flexional" languages in comparison 
with agglutinating and isolating ones. 

29. Semantico-Phonetic Form as a unitarianist theory of language, on 
the other hand, seeks to explain language phenomena as different aspects
of an innately available system--most probably an exaptation of some
sort with its original biological function associated with motor 
activities of the body and their representations in the brain (see
Allott 1994, and Calvin and Bickerton, 2000)--which inevitably inter-
acts with other systems of human biology and sociology in the fulfill-
ment of its major function in the human society, namely communication.
Formulated as such, a theory of the language faculty is expected to
function as a point of convergence for two different types of theories
of language: speaker-oriented theories, which focus on what happens to
the speaker as she produces language (then having a potential interest
in such considerations as economy), and functionalist theories of lan-
guage, which inevitably have a keen interest in the communicative
functions of language and how they shape language use and usage (then
implying a significant role for negotiations between the speaker AND
the listener). Such a negotiation of interests is characterised with (a) 
the speaker's natural tendency to conform to the principle of the least 
effort (economy), on the one hand, and (b) the listener's interest to 
urge the speaker to remain as distinct as possible (distinctness). 
Neither of these two per se can explain why language is structured as it 
is. This puts both formalist and functionalist explanations of language 
in perspective and encourages a more unified model of language. 

30. In this sense, as the speaker minimises her PF "extra baggage" (i.e. 
she tries to maximise her conformity to the principles of natural econo-
my), the property of distinctness of speech will be minimised, which 
disfavours the interlocutor as now he must maximise his efforts to make 
sense of what the speaker means. Maximal economy means uttering noth-
ing, which minimises distinctness to zero--a failure to communicate.
Then sentences we normally produce embody some compromise between these
two tendencies so that a reasonable balance is hit between what the
speaker and the listener each demands. A strict observation of economy 
principles, such as those assumed in purely formalist models of compe-
tence (like Chomsky's), is out of the question in real performance.
Semantico-Phonetic Form, on the other hand, does NOT assume its formal
restrictions (see Section four below) as strict, inviolable require-
ments of Universal Grammar but the natural tendency of the language
faculty to minimise the energy needed in order for the speaker to com-
pute structural descriptions; a tendency which is constrained by (the 
speaker's observation of) the listener's tendency to optimise communica-
tional utility via the maximal distinctness of the message. This means
that a Unitarianist Grammar has speaker economy counterbalanced by the
listener's desire for distinctness. When one is maximised, the other
gets minimised.

3.2 Semantico-Phonetic Form and the Convergence of Functional and 
    Formal Models of Language Architecture

31. Many linguists in both formalist and functionalist camps maintain 
that a convergence between functional and formal explanations of lan-
guage is possible (if not necessary) in principle. Newmeyer (1998a, b) 
argues that formalist and functionalist approaches can complement each 
other in that the former is concerned with the autonomous system at the 
core of language while the latter focuses on the functional motivation 
of syntactic structures in general. Each approach has its own merits and 
demerits. Formalists' focus on purely formal grammar-internal solutions 
has resulted in unnaturally complex treatments of phenomena while func-
tionalists go to the other extreme of rejecting the existence of struc-
tural systems. On the other hand, functionalists (rightly) incorporate 
some discourse-based explanations for syntactic phenomena that may prove 
to be more adequate than merely formalist accounts of language. But 
formalists do not forget that there are serious mismatches between forms 
and functions. For Newmeyer, these two approaches can converge on (a) 
what a model is constructed of, (b) developing a synchronic model of 
grammar-discourse interaction, and (c) explaining the mechanism by which 
functions shape forms.

32. With more specific issues in mind, Hale (1998) observes that overt 
nominals in Navajo--a pronominal argument language--must not be adjuncts 
as it is possible to extract from NP. Otherwise, the Condition on 
Extraction Domains will be violated. On the other hand, some other data 
from Navajo strongly suggest that such nominals must be adjuncts so that 
no pronominal can c-command any overt nominal argument. Then some 
coreference phenomena seem to be in conflict with Principle C of the 
Binding Theory while some other suggest that the principle is observed 
in Navajo.
                               
33. Kaiser's (1998) insightful paper on the Japanese post-verbal cons-
truction (PVC) suggests that both formalist and functionalist accounts 
are needed in order to explain the Japanese PVC. From a formal point of
view, PVCs are subject to such structural constraints on movement as 
Subjacency. From a functional point of view, there are certain dis-
course contexts and not others in which some PVCs can occur. Kuno 
(1978), for instance, considers a PV element to be discourse-
predictable. Vallduvi's (1992) theory of Informatics may serve as a 
basis for a unified explanation of the PVC in terms of iconicity with 
both formal and functional properties.

34. Functionalism and formalism also complement each other in explaining 
topicality and agreement (Meinunger, 1998): formalism affords a 
grammatical description of agreement (e.g. in terms of Chomsky's 
Minimalist Program) while  the phenomenon can be explained functionally 
in reference to Givon's understanding of topicality. Meinunger proposes 
that "the properties which characterize the degree of topicality of a 
given noun phrase are linked to concrete morphological features [...] 
which trigger certain operations like movement or clitic doubling" (p. 
213). Examples concerning the behaviour of direct objects in Turkish, 
Spanish, and German, the differences between Spanish and Greek clitic 
doubling, and object shift in Icelandic and Danish support the proposal.

35. Nettle (1998) focuses on the parallelism between linguistics and 
biology with respect to functionalism: for both adaptation is the result 
of the process of replication, variation and selection. Structural pat-
terns are passed from one generation to another. But this replication is 
not perfect as random errors and novel solutions to specific discourse 
problems leak in. The linguistic equivalent of natural selection has 
something to do with plasticity, economy, and communicational utility 
that language forms can afford within the user's linguistic and cogni-
tive system. 

36. The core of functionalist accounts of language seems to the adoption 
of forms into grammar due to their communicational or cognitive useful-
ness (Nettle, 1998:449). It follows that formalist theories of language 
economy are not necessarily incompatible with functionalism as the 
speaker naturally adopts a more functional system that helps to achieve 
maximal economy of production in real speech (Lindblom et al, 1995). 
Such a system saves the speaker some physical (articulatory/cognitive) 
effort, which is just desirable to any organism. Despite that, there are 
times when the speaker has to "hyper-articulate forms [...] to make 
herself understood, but will otherwise produce the most reduced variants 
she can as her speech output tends towards maximal economy of produc-
tion" (Nettle, 1998:448). A more functionalist (than Chomsky's) 
approach to economy can better explain "imperfections" in the economy of 
speech as it is the communication force--the listener's demand that the 
speaker remain comprehensible--that may force deviations from the 
principles of economy.


4. The Pooled Features Hypothesis

4.1 Lexical Economy 

37. Cognitive psychologists of the sixties had already thought of a 
model of knowledge called SEMANTIC MEMORY structured as a network of 
nodes and some paths between them (Collins and Quillian, 1969). In this 
network, two types of nodes were hypothesised to exist: set nodes, such 
as Animal, and property nodes, such as Can Move Around. The model was 
primarily concerned with economy of representation achieved through 
"strict hierarchical organization and placement of properties at their 
highest level of generalizability" (Komatsu, 1994:184).

38. Although subsequent studies suggest that semantic memory is not 
strictly hierarchical in its organization (e.g. Rips, Shoben, and 
Smith 1973), nor organized with such properties at the highest level 
of generality (e.g. Conrad 1972), some of the latest developments in 
the field of cognitive psychology such as connectionist or distributed 
arrays models of concepts (like McClelland and Rumelhart, 1985; Estes, 
1986; Gluck, 1991; Schyns, 1991; Shanks, 1991; and Kruschke, 1992) still 
have an eye on economy of representation as in such PDP (Parallel Dis-
tributed Processing) models one single network of simple, interconnected 
units can represent a good number of categories. Then although a concept 
is assumed to be a collection of individual representations of the 
members of a category, connectionist networks capture both abstracted 
and specific instance data while they are neither abstractive nor 
enumerative.

39. Assuming the lexicon to be a network of concepts and categories 
with some phonetic labels and certain formal features characterizing 
grammatical limitations on their use, one can hypothesise that the 
lexicon is economical in its internal organization and retrieval 
process both. Perhaps Chomsky has no objection to this contention.
While he still endorses de Saussure's view that the lexicon is "a list 
of 'exceptions', whatever does not follow from general principles", he 
further assumes that "the lexicon provides an 'optimal coding' of such 
idiosyncrasies" (Chomsky, 1995:235).

40. If we are concerned with the cognitive system of the language 
faculty, and if "for each particular language, the cognitive system 
[...] consists of a computational system CS and a lexicon" (Chomsky, 
1995:6), then it is quite natural to assume as THE NULL HYPOTHESIS that 
the system is economical in all respects--organization and retrieval of 
LIs, selection from the lexicon for the numeration, and derivation of 
structural SDs included--unless proved otherwise. Based on this, it may 
be hypothesised that those formal features that happen to be common bet-
ween two LIs (selected for the same derivation) are copied from the 
lexicon onto the lexical array only once so that such LIs will share 
these features among themselves in order to satisfy the requirements of 
the principles of economy of derivation and representation such as 
simplicity, nonredundancy, and the like. Naturally, ALL identical 
features cannot be ALWAYS shared as such pooling of identical features 
requires the adjacency of the relevant lexical items: a very strong 
version of this "sharing condition" may necessitate syntactically im-
possible constellations--e.g. one in which some LIs, say A through E, 
are arranged as pairs A-B, B-C, C-D, and D-E (with some features shared 
for each) but no such union as A-D (although A and D can have some 
features in common) because this will inevitably nullify some other 
unions. A weaker version, advocated here, merely requires all lexical 
items in the structure to have SOME feature in common with a neighbour 
[13]. The hypothesis formulated as such is termed here the Pooled 
Features Hypothesis. 

4.2 Feature Sharing and Phrase Structure

41. The postulation of such a sharing mechanism has theoretical con-
sequences for the unitarianist syntax; hence, more distanced from
minimalist accounts of movement. Firstly, the Pooled Features Hypothesis 
reduces the phrase structure to a bare phrase structure in which tree 
diagrams are labelled with shared formal features rather than category 
labels. The assumption is that the phrase structure is NOT computed by 
the computational system: it is universally available in its barest form 
as a means to present an array of lexical items ((5) below). However, as
lexical items are plugged into the structure, certain and not other 
local relations are imposed on their hierarchical organization, mainly 
(but not exclusively) due to the featural composition of each lexical 
item and the formal features it happens to share with some others (see 
(6) below as an illustration). In other words, due to certain economical
considerations, lexical items with common formal features enter into 
the most local relations possible (between two LIs or their projections)
so that the common formal features can be pooled. Feature sharing, in a
sense, is a necessary (though not sufficient) condition on the locality 
of structural relations.

(5)
.                                        / \
.                                      /     \
.                                    /     /   \
.                                  /     /       \
.                                /     /      /    \
.                              /     /      /        \
.                            /      /     /       /    \
.                          /      /     /       /        \


42. Secondly, no distinction is made between such pairs as strong/weak 
or interpretable/uninterpretable features. Then it is not a question of 
un/interpretablity when a difference is detected between two features. 
It is rather a question of how and/or where, i.e. at which stage of the
derivation, the features are supposed to be interpreted. If a formal 
feature is shared by two LIs, the feature is structurally interpreted in 
that it has made these two LIs assume the most local structural relation 
in a bare phrase structure:[14]

  (6)
      a.   He <Casenom> may <Inf> marry <Caseacc> her.
        [3MSD]       [Pres I]      [V]           [3FSD]



 .                 <Casenom> may
 .                         /    \
 .                        /     may <Inf>
 .                       /     /    \
 .                      /     /  marry <Caseacc>
 .                     /     /    /    \
 .                   He    may  marry  her
 .               [3MSD] [Pres I] [V]   [3FSD]

  b.  He      <Casenom> married <Caseacc> her.
     [3MSD]            [Past V]          [3FSD]

 .                        <Casenom>married
 .                              /         \
 .                             /        married <Caseacc>
 .                            /         /    \
 .                           He   married    her
 .                        [3MSD]  [Past V]  [3FSD]

Unpooled features, however, cannot have any structural interpretation. 
As a result, they have to wait in line until interpreted at SPF. 
Fortunately, pooled features, as specified here, happen to be roughly 
the same as those that Chomsky refers to as uninterpretable ones. The 
inventory of unpooled features, on the other hand, corresponds to 
Chomsky's set of interpretable features.  Although the Pooled 
Features Hypothesis does not hold the distinction between 
interpretable and uninterpretable features, the distributional 
similarities between (un)interpretable and (un)pooled features 
minimize our theoretical and empirical losses. For Chomsky, such formal
features are checked and deleted. For me, (when pooled) they shape
the structure.

43. (7) represents a definition of Feature Sharing.

(7)  F is shared by alpha and beta iff F is a common formal feature 
that labels a node immediately dominating both alpha and beta or their
projections. The shared feature will label the node which is on the 
shortest path between alpha and beta or their projections. To put it 
more formally:

(8) SHFab <---> (Ex) [Fx & CMFab & Lxn & Dna & Dnb V Dna'' & Dnb'']

(9) ~SHFab ---> (Ex) [(Fx) & ~CMFab & SPFx]
  
  where SHF stands for "share the feature", a for "alpha", b for "beta",
  F for "feature", CMF for "Common feature", L for "labels", n for 
  "node", D for "dominates", '' for "a projection of", and SPF for "is 
  interpreted at Semantico-Phonetic Form".

The Pooled Features Hypothesis is compatible with Brody's (1997) radical
interpretability that requires ALL features to have semantic interpre-
tation. They are even similar in that Brody's bare checking theory 
assumes that "multiple instances of what is in fact one feature are not
tolerated at the interface" (Brody, 1997:159). But feature sharing and
bare checking cease to be similar at this point as for Brody, checking
does take place, i.e. a feature is deleted after all, because "the mul-
tiple copies of F are interpretively redundant and would violate the
principle of full interpretation" (Brody, 1997:158). Feature sharing,
on the other hand, assumes that an LI in the lexicon is a set of codes
each pointing to some feature from one of the inventories of features--
there are three of these inventories: those of phonological, semantic, 
and formal features respectively. When LIs are selected, their features 
are copied from the lexicon onto a temporary buffer one by one so that 
features common between two LIs are copied only once in the fulfillment 
of the principles of natural economy. Note that such sharing of fea-
tures can work ONLY FOR FORMAL FEATURES as FF(LI) is different from 
other subcomplexes, namely PF(LI) and SF(LI), in that formal features
are grammatical in nature, thus INTERLEXICAL. This model is even more 
economical than Brody's bare checking theory which seems to introduce 
a feature onto LA first and then check and delete all of its copies 
except one in order to fulfill the principle of full interpretation. 


5. Movement and Formal Features: How Imperfect Are "Imperfections"?

5.1 A Critique of Chomsky's Thesis of Movement

44. In Chapter Four of "The Minimalist Program", 'Categories and Trans-
formations'(1995), Chomsky advances several claims with the aim of 
establishing a relation between certain morphological requirements of a 
language and the operation Move. According to Chomsky, "the operation 
Move is driven by morphological considerations: the requirement that 
some feature F must be checked" (Chomsky, 1995:262). Then F (a feature) 
raises to target beta (a full-fledged category) in K = {gamma, {alpha, 
beta}} to form  K = {gamma, {F, beta}}, or it raises to target K to form 
{gamma {F, K}}. However, due to the economy condition, "F carries along 
just enough material for convergence. [...] Whatever 'extra baggage' is 
required for convergence involves a kind of 'generalized pied-piping'. 
[...] For the most part--perhaps completely--it is properties of the 
phonological component that require such pied-piping" (p.262). Chomsky 
(1995) argues that a principle of economy (Procrastinate)[15] requires 
that this movement be covert unless PF convergence forces overt raising 
(p.p. 264-265).

45. This formulation of Chomsky's thesis of movement, however, crucially
relies on how the terms checking and the PF convergence condition are
defined. Otherwise, one cannot explain why alpha (whether F or K) moves
at all nor why covert raising is preferred to overt raising. Although
checking is such a central concept to Chomsky's thesis, he avoids an
explicit definition of the term. Instead, he appeals to intuitions, 
illustrations, and such comments on checking as follows (which seem to 
be the closest ones to a definition of the term):

* "We can begin by reducing feature checking to deletion [...]. This 
cannot be the whole story" (p. 229).
* "A checked feature is deleted when possible. [...] [D]eletion is 
'impossible' if it violates principles of UG. Specifically, a 
checked feature cannot be deleted if that operation would contradict 
the overriding principle of recoverability of deletion [...]. Interpre-
table features cannot be deleted even if checked" (p.280).
* "[-Interpretable] features [...] must be inaccessible after checking. 
[...] Erasure of such features never creates an illegitimate object, so 
checking is deletion, and is followed by erasure without exception"  
(p.281).
* "Mismatch of features cancels the derivation. [...] We distinguish 
mismatch from nonmatch: thus, the case feature [accusative] mismatches 
F' = [assign nominative], but fails to match F' = I  of a raising 
infinitival, which assigns no case" (p. 309).

46. Chomsky's formulation of PF convergence as the condition on the 
"extra-baggage" accompanying F in its movement is even less clear in 
that he seems to associate it with the strength of features in 
question. Accordingly, a strong feature, one which is a feature of a 
nonsubstantive category checked by a categorial feature (p. 232),  
is a feature that can trigger movement (whereby both phonetic and 
formal features are moved together).  Chomsky asserts that

"if F is strong, then F is a feature of a nonsubstantive category and F 
is checked by a categorial feature. If so, nouns and main verbs do not 
have strong features, and a strong feature always calls for a certain 
category in its checking domain [...]. It follows that overt movement 
of beta targeting alpha , forming [Spec, alpha] or [alpha  beta alpha], 
is possible only when a is nonsubstantive and a categorial feature of 
beta is involved in the operation" (1995:232).

47. In Chapter Four of his "Minimalist Program", Chomsky drops the 
stipulation underlying his formulation of strength because, as he puts 
it, "formulation of strength in terms of PF convergence is a restatement 
of the basic property, not a true explanation" (p. 233). Since he cannot 
think of any better formulation of strength either--"[i]n fact, there 
seems to be no way to improve upon the bare statement of the 
properties of strength" (p.233)--we have to conclude that a strong 
feature is one that "triggers a rule that eliminates it: [strength] is 
associated with a pair of operations, one that introduces it into the 
derivation (actually, a combination of Select and Merge), a second that 
(quickly) eliminates it" (p. 233). Thus:

(10) (A) If F is a feature of the target so that the target is not a 
         substantive category, 
     AND 
     (B) alpha is a substantive category that contains a categorial 
         feature SO THAT ALPHA MOVES, 
     AND 
     (C) it enters into a checking relation with the target, 
     AND 
     (D) its categorial feature eliminates F (A through D altogether
         as equivalent to saying "F is strong"), 
     THEN 
     (E) alpha moves.

(10) is of little interest because it is redundant and trivial:
     
(10')  [ (A) & (alpha moves) & (C) & (D) ---> (alpha moves) ]

It merely tells us that if (among many other events) alpha moves then 
alpha moves. This reduces Chomsky's thesis of overt movement to a 
triviality. At best, it is as informative as saying:

(11) If an element does not move overtly, then its unchecked features 
moves covertly in order to be checked.

48. The thesis is problematic with regard to the PF convergence 
condition on movement, Procrastinate, and feature strength, too (see 
note 15 above). Procrastinate, a natural economy condition, minimizes to 
zero the PF "extra-baggage" F carries with itself as it is raising to 
a new position to be checked. For LF movement is "cheaper" than overt 
movement. Then the strength of a feature (as the PF convergence con-
dition on triggering overt movement) necessitating the overt movement 
and LF movement (as a requirement by Procrastinate) are always in com-
plementary distribution so that:

(12)
(PF convergence ---> overt movement) V (Procrastinate ---> LF movement)

This seems to be a violation of the independence assumption according 
to which PF and LF are two independent interface levels.

49. The thesis does not meet the condition of falsifiability either.
If an element moves overtly, then the theory explains the movement 
in reference to some strong feature of the element. When required to
offer some existence proof for such strong features, it resorts 
to the overt movement of the element as the syntactic evidence. If 
confronted with some disconfirming cross-linguistic evidence, the theory
replies by saying that the feature must be weak in that language. Even
if the confirming and disconfirming pieces of evidence happen to come
from the same language, e.g. the feature Q in English which is to be 
eliminated despite being Interpretable, vaguely defined notions such as 
strong/weak, delete/erase, and checking relation/checking configuration
make it difficult to challenge the thesis on empirical grounds.

50. Finally, Chomsky's checking theory does not explain why 
[-Interpretable] features should exist after all if (a) they have no 
interpretation at all, (b) they must always be checked, deleted, and 
erased without exception (p.281) in order for the derivation to con-
verge, and (c) it is not uninterpretability but strength that triggers 
overt movement. Perhaps one needs such formal features in order to 
justify Chomsky's hypothesis of covert movement--the remainder of his 
thesis of movement. It is not so clear, however, why the language 
faculty should need them. 

51. Chomsky's 'Minimalist Inquiries: the Framework'(1998) (henceforth, 
MI)--although still exploratory like "The Minimalist Program" (1995)--is 
intended to be a major rethinking of MP issues and "a clearer account 
and further development of them" (p.1). As far as Chomsky's thesis of 
movement is concerned, however, there seems to be no significant 
improvement in the original ideas discussed earlier in MP. Once more, a 
set of fresh terms and distinctions are introduced in order to explain 
the complexities of the functioning of language faculty. Chomsky seems 
to recognize this terminological strategy himself in the footnote 110 
while discussing the featural composition of AGR when he says: "In MP, 
it could be avoided only by recourse to the (dubious) distinction bet-
ween deletion and erasure" (1998:55).

52. Chomsky (1998) assumes movement--or "dislocation", the term Chomsky 
prefers in his MI--to be an apparent "imperfection of language" or a 
"design flaw" which makes the strong minimalist thesis untenable (p. 
32). Chomsky assumes "two striking examples" of such imperfections to 
be:
 
 (I)  Uninterpretable features of lexical items
 (II) The "dislocation" property 

"Under (I), we find features that receive no interpretation at LF and 
need receive none at PF, hence violating any reasonable version of the 
interpretability condition [...]" (p.33). "The dislocation property (II) 
is another apparent imperfection (as) [...] the surface phonetic 
relations are dislocated from the semantic ones" (p.35). Since "such 
phenomena are  pervasive, [...] (t)hey have to be accommodated by some 
device in any adequate theory of language, whether it is called 
'transformational' or something else" (p.35).

53. What is, according to Chomsky, the role of the minimalist programme
for the syntactic theory in this regard? 

"The function of the eye is to see, but it remains to determine the 
implementation; a particular protein in the lens that reflects light, 
etc. Similarly, certain semantic properties may involve dislocated 
structures, but we want to discover the mechanisms that force 
dislocation. Minimalist intuitions lead us to look at the other major 
imperfection, the uninterpretable inflectional features. Perhaps these 
devices are used to yield the dislocation property. If so, then the 
two imperfections might reduce to one, the dislocation property. But 
the latter might itself be required by design specifications. That 
would be an optimal conclusion [...]" (Chomsky, 1998, p 36).

54. Then Chomsky seems to dispense with the concept strength altogether
saying: "The concept strength, introduced to force violation of 
Procrastinate, appears to have no place. It remains to determine 
whether the effects can be fully captured in minimalist terms or 
remain as true imperfections" (p. 49).

55. It is too early, however, to conclude that Chomsky's  MI rethinking 
of movement is significantly different from his MP formulation of the 
phenomenon. Although he dispenses with the term (but not the concept) 
strength, he introduces a new one--EPP-features--which is functionally 
similar (at least as far as movement is concerned) to strength as formu-
lated in MP,  and a new operation--Agree--in order to explain the mecha-
nisms underlying movement: 

"(The) operation [...] Move, combining Merge and Agree (,) [...] 
establishes agreement between alpha and F and merges P(F) (generalized 
'pied piping') to alphaP, where P(F) is a phrase determined by F [...] 
and alphaP is a projection headed by alpha. P(F) becomes SPEC-a. [...] 
All CFCs (core functional categories) may have phi-features (obligatory 
for T, v). These are uninterpretable, constituting the core of the 
systems of (structural) Case-assignment and "dislocation" (Move). [...] 
Each CFC also allows an extra SPEC beyond its s-selection: for C, a 
raised Wh-phrase; for T, the surface subject; for v, the phrase raised 
by Object Shift (OS). For T, the property is the Extended Projection 
Principle (EPP). By analogy, we can call the corresponding properties 
of C and v EPP-features, determining positions not forced by the 
Projection Principle. EPP-features are uninterpretable [...] though 
the configuration they establish has effects for interpretation"
(Chomsky, 1998:14-15).

He then formulates the configuration (22) below for CFCs "with XP the 
extra SPEC determined by the EPP-features of the attracting head H:
                   (22)  alpha  = [XP   [ (EA)   H   YP ]]        

Typical examples of (22) are raising to subject (yielding (23A)), 
Object Shift (OS, yielding (B), with XP= DO and t its trace), and overt 
A'-movement (yielding (C), with H = C and XP a Wh-phrase [...]:
         
         (23)   (A)    XP   -     [T   YP]
                (B)    XP   -     [SU    [ v   [V   t ]]]
                (C)    XP   -     [C   YP]

The EPP-features of T might be universal. For the phase heads v/C, it 
varies parametrically among languages and if available is optional. 
[...] [T]he EPP-feature can be satisfied by Merge of an expletive EXPL 
in (A), but not in (B) / (C)" (Chomsky, 1998:23).

56. The arguments against Chomsky's MP thesis of movement presented 
earlier seem to be relevant here, too. Chomsky's thesis is still 
a tautology in that it does not provide any useful information about 
the phenomenon. The thesis merely states that things move simply 
because some mysterious EPP-features up there make them move (as 
strong features did in his MP account of the thesis). And by EPP-
features he means those features we understand must be there because 
of the raising of an element to the new position. Since "[c]hoice of 
Move over Agree follows from presence of EPP-features" (p.19),  and 
since such features are uninterpretable ones presumably doomed to 
deletion in the course of the derivation, we are once more left with 
the question of why they should be there after all, and with the other 
questions discussed earlier. 

57. Chomsky's allusion to "certain semantic properties" involving dis-
located structures seems to have something to do with such functionalist 
theories as parsing or theme-rheme structure in explaining the why of 
movement. Chomsky has set himself on the exploration of the mechanisms 
involved in movement. Then one may wonder how the nature could antici-
pate (if it did) our future need to such (then useless) uninterpretable 
features as that part of the computational mechanism we will happen to 
employ later when we want to move things for meaning's sake. One possi-
bility is that such features evolved later to take care of our already 
existing needs to communicate meaning. The other possibility, which is 
more in line with the ideas expressed in Gould (1991) and Uriagereka 
(1998), is to consider uninterpretability an exaptation--a property of 
the language faculty that was NOT adapted for its present function, i.e.
affording movement so that certain semantic effects are achieved, but 
later co-opted for that purpose. Uninterpretability as an adaptation 
must not be particularly attractive to Chomsky as it implies that 
uninterpretable features, which are illegible to the C-I system, are 
still semantically motivated in origin. Uninterpretability as an 
exaptation, on the other hand, makes the proposal less falsifiable than 
ever. 

58. Roberts and Roussou in their manuscript 'Interface Interpretation'
criticise Chomsky's (1995, 1998) thesis of movement on similar grounds, 
namely (a) the introduction of uninterpretable features that have no 
other role except to be deleted, (b) Chomsky's formulation of a strong 
feature as one with a PF reflex while they must be deleted/erased as 
soon as possible, (c) F-checking requiring the presence of the same 
feature twice, (d) checking theory imposing a ranking of principles (a 
conceptual anomaly in minimalist approaches), (e) case features being 
uninterpretable for both the attractor and the attractee, and finally 
(f) its failure to provide a formal account for parametric variation.

59. Instead, they propose another minimalist model--Interface Inter-
pretability--in which there are only (LF) interpretable features, with 
strength associated with morphophonological realisation. The system is 
claimed to take care of parametric variation, too: "[t]he lexicon pro-
vides the information determining the mapping" designated in this model 
as the syntactic symbols +p for PF-mapping, +l for LF-mapping, and F* 
when a feature "must have a PF realisation" (p.p 5-6). Parametric vari-
ation then may be formulated as:

(13)  a. Is F* ?           Yes/No
     b. If F*, is it satisfied by Move or Merge?

Accordingly, "there can be no features that do not receive any inter-
pretation at all, that is they are not interpreted in either inter-
face (-p, -l)" (p. 7).

60. Roberts and Roussou's proposal, however, seems to be open to the 
same criticisms already levelled at other split-interface hypotheses. 
The proposal merely rules out a [-p, -l] feature. They do not explain 
how the listener/learner can recover a [-p, +l] feature, i.e. one to be 
interpreted in terms of meaning but bereaved of any phonetic reali-
sation. Moreover, their system is based on two mapping features (or 
whatever they intend these to be), namely [+/- p] and [+/- l]. But it 
artificially exploits only [F][+p] and [F][-p] as the parametric 
variants. It is not so clear why "Is F$ ?"(with $=[+l]) cannot be the 
source of any parametric variation in such a system.

61. Furthermore, Roberts and Roussou assume Merge to be a more economi-
cal option than Move/Merge + Move because Merge is costless (Chomsky, 
1995). They conclude that Merge is less marked than Move as far as
parametric values are concerned. Then "[t]he least economical option
(in 13 above) is Move" (p. 7). This means one can think of a hierarchy 
of parametric options arranged as below in terms of economy:

(14)
  (~F*) > (F* & Merge) > (F* & Move)/(F* & Move & Merge)
  where > stands for "more economical than".

This inevitably means some languages are more economical than others:
as one proceeds from left to right in the diagram, the parametric value 
results in a language that is less and less economical. Since economy is 
NOT placed on a par with distinctness, as it is done in SPF, such a 
stipulation can be catastrophic for a theory of language. It implies 
that some languages are functionally more evolved than some "less 
perfect" ones. For instance, a language in which a Q particle with a PF 
index is added to the sentence via Merge to signal questions must be 
more economical than one that prefers AUX-raising for the same purpose. 
Then Chinese is a more economical language than English. 

62. Roberts and Roussou predict that no (-p, -l) features can exist. 
They then continue that "[t]hese features (-p, -l) are precisely those 
that correspond to Chomsky's non-interpretable and in particular weak 
non-interpretable features, as well as Case features" (p. 7). Apart from
the careless phrasing of this stipulation, they forget that Chomsky's
non-interpretable features can also be [+p, -l]. The phi-features of 
a verb or any other non-nominal, e.g. the [plural] feature of AUX in
(15) below, which are definitely non-interpretable in Chomsky's system, 
obviously have a phonetic realisation:

(15)
    a. I WAS going.   (WAS: I, past, SINGULAR)
    b. We WERE going. (WERE: I, past, PLURAL)

Then Chomsky's uninterpretable features, which they rightly criticise
for reasons similar to those discussed here earlier and also in Lotfi 
(in press), can actually survive Roberts and Roussou's system because 
they are NOT [-p, -l]. And perhaps even more than that: Such features 
are ~F*, then the most economical option the system offers. This is 
even worse than Chomsky's Procrastinate, which Chomsky himself rejects 
in his 'Minimalist Inquiries' (1998), because Chomsky has always 
considered uninterpretability as an imperfection in the design of the
language faculty rather than the most economical parametric value.

                 
5.2 Feature Sharing and Movement

5.2.1 Structural Well-Formedness

63. According to the Pooled Features Hypothesis, an LI moves from its 
original position to a higher position on the hierarchy iff this raising
is formally motivated, viz a formal feature of the lexical item requires
alpha (the LI to move or its maximal projection) to move either because
(a) the nature of the feature in question necessitates such raising (the
c-command condition on scope), (b) alpha moves in order to have some
feature pooled with the target, i.e. to fulfill the sharing condition of
locality, or (c) a head is required to be in a position high enough to
dominate the whole grammatical structure it heads. A trace is left be-
hind that is connected to the moved element via a chain. Both the head
and the foot of the chain continue to share some formal features with
their neighbours so that they remain structurally licensed. Although
raising of pooled features may be denied in this framework on the 
grounds that shared formal features of an LI are structurally needed 
where they originated, it is still possible for a formal feature to be
drawn from the lexicon and be located somewhere on the tree without 
any lexical realization.[16] Furthermore, prior to Spell-Out, it is 
still possible for an LI to draw upon the lexicon and pick up new formal 
features. A yet unexplored possibility is to move even a POOLED feature
(in order to license a structural target position for an LI) iff the 
foot remains licensed due to some other features still shared between 
the trace and its neighbouring LI (or the projection of the latter).

64. A syntactic structure is well-formed if three inter-related 
conditions are met in the derivation: 

(a) The feature sharing condition on the locality of relations between 
    adjacent LIs (or their projections)
For each node at least one formal feature of the LIs involved must be 
pooled in order to form a legitimate local relation. When a mother node 
and its daughter(s) are identical in their lexical labels, no pooled 
feature is needed to appear between < > as the featural label because 
for such elements all formal features are actually identical and, as a 
result, pooled. For others, the pooled features will cement lexical 
items or their projections together. 

65. The sharing condition on lexical/phrasal adjacency is not bereaved 
of a functional motivation: since formal features are all assumed to 
have some interpretation, the adjacent LIs/their projections will be 
inevitably those with some Semantico-pragmatic links. For instance, the 
feature <Caseacc> pooled between a transitive verb and its object links 
these two in terms of semantic predication. Both the speaker and the 
listener seem to benefit from such links as they must find it easier 
now to compute these LIs/phrases one after another with one leading 
naturally to another, and as a result, minimising the amount of the 
information one has to store on the working memory while computing 
sentences. Otherwise, the speaker/listener has to keep track of such 
lexical items while producing/processing the interrupting data, say 
the material separating the transitive verb and its complement in
(16): hence, the adjacency condition on case assignment.

16. 
    a. Mary met <Caseacc> him yesterday.
  * b. Mary met yesterday him.

This approach to adjacency is compatible with Hawkins' notion of Early
Immediate Constituents according to which the parser prefers linear 
orders that maximise the IC (immediate constituent) to non IC ratios of 
constituent recognition domains (Hawkins, 1994:77).

(b) The c-command condition on scope

66. The c-command condition on scope is formulated here in reference to
the (non-)locality of a feature, viz. whether it semantically pertains 
to a complete sentence, or to only part of it. I define a global feature 
as one whose sphere of influence is (due to semantic considerations)  
the complete sentence. A local feature, on the other hand, pertains the 
interpretation of a part of a sentence. In 'We will meet him at the 
station', the feature [Male] pertains the interpretation of 'him' as the 
internal argument of the verb 'meet', while the feature [Declarative] 
pertains the whole sentence. As a result, [Male] is local while [Dec] 
global. The c-command condition on scope requires that a global feature 
should be in a structural position high enough to c-command all the 
elements of the construct and take scope over them. Mood category 
features--defined here as those which are concerned with the illocution-
ary force of the sentence, such as declarative, interrogative, and 
imperative--are good examples of such global features. [Q] is distinct 
from [Wh] in this respect as the former exclusively requires either the 
proposition P or its negation ~P to be true: hence a global feature. 
[Wh] per se, on the other hand, is concerned with the identity of a 
missing argument (among other things) but not the truth value of the 
whole proposition: hence a local feature.

67. The c-command condition on scope may prove to be more than a formal 
constraint on the use of global features. Distinctness as a communica-
tive constraint on production encourages the speaker to generate sen-
tences that are easier to process by the audience. The existing 
literature on parsing effects and iconicity support the view that these 
performance explanations are necessary in order to afford a more 
comprehensive account of grammaticality and language universals (Givon 
1979, 1995; Hawkins 1989, 1994; Haiman 1985; and Bybee 1985 among many 
others). Having a global feature like [Q] at the beginning of a sen-
tence, by means of a Q particle in the initial position and/or a raising 
intonation for the whole sentence, presumably facilitates the audience's 
processing of the sentence as a question. This is in line with Givon's 
account of the pragmatic aspects of meaning according to which we attend 
first to the most urgent task: SUBJ V word order is more frequent than 
V SUBJ; and 10,253 is ten thousand, two hundred, and fifty three rather 
than three, fifty, two hundred and ten thousand. Similarly, the lexical 
carrier of a global feature tends to appear in the initial position so 
that the listener can attend the task of determining the mood category 
of the sentence first. This will give her more time to decide what the 
speaker expects her to do--the illocutionary force of the utterance--as 
the latter gallops towards the end of the sentence. It is also compa-
tible with the functionalist theories of thematic structure (Halliday, 
1970, 1973) according to which speakers place the "frame"--a point of 
departure for the sentence--at the beginning of the sentence in order 
to orient their listeners toward a particular area of knowledge. The 
remainder of the sentence, i.e. the "insert", enables them to narrow 
down what they want to say. With [Q] in the initial position, speakers 
give notice to listeners that they are going to ask a question.
                   
(c) The dominance condition on head position

68. A head of a phrase must be in a position high enough in order for 
its maximal projection to dominate all its phrasal elements. The rele-
vant phrases for our discussion here seem to be root and subordinate 
clauses as the maximal projections of AUX/VERB and COMP respectively. 
The condition is implied as a formal requirement on projections in both 
GB and MP as in neither framework an [X] [BAR 3] is allowed. In terms of 
the operation Merge, "[a] derivation converges only if this operation 
has applied often enough to leave us with just a single object [...]"
(Chomsky, 1995:226). Then the operation stops (in the fulfillment of 
principles of natural economy) the moment [X] projects as [X] [BAR 2] 
because it is now a single object open to other syntactic operations 
like Move and Share, which naturally apply to [X] [BAR 0] and [X] 
[BAR 2] as single syntactic objects but no [X] [BAR 1]. Since [X] [BAR
3] cannot project--the merger has already produced a single syntactic
object, a head like AUX has to raise under certain circumstances [17]
so that it can further project up there and fulfill the dominance con-
dition:

(17)
                          [AUX] [BAR 2] 
                             / \
                           /     \
                         /      [AUX] [BAR 1]
                       /       /    \
                     /       /     [AUX] [BAR 2]
                   /       /       /   \
                 /       /       /       \
               /       /       /       [AUX] [BAR 1]
             /       /       /        /     \
           /       /       /        /       doing
         /       /       /        /        /   \
       What-i  are-j   you       t-j    doing   t-i ?
               ^                 |
               |_________________|


69. In (17), the final derivation is still a root clause, then the maxi-
mal projection of AUX. Since the movement of 'what' to the initial posi-
tion is a forced move--in the fulfillment of the c-command condition on 
scope, the auxiliary also has to move to a new position high enough
to satisfy the dominance condition, but not too high to violate the 
scope condition for the global feature [Q] carried by the Wh-word. If 
Merge could derive [AUX] [BAR 3], AUX would remain in its original 
position. In subordinate clauses, on the other hand, AUX does not need
to head the clause, which is the projection of COMP. Then there is no 
SUBJ-AUX inversion:

(18)
                                  what
                                 /   \
                               /    [AUX] [BAR 2]
                             /      /   \
                           /      /       \
                         /      /       [AUX] [BAR 1]
                       /      /        /     \
                     /      /        /       doing
                   /      /        /        /   \
  (I wonder)... what-i  you      are     doing   t-i 


70. My thesis of movement states that whenever one or more of these 
conditions are not satisfied in a derivation, overt raising takes place. 
Otherwise, the derivation will be cancelled as ill-formed. These possi-
bilities are explored below in further details.

5.2.2 Languages with Q-Particles

71. Japanese-type languages are well known to require no subject-
auxiliary inversion rule in the formation of yes-no questions:

(19)    a. Kore-wa    hon   desu.
           this-TOP   book  is
             'This is a book'
         b.  Kore-wa   hon    desu  ka?
             this-TOP  book   is    Q-particle 
               'Is this a book?'
               (from Kuno 1973)

Persian, though Indo-European, follows the same pattern in that a 
question particle 'aya' is added to sentence instead of AUX movement. 
The particle, however, is placed at the very beginning of the sentence:

(20)   a. Pedar  khahad  raft.
          Father will-3S go
          'Father will go'
          b. (Aya)       pedar   khahad  raft?
             Q-particle  father  will-3S go
             'Will father go?'

Both Japanese and Persian Q-particles, presumably carrying [Q], are 
high enough in the structure to have scope over the whole sentence. 
Since the Q-particle is an operator that addresses the truth value of 
the whole sentence, it is necessarily located in the highest position 
available in the sentential structure. It is hypothesised that such 
lexical items as Persian 'aya' occupy the head position immediately 
above IP. Moreover, such LIs contain the structural feature 
[Inflection] with a potentiality of being shared with a finite verb 
or auxiliary whose maximal projection is c-selected by C. Then such 
a Q-particle heads a root clause, and dominates all elements within 
it (21b). For Persian declarative sentences, the formal feature [Dec] 
does not have a lexical carrier. However, all the three well-
formedness conditions are fully met in (21a). Under such circumstances, 
[Dec] is assumed to be attached to C, a null element whose maximal 
projection both dominates all the elements of the sentence, and 
shares a structural feature with 'khahad' the head of IP.

(21)    a.                        
.                            C<Inflection>
.                          /    \
.                         /       \
.                        /          \
.                       /  <Casenom>khahad
.                      /        /      \
.                     /       /   khahad <non-finite>
.                    /      /     /       \    
.                  C      Pedar khahad  raft
.                [Dec]
.     
.
.             b. (Aya) pedar khahad raft?
. 
.                              Aya<Inflection> 
.                            /     \
.                           /        \
.                          /           \
.                         /  <Casenom>khahad
.                        /        /       \
.                       /       /       khahad <Non-finite>
.                      /      /        /      \
.                     /     /        /          \
.                   Aya  pedar    khahad       raft?
.                   [Q]  

    
72. Japanese and Persian are also wh-in-situ languages; that is, Wh- 
interrogatives are formed with no Wh-fronting. In both languages, Q- 
particles are available in a position high enough to have scope over 
the whole construct. In Persian, however, the Q-particle is not 
obligatory. Again, this must be a case of [Q] located in the initial 
position without any phonetic realization. One should bear in mind 
that the Persian Q-particle is not obligatory in yes-no questions 
either. Moreover, even in standard spoken Persian, the Q-particle 
'yani' can be employed to signal a (yes/no or Wh-) question (see note 
18 below).

(22) Japanese Wh-interrogatives:
     a.  John-wa   dare-o  korosita ka?
         John-TOP  who-DO  killed   Q-particle
          'Who did John kill?'
     b. John-wa   Mary-ga        dare-o  kiratte-iru to
        John-TOP  Mary-particle  who-DO  hating-is   that

        sinzite-ita   ka?
        believing-was Q-particle
            'Who did John believe that Mary hated?'
                                             (from Kuno 1973)
(23) Persian Wh-interrogatives:
    a. (Aya)      ke  khahad raft?
       Q-particle who will   go
            'Who will go?'
    b. (Aya)   pedar  fekr-mikonad       Hasan ke-ra  did?
       Q-part. father thought-do-3S-Pres Hasan who-DO saw-3S
             'Who does father think that Hasan saw?'

Here questions are formed with very similar requirements and mechanisms
as those of yes-no questions: The Q-particle is located in the initial
position so that the scope requirement on the use of [Q] is fulfilled. 
The position is structurally granted via [Inflection]-sharing between
the Q-particle and auxiliary/finite verb. The Wh-phrase remains in situ
because (a) [Wh] is a local feature whose sphere of influence is not
the whole sentence but one of its constituents, and (b) [Wh] in such
languages is not piedpiped to the global feature [Q]. Then:

(24)
.                              Aya <Inflection>
.                             /    \
.                            /       \
.                           /          \
.                          /  <Casenom>khahad
.                         /           /   \
.                        /          /   khahad <Non-finite>
.                       /         /    /      \
.                      /        /    /          \
.                    Aya       ke  khahad       raft?
.                    [Q]      [Wh]
.                                      

5.2.3 Languages with Overt Movement

73. Let's begin the study of overt movement in such languages with the 
structure of English subordinate clauses, for the differences between 
such structures and those of root clauses in wh-in-situ languages are 
minimal. These similarities could be mainly due to the presence of a 
lexical item with a mood category feature in English subordinate 
clauses. Then no other sentential element is needed to host the feature 
for structural reasons:

(25) English subordinate clauses
   a. that you could see her
      [Dec]
   b. whether you could see her
      [Q]
   c. whom-i  you could see  t-i
      [QWh]

But even here, some differences can be observed because in English 
subordinate clauses, the Wh-word still moves since even here [Q] is not 
lexicalized. Instead, the Wh-word contains both [Q] and [Wh]. These two 
features seem to be piedpiped together in a sense as the covert raising 
of a feature (such as [Q] in order to take scope) is denied in this 
model. 

74. Like Persian declarative/interrogative root clauses, English 
subordinate clauses (in 26a, b, c) conform to well-formedness conditions 
outlined earlier.

(26)  a.
.                          that<Inflection>
.                         /     \ 
.                        /        \
.                       / <Casenom>could
.                      /      /       \
.                     /      /       could <Nonfinite>
.                    /      /      /      \
.                   /      /      /       see <Caseacc>
.                  /      /      /       /    \
.                that   you   could    see    him 
.               [Dec]
.
.  b.                     whether <Inflection>
.                         /      \ 
.                        /         \
.                       / <Casenom>could
.                      /        /     \
.                     /        /     could <Nonfinite>
.                    /        /    /     \
.                   /        /    /      see <Caseacc>
.                  /        /    /      /   \
.              whether    you could   see   him 
.               [Q]
.   c.
.                           whom <Inflection>
.                          /     \ 
.                         /        \
.                        / <Casenom>could
.                       /        /    \
.                      /        /    could <Nonfinite>
.                     /        /     /   \
.                    /        /     /    see <Caseacc>
.                   /        /     /    /   \
.                 whom     you  could see    t
.                 [Q]
.

In all these cases, appropriate local relations are established between 
nodes with pooled features. Moreover, mood category features in these 
trees are carried by LIs that both c-command all other elements, and 
head the whole construct.

75. Interrogative roots in English, however, show different syntactic
patterns than those of Japanese-type languages because of no lexical-
ization of mood category features in English roots. As a result, other 
sentential components compete for hosting the feature as required by
well-formedness conditions. In order to form a yes-no question in Modern
Standard English, the formal feature [Q] is introduced into the
derivation. Contrary to Japanese-type languages, there is no specific 
lexical item comparable with 'ka' and 'aya' to carry [Q]. Still contrary
to Early Modern English, it is the auxiliary rather than the verb that 
carries [Q] in MSE. This seems to be a lexical difference between EME
and MSE. Whatever the case, an existing LI functions as a host to [Q]
so that the scope condition on the use of [Q] is satisfied. In EME, when
the finite verb (27a, b) moves to the initial position, other well-
formedness conditions are satisfied automatically. Then no more 
structural modifications are needed. In MSE yes-no questions, on the 
other hand, it is the auxiliary rather than the finite verb which is 
introduced into the derivation in order to host [Q]. Then Aux moves in 
the fulfillment of well-formedness conditions (27c).

(27) 
       a.  You saw him.    (EME/MSE)
.                                C <Inflection>
.                             /    \
.                            /       \
.                           / <Casenom>saw 
.                          /         /  \
.                         /         /   saw<Caseacc>
.                        /         /    /  \
.                       C        you  saw  him.
.                      [Dec]
.
.      b.  Saw you him?   (EME)
.                                saw
.                                /  \
.                               /     \
.                              /        \
.                             / <Casenom>saw
.                            /         /   \
.                           /         /    saw <Casenom>
.                          /         /    /   \
.                        Saw       you   t    him ?
.                        [Q]
.
.       c. Did you see him?  (MSE)
.                             did
.                           /    \ 
.                          /       \
.                         / <Casenom>did
.                        /        /   \
.                       /        /    did<Nonfinite>
.                      /        /     /  \
.                     /        /     /   see <Caseacc>
.                    /        /     /   /   \
.                   Did     you    t  see   him ?
.                   [Q]

76. Other Germanic languages like German and Dutch seem to follow the 
same pattern as that of EME whenever an auxiliary is not present in the 
sentence. Otherwise, the auxiliary is fronted with similar specifica-
tions to MSE auxiliary inversion. Other structural differences are due 
to SOV word order in German and Dutch. 

(28)  German yes-no questions
        a. Kauft Karl das Buch?
            buys   Karl the  book
            'Does Karl buy the book?'
        b. Hat Karl das Buch gekauft? 
            Has Karl the  book  bought
           'Has Karl bought the book?'
           (from Haegeman 1991)

(29)   Dutch yes-no questions
        a. Koopt Wim het boek?
            buys   Wim the book
           'Does Wim buy the book?'
        b. Heeft Wim het boek gekocht?
            has     Wim the book bought
           'Has Wim bought the book?'
           (from Haegeman 1991)

77. In order to form a Wh-question in MSE, the formal feature [Q] 
and [Wh] are to be introduced into the derivation. Contrary to wh-in-
situ languages, however, [Q] and [Wh] seem to be piedpiped together in 
interrogative words for such languages with obligatory Wh-fronting. 
Then the Wh-word has to move overtly to the beginning of the sentence 
in order for [Q] to take scope. This can explain the syntactic 
configurations depicted in (30a) and (30b). In (30a), the Wh-word 
containing both the formal features [Q] and [Wh] is already in a 
position c-commanding all other sentential elements. Then no Wh- 
raising needs to take place. The feature-sharing account of (30a) seems 
to be more elegant and economical than orthodox accounts of the 
phenomena according to which even in (30a) the Wh-word is raised to 
Spec-CP position. In (30b), on the other hand, both the Wh-word and 
Aux (could) raise to new positions in the fulfillment of the scope 
condition on [Q] insertion, the sharing condition on structural local-
ity, and the dominance condition on head position.

(30) a.
.                          <Casenom>saw 
.                                  /   \
.                                /    saw <Caseacc>
.                              /      /      \
.                             Who    saw    her ?
.                            [QWh]
.
.b.                        could
.                           /  \
.                         /  could
.                       /     /   \ 
.                     /     /       \ 
.                   /     / <Casenom>could
.                 /     /           /  \ 
.               /     /            /  could <Non-finite>
.             /     /             /   /   \
.           /     /              /   /    see <Caseacc>
.         /     /               /   /    /   \
.       Whom  could           you   t   see   t 
.      [Q Wh]
     
78. Brody's (1997) analysis of such sentences also incoporates an inter- 
pretable Wh-feature. However, his [+Wh] seems to be more similar to my 
[Q] as he assumes [+Wh] to be loaded onto AUX even in yes/no questions. 
Furthermore, he assumes the feature to be carried by AUX (and not Wh-
words) in Wh-questions, which is in sharp contrast with the analysis 
offered here because Brody's [+Wh] does not require the c-command 
condition on the scope of global features outlined earlier.

79. Given sharing assumptions, however, the ungrammaticality of (31a, 
b, and c) needs to be explained:

(31)
.        a.                    whom
.                             /    \ <Inflection>
.                            /       \
.                           / <Casenom>saw 
.                          /         /  \
.                         /         /   saw <Caseacc>
.                        /         /   /   \
.                    * Whom      you saw    t ?
.                     [Q Wh]
.
.
.        b.                  whom
.                           /     \ 
.                          /     whom<Inflection>
.                         /    /     \  
.                        /    /        \
.                       /    / <Casenom>could
.                      /    /       /     \
.                     /    /       /    could <Non-finite>
.                    /    /       /      /   \
.                   /    /       /      /    see <Caseacc>
.                  /    /       /      /     /  \
.                    * Whom   you   could   see  t ?
.                     [QWh]
.
.         c.                 whom
.                            /  \  
.                           /    \
.                          /    whom <Inflection>
.                         /    /    \  
.                        /    /       \  
.                       /    / <Casenom>could
.                      /    /        /   \ 
.                     /    /        /   could <Non-finite>
.                    /    /        /    /   \
.                   /    /        /    /    see <Caseacc>
.                  /    /        /    /    /   \
.            * Could  whom     you    t   see   t 
.             [Q Wh]
.
In both (31a and b), the dominance condition on head position is 
violated. The finite verb 'saw' and the auxiliary 'could' are the 
legitimate heads of the relevant interrogative roots while in both 
cases it is the Wh-word that occupies the head position. (31c) is 
even worse than (31a and b) because both the c-command condition on 
scope and the dominance condition on head position are violated in 
this structure.  

80. Data from other Germanic languages seem to provide further support 
for this account of movement. In both German and Dutch, the finite verb 
is raised to the initial position of a yes-no question whereby the 
dominance and scope conditions are fulfilled. The sharing condition is 
also satisfied because three projections of 'kauft' in (32) below are 
identical in structural features. In case of Wh-interrogatives, Wh-
raising is a forced move, too. Otherwise, the scope condition would be 
violated. Since German and Dutch data are syntactically identical in 
this regard, only German trees are provided below.

(32)  German interrogatives
.
.     a.                  kauft 
.                        /     \ 
.                       /        \ 
.                      / <Casenom>kauft
.                     /         /   \ 
.                    /         /   kauft <Caseacc>
.                   /         /   /   /\
.                  /         /   /  /____\
.               Kauft     Karl  t  das Buch?  
.               [Q]
.                'Does Karl buy the book?'
.
.      b.              kauft 
.                      /   \
.                     /   kauft 
.                    /   /    \ 
.                   /   /       \ 
.                  /   / <Casenom>kauft
.                 /   /         /  \
.                /   /         /    kauft <Caseacc>
.               /   /         /   /   \
.             Was kauft    Karl   t    t?  
.            [QWh]
.               'What does Karl buy?'
.
.
.
.       c.                 hat 
.                         /    \ 
.                        /       \
.                       / <Casenom>hat
.                      /       /    \
.                     /       /    hat <+n>
.                    /       /   /     \
.                   /       /   / <Caseacc>gekauft
.                  /       /   /        / \
.                 /       /   /        /    \
.                /       /   /        / \     \
.               /       /   /        /___\      \
.             Hat    Karl  t        das Buch  gekauft ?
.             [Q]
.                  'Has Karl bought the book?'
.

5.2.4  Multiple Questions and Movement

81. It is a common observation that single-pair answer is impossible in 
English multiple questions like 'Who bought what?': it is infelicitous 
in a situation like a store to ask such a question when someone sees 
that someone else buys an article of clothing but does not see who it 
is and what exactly is bought. A pair-list answer, on the other hand,
is felicitous when someone, say the store clerk who has been out for
an hour, asks his assistant the same question expecting an answer like
'Mr Brown bought a jacket, Mrs Smith bought a sweater, ...' (see 
Grohmann 1999 for some other situations). In Japanese, Chinese, and  
Hindi, which are all wh-in-situ languages, either a single-pair or a
pair-list one is possible. Hagstrom (1998) argues that a single-pair 
multiple question is a set of propositions, while a pair-list question 
is a set of sets of propositions. The Q-morpheme in wh-in-situ languages
is an existential quantifier that originates in a clause internal 
position and then moves into CP. If Q moves from the lower wh-phrase, 
it will be a pair-list question. If it moves from a position higher 
than both wh-phrases, a single-pair question will be the result.

82. French also permits both interpretations depending on the use of 
the in-situ or the Wh-movement strategies: single-pair answers are pos-
sible in French only with the in-situ strategy. "It is possible that the 
obligatoriness of syntactic movement of a wh-phrase to SpecCP for some 
reason forces the pair-list interpretation" (Boskovic, 1998). Then what
happens in Japanese is due to the semantically motivated movement of Q
while the overt Wh-movement in English is motivated by a strictly formal
syntactic requirement. Boskovic also observes some variation in multiple 
Wh-fronting languages. Bulgarian (in which the overt movement of a Wh-
phrase to SpecCPs is obligatory) patterns with English, and Serbo-
Croatian (no Wh-phrase overt movement to interrogative SpecCPs) with 
Japanese. Grohmann (1999) extends Boskovic's (1998) adaptation of 
Hagstrom's (1998) semantics to German. Both Wh-elements in a German 
multiple question move overtly with one of WHs targeting SpecFocP and 
the other the lower projection FP. Then German patterns with Bulgarian 
rather than English with regard to multiple questions.

83. Significantly, the data on multiple questions in Persian suggest 
that associating single-pair interpretations with wh-in-situ languages 
is not empirically borne out: Persian, a wh-in-situ language with a Q 
particle in the initial position to mark its interrogatives, seems to 
pattern with English rather than Japanese, Chinese, and Hindi. It 
normally affords only pair-list interpretations in multiple Wh-
questions, which sheds doubt on both Hagstrom's (1998) semantics of 
multiple questions and Boskovic's (1998) and Grohmann's (1999) adapta-
tions.

84. A group of 40 adult native-speakers of (Esfahani) Persian studying 
at Azad University were asked to indicate on a five-point-scale how in-
felicitous a multiple question was in each of the two situations des-
cribed below. In all cases, the Q particle 'yani' [18] was employed to 
signal the question rather than 'aya' because in informal Persian 
'yani' (or preferably no Q particle at all) is used in Wh-questions. 
Since multiple questions are rarely employed in written Persian, and
also because the use of '-o/-ro' as the cliticised form of 'ra' has 
certain important consequences for the felicitous interpretations of 
such questions, it was decided to assume a more conversational style 
in writing the items in question. Their ratings are tabulated for each 
case separately:

Situation I

You are in a store and off in the distance see somebody buying an 
article of clothing, but do not see who it is and does not see exactly
what it is being bought. You go to the shop-assistant and ask:

    - Yani        ki  chi  kharid?
      Q-particle  who what bought
        'Who bought what?'
    
    - Ali ye pirhan kharid.
      Ali a  shirt  bought
      'Ali bought a shirt.'

Scale of infelicitousness:            0      1      2      3      4

Number of participants who
preferred each point on the scale:    0      3      2      17      18

Number in percentages:                0%    7.5%    5%    42.5%    45%

Rating scores:                        0      3      4       51      72

Possible MAX:                       160

Possible MIN:                         0

Total:                              130 (out of 160)


Situation II

You are paying a social visit to a newly-married couple in their apart-
ment. While having a friendly conversation about the wedding presents 
they received from their friends, you ask about both what they received 
and who sent them each:
  
     - Yani        ki  chi  avord?
       Q-particle  who what brought
       'Who gave you what?'

     - Ali ye sa'at avord,   Maryam ye angoshtar avord,   Mina ye goldan 
       Ali a clock  brought, Maryam a  ring      brought, Mina a  vase    
       
       avord, ... .
       brought

       'Ali gave us a clock, Maryam gave us a ring, Mina gave us a 
        vase, ... .'

Scale of infelicitousness:            0      1      2      3      4

Number of participants who
preferred each point on the scale:   15      15     2      5      3

Number in percentages:              37.5%   37.5%   5%   12.5%   7.5%  

Rating scores:                        0      15     4     15      12
                       
Possible MAX:                       160

Possible MIN:                         0

Total:                               46 (out of 160)


85. The ratings suggest that the PL-reading of a multiple question in 
Persian is about 2.83 times more felicitous than its SP-reading. Then:

(33)
    # a. Yani ki chi kharid? (infelicitous '#')
            (single-pair answer)

      b. Yani ki chi avord? (felicitous)
       (pair-list answer)


Despite that, some other data from Persian multiple questions seem to
be more in harmony with what happens in other wh-in-situ languages:


Situation III

While  resting in her office, the teacher notices that some of her 
students are hitting their friends. Not wearing her glasses, she fails 
to identify them. Later she goes to her class and asks:

      - Yani Ki  ki-o  zad?
        Q    who whom hit
         'Who hit whom?'

      - Hasan Ali-o  zad, Hamid Arash-o  zad, Sina Pedram-o  zad, ... .
        Hasan Ali-DO hit, Hamid Arash-DO hit, Sina Pedram-DO hit
        'Hasan hit Ali, Hamid hit Arash, Sina hit Pedram, ... .'
        (felicitous, pair-list answer) 
        

Situation IV

While resting in her office, the teacher notices that one of her stu-
dents is hitting his friend. Not wearing her glasses, she fails to 
identify them. Later she goes to her class and asks:
        
      - Yani Ki  ki-o  zad?
        Q    who whom hit
         'Who hit whom?'

      - Hasan Ali-o  zad.
        Hasan Ali-DO hit
        'Hasan hit Ali.'
        (felicitous, single-pair answer) 


Situation V

Ali knows that his friend Hasan used to have three cars on sale: a white 
B.M.W, a red Chevrolet, and a black Jaguar. Hasan tells him that a close 
friend of Ali's bought one of these three cars last week. Ali wants to 
know who he was and what car he bought:

     - Yani ki  chi-o kharid?
       Q    who what  bought
        'Who bought what?'

     - Hamid Jaguar-o   kharid.
       Hamid Jaguar-DO  bought
       'Hamid bought the Jaguar.'
       (felicitous, single-pair answer)


Situation VI

Back from his holiday, Hasan notices that his assistant has sold all the
cars on sale. Hasan asks:

     -Yani ki chi-o kharid?
      
     -Hamid Jaguar-o  kharid, Bahram Chevrolet-o  kharid, Ahmad B.M.W-ro
      Hamid Jaguar-DO bought, Bahram Chevrolet-DO bought, Ahmad B.M.W-DO
      
      kharid, ...
      bought
      (felicitous, pair-list answer) 


86. Interestingly, in all the multiple questions with a SP-reading the 
direct object marker 'ra' follows the Wh-phrase (cliticised as '-o' or 
'-ro' in spoken Persian) that is the internal argument of the verb. Al-
though 'ra' carries an ACC feature, which will be shared by the object, 
there seems to be also a [+ specific] borne by 'ra'. Browne (1970) as-
sumes that Persian 'ra' is comparable with Turkish '-i' in this respect 
as both carry the feature, and as a result make the object specific, 
too. Karimi (1989, 1990) follows Browne in this regard and considers 
'ra' as a specificity marker for the direct object. She further argues 
that the interrogative element 'chi' may or may not cooccur with 'ra' 
when it is in the object position. If it does, it receives a specific 
reading:

  (34)   
      a. emruz ketab xarid-am
         today book  bought+1sgS
         'I bought books  today.'

      b. chi-(*ro)  xaridi?
         what       bought+2sgS
         'What did you buy?'
                                (from Karimi, 1990:149.30)

  (35)
      a. ketab-i-ro    ke   be to  gofte bud-am   xarid-am
         book+indef+ra that to you told  was+1sgS bought+1sgS
       'I bought the book that I had told you (about).'
      b. Chi-ro  xaridi?
         what+ra bought+2sgS
       'What did you buy?'
                                (from Karimi, 1990:149.31)

87. Then in '(Yani) ki chi kharid?' both 'ki' and 'chi' are [-specific]
while in '(Yani) ki chio kharid?' the first Wh-word is [-specific] 
but the second [+specific]. It follows that in (36a) below, the only 
possible interpretation is a non-specific, generic one. It actually 
means 'someone, who can be anyone and I want to know who one is, bought 
something, which can be anything and I want to know what it is'. In 
(36b), on the other hand, what is bought is marked as specific.  Then it 
might be thought to have a specific 'ki' subject, too. Still on the 
other hand, 'ki' in (36b) is [-specific] and can take scope on 'chio' in 
that for each non-specific person that antecedes 'ki', there can be a 
[+specific] 'chio' object. Then the question is ambiguous in that if the 
speaker decides that specificity of the Wh-word 'chio' lies within the 
non-specificity of the c-commanding Wh-word, i.e. 'ki', then the sen-
tence will still have a non-specific pair-list interpretation. Other-
wise, the specificity of the Wh-word 'chio' makes a single-pair inter-
pretation possible. 

(36)
     a.  Ki           chi          kharid?
       [-specific]  [-specific]

     b.  ki            chio         kharid?
       [-specific]   [+specific]

88. The ambiguity of (36b) is comparable with that of a sentence with 
two quantifiers; one a universal quantifier in a c-commanding position,
and the other an existential quantifier in a lower position. Then 
'everyone loves someone' is ambiguous in scope as it may be paraphrased
either as 'everyone has somebody or other that one loves' or 'there is
some particular individual whom everyone loves'. The data provided by
Fox and Sauerland (1997) suggest that a generic tense/context can also 
be behind such scope effects. Then ignoring tense, (37a) is identical to 
its counterpart (37b). (37b) is generic while (37a) is not:

(37)
     a. Yesterday, a guide ensured that every tour to the Louvre was 
        fun.
     
     b. In general, a guide ensures that every tour to the Louvre is 
        fun.
                                   (from Fox and Sauerland, 1997)

89. Similarly, it may be hypothesised that the feature [+/- specific] 
carried by Wh-phrases in multiple questions determines PL and/or SP 
reading(s) of the question. Then the differences among such languages 
as English, Japanese, Bulgarian, and Persian may be due to certain dif-
ferences in the featural composition of their Wh-phrases rather than 
the semantically motivated movement of Q, or the Wh-phrase overt move-
ment to interrogative SpecCPs. The hypothesis correctly predicts that 
for a Persian multiple question with Wh-phrases like 'koja' (where), 
'kay' (when), and 'chera' (why), a pair-list answer is obligatory as no 
DO-marker can be attached to such Wh-phrases:

(38)
    a. Ki       kay   raft?
       Who      when  went
     [-spec]  [-spec]
       'Who went when?'

    b. Ki      koja   raft?
     [-spec] [-spec] 
       who     where  went?
       'Who went where?'

    c.   Ki    chera raft?
      [-spec]  [-spec]
       'who    why  went?'


90. Interestingly enough, for Persian it is the arithmetic sum of the 
number of Wh-phrases with [+/- specific] feature (irrespective of their 
structural height) that in the final run determines the pair-list/
single-pair reading of the sentence. Then in a sentence with two 
[-specific] Wh-phrases, the pair-list reading is obligatory. When one of 
these two Wh-phrases but not the other is [-specific], both PL and SP 
readings are possible. Finally, in a sentence with one [+specific] Wh-
phrase and two [-specific] ones, a PL reading is again obligatory:

(39)
   a. (Aya/Yani) ki     chi     kharid? (pair-list answer)
               [-spec] [-spec]
   b. (Aya/Yani) ki     chio   kharid? (pair-list/single-pair answer)
               [-spec] [+spec]
   c. (Aya/Yani) ki     chio    koja    kharid? (pair-list answer)
               [-spec] [+spec] [-spec]

91. Cross-linguistic variation is then possible with regard to three 
different factors:

  (a) A Wh-phrase can be either [-specific] or [+specific]. For a lan-
      guage like Persian both types are possible. Some other languages
      are conceivable that afford only a [-specific] feature.

  (b) A marker marks the (non)specificity of the Wh-phrase.  For
      Persian, it is the DO marker 'ra' that marks the Wh-phrase as
      [+specific]. The negative value of the feature is available by
      default. Then Persian [+specific] wh-phrases can only occupy
      the object position. For other languages, completely different
      markers are conceivable.

  (c) The structural height of the Wh-carrier of [+/- specific] feature
      may or may not matter in determining the PL/SP reading of the 
      question. In Persian, the structural height of the Wh-phrase does
      not matter. 

6. Conclusion

92. The Pooled Features Hypothesis is a unitarianist hypothesis. It dis-
penses with LF in its generative account of movement, which is always 
overt because the well-formedness conditions have to be satisfied first 
in order to license the derived structure. It is only then that a 
sentence can be interpreted by performance systems. Therefore there is 
no need to postulate any LF interface level to explain how the well-
formedness of a sentence is guaranteed after Spell-Out with no PF 
realisation. Moreover, it is possible to think of some functional 
explanation for feature-sharing as it presumably makes a sentence easier 
to process (see par. 63). It follows that the Pooled Features Hypothesis 
is unitarianist in a third sense, too: since feature-sharing is both an 
economy requirement on production and a processing facility, the 
hypothesis can "unify" the speaker and the listener in a single model. 
The speaker saves time and energy as she minimises the number of 
features to be mapped onto the lexical array. She must also benefit from 
the formal links, i.e. features to be pooled, between the adjacent LIs 
and/or their phrasal projections in her selection of items from the 
lexicon: hence, easier to produce 'She will be doing that' than 'she be 
that will doing'. The listener has her own share of sharing as (due to 
the links mentioned above) she finds the former easier to process than 
the latter.

Notes

[1] As May points out, (1a) is ambiguous as it could mean 'the men
introduced each other to everyone that the women introduced the men
to'. Then we could have (2') instead of (2):

(2') The men-j introduced each other-j to everyone-i that the women-k
     did [VPellip introduce them-j to e-i].

In both (2) and (2'), however, elliptical VP could be still hypothesised 
to contain the elements they do with their phonetic content deleted.

[2] That two copies of 'each other' in (2) are different in their
references would pose no problem as the two copies of 'each other' are
not co-referential in (1b) either.

[3] Isn't it too late to criticise generativists for what they did
about two decades ago? To me, the answer is 'no'. The argument still
has a target to address: although Minimalist syntax today is radically
different from GB in both its conceptualisation and manipulation of
logical forms, the empirical motivation for this level of represen-
tation goes back to the GB framework. It appeared in MP as an un-
questionable assumption of the field. Chomsky (1993) had already dis-
pensed with internal interfaces S-structure and D-structure. Perhaps
the move was so radical that the existence of the interface level (LF) 
was not doubted at all. Hence no need to look for more existence proof 
for LF. Although this does not necessarily make matters worse, it does 
explain why I have targeted at the GB empirical evidence for LF.

[4] Extending Chomsky's conclusion, Barss (1986) had already argued for
the satisfaction of Principle A at s-structure, too. Otherwise, (i) 
would be grammatical with the anaphor licensed at LF:

(i) * David-i wonders who showed which picture of himself-i to Mary.

[5] Also note the direction of arrows on Uriagereka's diagram.

[6] As far as I remember, nowhere in Chomsky (1995), (1998), and (1999), 
any reference is made to the listener's reconstruction of LF based on PF 
as a possible interpretation of the model. There is a strong tendency in 
Chomsky (1999), however, to give more weight to interactions between 
syntax and phonology, e.g. the phonological edge, leftward TH/EX as the 
function of phonological component, etc. Although I welcome this shift 
of interest from LF to PF, it seems to widen the existing gap between 
these two levels, which makes Chomsky's model even less comprehensible 
than before. Anyway, I discuss the issue of reconstruction as a mere 
possibility.
                                     
[7] The LF-to-PF mapping may be equated to language learning to
answer this question. The answer, however, is not satisfactory because
this "explanation" could be equally used for relating any other LF
with any other PF that the linguist wishes to link to save his 
theories. It simply displaces the problem as now one can ask how the
learner can learn about such mismatches. Because the PLD is not rich
enough to show the mismatches, one should hypothesise some innate
mechanism--like parameterised mapping--to explain the problem of learn-
ing inherent in this case. But even this cannot be the whole answer. To 
be more specific, a language-user encounters the sentence 'Jane knows 
that Mary distrusts herself'. Due to his innate knowledge of the binding 
principles, he assumes 'Mary' and the anaphor to be coreferential. 
However, if the pragmatics of the language repeatedly proves 'Mary'
in such cases to be co-indexed with 'Jane', then he sets a parameter,
like Wexler and Manzini's (1987) Governing Category Parameter, at the
relevant marked value to take care of the data. So far so good. But
what LF-advocates expect the learner to do with regard to LF-to-PF
mapping is next to a miracle. Encountering the sentence 'the men
introduced each other to everyone that the women did', he raises the
quantifier to derive its logical form: hence, 'each other' can also be 
coreferential with 'the women' after Quantifier Raising. This presumably 
needs no learning as for Chomsky LF is universal, then most probably 
innately available. What is miraculous in this case is the learner's 
knowledge NOT to do the same thing to some other sentences like 'He 
phoned everyone that John knew'. Else, 'he' and 'John' could be co-
referntial after QR at LF, which is impossible in English.

[8] I say "tough" because I doubt that the standard generative 
methodology, which relies on the linguist's intuitions and grammatical 
judgements, is always adequate for empirically addressing such cogni-
tively inclined questions as those put forward by the minimalist, like 
those forwarded by Chomsky in his introduction to MP, e.g. "what con-
ditions are imposed on the language faculty by virtue of [...] its place
within the array of cognitive systems of the mind/brain [...]" (1995:1)?

[9] This part of the hypothesis will be weakened later to take care of 
sound-meaning and sound-syntax correspondences with regard to prosodic 
features of language.

[10] See Kendon (1972, 1991), McNeil (1992), Brownman and Golstein 
(1992), Allott (1994) for relations between language and gestures.

[11] In actual performance, the listener's C-I system "reads" a copy of 
SPF perceived (via the listener's A-P system) as a "reconstruction" of 
the speaker's copy. Hence, one single SPF with two actual copies: the 
speaker's, and the listener's.

      SPEAKER:                        |                         
                                      V                         
                     A-P. . . . . .> spf1 
                   system                
                     |                  
                     |    
                     V
               ACTUAL SPEECH 
      - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
      LISTENER:      /\ 
                     . 
                     .
                     . 
                    A-P
                  system - - - - - -> spf2
                                       /\
                                       .
                                       .
                                      C-I
                                    system

 interpret . . .>  
 produce   - - ->


At any rate, spf1 and spf2 must be similar enough in order for actual
speakers and hearers to communicate. The listener's A-P system can 
manage the reconstruction of spf1 as spf2 due to the similarities 
between the speaker's and the listener's A-P systems. The SPF model 
(depicted in Figure 4) captures this with its assumption of a single 
interface level at which a derivation becomes accessible to both the 
speaker and the listener. The LF-to-PF mapping is dispensed with.
                                                           
[12] An interesting empirical question to ask is whether the listener's 
A-P and motor systems can also access SPF via the listener's copy:
                    
                .
                .    
               A-P                                    Motor
             system - - - - - - > spf2 < . . . . . . system
                .                /\ /\
                . . . . . . . . .:  .         
                                    .
                                    .
                                   C-I
                                 system



If yes, the listener may subconsciously "echo" the speaker's speech 
and/or "mirror" her gesticulations and physical movements even when the 
listener has no visual access to the speaker's performance. Even if 
true, such "echoing-mirroring" behaviour does not need to be easily
observable under normal circumstances. It could be as silent as lips 
movement, muscle tension, or laryngeal movements.

[13] As mentioned earlier, a stronger version of the hypothesis, which 
requires ANY formal feature to be ALWAYS selected only once and then 
shared by ALL relevant lexical items, is obviously false. Otherwise, as 
one of the referees notes, a sentence like 'all the dog-s which Mary 
feeds will bite chicken-s' will be problematic because the plural 
feature in this case is selected twice. 'Dogs' and 'chickens' cannot 
share the feature because in order to afford that they first need to be 
structurally adjacent, which is barred due to the feature <Caseacc> 
shared by 'bite' and 'chickens'.


[14] Pooled features appear between < >; others between [ ]. In (6a), 
<Casenom> is shared by 'he' and 'may', and <Inf> by 'may' and 'marry'.
Pooled features appear here on that side of a tree node that is closer 
to the LI/NODE to function as a partner: hence, <Inf> to the right and 
<Casenom> to the left of the first and second projections of 'may' 
respectively.

[15] Chomsky (1998, 1999) finally abandoned the principle Procrastinate. 
He thinks the principle is not formulable as before anymore as the 
overt-covert distinction has collapsed (Chomsky, 1999:12). The prin-
ciple is abandoned because it is dispensable as another case of look-
ahead (1998:49). He even advocates "something like the opposite: per-
form computations as quickly as possible, the 'earliness principle'
of Pesetsky (1989)" (Chomsky, 1999:12). Since the concept strength was 
originally introduced to explain the violation of the principle, he 
concludes that strength has no place in this framework either. Later 
I argue that Chomsky's new term "EPP-features" is essentially the same 
(at least in terms discussed here) as "strong features".

[16] The feature must normally have some SPF realization, however.
Otherwise, the listener has simply no chance to become aware of it. A 
suprasegmental like stress or intonation can be a convenient SPF real-
ization of such isolated formal features without a lexical item 
specifically selected to bear the feature in question. In Persian,
for instance, the lexical item 'aya' can be dropped from a yes/no 
question. Despite the absence of this lexical device to carry [Q], the
rising intonation of the sentence makes the listener understand the 
presence of [Q] because even in the 'aya'-less question, the feature [Q] 
still has some SPF realisation.

[17] See Section 5.2.3.

[18] Contrary to 'aya', 'yani' is an Arabic word which has been bor-
rowed into Persian. In standard formal Persian, the word means 'means/is 
equal to'. Then:

   'Rehlat' yani 'margh'.
    demise  means death
    ' 'Demise' means 'death'.'

Despite that, and contrary to Persian verbs, it cannot be inflected at
all nor occur in the final position of the sentence. In spoken Persian, 
on the other hand, 'yani' also replaces 'aya' in both yes/no and Wh-
questions:

      Hasan raft.
      Hasan go+3sg+past
      'Hasan went.'      
      
      Yani Hasan raft?
           Hasan went
      'Did Hasan go?'

      Yani Hasan koja  raft?
           Hasan where went
      'Where did Hasan go?'


References

Aarons, D., B. Bahan, J. Kegl, and C. Neidle. (1992). Clausal Struc-
          ture and a Tier for Grammatical Marking in ASL. Nordic Journal 
          of Linguistics 15:103-142.
Allott, R. (1994). Gestural Equivalence (Equivalents) of Language.
          Berkeley, [http://www.percep.demon.co.uk/index.htm].
Barss, A. (1986). Chains and Anaphoric Dependence: On Reconstruction
          and its Implications. Doctoral dissertation, MIT.
Boskovic, Z. (1998). On the Interpretation of Multiple Questions.
          Chomsky Celebration Website, [http://mitpress.mit.edu/\
          celebration].
Brody, M. (1995). Lexico-Logical Form: A Radically Minimalist Theory.
          Cambridge, MA: MIT Press.
Brody, M. (1997). Perfect Chains. In Elements of Grammar: Handbook of
          Generative Syntax, ed. L. Haegeman. Dordrecht: Kluwer.
Browne, W. (1970). More on Definiteness Markers: Interrogatives in
          Persian. Linguistic Inquiry 1:359-363.
Browman, C. and L. Goldstein. (1989). Gestural Structures and Phonolo-
          gical Patterns. Status Report on Speech Research SR-97/98 pp. 
          1-23. NewHaven, Conn.:Haskins Laboratories.
Browman, C. and L. Goldstein. (1991). Gestural Structures: Distinctive-
          ness, Phonological Processes, and Historical Change. In Modu-
          larity and the Motor Theory of Speech Perception, eds. 
          Mattingly, I. M. and M. Studdert-Kennedy.
Browman, C. and L. Goldstein. (1992). Articulatory Phonology: An over-
          view. Phonetica 49:155-180.
Bybee, J. (1985). Morphology: A Study of the Relation Between Meaning
          and Form. Philadelphia: Benjamins.
Calvin, W. and D. Bickerton. (2000). Lingua ex Machina: Reconciling
          Darwin and Chomsky with the Human Brain. Cambridge, MA:
          MIT Press.
Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, MA:
          MIT Press.
Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht:
          Foris.
Chomsky, N. (1993).A Minimalist Program for Linguistic Theory. In The
          View from Building 20: Essays in Linguistics in Honor of
          Sylvain Bromberger, eds. Hale, K. and S. Jay Keyser, 
          Cambridge, MA: MIT Press.
Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Press.
Chomsky, N. (1998). Minimalist Inquiries: The Framework. MIT Occasional 
          Papers in Linguistics 15, Cambridge, MA: MIT Press.
Chomsky, N. (1999). Derivation by Phase. MIT Occasional Papers in Lin-
         guistics 18, Cambridge, MA: MIT Press.
Collins, A. M.& M. R. Quillian (1969). Retrieval Time from Semantic 
          Memory. Journal of Verbal Learning and Verbal Behavior 8. 
          240-247.
Conrad, C. (1972). Cognitive Economy in Semantic Memory. Journal of
          Experimental Psychology 92 . 149-154.
Estes, W. K. (1986). Array Models for Category Learning. Cognitive 
          Psychology 18. 500-549.
Fox, D. and U. Sauerland. (1997). Illusive Scope of Universal
          Quantifiers, [http://web.mit.edu/afs/athena.mit.edu/user/s\
          /a/sauerlan/www/Scope_Illusions.html].
Givon, T. (1979). On Understanding Grammar. New York: Academic Press.
Givon, T. (1984). Syntax: A Functional-Typological Introduction. 
          Amesterdam: Benjamins.
Givon, T. (1995). Functionalism and Grammar. Amsterdam: Benjamins.
Gluck, M. A. (1991). Stimulus Generalization and Representation in 
          Adaptive Network Models of Category Learning. Psychological 
          Science  2. 50-55.
Gould, S.J. (1991). Exaptation: A Crucial Tool for Evolutionary Psycho-
          logy. Journal of Social Issues 47, 43-65.
Grohmann, K. (1999). German is a Multiple Wh-Fronting Language! 
          Colloque de syntaxe et semantique a Paris 3.
Haegeman, L. (1991). Introduction to Government and Binding Theory. 
          Basil Blackwell.
Hagstrom, P. (1998). Decomposing Questions. Doctoral dissertation, MIT.
Haiman, J. (1985). Natural Syntax. Cambridge: Cambridge University 
          Press.
Hale, K. (1998). Conflicting Truths. In Functionalism and Formalism in 
          Linguistics, eds. M. Darnell, E. Moravcsik, F. Newmeyer, M. 
          Noonan, and K. Wheatley. Philadelphia: Benjamins.
Halliday, M. A. K. (1970). Language Structure and Language Function.
          In New Horizons in Linguistics, ed. J. Lyons. Baltimore:
          Penguin Books.
Halliday, M. A. K. (1973). Explorations in the Functions of Language.
          London: Edward Arnolds.
Hawkins, J. A. (1989). Competence and Performance in the Explanation of
          Language Universals. In Essays on Grammatical Theory and 
          Universal Grammar, eds. D. Arnold, M. Atkinson, J. Durand,
          C. Grover, and L. Sadler. Oxford University Press.
Hawkins, J. A. (1994). A Performance Theory of Order and Constituency.
          Cambridge: Cambridge University Press.
Jesperson, O. (1993). Progress in Language with Special Reference to
          English. Philadelphia: Benjamins.
Kaiser, L. (1998). Representing the Structure-Discourse Iconicity of
          the Japanese Post-Verbal Construction. In Functionalism and
          Formalism in Linguistics, eds. M. Darnell, E. Moravcsik, F.
          Newmeyer, M. Noonan, and K. Wheatley. Philadelphia: Benjamins.
Karimi, S. (1989). Aspects of Persian Syntax, Specificity, and the 
          Theory of Grammar. Doctoral dissertation, University of 
          Washington.
Karimi, S. (1990). Obliqueness, Specificity, and Discourse Functions:
          Ra in Persian. Linguistic Analysis 20:139-191.
Kendon, A. (1972). Some Relationships between Body Motion and Speech: An 
          Analysis of one example. In Siegman, Aron Wolfe and Benjamin 
          Pope eds. Studies in Dyadic Communication. New York: Pergamon.
Kendon, A. (1991). Revisiting the Gesture Theory of Language Origins. 
          Paper for LOS Meeting, De Kalb, Illinois.
Komatsu, L. K. (1994). Experimenting with the Mind: Readings in 
          Cognitive Psychology. California: Brooks/Cole Publishing 
          Company.
Kruschke, J. K. (1992). ALCOVE: An Exemplar-Based Connectionist Model
          of Category Learning. Psychological Review 99. 22-44.
Kuno, S. (1973). The Structure of the Japanese Language. Cambridge, MA:
          MIT Press.
Kuno, S. (1978). Japanese: A Characteristic OV Language. In Syntactic
          Typology, ed. W. Lehmann. Austin: University of Texas Press.
Liberman, A. M. (1993). Haskins Laboratories Status Report on Speech
          Research 113:1-32.
Lindblom, B., S. Guion, S. Hura, S. Moon, and R. Willerman. (1995). 
          Is Sound Change Adaptive? Revista di Linguistica 7:5-37.
Lotfi, A. R. (to appear). Minimalist Program Revisited: Chomsky's 
          Strength to Trigger Movement. Proceedings of the 34th Col-
          loquium of Linguistics.
May, R. (1985). Logical Form: Its Structure and Derivation. Cambridge, 
          MA: MIT Press.
May, R. (1991). Syntax, Semantics, and Logical Form. In The Chomskyan
          Turn, ed. A. Kasher. Oxford: Blackwell.
McClelland, J. L. & D. E. Rumelhart (1985). Distributed Memory and 
          the Representation of General and Specific Information. Journal 
          of Experimental Psychology 114. 159-188.
McNeill, D. (1992). Hand and Mind: What Gestures Reveal about Thought. 
          Chicago: University of Chicago Press.
Meinunger, A. (1998). Topicality and Agreement. In Functionalism and
          Formalism in Linguistics, eds. M. Darnell, E. Moravcsik, F.
          Newmeyer, M. Noonan, and K. Wheatley. Philadelphia: Benjamins.
Nettle, D. (1998). Functionalism and Its Difficulties in Biology and
          Linguistics. In Functionalism and Formalism in Linguistics, 
          eds. M. Darnell, E. Moravcsik, F. Newmeyer, M. Noonan, and K. 
          Wheatley. Philadelphia: Benjamins.
Newmeyer, F. (1998a). Language Form and Language Function. Cambridge,
          MA: MIT Press.
Newmeyer, F. (1998b). Some Remarks on the Functionalist-Formalist 
          Controversy in Linguistics. In Functionalism and Formalism in 
          Linguistics, eds. M. Darnell, E. Moravcsik, F. Newmeyer, M. 
          Noonan, and K. Wheatley. Philadelphia: Benjamins.
Pesetsky, D. (1989). Language-Particular Processes and the Earliness
          Principle. Ms. MIT.
Place, U. T. (2000). The role of the hand in the evolution of language.
          Psycoloquy 11(007), [http://www.cogsci.soton.ac.uk/cgi/psyc/n\
          ewpsy?11.007].
Rips, L. J., E. J. Shoben, &  E. E. Smith (1973). Semantic Distance and 
          the Verification of Semantic Relations. Journal of Verbal 
          Learning and Verbal Behavior  12. 1-20. 
Roberts, I. and A. Roussou. ms. Interface Interpretation. University of
          Stuttgart.
Schyns, P. G. (1991). A Modular Neural Network Model of Concept 
          Acquisition. Cognitive Science 15 . 461-508.
Shanks, D. R. (1991). Categorization by a Connectionist Network. Journal 
          of Experimental Psychology: Learning, Memory, and Cognition 17.
          433-443.
Uriagereka, J. (1998). Rhyme and Reason: An Introduction to Minimalist
          Syntax. Cambridge, MA: MIT Press.
Vallduvi, E. (1992). The Informational Component. New York: Garland.
Wilbur, R. (1998). A Functional Journey with a Formal Ending: What do
          Brow Raises Do in American Sign Language? In Functionalism and
          Formalism in Linguistics, eds. M. Darnell, E. Moravcsik, F.
          Newmeyer, M. Noonan, and K. Wheatley. Philadelphia: Benjamins.
Zubizarreta, M.L. (1998). Prosody, Focus, and Word Order. Cambridge, MA:
          MIT Press.

Author's address:     Department of English Language
                      Azad University at Khorasgan,
                      Esfahan, IRAN.
                      E-mail: lotfi@www.dci.co.ir