Monday, September 24, 2012

Sentential fusion

Aside from anaphora, the chief way that languages achieve efficiency is by fusing sentences at their common parts, replacing conjoined sentences by single sentences with conjoined parts and thus eliminating  repetition: from "Dogs eat and dogs drink"  to "Dogs eat and drink" for example.  Given the SOV (or OSV) nature of developing Xorban and other LoCCan3s, examples like this are fairly easy: place the arguments specifiers out front and then conjoin the predicate bits: la grka je ctka pnxa. However, when it is the arguments that vary while the predicate remains constant, he task is a little harder -- at least less obvious.

To start with a simple case: "Dogs and cats eat meat".  Even leaving "eat meat" as unanalyzed for now, How to begin: je la grka le mlte ctkV runs into two problems immediately: it parses as incomplete ((lg(lmc)) and it is unclear what vowel to put at the end of ctk, since either a or e will leave the other group out.  Dropping the je would relieve the first problem but make the two quantifiers subordinate one to another rather than parallel -- which doesn't matter for l, but would for others -- and doesn't solve the V problem.  When dealing only with l, as here, it is tempting to think that, since the two lines are parallel, we could give them the same variable: la grka la mlta ctka.  But the loss of the overt parallelism (or, rather, the failure to show it at all) means that the variable a has officially been hastily rebound and we end up talking only about cats, with a vacuous mention of dogs.  Or we could fuse the two groups into one la grka le mlte li gnikake ctki.  But this solution only works for sentences joined by "and". We need a solution that will work for "or" or even "if ... then ... ."  And one that can be generalized to more than a couple of arguments.

It would seem that, to do this, we have to move away from the strictly logical into encryption of some sort.  In the present system, no strictly logical system works, since the requirements for fusion interfere with the logical structure based on separation.  So we need to encode both the separation and the fact that the separated terms are parallel but share the predicate under a certain truth functional situation.  This looks like a three step process: 1) mark off the separate parts, 2) join them in the appropriate sentential way, and 3) indicate that they come together at the predicate.  So we mark each block with X (tbd), join blocks with  connectives (I think the sentential ones can be used here unambiguously -- if followed immediately by a block, it joins blocks), an then used some fusion of variables to mark the coincidence at the predicate.  The easiest fusion of variable is a new variable which is just V1'V2 (...) for those used before.  There are a couple of problems: the preassigned cases of a'e and o'e, but those can probably be worked around some how.  Of course, the new variables are not real -- they are strictly unbound and cannot be bound, they are mere code.  So we could end up with je X le grke X lo mlto ctke'o.  Of course, on a syllable count, it is not clear that this is an actual saving over je la grka ctka le mlte ctke but perhaps more complex cases give further advantages.

This could easily be expanded to the next more complex cases, where subjects and objects run in parallel, by just marking the ends of blocks as well as the start.  Again, the question of actual efficiency arises; ordinary languages just leave stuff out, without marking the absence in any formal way.  But we can go from  je la mlta le ldre pnxake li xrmi lo djco pnxiko to je X la mlta le ldre Y X li xrmi lo djco Y pnxa'eki'o (correcting for the a'e conflict).  And this can be carried to almost any length, with, I fear, increasing loss of clarity or strain on memory: ja na X la ma 'djan' le me 'roma' li mi 'lyndyn' Y lo vnjo X lu mu 'frank' la'o ma'o 'paris' le'a ma'u 'berlin' Y klma'uke'a'oki'a'uke'oko. Aloha!

9/25/12
A summary of the thread on this subject called my attention to the fact that the parallel blocks are parallel and that the problem of rebinding thus soes not occur.  So, the combining variables are not needed, as the same ones can be used in both blocks.  So, we can reduce the first example to je X la grka X la mlta ctka.  And, indeed, we could continue with X le ldre X le djce pnxake.   This now begins to show some signs of real savings.  The last example would be the no so Hawai'ian  ja na Xla ma 'djan' le me 'roma' li mi 'lyndyn X la ma 'frank' le me 'paris; li mi 'berlin' Y lo vnjo klmakekiko'eko. (I assume that klama will actually be simplified from five arguments eventually).  Only one Y was needed here, since the second X  automatically also closes the first block.

10/1/12
The notion of dropping the end marker for the first of a pair of parallels fails because of the possibility of further parallels within a sing le one:  If Bob or Frank goes to New York or Chicago, then Harry or Sally goes to Los Angeles: ja na XjaX la ma 'bab' Y Xla ma 'frank' Y ja X le me 'nuyork' Y X le me 'cikago'YY X ja X la ma 'hari' YX la ma 'sali'Y le me 'lasandjelis' Y klmake (surely some of these markers can be elided -- to be worked on).

10/7/12

Starting more or less afresh.  The goal here is to reduce as much as practicable the amount of literal repetition in utterances without losing the logical structure (though typically burying it a bit).  There are several cases (neither exclusive nor exhaustive) that deserve attention.

The easiest of these are cases where the same reference is made several times -- the usual case for anaphoric pronouns.  The solution here is inherent in the construction of Xorban, where terms are replaced by bound variables: bring all the occurrence of the same reference into the scope of a single defining that reference and replace all the (other) occurrences by bound variables.  In the case of universally and particularly bound variables, this is presumably already the case so far as actual reference agreement is expressed.  In the case of the salience quantifier, all of the exact repetition or repetitions with different variables but the same intended referent can be brought together into a single quantifier expression with the repetitions replaced by the sited variable -- provide that this does not involve moving a bound variable outside the scope of its quantifier.  This comes from the general fact that the salience quantifier moves freely across most syntactic boundaries, excepting especially binding (and, as we will see, worlds).  And that a string of salience quantifiers with the same referent together reduce to (the latest) one -- once the variables are brought into line. 

<Examples>

Along the same line, if the same group of terms serve as argument to several predicates (not necessarily all with all, of course), the terms can all be pulled to the front (under the usual condition about not changing binding or the nature of the bond) and their various occurrence replaced by  occurrences of the cited variables. This is probably done more or less automatically in many cases, where the connections between the predicates are simple, say.

<Examples>


Sometimes, the repetition is between a term and a predicate  In a simple case like "Every time John goes the store, Alice does, too", the English solution -- substituting the propredicate "do"  for the predicate "goes to the store" -- is virtually automatic.  The best Xorban solution (courtesy of the Engelang thread on Termsets) seems to follow the same pattern, with the first he marking what follows to be the repeated, the second marking where the repetition is to go.  ra le qdjanqe he fa li zrci klmei lo qalisqo he.The bit before each he is what changes from occurrence to occurrence of the repetition (mutatis mutandis).  There are some more complicated (and controversial) cases to be discussed later.

The final case (to be discussed here) is when two or more sets of terms occur in the same places in two or more occurrences of the same predicate. These situations lend themselves to two rather different lines of solution, some cases more readily for one than the other.  The first is simply a variant of the propredicate solution used just above.  The second is to treat each set of terms as a unit and arrange them around the predicate used once and without a propredicate.  So "If Bob goes to Chacago then Mary will go to Detroit"
would become, with he, ja na la qbaba le qcikagoqe he klmake li qmarisqi lo qditroitqo he.  Using termsets would give jana X la qbabqa  le qcikagoqeY Xla qmarisqa le qditroitqeY klmake, where X marks the beginning of each parallel terms set, and Y the end (probably some simplifications are possible here -- for example, the common predicate might be moved to the position of the first Y, making the reading somewhat clearer and saving at least one Y and maybe both).  In this example, the two approaches seem about equally efficient and so, since it is already needed elsewhere, te advantage lies with the propredicate, but as plans get more complicated, the efficiency advantage shifts to the termsets (though intelligibility may yet be a countering factor): "If Bob or Bill goes to Washington or Chicago then Sally and Harry go to Detroit" seems to require at least five "goes to" (one and four propredicates), while the termset version takes only one (but an uncertain number of Ys): ja na X ja X la qbabqa Y X la qbilqaY ja X le qwacintynqe X lecikagoqe Y klmake X je X la qsaliqa X la qhariqa Y le qditroitqe.   Of course, the "termset" notation is not restricted to terms: "Some but not all cats are black" becomes je X sa Xna ra mlta xkra.

No comments:

Post a Comment