Monday, September 24, 2012

Sentential fusion

Aside from anaphora, the chief way that languages achieve efficiency is by fusing sentences at their common parts, replacing conjoined sentences by single sentences with conjoined parts and thus eliminating  repetition: from "Dogs eat and dogs drink"  to "Dogs eat and drink" for example.  Given the SOV (or OSV) nature of developing Xorban and other LoCCan3s, examples like this are fairly easy: place the arguments specifiers out front and then conjoin the predicate bits: la grka je ctka pnxa. However, when it is the arguments that vary while the predicate remains constant, he task is a little harder -- at least less obvious.

To start with a simple case: "Dogs and cats eat meat".  Even leaving "eat meat" as unanalyzed for now, How to begin: je la grka le mlte ctkV runs into two problems immediately: it parses as incomplete ((lg(lmc)) and it is unclear what vowel to put at the end of ctk, since either a or e will leave the other group out.  Dropping the je would relieve the first problem but make the two quantifiers subordinate one to another rather than parallel -- which doesn't matter for l, but would for others -- and doesn't solve the V problem.  When dealing only with l, as here, it is tempting to think that, since the two lines are parallel, we could give them the same variable: la grka la mlta ctka.  But the loss of the overt parallelism (or, rather, the failure to show it at all) means that the variable a has officially been hastily rebound and we end up talking only about cats, with a vacuous mention of dogs.  Or we could fuse the two groups into one la grka le mlte li gnikake ctki.  But this solution only works for sentences joined by "and". We need a solution that will work for "or" or even "if ... then ... ."  And one that can be generalized to more than a couple of arguments.

It would seem that, to do this, we have to move away from the strictly logical into encryption of some sort.  In the present system, no strictly logical system works, since the requirements for fusion interfere with the logical structure based on separation.  So we need to encode both the separation and the fact that the separated terms are parallel but share the predicate under a certain truth functional situation.  This looks like a three step process: 1) mark off the separate parts, 2) join them in the appropriate sentential way, and 3) indicate that they come together at the predicate.  So we mark each block with X (tbd), join blocks with  connectives (I think the sentential ones can be used here unambiguously -- if followed immediately by a block, it joins blocks), an then used some fusion of variables to mark the coincidence at the predicate.  The easiest fusion of variable is a new variable which is just V1'V2 (...) for those used before.  There are a couple of problems: the preassigned cases of a'e and o'e, but those can probably be worked around some how.  Of course, the new variables are not real -- they are strictly unbound and cannot be bound, they are mere code.  So we could end up with je X le grke X lo mlto ctke'o.  Of course, on a syllable count, it is not clear that this is an actual saving over je la grka ctka le mlte ctke but perhaps more complex cases give further advantages.

This could easily be expanded to the next more complex cases, where subjects and objects run in parallel, by just marking the ends of blocks as well as the start.  Again, the question of actual efficiency arises; ordinary languages just leave stuff out, without marking the absence in any formal way.  But we can go from  je la mlta le ldre pnxake li xrmi lo djco pnxiko to je X la mlta le ldre Y X li xrmi lo djco Y pnxa'eki'o (correcting for the a'e conflict).  And this can be carried to almost any length, with, I fear, increasing loss of clarity or strain on memory: ja na X la ma 'djan' le me 'roma' li mi 'lyndyn' Y lo vnjo X lu mu 'frank' la'o ma'o 'paris' le'a ma'u 'berlin' Y klma'uke'a'oki'a'uke'oko. Aloha!

9/25/12
A summary of the thread on this subject called my attention to the fact that the parallel blocks are parallel and that the problem of rebinding thus soes not occur.  So, the combining variables are not needed, as the same ones can be used in both blocks.  So, we can reduce the first example to je X la grka X la mlta ctka.  And, indeed, we could continue with X le ldre X le djce pnxake.   This now begins to show some signs of real savings.  The last example would be the no so Hawai'ian  ja na Xla ma 'djan' le me 'roma' li mi 'lyndyn X la ma 'frank' le me 'paris; li mi 'berlin' Y lo vnjo klmakekiko'eko. (I assume that klama will actually be simplified from five arguments eventually).  Only one Y was needed here, since the second X  automatically also closes the first block.

10/1/12
The notion of dropping the end marker for the first of a pair of parallels fails because of the possibility of further parallels within a sing le one:  If Bob or Frank goes to New York or Chicago, then Harry or Sally goes to Los Angeles: ja na XjaX la ma 'bab' Y Xla ma 'frank' Y ja X le me 'nuyork' Y X le me 'cikago'YY X ja X la ma 'hari' YX la ma 'sali'Y le me 'lasandjelis' Y klmake (surely some of these markers can be elided -- to be worked on).

10/7/12

Starting more or less afresh.  The goal here is to reduce as much as practicable the amount of literal repetition in utterances without losing the logical structure (though typically burying it a bit).  There are several cases (neither exclusive nor exhaustive) that deserve attention.

The easiest of these are cases where the same reference is made several times -- the usual case for anaphoric pronouns.  The solution here is inherent in the construction of Xorban, where terms are replaced by bound variables: bring all the occurrence of the same reference into the scope of a single defining that reference and replace all the (other) occurrences by bound variables.  In the case of universally and particularly bound variables, this is presumably already the case so far as actual reference agreement is expressed.  In the case of the salience quantifier, all of the exact repetition or repetitions with different variables but the same intended referent can be brought together into a single quantifier expression with the repetitions replaced by the sited variable -- provide that this does not involve moving a bound variable outside the scope of its quantifier.  This comes from the general fact that the salience quantifier moves freely across most syntactic boundaries, excepting especially binding (and, as we will see, worlds).  And that a string of salience quantifiers with the same referent together reduce to (the latest) one -- once the variables are brought into line. 

<Examples>

Along the same line, if the same group of terms serve as argument to several predicates (not necessarily all with all, of course), the terms can all be pulled to the front (under the usual condition about not changing binding or the nature of the bond) and their various occurrence replaced by  occurrences of the cited variables. This is probably done more or less automatically in many cases, where the connections between the predicates are simple, say.

<Examples>


Sometimes, the repetition is between a term and a predicate  In a simple case like "Every time John goes the store, Alice does, too", the English solution -- substituting the propredicate "do"  for the predicate "goes to the store" -- is virtually automatic.  The best Xorban solution (courtesy of the Engelang thread on Termsets) seems to follow the same pattern, with the first he marking what follows to be the repeated, the second marking where the repetition is to go.  ra le qdjanqe he fa li zrci klmei lo qalisqo he.The bit before each he is what changes from occurrence to occurrence of the repetition (mutatis mutandis).  There are some more complicated (and controversial) cases to be discussed later.

The final case (to be discussed here) is when two or more sets of terms occur in the same places in two or more occurrences of the same predicate. These situations lend themselves to two rather different lines of solution, some cases more readily for one than the other.  The first is simply a variant of the propredicate solution used just above.  The second is to treat each set of terms as a unit and arrange them around the predicate used once and without a propredicate.  So "If Bob goes to Chacago then Mary will go to Detroit"
would become, with he, ja na la qbaba le qcikagoqe he klmake li qmarisqi lo qditroitqo he.  Using termsets would give jana X la qbabqa  le qcikagoqeY Xla qmarisqa le qditroitqeY klmake, where X marks the beginning of each parallel terms set, and Y the end (probably some simplifications are possible here -- for example, the common predicate might be moved to the position of the first Y, making the reading somewhat clearer and saving at least one Y and maybe both).  In this example, the two approaches seem about equally efficient and so, since it is already needed elsewhere, te advantage lies with the propredicate, but as plans get more complicated, the efficiency advantage shifts to the termsets (though intelligibility may yet be a countering factor): "If Bob or Bill goes to Washington or Chicago then Sally and Harry go to Detroit" seems to require at least five "goes to" (one and four propredicates), while the termset version takes only one (but an uncertain number of Ys): ja na X ja X la qbabqa Y X la qbilqaY ja X le qwacintynqe X lecikagoqe Y klmake X je X la qsaliqa X la qhariqa Y le qditroitqe.   Of course, the "termset" notation is not restricted to terms: "Some but not all cats are black" becomes je X sa Xna ra mlta xkra.

Tuesday, September 18, 2012

What the l

Once upon a time in the history of Logjam, it was carved in Silly-putty for all eternity that lo broda  meant su'o lo ro broda,  Like Manicheanism in Christianity, this heresy was quickly quelled and yet continues to affect many sects within the loglang community.  It lingers in the cult that requires that lo broda always refers to the same thing, whether Mr. Broda or brodahood or just the solid lump of Broda.  It is behind most of the other views as well, though usually with more emphasis on the su'o.  The question is, do these different cliques have enough in common to make a language which is indifferent to their various claims.  The general success of xorlo, which is based in one group, but whose basis is generally ignored (or not understood), suggests that it they have.  This is an inquiry into whether there are any sentences which would be true on on ontology but not on some other.  I needs must also take into account the kind of mental adjustments that each group has to make to accommodate the sayings of another group in the vulgar (i.e., not minutely accurate) form.  In keeping with the times, I will attempt to do this in terms of emerging Xorban, in particular, taking the quantifier l, rather than the term-maker lo from the older languages.

From familiarity, I begin by considering a relatively conservative ontology in which the universe of discourse consists of discrete things, which are grouped by predicates into classes and relations and the like.  Variables range over exactly these, but may be instantiated to several at once (or they range of L-sets of these -- the distinction does not make a formal difference, whatever it may mean materially).  Since there are no terms but variables in this system, the question of truth comes down to whether values taken on by a variable in a context fall in the class assigned to the attached predicate.  In the case of sa Ra Pa and la Ra Pa, it is enough that some such values work.  For ra Ra Pa they all must.  The difference between l and s is that the one requires the same values each time, the other does not, beyond certain syntactically defined limits. 

The question of whether a value is in the set defined by a predicate is not an entirely straightforward matter.  There are at least three cases to consider (using L-set language for simplicity): the value as a whole is in the set assigned the predicate, though not all its members are; all its members are (and perhaps the set as a whole as well); or some (but not necessarily) all its members are.  This last includes cases where the some members for classes which are in as classes).  Generally, these distinctions are ignored, since it is usually obvious from the nature of the predicate which is involved or it does not really matter.  When it is an issue, however, devices are needed to indicate which is the appropriate sense of "satisfy" that is intended. As a rough rule, when precision begins to be an issue, l alone indicates the first case, a subordinate r the second and a subordinate s the third.  I suppose more explicit modifiers might also be used, especially when the set is referred to several times in the following sentence and thus might be involved in different ways.

I cannot speak with confidence about the other ontologies.  So far as I can see, the Mr. Broda story and the lump of Broda story do not different formally very much from the above plan.  To be sure, the "class' is always the same on these views and the "members" are derivative from it rather than the other way around, but the way that sentences behave when total precision is not the issue seems not to differ much.   To be sure, a powder puff tail may count as a rabbit in the lump Rabbit class, is unlikely to fare any differently in comparison to other ontologies: if it's all you see, it probably counts as a rabbit; if it's all you catch, it probably won't.   Whether the same quantifier expansions would have th4e dsame effect (or even be intelligible) on these other views is unclear.

There are indications that the layers of concepts ontology also come down to the same formality.  But I confess that I can't see how to fit myopic singulars in at all, mainly because I just do not understand the notion.  My chief problem (though probably not the only one, if this were removed) is the claim that la Ra Pa entails ra Ra Pa <=> sa Ra Pa, that there is either only one or none at all Rs (or that they are totally uniform, at least with respect to  P).  This becomes puzzling, I think, because it appears from other things said that la Ra encompasses all Rs.  If the expressions brings in only a portion of the Rs (or if P is very general) there does not seem to be a problem, though the insistence on the formula seems forced.  I am sure there is something more here that I am missing  but don't (obviously) see what it can be.

9/26/12
Some remarks today, though rather opaque, encouraged my belief that we are all about the same pattern with very different words and images.  The short form seems to be:
   The extension of a predicates is a thing which satisfies that predicate in some way
   This thing is broken down using some form of the jest relation (variously interpreted according to the case, but formally the same), generating a join semi-lattice of things, each of which also satisfies the predicate in some way (maybe the same, maybe differently).
   The quantifiers r, s, l  indicate things in the lattice b  In a particular case, Qa Ra Pa says that the selected thing in the R lattice also satisfies P in some way (is also a node in the P lattice).
   The way of satisfaction may be spelled out but typically will not, as being obvious.


Friday, September 14, 2012

Xorban: commentary

Xorban is a project to build a loglang from scratch but keeping in mind both the ideals and the problems of Loglan and Lojban.  Among those ideals on the language side are the unique decomposability of the speech stream into words, the unique assignment of worlds to grammatical roles and the unique parsing of every utterance.  On the logic side, the chief goal is that the logical structure of each sentence -- and of the discourse as a whole -- be obvious at a glance.  In all of this, the aim is not to be significantly longer than say raw English.  The inherited problems are many, but the ones that have come to be dealt with first include such core matters as the need for unambiguous anaphora, the uncertainty of the limits of the scope of quantifiers (and, perhaps, other operators), and, at the more peripheral, technical issues like "donkey sentences" and statistical claims and parallel arguments. What follows is an attempt to bring up to date all the definite points established in the roughly 600 messages on Engelang in the last month and a half (as of 9/14/12) and lay out some of the disputes that have arisen and some of the tasks that remain.

To provide for the unique decomposability, Xorban continues the tradition of unique phonological forms for each type of linguistic item.  Although phonology and morphology are not yet being worked on, temporary forms are being used in discussions and example.  Under the present scheme, predicates are strings of three or more consonants CCC*, either borrowings from Lojban with the vowels removed, or schematic forms like bcd and fgh.  Variable, the only terms, vowels or V'V sequences V('V)*.  Quantifiers and connectives and other special operators are CV and a few special predicates with mainly logical functions are CC only.  Again, many of these bear suggestive relations to Lojban words.

For unique parsability, the proven structure of FOPL is adhered to,  For efficiency (keeping things a short as possible), the propositional connective are used in Polish Notation form, each connective connects the following two (or, occasionally, one) sentences, so no parentheses are required to indicate grouping.  This requires thinking through what you are to say before you say it -- probably not a bad requirement for a logical language.  The sentential connectives are one usual set, AND (currently je) IOR (ja), and NOT (na), with a couple of interesting additons we will discuss directly.  The quantifiers are restricted, that is, the quantifier atttaches not merely to a variable but to a variable and a sentence containing that variable free and restricts the ran ge of that variable to those things which satisfy that sentence (more details on this later).  The quantifiers are the standard ALL (r followed by the vowel which represents the variable) and SOME  (s) and a third CERTAIN (say) (l) that is meant to form constants through a following context.  This last is controversial and will turn up prominently in the discussion of disputed topics later.  Finally, but importantly, every predicate has a fixed number of terms and all of those terms must be in place.  Various shortcuts have been and will no doubt be further introduced to reduce some of the redundancy of irrelevance of some of these requirements.

So, at the core of every Xorban utterance (so far) is a simple formula, an n-place predicate and n variable, separated, for now, by k: bcdakeki, say, Fxyz in familiar logical terms.  But such a unit is not strictly speaking a sentence, since variables don't designate things and thus this does not say anything about anything in particular.  To bring this into focus, then, we need to add quantifiers which bind the variables and restrict them to particular sorts of things.  So, a fully formed sentence would be like ra cdfa se dfg e li fghi bcdakeki , [AxGx][SyHy][\zJz]Fxyz.  "Each of the Gs stand in relation F to some of the Hs and certain of the Js".  It is worth noting that the position of the ALL and the SOME are fixed in these formulas, once set down, but that of CERTAIN is not, so that in the above case, the Hs which stand F to a particular G may be different from the one that are in that relation to another G, the Js are the same throughout -- what it means to say that l, CERTAIN, forms a constant.  It also means that the same claim could be written as  li fghi ra cdfa se dfge bcdakeki, [\z Jz][Ax Gx][SxHx] Fxyz.

Of course, the formula to which the quantifier attaches -- nor the restriction formula of the quantifier -- need not be a simple formula.  Simple formulae can be negated, na bcdakeki, or two can be joined together with sentential connectives, ja bcdakeki jkla'ake'eko'e.  Nor need all the variables be quantified at once.  Thus, any of these processes, quantifying, connecting, and negating, can be done in any order and any number of times, eventually creating sentences, but then going on to compound sentences from there.

So far, Xorban is a complete FOPL in a fairly efficient format, ignoring the things that are moving toward a language: predicates distinguished by form rather than type face, variables  and predicates that run on for several characters, and the need to separate the various variable that attach to a predicate with k.  The next few features of Xorban seem to be also moving in the language direction -- indeed, most of what will happen with Xorban (or any loglang) after this point is directed toward the language side.  The logic is (pretty much -- we may want to go beyond FOPL some, as Lojban does) done.  The trick will always be to add these language features without losing the parsing uniqueness of the logic.

Item 1.  In addition to na, negation, Xorban has two other unary sentential connectives, ni, affirmation< "it is the case that", and nu, tautology "whether or not it is the case that".  Since these behave syntactically just like na, they present no problems for parsing.  Their exact purpose is somewhat obscure at the moment, however.  ni is strictly redundant, apparently, since an unnegated declarative sentence declares itself to be the case.  One can imagine a rhetorical use, however, to reassert the correctness of a challenged claim.  The second, nu, doesn't seem to have a use on its own at all.  It might be used with je, however, to reproduce the effect of Lojban  binary ju and its variants (though those have never played much of a role).  These perhaps lay the groundwork for some effects later.  Right now they are merely curiosities.

Item 2.  In addition to the usual connectives noted, Xorban has ju.  This is AND with the additional twist that the two sentences connected describe two events that are inherently part of a single event.  The example, of a person reading a poem and another person hearing that poem being parts of the single event of one person reading that poem TO the other person, shows the idea well.  And also indicates the probable usefulness of the connective: keeping the number of separate predicates and the number of arguments for each predicate down by treating complex events as composed of partial events joined together (whether this is more efficient or not is just unclear, as is whether this line will be pursued in later developments).

Item 3.  On much the same line, Xorban now has unary operators which make new predicates from old formulae or sentences.  bV is the agentive marker: if cdfa has come to pass, be cdfa says that e made it come to pass.  Similarly, f makes a predicate true of the following state of affairs: if bcd a has come to pass, then fe bcda says that e is what has come to pass.

Item 4.  So far, terms have all been variables, bound ones at that.  The operator m takes a term and a word and claims that the referent of the term is called by the word: ma 'John' says that a is called 'John' (following the Lojbanic style, the example is usually 'djn'.  At the moment, however, this is just an interesting fact without any linguistic significance, because we are not allowed to replace a by John anywhere.  The best we can do is introduce la ma 'John' and then use this, bound, a ever after (this is, of course, standard Lojban la djan with the understanding that  la is a sort of lo).  While this looks counterproductive, it does point toward the development of an anaphora system, where names and other descriptions are introduced, followed up by their variables for a while and then refreshed with a new use of the description, exactly the same as before, say, to carry human memory on for a spell.  The new description can be viewed as the old one, since l moves smoothly over all manner of intervening objects (well, maybe not all, but more on that later) so all of them can be taken as occurring at the first point, where duplicated quantifiers reduce to a single one.  It is even possible that some s might be picked up in this process, matching the pattern of introductory "a man" and subsequent "the man".  But this has not been clearly developed yet.

Item 5.  The predicate making d is related to m, in that assigns a item rather arbitrarily to its argument.  In this case, the item is a predicate with a free space to take the term for d.  There is no requirement, however, that the argument have that property; it is merely a convenient indicator.  The effect of this mixed with l is, of course, the Lojban {le}.

Item 6.  However, some more or less constants have been fixed to work.  The personal pronouns I, you, we are always a'a, e'e, a'e. These are, however, official abbreviations for quantifier expressions, involving further new predicates.  For "I", the predicate is mslf ("myself"?), apparently only applicable to the speaker of the moment.  bcd a'a then expands to la'a mslf a'a bcd a'a, with the leftward exodus of l stopped at the change of speakers.  "You" is handled similarly with rslf ("yourself"?).  The inclusive "we" requires mn ("is a member of/among") and then as defined as la'e je mna'aka'e mne'eka'e, a bunch to which both I and you belong.   There is also a defined gap-filler, o'e, for those arguments that we don't care about but which have to be present so that a predicate has all its arguments.  This uses the vacuous restriction sm ("something"?) and is given as lo'e smo'e.  But this is clearly wrong, since that would make the gap filler the same everywhere, which is clearly not the intention.  It would be better as so'e smo'e with the understanding that this quantifier should be absolutely next to the predicate involved.  However, even this will not quite work, since two gap-fillers in the same predicate would still be equated, as is not generally the intention,  (These problems seem inheirited from Lojban {zo'e} and require a similar solution -- whatever that is.  Probably a separate set of V'V for this role.) [Even if there is only one sm -- and every other predicate, apparently -- the manner or mode or subspecies or whatever of that one is different in the two places and ought to be marked for ordinary, as opposed to transcendental or meta-, speech.]