ELG. Writing systems: Difference between revisions
No edit summary |
m (Gleki moved page eLG. Writing systems to ELG. Writing systems over a redirect without leaving a redirect: Text replace - "eLG" to "ELG") |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
=Writing systems= | |||
==What's a letteral, anyway?== | |||
{{ind|general-imported|letter|alphabet}} {{ind|general-imported|letteral|definition}} {{ind|general-imported|Brown|James Cooke|and "letteral"}} James Cooke Brown, the founder of the Loglan Project, coined the word “letteral” (by analogy with “numeral”) to mean a letter of the alphabet, such as “f” or “z”. A typical example of its use might be | |||
{{dsp|example-random-id-tvHm}} | |||
{{judri|c17e1d1}} | |||
{{ind|example|fourteen "e"s}} | |||
There are fourteen occurrences of the letteral “e” in this sentence. | |||
{{ind|general-imported|lerfu|definition}} (Don't forget the one within quotation marks.) Using the word | |||
“letteral” avoids confusion with | |||
“letter”, the kind you write to someone. Not surprisingly, there is a Lojban gismu for | |||
“letteral”, namely | |||
{{vla|lerfu}}, and this word will be used in the rest of this chapter. | |||
{{ind|general-imported|alphabet|Latin used for Lojban}} {{ind|general-imported|Latin|alphabet of Lojban}} Lojban uses the Latin alphabet, just as English does, right? Then why is there a need for a chapter like this? After all, everyone who can read it already knows the alphabet. The answer is twofold: | |||
{{ind|general-imported|alphabet|words for letters in|rationale}} First, in English there are a set of words that correspond to and represent the English lerfu. These words are rarely written down in English and have no standard spellings, but if you pronounce the English alphabet to yourself you will hear them: ay, bee, cee, dee ... . They are used in spelling out words and in pronouncing most acronyms. The Lojban equivalents of these words are standardized and must be documented somehow. | |||
{{ind|general-imported|alphabets|words for non-Lojban letters|rationale}} Second, English has names only for the lerfu used in writing English. (There are also English names for Greek and Hebrew lerfu: English-speakers usually refer to the Greek lerfu conventionally spelled | |||
“phi” as | |||
“fye”, whereas | |||
“fee” would more nearly represent the name used by Greek-speakers. Still, not all English-speakers know these English names.) Lojban, in order to be culturally neutral, needs a more comprehensive system that can handle, at least potentially, all of the world's alphabets and other writing systems. | |||
Letterals have several uses in Lojban: in forming acronyms and abbreviations, as mathematical symbols, and as pro-sumti – the equivalent of English pronouns. | |||
{{ind|general-imported|letter|contrasted with word for the letter}} {{ind|general-imported|lerfu word|contrasted with lerfu}} {{ind|general-imported|lerfu|contrasted with lerfu word}} In earlier writings about Lojban, there has been a tendency to use the word {{vla|lerfu}} for both the letterals themselves and for the Lojban words which represent them. In this chapter, that tendency will be ruthlessly suppressed, and the term “lerfu word” will invariably be used for the latter. The Lojban equivalent would be {{jbo|lerfu valsi}} or {{vla|lervla}}. | |||
{{ssp|section-lerfu-liste}} | |||
==A to Z in Lojban, plus one== | |||
{{ind|general-imported|lerfu words|Lojban coverage requirement}} The first requirement of a system of lerfu words for any language is that they must represent the lerfu used to write the language. The lerfu words for English are a motley crew: the relationship between | |||
“doubleyou” and | |||
“w” is strictly historical in nature; | |||
“aitch” represents | |||
“h” but has no clear relationship to it at all; and | |||
“z” has two distinct lerfu words, | |||
“zee” and | |||
“zed”, depending on the dialect of English in question. | |||
{{ind|general-imported|lerfu word|for "'"}} {{ind|general-imported|lerfu words|for consonants}} {{ind|general-imported|lerfu words|for vowels}} {{ind|general-imported|lerfu words|formation rules}} All of Lojban's basic lerfu words are made by one of three rules: | |||
*to get a lerfu word for a vowel, add {{vla|bu}}; | |||
*to get a lerfu word for a consonant, add {{lerfu|y}}; | |||
*the lerfu word for {{lerfu|'}} is {{vla|.y'y}}. | |||
{{ind|general-imported|lerfu words|table of Lojban}} Therefore, the following table represents the basic Lojban alphabet: | |||
<!-- FIXME: should this list be displayed more like this: | |||
' a b c d e | |||
.y'y. .abu by. cy. dy. .ebu | |||
f g i j k l | |||
fy. gy. .ibu jy. ky. ly. | |||
m n o p r s | |||
my. ny. .obu py. ry. sy. | |||
t u v x y z | |||
ty. .ubu vy. xy. .ybu zy. | |||
--> | |||
;'''{{lerfu|'}}''':{{vla|.y'y.}} | |||
;'''{{lerfu|a}}''':{{vla|.abu}} | |||
;'''{{lerfu|b}}''':{{vla|by.}} | |||
;'''{{lerfu|c}}''':{{vla|cy.}} | |||
;'''{{lerfu|d}}''':{{vla|dy.}} | |||
;'''{{lerfu|e}}''':{{vla|.ebu}} | |||
;'''{{lerfu|f}}''':{{vla|fy.}} | |||
;'''{{lerfu|g}}''':{{vla|gy.}} | |||
;'''{{lerfu|i}}''':{{vla|.ibu}} | |||
;'''{{lerfu|j}}''':{{vla|jy.}} | |||
;'''{{lerfu|k}}''':{{vla|ky.}} | |||
;'''{{lerfu|l}}''':{{vla|ly.}} | |||
;'''{{lerfu|m}}''':{{vla|my.}} | |||
;'''{{lerfu|n}}''':{{vla|ny.}} | |||
;'''{{lerfu|o}}''':{{vla|.obu}} | |||
;'''{{lerfu|p}}''':{{vla|py.}} | |||
;'''{{lerfu|r}}''':{{vla|ry.}} | |||
;'''{{lerfu|s}}''':{{vla|sy.}} | |||
;'''{{lerfu|t}}''':{{vla|ty.}} | |||
;'''{{lerfu|u}}''':{{vla|.ubu}} | |||
;'''{{lerfu|v}}''':{{vla|vy.}} | |||
;'''{{lerfu|x}}''':{{vla|xy.}} | |||
;'''{{lerfu|y}}''':{{vla|.ybu}} | |||
;'''{{lerfu|z}}''':{{vla|zy.}} | |||
{{ind|general-imported|bu|effect on preceding word}} {{ind|general-imported|lerfu words|composed of compound cmavo}} {{ind|general-imported|lerfu words|composed of single cmavo}} {{ind|general-imported|lerfu words|vowel words contrasted with consonant words}} {{ind|general-imported|lerfu words|consonant words contrasted with vowel words}} {{ind|general-imported|lerfu words for vowels|pause requirement before}} There are several things to note about this table. The consonant lerfu words are a single syllable, whereas the vowel and | |||
{{lerfu|'}} lerfu words are two syllables and must be preceded by pause (since they all begin with a vowel). Another fact, not evident from the table but important nonetheless, is that | |||
{{vla|by}} and its like are single cmavo of selma'o BY, as is | |||
{{vla|.y'y}}. The vowel lerfu words, on the other hand, are compound cmavo, made from a single vowel cmavo plus the cmavo | |||
{{vla|bu}} (which belongs to its own selma'o, BU). All of the vowel cmavo have other meanings in Lojban (logical connectives, sentence separator, hesitation noise), but those meanings are irrelevant when | |||
{{vla|bu}} follows. | |||
Here are some illustrations of common Lojban words spelled out using the alphabet above: | |||
{{dsp|example-random-id-qHRb}}{{example|interlinear-gloss-example|example-random-id-qHRb}} | |||
{{judri|c17e2d1}} | |||
:'''ty. .abu ny. ry. .ubu''' | |||
:''“t” “a” “n” “r” “u”'' | |||
{{dsp|example-random-id-qhrx}}{{example|interlinear-gloss-example|example-random-id-qhrx}} | |||
{{judri|c17e2d2}} | |||
:'''ky. .obu .y'y. .abu''' | |||
:''“k” “o” “'” “a”'' | |||
{{ind|general-imported|lerfu words|effect of systematic formulation}} {{ind|general-imported|spelling out words|Lojban contrasted with English in usefulness}} Spelling out words is less useful in Lojban than in English, for two reasons: Lojban spelling is phonemic, so there can be no real dispute about how a word is spelled; and the Lojban lerfu words sound more alike than the English ones do, since they are made up systematically. The English words “fail” and “vale” sound similar, but just hearing the first lerfu word of either, namely “eff” or “vee”, is enough to discriminate easily between them – and even if the first lerfu word were somehow confused, neither “vail” nor “fale” is a word of ordinary English, so the rest of the spelling determines which word is meant. Still, the capability of spelling out words does exist in Lojban. | |||
{{ind|general-imported|lerfu words ending with "y"|pause after|rationale}} Note that the lerfu words ending in | |||
{{lerfu|y}} were written (in {{lex|example-random-id-qHRb}} and {{lex|example-random-id-qhrx}}) with pauses after them. It is not strictly necessary to pause after such lerfu words, but failure to do so can in some cases lead to ambiguities: | |||
{{dsp|example-random-id-6dMS}}{{example|interlinear-gloss-example|example-random-id-6dMS}} | |||
{{judri|c17e2d3}} | |||
:'''mi cy. claxu''' | |||
:<code>I lerfu-“c” without</code> | |||
:''I am without (whatever is referred to by) the letter “c”.'' | |||
without a pause after {{vla|cy}} would be interpreted as: | |||
{{dsp|example-random-id-qBLA}}{{example|interlinear-gloss-example|example-random-id-qBLA}} | |||
{{judri|c17e2d4}} | |||
:'''micyclaxu''' | |||
:<code>(Observative:) doctor-without</code> | |||
:''Something unspecified is without a doctor.'' | |||
A safe guideline is to pause after any cmavo ending in | |||
{{lerfu|y}} unless the next word is also a cmavo ending in | |||
{{lerfu|y}}. The safest and easiest guideline is to pause after all of them. | |||
{{ssp|section-upper-case}} | |||
==Upper and lower cases== | |||
{{ind|general-imported|lower case letters|use in Lojban}} {{ind|general-imported|capital letters|use in Lojban}} {{ind|general-imported|stress|irregular marked with upper-case}} {{ind|general-imported|lower-case letters|English usage contrasted with Lojban}} {{ind|general-imported|lower-case letters|Lojban usage contrasted with English}} {{ind|general-imported|upper-case letters|English usage contrasted with Lojban}} {{ind|general-imported|upper-case letters|Lojban usage contrasted with English}} Lojban doesn't use lower-case (small) letters and upper-case (capital) letters in the same way that English does; sentences do not begin with an upper-case letter, nor do names. However, upper-case letters are used in Lojban to mark irregular stress within names, thus: | |||
{{dsp|example-random-id-Fam2}}{{example|interlinear-gloss-example|example-random-id-Fam2}} | |||
{{judri|c17e3d1}} | |||
:'''.iVAN.''' | |||
:''the name “Ivan” in Russian/Slavic pronunciation.'' | |||
{{ind|general-imported|case|upper/lower specification}} {{ind|general-imported|lower-case|lerfu word for}} {{ind|general-imported|upper-case|lerfu word for}} It would require far too many cmavo to assign one for each upper-case and one for each lower-case lerfu, so instead we have two special cmavo {{vla|ga'e}} and {{vla|to'a}} representing upper case and lower case respectively. They belong to the same selma'o as the basic lerfu words, namely BY, and they may be freely interspersed with them. | |||
{{ind|general-imported|lower-case word|effect on following lerfu words}} The effect of | |||
{{vla|ga'e}} is to change the interpretation of all lerfu words following it to be the upper-case version of the lerfu. An occurrence of {{vla|to'a}} causes the interpretation to revert to lower case. Thus, {{jbo|ga'e .abu}} means not “a” but “A”, and Ivan's name may be spelled out thus: | |||
{{dsp|example-random-id-q6pw}}{{example|interlinear-gloss-example|example-random-id-q6pw}} | |||
{{judri|c17e3d2}} | |||
:'''.ibu ga'e vy. .abu ny. to'a''' | |||
:<code>i [upper] V A N [lower]</code> | |||
The cmavo and compound cmavo of this type will be called “shift words”. | |||
{{ind|general-imported|shift word|scope}} How long does a shift word last? Theoretically, until the next shift word that contradicts it or until the end of text. In practice, it is common to presume that a shift word is only in effect until the next word other than a lerfu word is found. | |||
{{ind|general-imported|shift|single-letter|grammar of}} {{ind|general-imported|shift word|for single letter}} It is often convenient to shift just a single letter to upper case. The cmavo | |||
{{vla|tau}}, of selma'o LAU, is useful for the purpose. A LAU cmavo must always be immediately followed by a BY cmavo or its equivalent: the combination is grammatically equivalent to a single BY. (See {{ls|section-lerfu-cmavo-summary}} for details.) | |||
{{ind|general-imported|chemical elements|use of single-letter shift for}} A likely use of | |||
{{vla|tau}} is in the internationally standardized symbols for the chemical elements. Each element is represented using either a single upper-case lerfu or one upper-case lerfu followed by one lower-case lerfu: | |||
{{dsp|example-random-id-qhS7}}{{example|interlinear-gloss-example|example-random-id-qhS7}} | |||
{{judri|c17e3d3}} | |||
:'''tau sy.''' | |||
:<code>[single shift] S</code> | |||
:''S (chemical symbol for sulfur)'' | |||
{{dsp|example-random-id-qhsD}}{{example|interlinear-gloss-example|example-random-id-qhsD}} | |||
{{judri|c17e3d4}} | |||
:'''tau sy. .ibu''' | |||
:<code>[single shift] S i</code> | |||
:''Si (chemical symbol for silicon)'' | |||
{{ind|general-imported|single-letter shift|as toggle}} If a shift to upper-case is in effect when | |||
{{vla|tau}} appears, it shifts the next lerfu word only to lower case, reversing its usual effect. | |||
{{ssp|section-bu}} | |||
==The universal {{vla|bu}}== | |||
{{ind|general-imported|lerfu word set extension|with bu}} {{ind|general-imported|bu|for extension of lerfu word set}} So far we have seen | |||
{{vla|bu}} only as a suffix to vowel cmavo to produce vowel lerfu words. Originally, this was the only use of | |||
{{vla|bu}}. In developing the lerfu word system, however, it proved to be useful to allow | |||
{{vla|bu}} to be attached to any word whatsoever, in order to allow arbitrary extensions of the basic lerfu word set. | |||
{{ind|general-imported|fa'o|interaction with bu}} {{ind|general-imported|su|interaction with bu}} {{ind|general-imported|sa|interaction with bu}} {{ind|general-imported|si|interaction with bu}} {{ind|general-imported|lo'u|interaction with bu}} {{ind|general-imported|la'o|interaction with bu}} {{ind|general-imported|zoi|interaction with bu}} {{ind|general-imported|zo|interaction with bu}} {{ind|general-imported|zei|interaction with bu}} {{ind|general-imported|za'e|interaction with bu}} {{ind|general-imported|ba'e|interaction with bu}} {{ind|general-imported|bu|interaction with ba'e}} {{ind|general-imported|bu|and compound cmavo}} {{ind|general-imported|bu|grammar of}} Formally, | |||
{{vla|bu}} may be attached to any single Lojban word. Compound cmavo do not count as words for this purpose. The special cmavo | |||
{{vla|ba'e}}, {{vla|za'e}}, {{vla|zei}}, {{vla|zo}}, {{vla|zoi}}, {{vla|la'o}}, {{vla|lo'u}}, {{vla|si}}, {{vla|sa}}, {{vla|su}}, and {{vla|fa'o}} may not have {{vla|bu}} attached, because they are interpreted before {{vla|bu}} detection is done; in particular, | |||
{{dsp|example-random-id-WvFu}}{{example|interlinear-gloss-example|example-random-id-WvFu}} | |||
{{judri|c17e4d1}} | |||
{{ind|example|word "bu"}} | |||
:'''zo bu''' | |||
:''the word “bu”'' | |||
{{ind|lojban-word-imported|bubu}} {{ind|general-imported|names|pause requirement in lerfu words}} {{ind|general-imported|bu|effect of multiple}} is needed when discussing {{vla|bu}} in Lojban. It is also illegal to attach {{vla|bu}} to itself, but more than one {{vla|bu}} may be attached to a word; thus {{jbo|.abubu}} is legal, if ugly. (Its meaning is not defined, but it is presumably different from | |||
{{vla|.abu}}.) It does not matter if the word is a cmavo, a cmene, or a brivla. All such words suffixed by {{vla|bu}} are treated grammatically as if they were cmavo belonging to selma'o BY. However, if the word is a cmene it is always necessary to precede and follow it by a pause, because otherwise the cmene may absorb preceding or following words. | |||
{{ind|general|happy face|example}} {{ind|general|smiley face|example}} | |||
{{ind|general-imported|logograms|words for}} {{ind|general-imported|smiley face|word for}} {{ind|general-imported|unusual characters|words for}} The ability to attach | |||
{{vla|bu}} to words has been used primarily to make names for various logograms and other unusual characters. For example, the Lojban name for the “happy face” is {{jbo|.uibu}}, based on the attitudinal {{vla|.ui}} that means “happiness”. Likewise, the “smiley face”, written “:-)” and used on computer networks to indicate humor, is called {{jbo|zo'obu}} The existence of these names does not mean that you should insert {{jbo|.uibu}} into running Lojban text to indicate that you are happy, or {{jbo|zo'obu}} when something is funny; instead, use the appropriate attitudinal directly. | |||
{{ind|general|ampersand|example}} | |||
{{ind|general-imported|ampersand character|word for}} {{ind|general-imported|"&"|word for}} Likewise, | |||
:{{jbo|joibu}} represents the ampersand character, “&”, based on the cmavo {{vla|joi}} meaning “mixed and”. Many more such lerfu words will probably be invented in future. | |||
{{ind|general-imported|"|"|word for}} {{ind|general-imported|"."|word for}} {{ind|general-imported|syllable break|word for}} {{ind|general-imported|pause|word for}} {{ind|general-imported|syllable break|symbol for}} {{ind|general-imported|pause|symbol for}} The {{lerfu|.}} and {{lerfu|,}} characters used in Lojbanic writing to represent pause and syllable break respectively have been assigned the lerfu words {{jbo|denpa bu}} (literally, “pause bu”) and {{jbo|slaka bu}} (literally, “syllable bu”). The written space is mandatory here, because {{vla|denpa}} and {{vla|slaka}} are normal gismu with normal stress: {{jbo|denpabu}} would be a fu'ivla (word borrowed from another language into Lojban) stressed {{jbo|denPAbu}}. No pause is required between | |||
{{vla|denpa}} (or {{vla|slaka}}) and {{vla|bu}}, though. | |||
{{ssp|section-alien-alphabets}} | |||
==Alien alphabets== | |||
As stated in | |||
{{ls|section-letterals-introduction}}, Lojban's goal of cultural neutrality demands a standard set of lerfu words for the lerfu of as many other writing systems as possible. When we meet these lerfu in written text (particularly, though not exclusively, mathematical text), we need a standard Lojbanic way to pronounce them. | |||
There are certainly hundreds of alphabets and other writing systems in use around the world, and it is probably an unachievable goal to create a single system which can express all of them, but if perfection is not demanded, a usable system can be created from the raw material which Lojban provides. | |||
{{ind|general|alpha|example}} | |||
{{ind|general-imported|letters|non-Lojban|representation with names}} One possibility would be to use the lerfu word associated with the language itself, Lojbanized and with | |||
{{vla|bu}} added. Indeed, an isolated Greek | |||
“alpha” in running Lojban text is probably most easily handled by calling it | |||
:{{jbo|.alfas. bu}}. Here the Greek lerfu word has been made into a Lojbanized name by adding | |||
{{lerfu|s}} and then into a Lojban lerfu word by adding | |||
{{vla|bu}}. Note that the pause after | |||
:{{jbo|.alfas.}} is still needed. | |||
{{ind|general-imported|letters|non-Lojban|representation with consonant-word + bu}} Likewise, the easiest way to handle the Latin letters | |||
“h”, “q”, and “w” that are not used in Lojban is by a consonant lerfu word with {{vla|bu}} attached. The following assignments have been made: | |||
;'''{{jbo|.y'y.bu}}''':h | |||
;'''{{jbo|ky.bu}}''':q | |||
;'''{{jbo|vy.bu}}''':w | |||
As an example, the English word “quack” would be spelled in Lojban thus: | |||
{{dsp|example-random-id-0oAR}}{{example|interlinear-gloss-example|example-random-id-0oAR}} | |||
{{judri|c17e5d1}} | |||
{{ind|example|quack|example}} | |||
:'''ky.bu .ubu .abu cy. ky.''' | |||
:''“q” “u” “a” “c” “k”'' | |||
{{ind|general-imported|letters|symbol contrasted with sound for spelling}} {{ind|general-imported|letters|sound contrasted with symbol for spelling}} Note that the fact that the letter | |||
“c” in this word has nothing to do with the sound of the Lojban letter | |||
{{lerfu|c}} is irrelevant; we are spelling an English word and English rules control the choice of letters, but we are speaking Lojban and Lojban rules control the pronunciations of those letters. | |||
A few more possibilities for Latin-alphabet letters used in languages other than English: | |||
;'''{{jbo|ty.bu}}''':þ (thorn) | |||
;'''{{jbo|dy.bu}}''':ð (edh) | |||
However, this system is not ideal for all purposes. For one thing, it is verbose. The native lerfu words are often quite long, and with | |||
{{vla|bu}} added they become even longer: the worst-case Greek lerfu word would be | |||
:{{jbo|.Omikron. bu}}, with four syllables and two mandatory pauses. In addition, alphabets that are used by many languages have separate sets of lerfu words for each language, and which set is Lojban to choose? | |||
{{ind|general-imported|letters|non-Lojban|representation with language-shift}} {{ind|general-imported|language shift|choice of Lojban-lerfu-word counterpart}} {{ind|general-imported|language shift|effect on following words}} {{ind|general-imported|language shift|rationale for}} {{ind|general-imported|letters|non-Lojban|representation with consonant-word + bu, drawback}} The alternative plan, therefore, is to use a shift word similar to those introduced in | |||
{{ls|section-upper-case}}. After the appearance of such a shift word, the regular lerfu words are re-interpreted to represent the lerfu of the alphabet now in use. After a shift to the Greek alphabet, for example, the lerfu word | |||
{{vla|ty}} would represent not Latin “t” but Greek “tau”. Why “tau”? Because it is, in some sense, the closest counterpart of “t” within the Greek lerfu system. In principle it would be all right to map {{vla|ty.}} to “phi” or even “omega”, but such an arbitrary relationship would be extremely hard to remember. | |||
{{ind|general-imported|bu|interaction with language shift}} {{ind|general-imported|language shift|interaction with bu}} Where no obvious closest counterpart exists, some more or less arbitrary choice must be made. Some alien lerfu may simply not have any shifted equivalent, forcing the speaker to fall back on a | |||
{{vla|bu}} form. Since a | |||
{{vla|bu}} form may mean different things in different alphabets, it is safest to employ a shift word even when | |||
{{vla|bu}} forms are in use. | |||
Shifts for several alphabets have been assigned cmavo of selma'o BY: | |||
;{{vla|lo'a}}:Latin/Roman/Lojban alphabet | |||
;{{vla|ge'o}}:Greek alphabet | |||
;{{vla|je'o}}:Hebrew alphabet | |||
;{{vla|jo'o}}:Arabic alphabet | |||
;{{vla|ru'o}}:Cyrillic alphabet | |||
{{ind|general-imported|language shift|based on name + bu}} {{ind|general-imported|language shift|compound}} {{ind|general-imported|language shift|formation of shift alphabet name}} {{ind|general-imported|Cyrillic alphabet|language shift word for}} {{ind|general-imported|Arabic alphabet|language shift word for}} {{ind|general-imported|Hebrew alphabet|language shift word for}} {{ind|general-imported|Greek alphabet|language shift word for}} {{ind|general-imported|Latin alphabet|language shift word for}} The cmavo | |||
{{vla|zai}} (of selma'o LAU) is used to create shift words to still other alphabets. The BY word which must follow any LAU cmavo would typically be a name representing the alphabet with | |||
{{vla|bu}} suffixed: | |||
{{dsp|example-random-id-qHT3}}{{example|interlinear-gloss-example|example-random-id-qHT3}} | |||
{{judri|c17e5d2}} | |||
{{ind|example|Devanagari|example}} | |||
:'''zai .devanagar. bu''' | |||
:''Devanagari (Hindi) alphabet'' | |||
{{dsp|example-random-id-qhTV}}{{example|interlinear-gloss-example|example-random-id-qhTV}} | |||
{{judri|c17e5d3}} | |||
{{ind|example|Japanese katakana|example}} | |||
{{ind|example|katakana|example}} | |||
:'''zai .katakan. bu''' | |||
:''Japanese katakana syllabary'' | |||
{{dsp|example-random-id-qhud}}{{example|interlinear-gloss-example|example-random-id-qhud}} | |||
{{judri|c17e5d4}} | |||
{{ind|example|Japanese hiragana|example}} | |||
{{ind|example|hiragana|example}} | |||
:'''zai .xiragan. bu''' | |||
:''Japanese hiragana syllabary'' | |||
{{ind|general-imported|language shift|standardization of}} Unlike the cmavo above, these shift words have not been standardized and probably will not be until someone actually has a need for them. (Note the {{lerfu|.}} characters marking leading and following pauses.) | |||
{{ind|general|bold|example}} {{ind|general|italic|example}} | |||
{{ind|general-imported|shift words|for face}} {{ind|general-imported|shift words|for font}} {{ind|general-imported|face|specifying for letters}} {{ind|general-imported|font|specifying for letters}} In addition, there may be multiple visible representations within a single alphabet for a given letter: roman vs. italics, handwriting vs. print, Bodoni vs. Helvetica. These traditional “font and face” distinctions are also represented by shift words, indicated with the cmavo {{vla|ce'a}} (of selma'o LAU) and a following BY word: | |||
{{dsp|example-random-id-qhV0}}{{example|interlinear-gloss-example|example-random-id-qhV0}} | |||
{{judri|c17e5d5}} | |||
{{ind|example|font|example}} | |||
{{ind|example|Helvetica font|example}} | |||
:'''ce'a .xelveticas. bu''' | |||
:''Helvetica font'' | |||
{{dsp|example-random-id-qhv2}}{{example|interlinear-gloss-example|example-random-id-qhv2}} | |||
{{judri|c17e5d6}} | |||
{{ind|example|font|example}} | |||
{{ind|example|handwriting|example}} | |||
:'''ce'a .xancisk. bu''' | |||
:''handwriting'' | |||
{{dsp|example-random-id-qhVb}}{{example|interlinear-gloss-example|example-random-id-qhVb}} | |||
{{judri|c17e5d7}} | |||
{{ind|example|font|example}} | |||
{{ind|example|12-point|example}} | |||
:'''ce'a .pavrel. bu''' | |||
:<code>12-point font size</code> | |||
{{ind|general-imported|lo'a|contrasted with na'a}} {{ind|general-imported|na'a|contrasted with lo'a}} {{ind|general-imported|canceling letter shifts}} {{ind|general-imported|shift words|canceling effect}} The cmavo | |||
{{vla|na'a}} (of selma'o BY) is a universal shift-word cancel: it returns the interpretation of lerfu words to the default of lower-case Lojban with no specific font. It is more general than | |||
{{vla|lo'a}}, which changes the alphabet only, potentially leaving font and case shifts in place. | |||
Several sections at the end of this chapter contain tables of proposed lerfu word assignments for various languages. | |||
{{ssp|section-accents}} | |||
==Accent marks and compound lerfu words== | |||
{{ind|general|tilde|a diacritical mark}} {{ind|general|cedilla|a diacritical mark}} {{ind|general|circumflex|a diacritical mark}} {{ind|general|umlaut|a diacritical mark}} {{ind|general|accent mark|a diacritical mark}} | |||
{{ind|general-imported|letters|non-Lojban|representation of diacritical marks on}} {{ind|general-imported|diacritical marks|as lerfu}} Many languages that make use of the Latin alphabet add special marks to some of the lerfu they use. French, for example, uses three accent marks above vowels, called (in English) | |||
“acute”, | |||
“grave”, and | |||
“circumflex”. Likewise, German uses a mark called | |||
“umlaut”; a mark which looks the same is also used in French, but with a different name and meaning. | |||
{{ind|general-imported|diacritical marks|problem of position}} These marks may be considered lerfu, and each has a corresponding lerfu word in Lojban. So far, no problem. But the marks appear over lerfu, whereas the words must be spoken (or written) either before or after the lerfu word representing the basic lerfu. Typewriters (for mechanical reasons) and the computer programs that emulate them usually require their users to type the accent mark before the basic lerfu, whereas in speech the accent mark is often pronounced afterwards (for example, in German | |||
“a umlaut” is preferred to | |||
“umlaut a”). | |||
{{ind|general-imported|diacritical marks|specifying with tei…foi}} Lojban cannot settle this question by fiat. Either it must be left up to default interpretation depending on the language in question, or the lerfu-word compounding cmavo | |||
{{vla|tei}} (of selma'o TEI) and | |||
{{vla|foi}} (of selma'o FOI) must be used. These cmavo are always used in pairs; any number of lerfu words may appear between them, and the whole is treated as a single compound lerfu word. The French word | |||
“été”, with acute accent marks on both | |||
“e” lerfu, could be spelled as: | |||
{{dsp|example-random-id-NQgb}}{{example|interlinear-gloss-example|example-random-id-NQgb}} | |||
{{judri|c17e6d1}} | |||
{{ind|example|ete}} | |||
:'''tei .ebu .akut. bu foi ty. tei .akut. bu .ebu foi''' | |||
<natlang>( | |||
“e” acute ) | |||
“t” ( acute | |||
“e”)</natlang> | |||
{{ind|general|accent mark|example}} | |||
{{ind|general-imported|diacritical marks|order of specification within tei…foi}} and it does not matter whether | |||
:{{jbo|akut. bu}} appears before or after | |||
{{vla|.ebu}}; the | |||
{{vla|tei}}…{{vla|foi}} grouping guarantees that the acute accent is associated with the correct lerfu. Of course, the level of precision represented by | |||
{{lex|example-random-id-NQgb}} would rarely be required: it might be needed by a Lojban-speaker when spelling out a French word for exact transcription by another Lojban-speaker who did not know French. | |||
{{ind|general-imported|diacritical marks|problem with multiple on one lerfu}} This system breaks down in languages which use more than one accent mark on a single lerfu; some other convention must be used for showing which accent marks are written where in that case. The obvious convention is to represent the mark nearest the basic lerfu by the lerfu word closest to the word representing the basic lerfu. Any remaining ambiguities must be resolved by further conventions not yet established. | |||
{{ind|general|Spanish ch|example}} {{ind|general|Spanish ll|example}} | |||
{{ind|general-imported|compound letters|native language|representing as distinct letters}} {{ind|general-imported|accented letters|considered as distinct from unaccented}} {{ind|general-imported|diacritical marks|considered as forming distinct letters}} Some languages, like Swedish and Finnish, consider certain accented lerfu to be completely distinct from their unaccented equivalents, but Lojban does not make a formal distinction, since the printed characters look the same whether they are reckoned as separate letters or not. In addition, some languages consider certain 2-letter combinations (like | |||
"ll" and | |||
"ch" in Spanish) to be letters; this may be represented by enclosing the combination in | |||
{{vla|tei}}…{{vla|foi}}. | |||
{{ind|general-imported|lerfu words|forming new for non-Lojban letters using bu}} In addition, when discussing a specific language, it is permissible to make up new lerfu words, as long as they are either explained locally or well understood from context: thus Spanish | |||
"ll" or Croatian | |||
"lj" could be called | |||
{{vla|.ibu}}, but that usage would not necessarily be universally understood. | |||
{{ls|section-accents-multiple-letters}} contains a table of proposed lerfu words for some common accent marks. | |||
{{ssp|section-punctuation}} | |||
==Punctuation marks== | |||
{{ind|general-imported|lau|effect on following lerfu word}} {{ind|general-imported|punctuation lerfu words|mechanism for creating}} Lojban does not have punctuation marks as such: the denpa bu and the slaka bu are really a part of the alphabet. Other languages, however, use punctuation marks extensively. As yet, Lojban does not have any words for these punctuation marks, but a mechanism exists for devising them: the cmavo | |||
{{vla|lau}} of selma'o LAU. | |||
{{vla|lau}} must always be followed by a BY word; the interpretation of the BY word is changed from a lerfu to a punctuation mark. Typically, this BY word would be a name or brivla with a | |||
{{vla|bu}} suffix. | |||
{{ind|general-imported|punctuation lerfu words|rationale for lau}} Why is | |||
{{vla|lau}} necessary at all? Why not just use a | |||
{{vla|bu}}-marked word and announce that it is always to be interpreted as a punctuation mark? Primarily to avoid ambiguity. The | |||
{{vla|bu}} mechanism is extremely open-ended, and it is easy for Lojban users to make up | |||
{{vla|bu}} words without bothering to explain what they mean. Using the | |||
{{vla|lau}} cmavo flags at least the most important of such nonce lerfu words as having a special function: punctuation. (Exactly the same argument applies to the use of | |||
{{vla|zai}} to signal an alphabet shift or | |||
{{vla|ce'a}} to signal a font shift.) | |||
{{ind|general-imported|punctuation lerfu words|interaction with different alphabet systems}} Since different alphabets require different punctuation marks, the interpretation of a | |||
{{vla|lau}}-marked lerfu word is affected by the current alphabet shift and the current font shift. | |||
{{ssp|section-chinese-characters}} | |||
==What about Chinese characters?== | |||
{{ind|general-imported|Amharic writing}} {{ind|general-imported|syllabaries|lerfu word representation}} {{ind|general-imported|hiragana|contrasted with kanji}} {{ind|general-imported|kanji|contrasted with alphabets and syllabaries}} {{ind|general-imported|Chinese characters|contrasted with alphabets and syllabaries}} Chinese characters ( | |||
"han | |||
<sup>4</sup> zi | |||
<sup>4</sup>" in Chinese, | |||
{{vla|kanji}} in Japanese) represent an entirely different approach to writing from alphabets or syllabaries. (A syllabary, such as Japanese hiragana or Amharic writing, has one lerfu for each syllable of the spoken language.) Very roughly, Chinese characters represent single elements of meaning; also very roughly, they represent single syllables of spoken Chinese. There is in principle no limit to the number of Chinese characters that can exist, and many thousands are in regular use. | |||
It is hopeless for Lojban, with its limited lerfu and shift words, to create an alphabet which will match this diversity. However, there are various possible ways around the problem. | |||
{{ind|general-imported|romaji|as a basis for kanji characters in Lojban lerfu words}} {{ind|general-imported|pinyin|as a basis for Chinese characters in Lojban lerfu words}} {{ind|general-imported|kanji|representing based on romaji spelling}} {{ind|general-imported|Chinese characters|representing based on pinyin spelling}} First, both Chinese and Japanese have standard Latin-alphabet representations, known as | |||
“pinyin” for Chinese and | |||
“romaji” for Japanese, and these can be used. Thus, the word | |||
"han | |||
<sup>4</sup> zi | |||
<sup>4</sup>" is conventionally written with two characters, but it may be spelled out as: | |||
<!-- FIXME: Shouldn't there be some suprescripts here somewhere?? Check the red book. If not, should the "4"s be quoted? --> | |||
{{dsp|example-random-id-fBfe}}{{example|interlinear-gloss-example|example-random-id-fBfe}} | |||
{{judri|c17e8d1}} | |||
{{ind|example|han^{4}zi^{4}}} | |||
:'''.y'y.bu .abu ny. vo zy. .ibu vo''' | |||
:'' “h” “a” “n” 4 “z” “i” 4 '' | |||
{{ind|general-imported|lerfu words with numeric digits|grammar considerations}} {{ind|general-imported|numeric digits in lerfu words|grammar considerations}} The cmavo | |||
{{vla|vo}} is the Lojban digit | |||
“4”. It is grammatical to intersperse digits (of selma'o PA) into a string of lerfu words; as long as the first cmavo is a lerfu word, the whole will be interpreted as a string of lerfu words. In Chinese, the digits can be used to represent tones. Pinyin is more usually written using accent marks, the mechanism for which was explained in | |||
{{ls|section-accents}}. | |||
The Japanese company named | |||
“Mitsubishi” in English is spelled the same way in romaji, and could be spelled out in Lojban thus: | |||
{{dsp|example-random-id-pLUV}}{{example|interlinear-gloss-example|example-random-id-pLUV}} | |||
{{judri|c17e8d2}} | |||
{{ind|example|Mitsubishi|example}} | |||
:'''my. .ibu ty. sy. .ubu by. .ibu sy. .y'y.bu .ibu''' | |||
<natlang> | |||
“m” | |||
“i” | |||
“t” | |||
“s” | |||
“u” | |||
“b” | |||
“i” | |||
“s” | |||
“h” | |||
“i” | |||
</natlang> | |||
{{ind|general-imported|kanji|representing based on strokes}} {{ind|general-imported|Chinese characters|representing based on strokes}} Alternatively, a really ambitious Lojbanist could assign lerfu words to the individual strokes used to write Chinese characters (there are about seven or eight of them if you are a flexible human being, or about 40 if you are a rigid computer program), and then represent each character with a | |||
{{vla|tei}}, the stroke lerfu words in the order of writing (which is standardized for each character), and a | |||
{{vla|foi}}. No one has as yet attempted this project. | |||
{{ssp|section-lerfu-pro-sumti}} | |||
==lerfu words as pro-sumti== | |||
{{ind|general-imported|lerfu string|definition}} So far, lerfu words have only appeared in Lojban text when spelling out words. There are several other grammatical uses of lerfu words within Lojban. In each case, a single lerfu word or more than one may be used. Therefore, the term | |||
“lerfu string” is introduced: it is short for | |||
“sequence of one or more lerfu words”. | |||
{{ind|general-imported|lerfu string|as pro-sumti}} A lerfu string may be used as a pro-sumti (a sumti which refers to some previous sumti), just like the pro-sumti | |||
{{vla|ko'a}}, | |||
{{vla|ko'e}}, and so on: | |||
{{dsp|example-random-id-2wo8}}{{example|interlinear-gloss-example|example-random-id-2wo8}} | |||
{{judri|c17e9d1}} | |||
{{ind|example|A loves B|example}} | |||
:'''.abu prami by.''' | |||
:''A loves B'' | |||
In | |||
{{lex|example-random-id-2wo8}}, | |||
{{vla|.abu}} and | |||
{{vla|by.}} represent specific sumti, but which sumti they represent must be inferred from context. | |||
{{ind|general-imported|lerfu string|as pro-sumti assigned by goi}} Alternatively, lerfu strings may be assigned by | |||
{{vla|goi}}, the regular pro-sumti assignment cmavo: | |||
{{dsp|example-random-id-i7Ny}}{{example|interlinear-gloss-example|example-random-id-i7Ny}} | |||
{{judri|c17e9d2}} | |||
:'''le gerku goi gy. cu xekri .i gy. klama le zdani''' | |||
:''The dog, or G, is black. G goes to the house.'' | |||
{{ind|general-imported|lerfu string|as pro-sumti|assumption of reference}} There is a special rule that sometimes makes lerfu strings more advantageous than the regular pro-sumti cmavo. If no assignment can be found for a lerfu string (especially a single lerfu word), it can be assumed to refer to the most recent sumti whose name or description begins in Lojban with that lerfu. So | |||
{{lex|example-random-id-i7Ny}} can be rephrased: | |||
{{dsp|example-random-id-7hVs}}{{example|interlinear-gloss-example|example-random-id-7hVs}} | |||
{{judri|c17e9d3}} | |||
:'''le gerku cu xekri. .i gy. klama le zdani''' | |||
:''The dog is black. G goes to the house.'' | |||
(A less literal English translation would use | |||
“D” for | |||
“dog” instead.) | |||
Here is an example using two names and longer lerfu strings: | |||
{{dsp|example-random-id-uAAF}}{{example|interlinear-gloss-example|example-random-id-uAAF}} | |||
{{judri|c17e9d4}} | |||
{{ind|example|Alexander Pavlovitch Kuznetsov|example}} | |||
{{ind|example|Steven Mark Jones|example}} | |||
:'''la stivn. mark. djonz. merko .i la .aleksandr. paliitc. kuzNIETsyf. rusko .i symyjy. tavla .abupyky. bau la lojban.''' | |||
:<code>Steven Mark Jones is-American. Alexander Pavlovitch Kuznetsov is-Russian.</code> | |||
:<code>SMJ talks-to APK in Lojban.</code> | |||
Perhaps Alexander's name should be given as | |||
:{{jbo|ru'o.abupyky}} instead. | |||
{{ind|general-imported|lerfu strings|as pro-sumti|for multiple sumti separated by boi}} What about | |||
{{dsp|example-random-id-gJFz}}{{example|interlinear-gloss-example|example-random-id-gJFz}} | |||
{{judri|c17e9d5}} | |||
{{ind|example|A gives BC|example}} | |||
:'''.abu dunda by. cy.''' | |||
:''A gives B C'' | |||
{{ind|general-imported|boi|eliding from lerfu strings}} Does this mean that A gives B to C? No. | |||
:{{jbo|by. cy.}} is a single lerfu string, although written as two words, and represents a single pro-sumti. The true interpretation is that A gives BC to someone unspecified. To solve this problem, we need to introduce the elidable terminator | |||
{{vla|boi}} (of selma'o BOI). This cmavo is used to terminate lerfu strings and also strings of numerals; it is required when two of these appear in a row, as here. (The other reason to use | |||
{{vla|boi}} is to attach a free modifier – subscript, parenthesis, or what have you – to a lerfu string.) The correct version is: | |||
{{dsp|example-random-id-Hdwz}}{{example|interlinear-gloss-example|example-random-id-Hdwz}} | |||
{{judri|c17e9d6}} | |||
{{ind|example|A gives B to C|example}} | |||
:'''.abu [boi] dunda by. boi cy. [boi]''' | |||
:''A gives B to C'' | |||
where the two occurrences of | |||
{{vla|boi}} in brackets are elidable, but the remaining occurrence is not. Likewise: | |||
{{dsp|example-random-id-L9op}}{{example|interlinear-gloss-example|example-random-id-L9op}} | |||
{{judri|c17e9d7}} | |||
:'''xy. boi ro [boi] prenu cu prami''' | |||
:<code>X all persons loves.</code> | |||
:''X loves everybody.'' | |||
{{ind|general-imported|pro-sumti|lerfu strings|interaction with quantifiers and boi}} {{ind|general-imported|boi|required between pro-sumti lerfu string and quantifier}} requires the first | |||
{{vla|boi}} to separate the lerfu string | |||
{{vla|xy.}} from the digit string | |||
{{vla|ro}}. | |||
{{ssp|section-meho}} | |||
==References to lerfu== | |||
{{ind|general-imported|pro-sumti|lerfu string|effect on reference to lerfu itself}} {{ind|general-imported|lerfu|reference to}} The rules of | |||
{{ls|section-lerfu-pro-sumti}} make it impossible to use unmarked lerfu words to refer to lerfu themselves. In the sentence: | |||
{{dsp|example-random-id-CYny}}{{example|interlinear-gloss-example|example-random-id-CYny}} | |||
{{judri|c17e10d1}} | |||
{{ind|example|"a" is letteral|example}} | |||
:'''.abu. cu lerfu''' | |||
:<code>A is-a-letteral.</code> | |||
{{ind|general-imported|lerfu|referring to with me'o}} the hearer would try to find what previous sumti | |||
{{vla|.abu}} refers to. The solution to this problem makes use of the cmavo | |||
{{vla|me'o}} of selma'o LI, which makes a lerfu string into a sumti representing that very string of lerfu. This use of | |||
{{vla|me'o}} is a special case of its mathematical use, which is to introduce a mathematical expression used literally rather than for its value. | |||
{{dsp|example-random-id-Yy32}}{{example|interlinear-gloss-example|example-random-id-Yy32}} | |||
{{judri|c17e10d2}} | |||
{{ind|example|"a" is letteral|example}} | |||
:'''me'o .abu cu lerfu''' | |||
:''The-expression “a” is-a-letteral.'' | |||
Now we can translate | |||
{{lex|example-random-id-tvHm}} into Lojban: | |||
{{dsp|example-random-id-UT1J}}{{example|interlinear-gloss-example|example-random-id-UT1J}} | |||
{{judri|c17e10d3}} | |||
{{ind|example|four "e"s|example}} | |||
:'''dei vasru vo lerfu po'u me'o .ebu''' | |||
:<code>this-sentence contains four letterals which-are the-expression “e”.</code> | |||
:''This sentence contains four “e” s.'' | |||
Since the Lojban sentence has only four | |||
{{lerfu|e}} lerfu rather than fourteen, the translation is not a literal one – but | |||
{{lex|example-random-id-pbDf}} is a Lojban truth just as | |||
{{lex|example-random-id-tvHm}} is an English truth. Coincidentally, the colloquial English translation of | |||
{{lex|example-random-id-pbDf}} is also true! | |||
{{ind|lojban-word-imported|la'e}} {{ind|general-imported|la'e lu|compared with me'o}} {{ind|general-imported|me'o|compared with la'e lu}} {{ind|general-imported|representing lerfu|lu contrasted with me'o}} {{ind|general-imported|lu|contrasted with me'o for representing lerfu}} {{ind|general-imported|me'o|contrasted with lu…li'u for representing lerfu}} {{ind|general-imported|me'o|contrasted with quotation for representing lerfu}} {{ind|general-imported|quotation|contrasted with me'o for representing lerfu}} The reader might be tempted to use quotation with | |||
{{vla|lu}}…{{vla|li'u}} instead of | |||
{{vla|me'o}}, producing: | |||
{{dsp|example-random-id-pbDf}}{{example|interlinear-gloss-example|example-random-id-pbDf}} | |||
{{judri|c17e10d4}} | |||
:'''lu .abu li'u cu lerfu''' | |||
:<code>[quote] .abu [unquote] is-a-letteral.</code> | |||
(The single-word quote | |||
{{vla|zo}} cannot be used, because | |||
{{vla|.abu}} is a compound cmavo.) But | |||
{{lex|example-random-id-pbDf}} is false, because it says: | |||
{{dsp|example-random-id-P8Ag}} | |||
{{judri|c17e10d5}} | |||
{{ind|example|word "abu"|example}} | |||
The word | |||
{{vla|.abu}} is a letteral | |||
which is not the case; rather, the thing symbolized by the word | |||
{{vla|.abu}} is a letteral. In Lojban, that would be: | |||
{{dsp|example-random-id-Da4r}}{{example|interlinear-gloss-example|example-random-id-Da4r}} | |||
{{judri|c17e10d6}} | |||
:'''la'e lu .abu li'u cu lerfu''' | |||
:<code>The-referent-of [quote] .abu [unquote] is-a-letteral.</code> | |||
which is correct. | |||
{{ssp|section-math}} | |||
==Mathematical uses of lerfu strings== | |||
{{ind|general-imported|lerfu strings|uses in mathematics}} {{ind|general-imported|mathematics|use of lerfu strings in}} This chapter is not about Lojban mathematics, which is explained in | |||
{{lch|chapter-mekso}}, so the mathematical uses of lerfu strings will be listed and exemplified but not explained. | |||
* {{ind|general-imported|mathematical variables|lerfu strings as}} {{ind|general-imported|lerfu string|as mathematical variable}} A lerfu string as mathematical variable: | |||
{{dsp|example-random-id-1Nuz}}{{example|interlinear-gloss-example|example-random-id-1Nuz}} | |||
{{judri|c17e11d1}} | |||
:'''li .abu du li by. su'i cy.''' | |||
:<code>the-number a equals the-number b plus c</code> | |||
:''a = b + c'' | |||
* {{ind|general-imported|function name|lerfu string as}} {{ind|general-imported|lerfu string|as function name}} A lerfu string as function name (preceded by | |||
{{vla|ma'o}} of selma'o MAhO): | |||
{{dsp|example-random-id-H0SM}}{{example|interlinear-gloss-example|example-random-id-H0SM}} | |||
{{judri|c17e11d2}} | |||
{{ind|example|function f of x|example}} | |||
:'''li .y.bu du li ma'o fy. boi xy.''' | |||
:<code>the-number y equals the number the-function f of x</code> | |||
{{math|y = f(x)}} | |||
Note the | |||
{{vla|boi}} here to separate the lerfu strings | |||
{{vla|fy}} and | |||
{{vla|xy}}. | |||
* {{ind|general-imported|selbri|lerfu string as}} {{ind|general-imported|lerfu string|as selbri}} A lerfu string as selbri (followed by a cmavo of selma'o MOI): | |||
{{dsp|example-random-id-X4KM}}{{example|interlinear-gloss-example|example-random-id-X4KM}} | |||
{{judri|c17e11d3}} | |||
{{ind|example|Nth rat|example}} | |||
:'''le vi ratcu ny.moi le'i mi ratcu''' | |||
:<code>the here rat is-nth-of the-set-of my rats</code> | |||
:''This rat is my Nth rat.'' | |||
* | |||
{{ind|general-imported|utterance ordinal|lerfu string as}} {{ind|general-imported|lerfu string|as utterance ordinal}} A lerfu string as utterance ordinal (followed by a cmavo of selma'o MAI): | |||
{{dsp|example-random-id-Jw40}}{{example|interlinear-gloss-example|example-random-id-Jw40}} | |||
{{judri|c17e11d4}} | |||
{{ind|example|Nthly|example}} | |||
:'''ny.mai''' | |||
:''Nthly'' | |||
* | |||
{{ind|general-imported|subscripts|lerfu string as}} {{ind|general-imported|lerfu string|as subscript}} A lerfu string as subscript (preceded by | |||
{{vla|xi}} of selma'o XI): | |||
{{dsp|example-random-id-oTgS}}{{example|interlinear-gloss-example|example-random-id-oTgS}} | |||
{{judri|c17e11d5}} | |||
{{ind|example|x sub k|example}} | |||
:'''xy. xi ky.''' | |||
:<code>x sub k</code> | |||
* | |||
{{ind|general-imported|quantifier|lerfu string as}} {{ind|general-imported|lerfu string|as quantifier}} A lerfu string as quantifier (enclosed in | |||
{{vla|vei}}…{{vla|ve'o}} parentheses): | |||
{{dsp|example-random-id-bbnL}}{{example|interlinear-gloss-example|example-random-id-bbnL}} | |||
{{judri|c17e11d6}} | |||
{{ind|example|n people|example}} | |||
:'''vei ny. [ve'o] lo prenu''' | |||
<code>( | |||
“n”) persons</code> | |||
{{ind|general-imported|lerfu strings|as quantifiers|avoiding interaction with sumti quantified}} The parentheses are required because | |||
:{{jbo|ny. lo prenu}} would be two separate sumti, | |||
{{vla|ny.}} and | |||
:{{jbo|lo prenu}}. In general, any mathematical expression other than a simple number must be in parentheses when used as a quantifier; the right parenthesis mark, the cmavo | |||
{{vla|ve'o}}, can usually be elided. | |||
{{ind|general-imported|lerfu juxtaposition interpretation|contrasted with mathematical interpretation}} {{ind|general-imported|lerfu string|interpretation|contrasted with mathematical interpretation}} All the examples above have exhibited single lerfu words rather than lerfu strings, in accordance with the conventions of ordinary mathematics. A longer lerfu string would still be treated as a single variable or function name: in Lojban, | |||
:{{jbo|.abu by. cy.}} is not the multiplication | |||
“{{math|a × b × c}}” but is the variable | |||
abc. (Of course, a local convention could be employed that made the value of a variable like | |||
abc, with a multi-lerfu-word name, equal to the values of the variables | |||
a, | |||
b, and | |||
c multiplied together.) | |||
{{ind|general-imported|lerfu shift scope|exception for mathematical texts}} {{ind|general-imported|mathematical texts|effect on lerfu shift scope}} There is a special rule about shift words in mathematical text: shifts within mathematical expressions do not affect lerfu words appearing outside mathematical expressions, and vice versa. | |||
{{ssp|section-acronyms}} | |||
==Acronyms== | |||
{{ind|general-imported|acronym|definition}} An acronym is a name constructed of lerfu. English examples are | |||
“DNA”, | |||
“NATO”, | |||
“CIA”. In English, some of these are spelled out (like | |||
“DNA” and | |||
“CIA”) and others are pronounced more or less as if they were ordinary English words (like | |||
“NATO”). Some acronyms fluctuate between the two pronunciations: | |||
“SQL” may be | |||
“ess cue ell” or | |||
“sequel”. | |||
{{ind|general-imported|lerfu words|as a basis for acronym names}} {{ind|general-imported|acronyms|using names based on lerfu words}} In Lojban, a name can be almost any sequence of sounds that ends in a consonant and is followed by a pause. The easiest way to Lojbanize acronym names is to glue the lerfu words together, using | |||
{{lerfu|'}} wherever two vowels would come together (pauses are illegal in names) and adding a final consonant: | |||
{{dsp|example-random-id-736i}}{{example|interlinear-gloss-example|example-random-id-736i}} | |||
{{judri|c17e12d1}} | |||
:'''la dyny'abub. .i la ny'abuty'obub. .i la cy'ibu'abub. .i la sykybulyl. .i la .ibubymym. .i la ny'ybucyc.''' | |||
:''DNA. NATO. CIA. SQL. IBM. NYC.'' | |||
{{ind|general-imported|acronym names from lerfu words|assigning final consonant}} There is no fixed convention for assigning the final consonant. In | |||
{{lex|example-random-id-736i}}, the last consonant of the lerfu string has been replicated into final position. | |||
{{ind|general-imported|bu|omitting in acronyms names based on lerfu words}} {{ind|general-imported|acronyms names based on lerfu words|omitting bu}} Some compression can be done by leaving out | |||
{{vla|bu}} after vowel lerfu words (except for | |||
{{vla|.y.bu}}, wherein the | |||
{{vla|bu}} cannot be omitted without ambiguity). Compression is moderately important because it's hard to say long names without introducing an involuntary (and illegal) pause: | |||
{{dsp|example-random-id-0sin}}{{example|interlinear-gloss-example|example-random-id-0sin}} | |||
{{judri|c17e12d2}} | |||
{{ind|example|DNA|example}} | |||
{{ind|example|NATO|example}} | |||
{{ind|example|CIA|example}} | |||
{{ind|example|SQL|example}} | |||
{{ind|example|IBM|example}} | |||
{{ind|example|NYC|example}} | |||
:'''la dyny'am. .i la ny'aty'om. .i la cy'i'am. .i la sykybulym. .i la .ibymym. .i la ny'ybucym.''' | |||
:''DNA. NATO. CIA. SQL. IBM. NYC.'' | |||
In | |||
{{lex|example-random-id-0sin}}, the final consonant | |||
{{lerfu|m}} stands for | |||
{{vla|merko}}, indicating the source culture of these acronyms. | |||
{{ind|general-imported|"z" instead of "'"|in acronyms names based on lerfu words}} {{ind|general-imported|acronyms names based on lerfu words|using "z" instead of "'" in}} Another approach, which some may find easier to say and which is compatible with older versions of the language that did not have a | |||
{{lerfu|'}} character, is to use the consonant | |||
{{lerfu|z}} instead of | |||
{{lerfu|'}}: | |||
{{dsp|example-random-id-Js6m}}{{example|interlinear-gloss-example|example-random-id-Js6m}} | |||
{{judri|c17e12d3}} | |||
:'''la dynyzaz. .i la nyzatyzoz. .i la cyzizaz. .i la sykybulyz. .i la .ibymyz. .i la nyzybucyz.''' | |||
:''DNA. NATO. CIA. SQL. IBM. NYC.'' | |||
{{ind|general-imported|acronyms|as lerfu strings using "me"}} {{ind|general-imported|lerfu strings|as acronyms using "me"}} One more alternative to these lengthy names is to use the lerfu string itself prefixed with | |||
{{vla|me}}, the cmavo that makes sumti into selbri: | |||
{{dsp|example-random-id-iMRB}}{{example|interlinear-gloss-example|example-random-id-iMRB}} | |||
{{judri|c17e12d4}} | |||
:'''la me dy ny. .abu''' | |||
<natlang>that-named what-pertains-to | |||
“d” | |||
“n” | |||
“a”</natlang> | |||
This works because | |||
{{vla|la}}, the cmavo that normally introduces names used as sumti, may also be used before a predicate to indicate that the predicate is a (meaningful) name: | |||
{{dsp|example-random-id-7KLi}}{{example|interlinear-gloss-example|example-random-id-7KLi}} | |||
{{judri|c17e12d5}} | |||
:'''la cribe cu ciska''' | |||
<code>That-named | |||
“Bear” writes.</code> | |||
:''Bear is a writer.'' | |||
{{lex|example-random-id-7KLi}} does not of course refer to a bear ( | |||
:{{jbo|le cribe}} or | |||
:{{jbo|lo cribe}}) but to something else, probably a person, named | |||
“Bear”. Similarly, | |||
:{{jbo|me dy ny. .abu}} is a predicate which can be used as a name, producing a kind of acronym which can have pauses between the individual lerfu words. | |||
{{ssp|section-character-codes}} | |||
==Computerized character codes== | |||
{{ind|general-imported|letter encoding schemes|application to lerfu words}} {{ind|general-imported|character encoding schemes|application to lerfu words}} {{ind|general-imported|lerfu words|using computer encoding schemes with se'e}} {{ind|general-imported|characters|definition}} {{ind|general-imported|character codes|definition}} Since the first application of computers to non-numerical information, character sets have existed, mapping numbers (called | |||
“character codes”) into selected lerfu, digits, and punctuation marks (collectively called | |||
“characters”). Historically, these character sets have only covered the English alphabet and a few selected punctuation marks. International efforts have now created Unicode, a unified character set that can represent essentially all the characters in essentially all the world's writing systems. Lojban can take advantage of these encoding schemes by using the cmavo | |||
{{vla|se'e}} (of selma'o BY). This cmavo is conventionally followed by digit cmavo of selma'o PA representing the character code, and the whole string indicates a single character in some computerized character set: | |||
{{dsp|example-random-id-r2jv}}{{example|interlinear-gloss-example|example-random-id-r2jv}} | |||
{{judri|c17e13d1}} | |||
{{ind|example|$}} | |||
{{ind|example|American dollars}} | |||
:'''me'o se'ecixa cu lerfu la .asycy'i'is. loi merko rupnu''' | |||
:<code>The-expression [code] 36 is-a-letteral in-set ASCII for-the-mass-of American currency-units.</code> | |||
:''The character code 36 in ASCII represents American dollars.'' | |||
:''“$” represents American dollars.'' | |||
{{ind|general-imported|ASCII|application to lerfu words}} Understanding | |||
{{lex|example-random-id-r2jv}} depends on knowing the value in the ASCII character set (one of the simplest and oldest) of the | |||
“$” character. Therefore, the | |||
{{vla|se'e}} convention is only intelligible to those who know the underlying character set. For precisely specifying a particular character, however, it has the advantages of unambiguity and (relative) cultural neutrality, and therefore Lojban provides a means for those with access to descriptions of such character sets to take advantage of them. | |||
{{ind|general-imported|peace symbol}} {{ind|general-imported|Unicode}} As another example, the Unicode character set (also known as ISO 10646) represents the international symbol of peace, an inverted trident in a circle, using the base-16 value 262E. In a suitable context, a Lojbanist may say: | |||
{{dsp|example-random-id-MXET}}{{example|interlinear-gloss-example|example-random-id-MXET}} | |||
{{judri|c17e13d2}} | |||
:'''me'o se'erexarerei sinxa le ka panpi''' | |||
:<code>the-expression [code] 262E is-a-sign-of the quality-of being-at-peace</code> | |||
{{ind|general-imported|se'e|and number base convention}} When a | |||
{{vla|se'e}} string appears in running discourse, some metalinguistic convention must specify whether the number is base 10 or some other base, and which character set is in use. | |||
{{ssp|section-lerfu-cmavo-summary}} | |||
==List of all auxiliary lerfu-word cmavo== | |||
:{{c|bu}} | |||
:{{s|BU}} | |||
:''description'' | |||
:{{c|ga'e}} | |||
:{{s|BY}} | |||
:''description'' | |||
:{{c|to'a}} | |||
:{{s|BY}} | |||
:''description'' | |||
:{{c|tau}} | |||
:{{s|LAU}} | |||
:''description'' | |||
:{{c|lo'a}} | |||
:{{s|BY}} | |||
:''description'' | |||
:{{c|ge'o}} | |||
:{{s|BY}} | |||
:''description'' | |||
:{{c|je'o}} | |||
:{{s|BY}} | |||
:''description'' | |||
:{{c|jo'o}} | |||
:{{s|BY}} | |||
:''description'' | |||
:{{c|ru'o}} | |||
:{{s|BY}} | |||
:''description'' | |||
:{{c|se'e}} | |||
:{{s|BY}} | |||
:''description'' | |||
:{{c|na'a}} | |||
:{{s|BY}} | |||
:''description'' | |||
:{{c|zai}} | |||
:{{s|LAU}} | |||
:''description'' | |||
:{{c|ce'a}} | |||
:{{s|LAU}} | |||
:''description'' | |||
:{{c|lau}} | |||
:{{s|LAU}} | |||
:''description'' | |||
:{{c|tei}} | |||
:{{s|TEI}} | |||
:''description'' | |||
:{{c|foi}} | |||
:{{s|FOI}} | |||
:''description'' | |||
{{ind|general-imported|LAU selma'o|grammar of following BY cmavo}} {{ind|general-imported|lerfu word cmavo|list of auxiliary}} Note that LAU cmavo must be followed by a BY cmavo or the equivalent, where | |||
“equivalent” means: either any Lojban word followed by | |||
{{vla|bu}}, another LAU cmavo (and its required sequel), or a | |||
{{vla|tei}}…{{vla|foi}} compound cmavo. | |||
{{ssp|section-proposed-lerfu-words}} | |||
==Proposed lerfu words – introduction== | |||
{{ind|general-imported|lerfu words|list of proposed|notation convention}} The following sections contain tables of proposed lerfu words for some of the standard alphabets supported by the Lojban lerfu system. The first column of each list is the lerfu (actually, a Latin-alphabet name sufficient to identify it). The second column is the proposed name-based lerfu word, and the third column is the proposed lerfu word in the system based on using the cmavo of selma'o BY with a shift word. | |||
{{ind|general-imported|proposed lerfu words|as working basis}} These tables are not meant to be authoritative (several authorities within the Lojban community have niggled over them extensively, disagreeing with each other and sometimes with themselves). They provide a working basis until actual usage is available, rather than a final resolution of lerfu word problems. Probably the system presented here will evolve somewhat before settling down into a final, conventional form. | |||
For Latin-alphabet lerfu words, See {{ls|section-lerfu-liste}} (for Lojban) and | |||
{{ls|section-alien-alphabets}} (for non-Lojban Latin-alphabet lerfu). | |||
{{ssp|section-greek}} | |||
==Proposed lerfu words for the Greek alphabet== | |||
<tab class=wikitable header=true> | |||
alpha {{jbo|.alfas. bu}} {{vla|.abu}} | |||
beta {{jbo|.betas. bu}} {{vla|by}} | |||
gamma {{jbo|.gamas. bu}} {{vla|gy}} | |||
delta {{jbo|.deltas. bu}} {{vla|dy}} | |||
epsilon {{jbo|.Epsilon. bu}} {{vla|.ebu}} | |||
zeta {{jbo|.zetas. bu}} {{vla|zy}} | |||
eta {{jbo|.etas. bu}} {{jbo|.e'ebu}} | |||
theta {{jbo|.tetas. bu}} {{jbo|ty. bu}} | |||
iota {{jbo|.iotas. bu}} {{vla|.ibu}} | |||
kappa {{jbo|.kapas. bu}} {{vla|ky}} | |||
lambda {{jbo|.lymdas. bu}} {{vla|ly}} | |||
mu {{jbo|.mus. bu}} {{vla|my}} | |||
nu {{jbo|.nus. bu}} {{vla|ny}} | |||
xi {{jbo|.ksis. bu}} {{jbo|ksis. bu}} | |||
omicron {{jbo|.Omikron. bu}} {{vla|.obu}} | |||
pi {{jbo|.pis. bu}} {{vla|py}} | |||
rho {{jbo|.ros. bu}} {{vla|ry}} | |||
sigma {{jbo|.sigmas. bu}} {{vla|sy}} | |||
tau {{jbo|.taus. bu}} {{vla|ty}} | |||
upsilon {{jbo|.Upsilon. bu}} {{vla|.ubu}} | |||
phi {{jbo|.fis. bu}} {{jbo|py. bu}} | |||
chi {{jbo|.xis. bu}} {{jbo|ky. bu}} | |||
psi {{jbo|.psis. bu}} {{jbo|psis. bu}} | |||
omega {{jbo|.omegas. bu}} {{jbo|.o'obu}} | |||
rough {{jbo|.dasei,as. bu}} {{vla|.y'y}} | |||
smooth {{jbo|.psiles. bu}} {{jbo|xutla bu}} | |||
</tab> | |||
{{ssp|section-cyrillic}} | |||
==Proposed lerfu words for the Cyrillic alphabet== | |||
{{ind|general-imported|Cyrillic alphabet|proposed lerfu words for}} {{ind|general-imported|lerfu words|proposed for Cyrillic alphabet}} The second column in this listing is based on the historical names of the letters in Old Church Slavonic. Only those letters used in Russian are shown; other languages require more letters which can be devised as needed. | |||
<tab class=wikitable header=true> | |||
a {{jbo|.azys. bu}} {{vla|.abu}} | |||
b {{jbo|.bukys. bu}} {{vla|by}} | |||
v {{jbo|.vedis. bu}} {{vla|vy}} | |||
g {{jbo|.glagolis. bu}} {{vla|gy}} | |||
d {{jbo|.dobros. bu}} {{vla|dy}} | |||
e {{jbo|.iestys. bu}} {{vla|.ebu}} | |||
zh {{jbo|.jivet. bu}} {{vla|jy}} | |||
z {{jbo|.zemlias. bu}} {{vla|zy}} | |||
i {{jbo|.ije,is. bu}} {{vla|.ibu}} | |||
short i {{jbo|.itord. bu}} {{jbo|.itord. bu}} | |||
k {{jbo|.kakos. bu}} {{vla|ky}} | |||
l {{jbo|.liudi,ies. bu}} {{vla|ly}} | |||
m {{jbo|.myslites. bu}} {{vla|my}} | |||
n {{jbo|.naciys. bu}} {{vla|ny}} | |||
o {{jbo|.onys. bu}} {{vla|.obu}} | |||
p {{jbo|.pokois. bu}} {{vla|py}} | |||
r {{jbo|.riytsis. bu}} {{vla|ry}} | |||
s {{jbo|.slovos. bu}} {{vla|sy}} | |||
t {{jbo|.tyvriydos. bu}} {{vla|ty}} | |||
u {{jbo|.ukys. bu}} {{vla|.ubu}} | |||
f {{jbo|.friytys. bu}} {{vla|fy}} | |||
kh {{jbo|.xerys. bu}} {{vla|xy}} | |||
ts {{jbo|.tsis. bu}} {{jbo|tsys. bu}} | |||
ch {{jbo|.tcriyviys. bu}} {{jbo|tcys. bu}} | |||
sh {{jbo|.cas. bu}} {{vla|cy}} | |||
shch {{jbo|.ctas. bu}} {{jbo|ctcys. bu}} | |||
hard sign {{jbo|.ier. bu}} {{jbo|jdari bu}} | |||
yeri {{jbo|.ierys. bu}} {{vla|.y.bu}} | |||
soft sign {{jbo|.ieriys. bu}} {{jbo|ranti bu}} | |||
reversed e {{jbo|.ecarn. bu}} {{jbo|.ecarn. bu}} | |||
yu {{jbo|.ius. bu}} {{jbo|.iubu}} | |||
ya {{jbo|.ias. bu}} {{jbo|.iabu}} | |||
</tab> | |||
{{ssp|section-hebrew}} | |||
==Proposed lerfu words for the Hebrew alphabet== | |||
<tab class=wikitable header=true> | |||
aleph {{jbo|.alef. bu}} {{jbo|.alef. bu}} | |||
bet {{jbo|.bet. bu}} {{vla|by}} | |||
gimel {{jbo|.gimel. bu}} {{vla|gy}} | |||
daled {{jbo|.daled. bu}} {{vla|dy}} | |||
he {{jbo|.xex. bu}} {{vla|.y'y}} | |||
vav {{jbo|.vav. bu}} {{vla|vy}} | |||
zayin {{jbo|.zai,in. bu}} {{vla|zy}} | |||
khet {{jbo|.xet. bu}} {{jbo|xy. bu}} | |||
tet {{jbo|.tet. bu}} {{jbo|ty. bu}} | |||
yud {{jbo|.iud. bu}} {{jbo|.iud. bu}} | |||
kaf {{jbo|.kaf. bu}} {{vla|ky}} | |||
lamed {{jbo|.LYmed. bu}} {{vla|ly}} | |||
mem {{jbo|.mem. bu}} {{vla|my}} | |||
nun {{jbo|.nun. bu}} {{vla|ny}} | |||
samekh {{jbo|.samex. bu}} {{jbo|samex. bu}} | |||
ayin {{jbo|.ai,in. bu}} {{jbo|.ai,in bu}} | |||
pe {{jbo|.pex. bu}} {{vla|py}} | |||
tzadi {{jbo|.tsadik. bu}} {{jbo|tsadik. bu}} | |||
quf {{jbo|.kuf. bu}} {{jbo|ky. bu}} | |||
resh {{jbo|.rec. bu}} {{vla|ry}} | |||
shin {{jbo|.cin. bu}} {{vla|cy}} | |||
sin {{jbo|.sin. bu}} {{vla|sy}} | |||
taf {{jbo|.taf. bu}} {{vla|ty.}} | |||
dagesh {{jbo|.daGEC. bu}} {{jbo|daGEC. bu}} | |||
hiriq {{jbo|.xirik. bu}} {{vla|.ibu}} | |||
tzeirekh {{jbo|.tseirex. bu}} {{jbo|.eibu}} | |||
segol {{jbo|.seGOL. bu}} {{vla|.ebu}} | |||
qubbutz {{jbo|.kubuts. bu}} {{vla|.ubu}} | |||
qamatz {{jbo|.kamats. bu}} {{vla|.abu}} | |||
patach {{jbo|.patax. bu}} {{jbo|.a'abu}} | |||
sheva {{jbo|.cyVAS. bu}} {{vla|.y.bu}} | |||
kholem {{jbo|.xolem. bu}} {{vla|.obu}} | |||
shuruq {{jbo|.curuk. bu}} {{jbo|.u'ubu}} | |||
</tab> | |||
{{ssp|section-accents-multiple-letters}} | |||
==Proposed lerfu words for some accent marks and multiple letters== | |||
{{ind|general-imported|multiple letters|proposed lerfu words for}} {{ind|general-imported|diacritic marks|proposed lerfu words for}} {{ind|general-imported|accent marks|proposed lerfu words for}} {{ind|general-imported|lerfu words|proposed for multiple letters}} {{ind|general-imported|lerfu words|proposed for diacritic marks}} {{ind|general-imported|lerfu words|proposed for accent marks}} This list is intended to be suggestive, not complete: there are lerfu such as Polish | |||
“dark” l and Maltese h-bar that do not yet have symbols. | |||
<tab class=wikitable header=true> | |||
acute {{jbo|.akut. bu}} or {{jbo|.pritygal. bu}} [{{vla|pritu}} {{vla|galtu}}] | |||
grave {{jbo|.grav. bu}} or {{jbo|.zulgal. bu}} [{{vla|zunle}} {{vla|galtu}}] | |||
circumflex {{jbo|.cirkumfleks. bu}} or {{jbo|.midgal. bu}} [{{vla|midju}} {{vla|galtu}}] | |||
tilde {{jbo|.tildes. bu}} | |||
macron {{jbo|.makron. bu}} | |||
breve {{jbo|.brevis. bu}} | |||
over-dot {{jbo|.gapmoc. bu}} [{{vla|gapru}} {{vla|mokca}}] | |||
umlaut/trema {{jbo|.relmoc. bu}} [{{vla|re}} {{vla|mokca}}] | |||
over-ring {{jbo|.gapyjin. bu}} [{{vla|gapru}} {{vla|djine}}] | |||
cedilla {{jbo|.seDIlys. bu}} | |||
double-acute {{jbo|.re'akut. bu [re akut.]}} | |||
ogonek {{jbo|.ogoniek. bu}} | |||
hacek {{jbo|.xatcek. bu}} | |||
ligatured fi {{jbo|tei fy. ibu foi}} | |||
Danish/Latin ae ae {{jbo|tei .abu .ebu foi}} | |||
Dutch ij {{jbo|tei .ibu jy. foi}} | |||
German es-zed {{jbo|tei sy. zy. foi}} | |||
</tab> | |||
{{ssp|section-ICAO-alphabet}} | |||
==Proposed lerfu words for radio communication== | |||
{{ind|general-imported|Phonetic Alphabet|proposed lerfu words for}} {{ind|general-imported|ICAO Phonetic Alphabet|proposed lerfu words for}} {{ind|general-imported|noisy environments|proposed lerfu words for}} {{ind|general-imported|radio communication|proposed lerfu words for}} {{ind|general-imported|lerfu words|proposed for radio communication}} {{ind|general-imported|lerfu words|proposed for noisy environments}} There is a set of English words which are used, by international agreement, as lerfu words (for the English alphabet) over the radio, or in noisy situations where the utmost clarity is required. Formally they are known as the | |||
“ICAO Phonetic Alphabet”, and are used even in non-English-speaking countries. | |||
This table presents the standard English spellings and proposed Lojban versions. The Lojbanizations are not straightforward renderings of the English sounds, but make some concessions both to the English spellings of the words and to the Lojban pronunciations of the lerfu (thus | |||
:{{jbo|carlis. bu}}, not | |||
:{{jbo|tcarlis. bu}}). | |||
;'''Alfa''' | |||
:{{jbo|.alfas. bu}} | |||
;'''Bravo''' | |||
:{{jbo|.bravos. bu}} | |||
;'''Charlie''' | |||
:{{jbo|.carlis. bu}} | |||
;'''Delta''' | |||
:{{jbo|.deltas. bu}} | |||
;'''Echo''' | |||
:{{jbo|.ekos. bu}} | |||
;'''Foxtrot''' | |||
:{{jbo|.fokstrot. bu}} | |||
;'''Golf''' | |||
:{{jbo|.golf. bu}} | |||
;'''Hotel''' | |||
:{{jbo|.xoTEL. bu}} | |||
;'''India''' | |||
:{{jbo|.indias. bu}} | |||
;'''Juliet''' | |||
:{{jbo|.juliet. bu}} | |||
;'''Kilo''' | |||
:{{jbo|.kilos. bu}} | |||
;'''Lima''' | |||
:{{jbo|.limas. bu}} | |||
;'''Mike''' | |||
:{{jbo|.maik. bu}} | |||
;'''November''' | |||
:{{jbo|.novembr. bu}} | |||
;'''Oscar''' | |||
:{{jbo|.oskar. bu}} | |||
;'''Papa''' | |||
:{{jbo|.paPAS. bu}} | |||
;'''Quebec''' | |||
:{{jbo|.keBEK. bu}} | |||
;'''Romeo''' | |||
:{{jbo|.romios. bu}} | |||
;'''Sierra''' | |||
:{{jbo|.sieras. bu}} | |||
;'''Tango''' | |||
:{{jbo|.tangos. bu}} | |||
;'''Uniform''' | |||
:{{jbo|.Uniform. bu}} | |||
;'''Victor''' | |||
:{{jbo|.viktas. bu}} | |||
;'''Whiskey''' | |||
:{{jbo|.uiskis. bu}} | |||
;'''X-ray''' | |||
:{{jbo|.eksreis. bu}} | |||
;'''Yankee''' | |||
:{{jbo|.iankis. bu}} | |||
;'''Zulu''' | |||
:{{jbo|.zulus. bu}} |
Latest revision as of 08:17, 1 July 2014
Writing systems
What's a letteral, anyway?
James Cooke Brown, the founder of the Loglan Project, coined the word “letteral” (by analogy with “numeral”) to mean a letter of the alphabet, such as “f” or “z”. A typical example of its use might be
- Example .1:
There are fourteen occurrences of the letteral “e” in this sentence.
(Don't forget the one within quotation marks.) Using the word “letteral” avoids confusion with “letter”, the kind you write to someone. Not surprisingly, there is a Lojban gismu for “letteral”, namely lerfu, and this word will be used in the rest of this chapter.
Lojban uses the Latin alphabet, just as English does, right? Then why is there a need for a chapter like this? After all, everyone who can read it already knows the alphabet. The answer is twofold:
First, in English there are a set of words that correspond to and represent the English lerfu. These words are rarely written down in English and have no standard spellings, but if you pronounce the English alphabet to yourself you will hear them: ay, bee, cee, dee ... . They are used in spelling out words and in pronouncing most acronyms. The Lojban equivalents of these words are standardized and must be documented somehow.
Second, English has names only for the lerfu used in writing English. (There are also English names for Greek and Hebrew lerfu: English-speakers usually refer to the Greek lerfu conventionally spelled “phi” as “fye”, whereas “fee” would more nearly represent the name used by Greek-speakers. Still, not all English-speakers know these English names.) Lojban, in order to be culturally neutral, needs a more comprehensive system that can handle, at least potentially, all of the world's alphabets and other writing systems.
Letterals have several uses in Lojban: in forming acronyms and abbreviations, as mathematical symbols, and as pro-sumti – the equivalent of English pronouns.
In earlier writings about Lojban, there has been a tendency to use the word lerfu for both the letterals themselves and for the Lojban words which represent them. In this chapter, that tendency will be ruthlessly suppressed, and the term “lerfu word” will invariably be used for the latter. The Lojban equivalent would be lerfu valsi or lervla.
A to Z in Lojban, plus one
The first requirement of a system of lerfu words for any language is that they must represent the lerfu used to write the language. The lerfu words for English are a motley crew: the relationship between “doubleyou” and “w” is strictly historical in nature; “aitch” represents “h” but has no clear relationship to it at all; and “z” has two distinct lerfu words, “zee” and “zed”, depending on the dialect of English in question.
All of Lojban's basic lerfu words are made by one of three rules:
- to get a lerfu word for a vowel, add bu;
- to get a lerfu word for a consonant, add y;
- the lerfu word for ' is .y'y.
Therefore, the following table represents the basic Lojban alphabet:
- '
- .y'y.
- a
- .abu
- b
- by.
- c
- cy.
- d
- dy.
- e
- .ebu
- f
- fy.
- g
- gy.
- i
- .ibu
- j
- jy.
- k
- ky.
- l
- ly.
- m
- my.
- n
- ny.
- o
- .obu
- p
- py.
- r
- ry.
- s
- sy.
- t
- ty.
- u
- .ubu
- v
- vy.
- x
- xy.
- y
- .ybu
- z
- zy.
There are several things to note about this table. The consonant lerfu words are a single syllable, whereas the vowel and ' lerfu words are two syllables and must be preceded by pause (since they all begin with a vowel). Another fact, not evident from the table but important nonetheless, is that by and its like are single cmavo of selma'o BY, as is .y'y. The vowel lerfu words, on the other hand, are compound cmavo, made from a single vowel cmavo plus the cmavo bu (which belongs to its own selma'o, BU). All of the vowel cmavo have other meanings in Lojban (logical connectives, sentence separator, hesitation noise), but those meanings are irrelevant when bu follows.
Here are some illustrations of common Lojban words spelled out using the alphabet above:
- Example .2:
- ty. .abu ny. ry. .ubu
- “t” “a” “n” “r” “u”
- Example .3:
- ky. .obu .y'y. .abu
- “k” “o” “'” “a”
Spelling out words is less useful in Lojban than in English, for two reasons: Lojban spelling is phonemic, so there can be no real dispute about how a word is spelled; and the Lojban lerfu words sound more alike than the English ones do, since they are made up systematically. The English words “fail” and “vale” sound similar, but just hearing the first lerfu word of either, namely “eff” or “vee”, is enough to discriminate easily between them – and even if the first lerfu word were somehow confused, neither “vail” nor “fale” is a word of ordinary English, so the rest of the spelling determines which word is meant. Still, the capability of spelling out words does exist in Lojban.
Note that the lerfu words ending in y were written (in Example .2 and Example .3) with pauses after them. It is not strictly necessary to pause after such lerfu words, but failure to do so can in some cases lead to ambiguities:
- Example .4:
- mi cy. claxu
I lerfu-“c” without
- I am without (whatever is referred to by) the letter “c”.
without a pause after cy would be interpreted as:
- Example .5:
- micyclaxu
(Observative:) doctor-without
- Something unspecified is without a doctor.
A safe guideline is to pause after any cmavo ending in y unless the next word is also a cmavo ending in y. The safest and easiest guideline is to pause after all of them.
Upper and lower cases
Lojban doesn't use lower-case (small) letters and upper-case (capital) letters in the same way that English does; sentences do not begin with an upper-case letter, nor do names. However, upper-case letters are used in Lojban to mark irregular stress within names, thus:
- Example .6:
- .iVAN.
- the name “Ivan” in Russian/Slavic pronunciation.
It would require far too many cmavo to assign one for each upper-case and one for each lower-case lerfu, so instead we have two special cmavo ga'e and to'a representing upper case and lower case respectively. They belong to the same selma'o as the basic lerfu words, namely BY, and they may be freely interspersed with them.
The effect of ga'e is to change the interpretation of all lerfu words following it to be the upper-case version of the lerfu. An occurrence of to'a causes the interpretation to revert to lower case. Thus, ga'e .abu means not “a” but “A”, and Ivan's name may be spelled out thus:
- Example .7:
- .ibu ga'e vy. .abu ny. to'a
i [upper] V A N [lower]
The cmavo and compound cmavo of this type will be called “shift words”.
How long does a shift word last? Theoretically, until the next shift word that contradicts it or until the end of text. In practice, it is common to presume that a shift word is only in effect until the next word other than a lerfu word is found.
It is often convenient to shift just a single letter to upper case. The cmavo tau, of selma'o LAU, is useful for the purpose. A LAU cmavo must always be immediately followed by a BY cmavo or its equivalent: the combination is grammatically equivalent to a single BY. (See Section .13 for details.)
A likely use of tau is in the internationally standardized symbols for the chemical elements. Each element is represented using either a single upper-case lerfu or one upper-case lerfu followed by one lower-case lerfu:
- Example .8:
- tau sy.
[single shift] S
- S (chemical symbol for sulfur)
- Example .9:
- tau sy. .ibu
[single shift] S i
- Si (chemical symbol for silicon)
If a shift to upper-case is in effect when
tau appears, it shifts the next lerfu word only to lower case, reversing its usual effect.
The universal bu
So far we have seen bu only as a suffix to vowel cmavo to produce vowel lerfu words. Originally, this was the only use of bu. In developing the lerfu word system, however, it proved to be useful to allow bu to be attached to any word whatsoever, in order to allow arbitrary extensions of the basic lerfu word set.
Formally, bu may be attached to any single Lojban word. Compound cmavo do not count as words for this purpose. The special cmavo ba'e, za'e, zei, zo, zoi, la'o, lo'u, si, sa, su, and fa'o may not have bu attached, because they are interpreted before bu detection is done; in particular,
- Example .10:
- zo bu
- the word “bu”
is needed when discussing bu in Lojban. It is also illegal to attach bu to itself, but more than one bu may be attached to a word; thus .abubu is legal, if ugly. (Its meaning is not defined, but it is presumably different from .abu.) It does not matter if the word is a cmavo, a cmene, or a brivla. All such words suffixed by bu are treated grammatically as if they were cmavo belonging to selma'o BY. However, if the word is a cmene it is always necessary to precede and follow it by a pause, because otherwise the cmene may absorb preceding or following words.
The ability to attach bu to words has been used primarily to make names for various logograms and other unusual characters. For example, the Lojban name for the “happy face” is .uibu, based on the attitudinal .ui that means “happiness”. Likewise, the “smiley face”, written “:-)” and used on computer networks to indicate humor, is called zo'obu The existence of these names does not mean that you should insert .uibu into running Lojban text to indicate that you are happy, or zo'obu when something is funny; instead, use the appropriate attitudinal directly.
Likewise,
- joibu represents the ampersand character, “&”, based on the cmavo joi meaning “mixed and”. Many more such lerfu words will probably be invented in future.
The . and , characters used in Lojbanic writing to represent pause and syllable break respectively have been assigned the lerfu words denpa bu (literally, “pause bu”) and slaka bu (literally, “syllable bu”). The written space is mandatory here, because denpa and slaka are normal gismu with normal stress: denpabu would be a fu'ivla (word borrowed from another language into Lojban) stressed denPAbu. No pause is required between denpa (or slaka) and bu, though.
Alien alphabets
As stated in Section , Lojban's goal of cultural neutrality demands a standard set of lerfu words for the lerfu of as many other writing systems as possible. When we meet these lerfu in written text (particularly, though not exclusively, mathematical text), we need a standard Lojbanic way to pronounce them.
There are certainly hundreds of alphabets and other writing systems in use around the world, and it is probably an unachievable goal to create a single system which can express all of them, but if perfection is not demanded, a usable system can be created from the raw material which Lojban provides.
One possibility would be to use the lerfu word associated with the language itself, Lojbanized and with bu added. Indeed, an isolated Greek “alpha” in running Lojban text is probably most easily handled by calling it
- .alfas. bu. Here the Greek lerfu word has been made into a Lojbanized name by adding
s and then into a Lojban lerfu word by adding bu. Note that the pause after
- .alfas. is still needed.
Likewise, the easiest way to handle the Latin letters “h”, “q”, and “w” that are not used in Lojban is by a consonant lerfu word with bu attached. The following assignments have been made:
- '.y'y.bu'
- h
- 'ky.bu'
- q
- 'vy.bu'
- w
As an example, the English word “quack” would be spelled in Lojban thus:
- Example .11:
- ky.bu .ubu .abu cy. ky.
- “q” “u” “a” “c” “k”
Note that the fact that the letter
“c” in this word has nothing to do with the sound of the Lojban letter
c is irrelevant; we are spelling an English word and English rules control the choice of letters, but we are speaking Lojban and Lojban rules control the pronunciations of those letters.
A few more possibilities for Latin-alphabet letters used in languages other than English:
- 'ty.bu'
- þ (thorn)
- 'dy.bu'
- ð (edh)
However, this system is not ideal for all purposes. For one thing, it is verbose. The native lerfu words are often quite long, and with bu added they become even longer: the worst-case Greek lerfu word would be
- .Omikron. bu, with four syllables and two mandatory pauses. In addition, alphabets that are used by many languages have separate sets of lerfu words for each language, and which set is Lojban to choose?
The alternative plan, therefore, is to use a shift word similar to those introduced in Section .2. After the appearance of such a shift word, the regular lerfu words are re-interpreted to represent the lerfu of the alphabet now in use. After a shift to the Greek alphabet, for example, the lerfu word
ty would represent not Latin “t” but Greek “tau”. Why “tau”? Because it is, in some sense, the closest counterpart of “t” within the Greek lerfu system. In principle it would be all right to map ty. to “phi” or even “omega”, but such an arbitrary relationship would be extremely hard to remember.
Where no obvious closest counterpart exists, some more or less arbitrary choice must be made. Some alien lerfu may simply not have any shifted equivalent, forcing the speaker to fall back on a bu form. Since a bu form may mean different things in different alphabets, it is safest to employ a shift word even when bu forms are in use.
Shifts for several alphabets have been assigned cmavo of selma'o BY:
- lo'a
- Latin/Roman/Lojban alphabet
- ge'o
- Greek alphabet
- je'o
- Hebrew alphabet
- jo'o
- Arabic alphabet
- ru'o
- Cyrillic alphabet
The cmavo
zai (of selma'o LAU) is used to create shift words to still other alphabets. The BY word which must follow any LAU cmavo would typically be a name representing the alphabet with
bu suffixed:
- Example .12:
- zai .devanagar. bu
- Devanagari (Hindi) alphabet
- Example .13:
- zai .katakan. bu
- Japanese katakana syllabary
- Example .14:
- zai .xiragan. bu
- Japanese hiragana syllabary
Unlike the cmavo above, these shift words have not been standardized and probably will not be until someone actually has a need for them. (Note the . characters marking leading and following pauses.)
In addition, there may be multiple visible representations within a single alphabet for a given letter: roman vs. italics, handwriting vs. print, Bodoni vs. Helvetica. These traditional “font and face” distinctions are also represented by shift words, indicated with the cmavo ce'a (of selma'o LAU) and a following BY word:
- Example .15:
- ce'a .xelveticas. bu
- Helvetica font
- Example .16:
- ce'a .xancisk. bu
- handwriting
- Example .17:
- ce'a .pavrel. bu
12-point font size
The cmavo na'a (of selma'o BY) is a universal shift-word cancel: it returns the interpretation of lerfu words to the default of lower-case Lojban with no specific font. It is more general than
lo'a, which changes the alphabet only, potentially leaving font and case shifts in place.
Several sections at the end of this chapter contain tables of proposed lerfu word assignments for various languages.
Accent marks and compound lerfu words
Many languages that make use of the Latin alphabet add special marks to some of the lerfu they use. French, for example, uses three accent marks above vowels, called (in English)
“acute”, “grave”, and “circumflex”. Likewise, German uses a mark called
“umlaut”; a mark which looks the same is also used in French, but with a different name and meaning.
These marks may be considered lerfu, and each has a corresponding lerfu word in Lojban. So far, no problem. But the marks appear over lerfu, whereas the words must be spoken (or written) either before or after the lerfu word representing the basic lerfu. Typewriters (for mechanical reasons) and the computer programs that emulate them usually require their users to type the accent mark before the basic lerfu, whereas in speech the accent mark is often pronounced afterwards (for example, in German
“a umlaut” is preferred to
“umlaut a”).
Lojban cannot settle this question by fiat. Either it must be left up to default interpretation depending on the language in question, or the lerfu-word compounding cmavo tei (of selma'o TEI) and foi (of selma'o FOI) must be used. These cmavo are always used in pairs; any number of lerfu words may appear between them, and the whole is treated as a single compound lerfu word. The French word “été”, with acute accent marks on both
“e” lerfu, could be spelled as:
- Example .18:
- tei .ebu .akut. bu foi ty. tei .akut. bu .ebu foi
<natlang>( “e” acute ) “t” ( acute “e”)</natlang>
and it does not matter whether
- akut. bu appears before or after
.ebu; the tei…foi grouping guarantees that the acute accent is associated with the correct lerfu. Of course, the level of precision represented by Example .18 would rarely be required: it might be needed by a Lojban-speaker when spelling out a French word for exact transcription by another Lojban-speaker who did not know French.
This system breaks down in languages which use more than one accent mark on a single lerfu; some other convention must be used for showing which accent marks are written where in that case. The obvious convention is to represent the mark nearest the basic lerfu by the lerfu word closest to the word representing the basic lerfu. Any remaining ambiguities must be resolved by further conventions not yet established.
Some languages, like Swedish and Finnish, consider certain accented lerfu to be completely distinct from their unaccented equivalents, but Lojban does not make a formal distinction, since the printed characters look the same whether they are reckoned as separate letters or not. In addition, some languages consider certain 2-letter combinations (like "ll" and "ch" in Spanish) to be letters; this may be represented by enclosing the combination in tei…foi.
In addition, when discussing a specific language, it is permissible to make up new lerfu words, as long as they are either explained locally or well understood from context: thus Spanish "ll" or Croatian "lj" could be called .ibu, but that usage would not necessarily be universally understood.
Section .18 contains a table of proposed lerfu words for some common accent marks.
Punctuation marks
Lojban does not have punctuation marks as such: the denpa bu and the slaka bu are really a part of the alphabet. Other languages, however, use punctuation marks extensively. As yet, Lojban does not have any words for these punctuation marks, but a mechanism exists for devising them: the cmavo
lau of selma'o LAU.
lau must always be followed by a BY word; the interpretation of the BY word is changed from a lerfu to a punctuation mark. Typically, this BY word would be a name or brivla with a
bu suffix.
Why is lau necessary at all? Why not just use a
bu-marked word and announce that it is always to be interpreted as a punctuation mark? Primarily to avoid ambiguity. The bu mechanism is extremely open-ended, and it is easy for Lojban users to make up bu words without bothering to explain what they mean. Using the lau cmavo flags at least the most important of such nonce lerfu words as having a special function: punctuation. (Exactly the same argument applies to the use of
zai to signal an alphabet shift or
ce'a to signal a font shift.)
Since different alphabets require different punctuation marks, the interpretation of a
lau-marked lerfu word is affected by the current alphabet shift and the current font shift.
What about Chinese characters?
Chinese characters (
"han 4 zi 4" in Chinese, kanji in Japanese) represent an entirely different approach to writing from alphabets or syllabaries. (A syllabary, such as Japanese hiragana or Amharic writing, has one lerfu for each syllable of the spoken language.) Very roughly, Chinese characters represent single elements of meaning; also very roughly, they represent single syllables of spoken Chinese. There is in principle no limit to the number of Chinese characters that can exist, and many thousands are in regular use.
It is hopeless for Lojban, with its limited lerfu and shift words, to create an alphabet which will match this diversity. However, there are various possible ways around the problem.
First, both Chinese and Japanese have standard Latin-alphabet representations, known as “pinyin” for Chinese and
“romaji” for Japanese, and these can be used. Thus, the word
"han 4 zi 4" is conventionally written with two characters, but it may be spelled out as:
- Example .19:
}
- .y'y.bu .abu ny. vo zy. .ibu vo
- “h” “a” “n” 4 “z” “i” 4
The cmavo vo is the Lojban digit “4”. It is grammatical to intersperse digits (of selma'o PA) into a string of lerfu words; as long as the first cmavo is a lerfu word, the whole will be interpreted as a string of lerfu words. In Chinese, the digits can be used to represent tones. Pinyin is more usually written using accent marks, the mechanism for which was explained in
The Japanese company named “Mitsubishi” in English is spelled the same way in romaji, and could be spelled out in Lojban thus:
- Example .20:
- my. .ibu ty. sy. .ubu by. .ibu sy. .y'y.bu .ibu
<natlang> “m” “i” “t” “s” “u” “b” “i” “s” “h” “i” </natlang>
Alternatively, a really ambitious Lojbanist could assign lerfu words to the individual strokes used to write Chinese characters (there are about seven or eight of them if you are a flexible human being, or about 40 if you are a rigid computer program), and then represent each character with a
tei, the stroke lerfu words in the order of writing (which is standardized for each character), and a foi. No one has as yet attempted this project.
lerfu words as pro-sumti
So far, lerfu words have only appeared in Lojban text when spelling out words. There are several other grammatical uses of lerfu words within Lojban. In each case, a single lerfu word or more than one may be used. Therefore, the term
“lerfu string” is introduced: it is short for “sequence of one or more lerfu words”.
A lerfu string may be used as a pro-sumti (a sumti which refers to some previous sumti), just like the pro-sumti ko'a, ko'e, and so on:
- Example .21:
- .abu prami by.
- A loves B
In Example .21, .abu and by. represent specific sumti, but which sumti they represent must be inferred from context.
Alternatively, lerfu strings may be assigned by goi, the regular pro-sumti assignment cmavo:
- Example .22:
- le gerku goi gy. cu xekri .i gy. klama le zdani
- The dog, or G, is black. G goes to the house.
There is a special rule that sometimes makes lerfu strings more advantageous than the regular pro-sumti cmavo. If no assignment can be found for a lerfu string (especially a single lerfu word), it can be assumed to refer to the most recent sumti whose name or description begins in Lojban with that lerfu. So Example .22 can be rephrased:
- Example .23:
- le gerku cu xekri. .i gy. klama le zdani
- The dog is black. G goes to the house.
(A less literal English translation would use “D” for “dog” instead.)
Here is an example using two names and longer lerfu strings:
- Example .24:
- la stivn. mark. djonz. merko .i la .aleksandr. paliitc. kuzNIETsyf. rusko .i symyjy. tavla .abupyky. bau la lojban.
Steven Mark Jones is-American. Alexander Pavlovitch Kuznetsov is-Russian.
SMJ talks-to APK in Lojban.
Perhaps Alexander's name should be given as
- ru'o.abupyky instead.
What about
- Example .25:
- .abu dunda by. cy.
- A gives B C
Does this mean that A gives B to C? No.
- by. cy. is a single lerfu string, although written as two words, and represents a single pro-sumti. The true interpretation is that A gives BC to someone unspecified. To solve this problem, we need to introduce the elidable terminator
boi (of selma'o BOI). This cmavo is used to terminate lerfu strings and also strings of numerals; it is required when two of these appear in a row, as here. (The other reason to use boi is to attach a free modifier – subscript, parenthesis, or what have you – to a lerfu string.) The correct version is:
- Example .26:
- .abu [boi] dunda by. boi cy. [boi]
- A gives B to C
where the two occurrences of boi in brackets are elidable, but the remaining occurrence is not. Likewise:
- Example .27:
- xy. boi ro [boi] prenu cu prami
X all persons loves.
- X loves everybody.
requires the first boi to separate the lerfu string xy. from the digit string
ro.
References to lerfu
The rules of Section .8 make it impossible to use unmarked lerfu words to refer to lerfu themselves. In the sentence:
- Example .28:
- .abu. cu lerfu
A is-a-letteral.
the hearer would try to find what previous sumti .abu refers to. The solution to this problem makes use of the cmavo me'o of selma'o LI, which makes a lerfu string into a sumti representing that very string of lerfu. This use of me'o is a special case of its mathematical use, which is to introduce a mathematical expression used literally rather than for its value.
- Example .29:
- me'o .abu cu lerfu
- The-expression “a” is-a-letteral.
Now we can translate Example .1 into Lojban:
- Example .30:
- dei vasru vo lerfu po'u me'o .ebu
this-sentence contains four letterals which-are the-expression “e”.
- This sentence contains four “e” s.
Since the Lojban sentence has only four e lerfu rather than fourteen, the translation is not a literal one – but Example .31 is a Lojban truth just as Example .1 is an English truth. Coincidentally, the colloquial English translation of Example .31 is also true!
The reader might be tempted to use quotation with lu…li'u instead of me'o, producing:
- Example .31:
- lu .abu li'u cu lerfu
[quote] .abu [unquote] is-a-letteral.
(The single-word quote zo cannot be used, because .abu is a compound cmavo.) But Example .31 is false, because it says:
- Example .32:
The word .abu is a letteral
which is not the case; rather, the thing symbolized by the word .abu is a letteral. In Lojban, that would be:
- Example .33:
- la'e lu .abu li'u cu lerfu
The-referent-of [quote] .abu [unquote] is-a-letteral.
which is correct.
Mathematical uses of lerfu strings
This chapter is not about Lojban mathematics, which is explained in Chapter ELG-ERROR in Template:Lch, so the mathematical uses of lerfu strings will be listed and exemplified but not explained.
- A lerfu string as mathematical variable:
- Example .34:
- li .abu du li by. su'i cy.
the-number a equals the-number b plus c
- a = b + c
- A lerfu string as function name (preceded by
ma'o of selma'o MAhO):
- Example .35:
- li .y.bu du li ma'o fy. boi xy.
the-number y equals the number the-function f of x
{{{1}}}
Note the boi here to separate the lerfu strings fy and xy.
- A lerfu string as selbri (followed by a cmavo of selma'o MOI):
- Example .36:
- le vi ratcu ny.moi le'i mi ratcu
the here rat is-nth-of the-set-of my rats
- This rat is my Nth rat.
A lerfu string as utterance ordinal (followed by a cmavo of selma'o MAI):
- Example .37:
- ny.mai
- Nthly
A lerfu string as subscript (preceded by xi of selma'o XI):
- Example .38:
- xy. xi ky.
x sub k
A lerfu string as quantifier (enclosed in vei…ve'o parentheses):
- Example .39:
- vei ny. [ve'o] lo prenu
(
“n”) persons
The parentheses are required because
- ny. lo prenu would be two separate sumti,
ny. and
- lo prenu. In general, any mathematical expression other than a simple number must be in parentheses when used as a quantifier; the right parenthesis mark, the cmavo
ve'o, can usually be elided.
All the examples above have exhibited single lerfu words rather than lerfu strings, in accordance with the conventions of ordinary mathematics. A longer lerfu string would still be treated as a single variable or function name: in Lojban,
- .abu by. cy. is not the multiplication
“a × b × c” but is the variable abc. (Of course, a local convention could be employed that made the value of a variable like abc, with a multi-lerfu-word name, equal to the values of the variables a, b, and c multiplied together.)
There is a special rule about shift words in mathematical text: shifts within mathematical expressions do not affect lerfu words appearing outside mathematical expressions, and vice versa.
Acronyms
An acronym is a name constructed of lerfu. English examples are
“DNA”,
“NATO”,
“CIA”. In English, some of these are spelled out (like
“DNA” and
“CIA”) and others are pronounced more or less as if they were ordinary English words (like
“NATO”). Some acronyms fluctuate between the two pronunciations:
“SQL” may be
“ess cue ell” or “sequel”.
In Lojban, a name can be almost any sequence of sounds that ends in a consonant and is followed by a pause. The easiest way to Lojbanize acronym names is to glue the lerfu words together, using
' wherever two vowels would come together (pauses are illegal in names) and adding a final consonant:
- Example .40:
- la dyny'abub. .i la ny'abuty'obub. .i la cy'ibu'abub. .i la sykybulyl. .i la .ibubymym. .i la ny'ybucyc.
- DNA. NATO. CIA. SQL. IBM. NYC.
There is no fixed convention for assigning the final consonant. In Example .40, the last consonant of the lerfu string has been replicated into final position.
Some compression can be done by leaving out bu after vowel lerfu words (except for .y.bu, wherein the bu cannot be omitted without ambiguity). Compression is moderately important because it's hard to say long names without introducing an involuntary (and illegal) pause:
- Example .41:
- la dyny'am. .i la ny'aty'om. .i la cy'i'am. .i la sykybulym. .i la .ibymym. .i la ny'ybucym.
- DNA. NATO. CIA. SQL. IBM. NYC.
In Example .41, the final consonant m stands for merko, indicating the source culture of these acronyms.
Another approach, which some may find easier to say and which is compatible with older versions of the language that did not have a ' character, is to use the consonant z instead of ':
- Example .42:
- la dynyzaz. .i la nyzatyzoz. .i la cyzizaz. .i la sykybulyz. .i la .ibymyz. .i la nyzybucyz.
- DNA. NATO. CIA. SQL. IBM. NYC.
One more alternative to these lengthy names is to use the lerfu string itself prefixed with me, the cmavo that makes sumti into selbri:
- Example .43:
- la me dy ny. .abu
<natlang>that-named what-pertains-to “d” “n” “a”</natlang>
This works because la, the cmavo that normally introduces names used as sumti, may also be used before a predicate to indicate that the predicate is a (meaningful) name:
- Example .44:
- la cribe cu ciska
That-named
“Bear” writes.
- Bear is a writer.
Example .44 does not of course refer to a bear (
- le cribe or
- lo cribe) but to something else, probably a person, named
“Bear”. Similarly,
- me dy ny. .abu is a predicate which can be used as a name, producing a kind of acronym which can have pauses between the individual lerfu words.
Computerized character codes
Since the first application of computers to non-numerical information, character sets have existed, mapping numbers (called “character codes”) into selected lerfu, digits, and punctuation marks (collectively called
“characters”). Historically, these character sets have only covered the English alphabet and a few selected punctuation marks. International efforts have now created Unicode, a unified character set that can represent essentially all the characters in essentially all the world's writing systems. Lojban can take advantage of these encoding schemes by using the cmavo
se'e (of selma'o BY). This cmavo is conventionally followed by digit cmavo of selma'o PA representing the character code, and the whole string indicates a single character in some computerized character set:
- Example .45:
- me'o se'ecixa cu lerfu la .asycy'i'is. loi merko rupnu
The-expression [code] 36 is-a-letteral in-set ASCII for-the-mass-of American currency-units.
- The character code 36 in ASCII represents American dollars.
- “$” represents American dollars.
Understanding Example .45 depends on knowing the value in the ASCII character set (one of the simplest and oldest) of the
“$” character. Therefore, the se'e convention is only intelligible to those who know the underlying character set. For precisely specifying a particular character, however, it has the advantages of unambiguity and (relative) cultural neutrality, and therefore Lojban provides a means for those with access to descriptions of such character sets to take advantage of them.
As another example, the Unicode character set (also known as ISO 10646) represents the international symbol of peace, an inverted trident in a circle, using the base-16 value 262E. In a suitable context, a Lojbanist may say:
- Example .46:
- me'o se'erexarerei sinxa le ka panpi
the-expression [code] 262E is-a-sign-of the quality-of being-at-peace
When a se'e string appears in running discourse, some metalinguistic convention must specify whether the number is base 10 or some other base, and which character set is in use.
List of all auxiliary lerfu-word cmavo
- bu
- BU
- description
- BY
- description
- BY
- description
- tau
- LAU
- description
- BY
- description
- ge'o
- BY
- description
- je'o
- BY
- description
- jo'o
- BY
- description
- ru'o
- BY
- description
- BY
- description
- BY
- description
- LAU
- description
- LAU
- description
- LAU
- description
- tei
- TEI
- description
- foi
- FOI
- description
Note that LAU cmavo must be followed by a BY cmavo or the equivalent, where “equivalent” means: either any Lojban word followed by bu, another LAU cmavo (and its required sequel), or a tei…foi compound cmavo.
Proposed lerfu words – introduction
The following sections contain tables of proposed lerfu words for some of the standard alphabets supported by the Lojban lerfu system. The first column of each list is the lerfu (actually, a Latin-alphabet name sufficient to identify it). The second column is the proposed name-based lerfu word, and the third column is the proposed lerfu word in the system based on using the cmavo of selma'o BY with a shift word.
These tables are not meant to be authoritative (several authorities within the Lojban community have niggled over them extensively, disagreeing with each other and sometimes with themselves). They provide a working basis until actual usage is available, rather than a final resolution of lerfu word problems. Probably the system presented here will evolve somewhat before settling down into a final, conventional form.
For Latin-alphabet lerfu words, See Section .1 (for Lojban) and Section .4 (for non-Lojban Latin-alphabet lerfu).
Proposed lerfu words for the Greek alphabet
<tab class=wikitable header=true> alpha .alfas. bu .abu beta .betas. bu by gamma .gamas. bu gy delta .deltas. bu dy epsilon .Epsilon. bu .ebu zeta .zetas. bu zy eta .etas. bu .e'ebu theta .tetas. bu ty. bu iota .iotas. bu .ibu kappa .kapas. bu ky lambda .lymdas. bu ly mu .mus. bu my nu .nus. bu ny xi .ksis. bu ksis. bu omicron .Omikron. bu .obu pi .pis. bu py rho .ros. bu ry sigma .sigmas. bu sy tau .taus. bu ty upsilon .Upsilon. bu .ubu phi .fis. bu py. bu chi .xis. bu ky. bu psi .psis. bu psis. bu omega .omegas. bu .o'obu rough .dasei,as. bu .y'y smooth .psiles. bu xutla bu </tab>
Proposed lerfu words for the Cyrillic alphabet
The second column in this listing is based on the historical names of the letters in Old Church Slavonic. Only those letters used in Russian are shown; other languages require more letters which can be devised as needed.
<tab class=wikitable header=true> a .azys. bu .abu b .bukys. bu by v .vedis. bu vy g .glagolis. bu gy d .dobros. bu dy e .iestys. bu .ebu zh .jivet. bu jy z .zemlias. bu zy i .ije,is. bu .ibu short i .itord. bu .itord. bu k .kakos. bu ky l .liudi,ies. bu ly m .myslites. bu my n .naciys. bu ny o .onys. bu .obu p .pokois. bu py r .riytsis. bu ry s .slovos. bu sy t .tyvriydos. bu ty u .ukys. bu .ubu f .friytys. bu fy kh .xerys. bu xy ts .tsis. bu tsys. bu ch .tcriyviys. bu tcys. bu sh .cas. bu cy shch .ctas. bu ctcys. bu hard sign .ier. bu jdari bu yeri .ierys. bu .y.bu soft sign .ieriys. bu ranti bu reversed e .ecarn. bu .ecarn. bu yu .ius. bu .iubu ya .ias. bu .iabu </tab>
Proposed lerfu words for the Hebrew alphabet
<tab class=wikitable header=true> aleph .alef. bu .alef. bu bet .bet. bu by gimel .gimel. bu gy daled .daled. bu dy he .xex. bu .y'y vav .vav. bu vy zayin .zai,in. bu zy khet .xet. bu xy. bu tet .tet. bu ty. bu yud .iud. bu .iud. bu kaf .kaf. bu ky lamed .LYmed. bu ly mem .mem. bu my nun .nun. bu ny samekh .samex. bu samex. bu ayin .ai,in. bu .ai,in bu pe .pex. bu py tzadi .tsadik. bu tsadik. bu quf .kuf. bu ky. bu resh .rec. bu ry shin .cin. bu cy sin .sin. bu sy taf .taf. bu ty. dagesh .daGEC. bu daGEC. bu hiriq .xirik. bu .ibu tzeirekh .tseirex. bu .eibu segol .seGOL. bu .ebu qubbutz .kubuts. bu .ubu qamatz .kamats. bu .abu patach .patax. bu .a'abu sheva .cyVAS. bu .y.bu kholem .xolem. bu .obu shuruq .curuk. bu .u'ubu </tab>
Proposed lerfu words for some accent marks and multiple letters
This list is intended to be suggestive, not complete: there are lerfu such as Polish “dark” l and Maltese h-bar that do not yet have symbols.
<tab class=wikitable header=true> acute .akut. bu or .pritygal. bu [pritu galtu] grave .grav. bu or .zulgal. bu [zunle galtu] circumflex .cirkumfleks. bu or .midgal. bu [midju galtu] tilde .tildes. bu macron .makron. bu breve .brevis. bu over-dot .gapmoc. bu [gapru mokca] umlaut/trema .relmoc. bu [re mokca] over-ring .gapyjin. bu [gapru djine] cedilla .seDIlys. bu double-acute .re'akut. bu [re akut.] ogonek .ogoniek. bu hacek .xatcek. bu ligatured fi tei fy. ibu foi Danish/Latin ae ae tei .abu .ebu foi Dutch ij tei .ibu jy. foi German es-zed tei sy. zy. foi </tab>
Proposed lerfu words for radio communication
There is a set of English words which are used, by international agreement, as lerfu words (for the English alphabet) over the radio, or in noisy situations where the utmost clarity is required. Formally they are known as the “ICAO Phonetic Alphabet”, and are used even in non-English-speaking countries.
This table presents the standard English spellings and proposed Lojban versions. The Lojbanizations are not straightforward renderings of the English sounds, but make some concessions both to the English spellings of the words and to the Lojban pronunciations of the lerfu (thus
- carlis. bu, not
- tcarlis. bu).
- Alfa
- .alfas. bu
- Bravo
- .bravos. bu
- Charlie
- .carlis. bu
- Delta
- .deltas. bu
- Echo
- .ekos. bu
- Foxtrot
- .fokstrot. bu
- Golf
- .golf. bu
- Hotel
- .xoTEL. bu
- India
- .indias. bu
- Juliet
- .juliet. bu
- Kilo
- .kilos. bu
- Lima
- .limas. bu
- Mike
- .maik. bu
- November
- .novembr. bu
- Oscar
- .oskar. bu
- Papa
- .paPAS. bu
- Quebec
- .keBEK. bu
- Romeo
- .romios. bu
- Sierra
- .sieras. bu
- Tango
- .tangos. bu
- Uniform
- .Uniform. bu
- Victor
- .viktas. bu
- Whiskey
- .uiskis. bu
- X-ray
- .eksreis. bu
- Yankee
- .iankis. bu
- Zulu
- .zulus. bu