interpretive conventions for lerfu formatting cmavo: Difference between revisions

Revision as of 16:52, 4 November 2013

There is a class of letterals that don't stand for a specific written

symbol, but rather modify the following letterals in a letteral

string. I have chosen to call these "lerfu formatting cmavo".

The CLL is rather skimpy on the details on how these cmavo are to be

interpreted, and how they interact, and this is one of the issues the

BPFK has to decide.

As I ((tsali) promised in

[1], I'm making this page to

show what we know about lerfu formatting cmavo from the examples in CLL,

and some possible solutions to situations that aren't described in the

book. Please keep any comments separate from the main text. Thanks.

We assume for the time being that the lerfu formatting cmavo are the

following: ga'e, to'a, lo'a, ge'o, je'o, jo'o, ru'o, lo'a, na'a

plus BU-letterals prefixed with the following: zai, ce'a

We assume that all other letterals stand for a distinct symbol, because

otherwise, all letteral sequences with nonce letterals would be

ambiguous as to what is formatting and what is printable characters.

What we already know about the Lojban lerfu sequence formatting cmavo

The lerfu formatting cmavo modifies the immediately following ordinary letteral.
If X and Y are ordinary letterals, and Z is a lerfu formatting cmavo, and X precedes Y, and X is modified by Z, then Y is also modified by Z, except in cases where Z is a single-lerfu modifier, such as TAU or LAU. In other words, most lerfu formatting cmavo applies at least to the entire contiguous sequence of letterals following it.

Models for handling cases that are as yet undefined

All of these are intended to be baseline-compliant, ie. render none of

the examples in CLL invalid. Please add a note if you think this is

incorrect.

The Microsoft Word model (state machine)

Formatting applies by default to all ordinary letterals of the current sequence
Formatting is by default additive. For example, "ce'a bold. bu ce'a .italik. bu .abu" results in a lowercase a that is both bold and italic.

- We could establish a system of "categories" that lerfu formatting cmavo can encode "features" of, and say that subsequent lerfu formatting cmavo supersede previous ones in the same category. Thus, "ce'a pavrel. bu .abu ce'a pavnon. bu by." results in a 12pt lowercase a followed by a 10pt lowercase b.
Formatting may be canceled by a special format that says "revert to state before the application of the last lerfu formatting cmavo".

The CSS model

Formatting applies to all following letterals in the string which are of "the same kind" as the lerfu formatting cmavo is. Unformatted lerfu forming cmavo apply to the entire remaining text; formatting that appears when Cyrillic shift is active applies to all remaining Cyrillic text, formatting that appears in XI subscripts N levels deep apply in all following subscripts N levels deep, and so on.
Less crunchy, because it relieves the speaker of the burden of updating many different kinds of formatting when switching contexts.

Undesirable because it establishes a hierarchical interpretation of a structure that is supposed to be "linear" (actually left-recursive, see grammar rule jbocre: lerfu_string_root_986).

A baseline-violating proposal

Create a new set of grammar rules:

GAhE_1620 (terminal for lerfu formatting cmavo) - move all formatting cmavo here from BY
ZAI_1621 (terminal for zai and ce'a)

XUhU_1629 (terminal for new terminator cmavo)
formatted_lerfu_string_1988 : GAhE_1620 lerfu_string_root_986 XUhU_gap_1410 | ZAI_1621 lerfu_word_987 lerfu_string_root_986 XUhU_gap_1410 ;

XUhU_gap_1410 : XUhU_1629 | XUhU_1629 free_modifier_32 | error ;

Modify lerfu_string_root_986 thus:

lerfu_string_root_986 : lerfu_word_987

| lerfu_string_root_986 lerfu_word_987

| lerfu_string_root_986 PA_672

| formatted_lerfu_string_1988

I don't see the point of having formatting cmavo in the first place, we don't really need such a wide variety of pronouns, but if they are going to be there then the restriction you propose would make sense. Of course, that means we won't be able to say things like ga'e klama le zarci anymore... xorxes

The point of them is for use in mathematics, where plain-text distinctions require lots of

formatting. For example, A might be a scalar, and A a corresponding vector.

But do we really want to refer to the vector as ce'a bold bu abu? It seems extremely inconvenient, especially if you have to repeat it several times and together with non-bold variables. (BTW, why are we using the English word "bold" for this?) Why even mention "bold", why not just "-vector bu abu"?
- Probably. But I couldn't be bothered to think of a native term for "bold", and hyphen notation is confusing to many, as well. --tsali

- - I proposed nacmei for "vector", [2]. So perhaps something like {nacmei bu abu} is easier to understand than {ce'a bold bu abu}. xorxes

@@ Line 1: / Line 1: @@
-. la maks <font color="#FF0000">to le do</font> gerku toi vikmi lo pinca di'o lemi sasfoi
+See also: [[BPFK Section: lerfu Shifts]], [[BPFK Section: lerfu Forming cmavo]]
-. la itiopias cu glare .i la dja<font color="#FF0000">mal ibu</font> pu klama
+There is a class of [[jbocre: letteral|letteral]]s that don't stand for a specific written
-. a'unai la mykydanldz lo vatyg<font color="#FF0000">au gusta</font>
+symbol, but rather modify the following letterals in a letteral
-. la sam pu zifygau leri cakyrespa gi'e zgana lo nu le zi<font color="#FF0000">fre sno</font>da'u cu klama
+string. I have chosen to call these "lerfu formatting cmavo".
-. lo ska<font color="#FF0000">mi a mi</font> kakne lo nu snada le cipra ne la turing
+The [[jbocre: CLL|CLL]] is rather skimpy on the details on how these cmavo are to be
-. le <font color="#FF0000">pa ris</font>na cu banzu
+interpreted, and how they interact, and this is one of the issues the
-. le terjbe po la <font color="#FF0000">brus sels</font>la le lanzu pe by
+BPFK has to decide.
-. le bimple <font color="#FF0000">montre a l</font>e jbugai montre cu drata
+As I ([[User:tsali|(tsali]]) promised in
+[http://www.lojban.org/bpfk/viewtopic.php?t=44], I'm making this page to
+show what we know about lerfu formatting cmavo from the examples in CLL,
+and some possible solutions to situations that aren't described in the
+book. '''Please keep any comments separate from the main text. Thanks.'''
+We assume for the time being that the lerfu formatting cmavo are the
+following: [[jbocre: ga'e|ga'e]], [[jbocre: to'a|to'a]], [[jbocre: lo'a|lo'a]], [[jbocre: ge'o|ge'o]], [[jbocre: je'o|je'o]], [[jbocre: jo'o|jo'o]], [[jbocre: ru'o|ru'o]], [[jbocre: lo'a|lo'a]], [[jbocre: na'a|na'a]]
+plus BU-letterals prefixed with the following: [[jbocre: zai|zai]], [[jbocre: ce'a|ce'a]]
+We assume that all other letterals stand for a distinct symbol, because
+otherwise, all letteral sequences with nonce letterals would be
+ambiguous as to what is formatting and what is printable characters.
+=== What we already know about the Lojban lerfu sequence formatting cmavo ===
+* The lerfu formatting cmavo modifies the immediately following ordinary letteral.
+* If X and Y are ordinary letterals, and Z is a lerfu formatting cmavo, and X precedes Y, and X is modified by Z, then Y is also modified by Z, except in cases where Z is a single-lerfu modifier, such as TAU or LAU. In other words, most lerfu formatting cmavo applies at least to the entire contiguous sequence of letterals following it.
+=== Models for handling cases that are as yet undefined ===
+All of these are intended to be baseline-compliant, ie. render none of
+the examples in [[jbocre: CLL|CLL]] invalid. Please add a note if you think this is
+incorrect.
+==== The Microsoft Word model (state machine) ====
+* Formatting applies by default to all ordinary letterals of the current sequence
+* Formatting is by default additive. For example, "ce'a bold. bu ce'a .italik. bu .abu" results in a lowercase a that is both bold and italic.
+** We could establish a system of "categories" that lerfu formatting cmavo can encode "features" of, and say that subsequent lerfu formatting cmavo supersede previous ones '''in the same category'''. Thus, "ce'a pavrel. bu .abu ce'a  pavnon. bu by." results in a 12pt lowercase a followed by a 10pt lowercase b.
+* Formatting may be canceled by a special format that says "revert to state before the application of the last lerfu formatting cmavo".
+==== The CSS model ====
+* Formatting applies to all following letterals in the string which are of "the same kind" as the lerfu formatting cmavo is. Unformatted lerfu forming cmavo apply to the entire remaining text; formatting that appears when Cyrillic shift is active applies to all remaining Cyrillic text, formatting that appears in [[jbocre: XI|XI]] subscripts N levels deep apply in all following subscripts N levels deep, and so on.
+* Less crunchy, because it relieves the speaker of the burden of updating many different kinds of formatting when switching contexts.
+* Undesirable because it establishes a hierarchical interpretation of a structure that is supposed to be "linear" (actually left-recursive, see grammar rule '''[[jbocre: lerfu_string_root_986]]''').
+=== A baseline-violating proposal ===
+Create a new set of grammar rules:
+* GAhE_1620 (terminal for lerfu formatting cmavo) - move all formatting cmavo here from BY
+* ZAI_1621 (terminal for zai and ce'a)
+* XUhU_1629 (terminal for new terminator cmavo)
+* formatted_lerfu_string_1988 : GAhE_1620 lerfu_string_root_986 XUhU_gap_1410 | ZAI_1621 lerfu_word_987 lerfu_string_root_986 XUhU_gap_1410 ;
+* XUhU_gap_1410 : XUhU_1629 | XUhU_1629 free_modifier_32 | error ;
+Modify lerfu_string_root_986 thus:
+lerfu_string_root_986 : lerfu_word_987
+;:|  lerfu_string_root_986  lerfu_word_987
+;:|  lerfu_string_root_986  PA_672
+;:|  formatted_lerfu_string_1988
+;:;
+--------
+I don't see the point of having formatting cmavo in the first place, we don't really need such a wide variety of pronouns, but if they are going to be there then the restriction you propose would make sense. Of course, that means we won't be able to say things like ''ga'e klama le zarci'' anymore... [[User:xorxes|xorxes]]
+The point of them is for use in mathematics, where plain-text distinctions require lots of
+formatting.  For example, A might be a scalar, and '''A''' a corresponding vector.
+*But do we really want to refer to the vector as ''ce'a bold bu abu''? It seems extremely inconvenient, especially if you have to repeat it several times and together with non-bold variables. (BTW, why are we using the English word "bold" for this?) Why even mention "bold", why not just "-vector bu abu"?
+**Probably. But I couldn't be bothered to think of a native term for "bold", and [[jbocre: hyphen notation|hyphen notation]] is confusing to many, as well. --[[User:tsali|tsali]]
+***I proposed ''nacmei'' for "vector", [http://www.lojban.org/jbovlaste/dict/nacmei]. So perhaps something like {nacmei bu abu} is easier to understand than {ce'a bold bu abu}. [[User:xorxes|xorxes]]

interpretive conventions for lerfu formatting cmavo: Difference between revisions

Revision as of 16:52, 4 November 2013

Contents

What we already know about the Lojban lerfu sequence formatting cmavo

Models for handling cases that are as yet undefined

The Microsoft Word model (state machine)

The CSS model

A baseline-violating proposal

Navigation menu

interpretive conventions for lerfu formatting cmavo: Difference between revisions

Revision as of 16:52, 4 November 2013

What we already know about the Lojban lerfu sequence formatting cmavo

Models for handling cases that are as yet undefined

The Microsoft Word model (state machine)

The CSS model

A baseline-violating proposal

Navigation menu

Search