BPFK Section: ZOI: Difference between revisions

From Lojban
Jump to navigation Jump to search
m (judri dragau)
m (judri dragau soi krefu (to ko fraxu fi tu'a lo spama toi))
 
Line 4: Line 4:


* Oustanding Issues
* Oustanding Issues
** The CLL needs to be updated to reflect the behavior of the [[CLL PEG Errata|PEG grammar with regard to ZOI|CLL PEG Errata EG grammar with regard to ZOI]].
** The CLL needs to be updated to reflect the behavior of the [[CLL PEG Errata|PEG grammar with regard to ZOI]].


This page resulted from a discussion about the way ZOI is handled in camxes and the specification for it in the CLL:
This page resulted from a discussion about the way ZOI is handled in camxes and the specification for it in the CLL:

Latest revision as of 07:03, 11 June 2015

  • Decided Issues
    • Point 1, below, is the official viewpoint: ZOI divides the input stream into words before looking for a token. Points 2 and 3 are not carried.

This page resulted from a discussion about the way ZOI is handled in camxes and the specification for it in the CLL:

http://groups.google.com/group/lojban/browse_frm/thread/aa27ff25e6dd5201

zoi, as described in the CLL:

http://mw.lojban.org/extensions/cll/19/10/

states, about delimiters:

"...and which is not found in the written text or spoken phoneme stream."

"Within written text, the Lojban written word used as a delimiting word may not appear..."

It then goes on to provide an example that is claimed to be ungrammatical:

mi djuno fi le valsi po'u zoi gy. gyrations .gy.

One could infer, given the example provided, that "gy" could not appear *at all* in the quoted text. The official parser (grammar.300) does not behave this way, however, as it breaks up the words into tokens and fails to detect that the delimiter might be a substring of a quoted word.

camxes also allows the form above as quoted by the CLL. It does not allow the following:

mi djuno fi le valsi po'u zoi gy. gyrate .gy.

because "gyrate" are three separate Lojban words, and camxes parses the quoted text as if it were Lojban, stopping after matching the first parsed word against the terminator.

Exactly how a parser should parse text between zoi delimiters needs to be decided. There are three extant proposals:

  1. Consider the PEG grammar is it is written now to be correct and update the CLL to more accurately describe how zoi works.
  2. replace the rule for zoi-word to match non-Lojban words (strings of non-whitespace) rather than lojban words, so 'gyrate' won't be divided into three words. The CLL will need to be updated for this behavior too.
  3. Replace the PEG grammar with something that reads the stuff between the ZOI delimiter a character at a time. Either requiring a pause before the final delimiter or not. This is consistent with the CLL.

ZOhOI

The proposed selma'o ZOhOI quotes a single word, allowing a non-Lojban word to be quoted without using delimiters. The behavior of ZOhOI is worth considering when deciding on how ZOI should parse between it's delimiters.

What ZOhOI considers a single word should probably be what ZOI considers a single word, unless it is decided that ZOI works a character at a time.