Tansky-Lechevalier scoring algorithm: Difference between revisions

From Lojban
Jump to navigation Jump to search
m (Text replace - "jbocre: l" to "l")
No edit summary
 
(7 intermediate revisions by 3 users not shown)
Line 1: Line 1:
This is the commonly assumed algorithm for selecting a canonical form (dictionary form) of a [[lujvo|lujvo]]. It was created by Bob and Nora Lechevalier in 1989, and is printed in section 4.12 in [[The Book|The Book]]. The following is a mostly verbatim quotation:


This is the commonly assumed algorithm for selecting a canonical form (dictionary form) of a [[lujvo|lujvo]]. It was created by Bob and Nora Lechevalier in 1989, and is printed in section 4.12 in [[jbocre: The Book|The Book]]. The following is a mostly verbatim quotation:
# Count the total number of letters, including hyphens and apostrophes; call it "L".
# Count the number of apostrophes; call it "A".
# Count the number of "y"-, "r"-, and "n"-hyphens; call it "H".
# For each rafsi, find the value in the following table. Sum this value over all rafsi; call it "R":
{| class="wikitable"
! rafsi form !! example !! value
|-
| CVC/CV (final) || -sarji || 1
|-
| CVC/C || -sarj- || 2
|-
| CCVCV (final) || -zbasu || 3
|-
| CCVC  || -zbas- || 4
|-
| CVC || -nun- || 5
|-
| CVV with an apostrophe || -ta'u- || 6
|-
| CCV || -zba- || 7
|-
| CVV with no apostrophe || -sai- || 8
|}
:5. Count the number of vowels, not including "y"; call it "V".
:6. The score is then: (1000 * L) - (500 * A) + (100 * H) - (10 * R) - V


# Count the total number of letters, including hyphens and apostrophes; call it ``L''.
This score is calculated for all possible forms of the lujvo, and the one with the <u>lowest</u> score is selected as the canonical form. The algorithm has no provision for ties, but this is rare, given the large amounts of coefficients that must be factored in.
 
# Count the number of apostrophes; call it ``A''.
 
# Count the number of ``y''-, ``r''-, and ``n''-hyphens; call it ``H''.
 
# For each rafsi, find the value in the following table. Sum this value over all rafsi; call it ``R'': CVC/CV (final) (-sarji) 1 CVC/C (-sarj-) 2 CCVCV (final) (-zbasu) 3 CCVC (-zbas-) 4 CVC (-nun-) 5 CVV with an apostrophe (-ta'u-) 6 CCV (-zba-) 7 CVV with no apostrophe (-sai-) 8
 
# Count the number of vowels, not including ``y''; call it ``V''.
 
# The score is then: (1000 * L) - (500 * A) + (100 * H) - (10 * R) - V
 
This score is calculated for all possible forms of the lujvo, and the one with the ''lowest'' score is selected as the canonical form. The algorithm has no provision for ties, but this is rare, given the large amounts of coefficients that must be factored in.


This algorithm was written on the basis of the personal tastes of its authors (which appears to be quite compatible to the tastes of the rest of the Lojban community). It prefers short words over long ones, and vowels over consonant clusters. It also ranks the different rafsi forms according to which of them the authors find more pleasing.
This algorithm was written on the basis of the personal tastes of its authors (which appears to be quite compatible to the tastes of the rest of the Lojban community). It prefers short words over long ones, and vowels over consonant clusters. It also ranks the different rafsi forms according to which of them the authors find more pleasing.

Latest revision as of 08:12, 8 June 2015

This is the commonly assumed algorithm for selecting a canonical form (dictionary form) of a lujvo. It was created by Bob and Nora Lechevalier in 1989, and is printed in section 4.12 in The Book. The following is a mostly verbatim quotation:

  1. Count the total number of letters, including hyphens and apostrophes; call it "L".
  2. Count the number of apostrophes; call it "A".
  3. Count the number of "y"-, "r"-, and "n"-hyphens; call it "H".
  4. For each rafsi, find the value in the following table. Sum this value over all rafsi; call it "R":
rafsi form example value
CVC/CV (final) -sarji 1
CVC/C -sarj- 2
CCVCV (final) -zbasu 3
CCVC -zbas- 4
CVC -nun- 5
CVV with an apostrophe -ta'u- 6
CCV -zba- 7
CVV with no apostrophe -sai- 8
5. Count the number of vowels, not including "y"; call it "V".
6. The score is then: (1000 * L) - (500 * A) + (100 * H) - (10 * R) - V

This score is calculated for all possible forms of the lujvo, and the one with the lowest score is selected as the canonical form. The algorithm has no provision for ties, but this is rare, given the large amounts of coefficients that must be factored in.

This algorithm was written on the basis of the personal tastes of its authors (which appears to be quite compatible to the tastes of the rest of the Lojban community). It prefers short words over long ones, and vowels over consonant clusters. It also ranks the different rafsi forms according to which of them the authors find more pleasing.