Tansky-Lechevalier scoring algorithm: Difference between revisions

From Lojban
Jump to navigation Jump to search
m (Gleki moved page tansky-Lechevalier scoring algorithm to Tansky-Lechevalier scoring algorithm over a redirect without leaving a redirect)
m (convert table and quote syntax to work with Mediawiki)
Line 2: Line 2:
This is the commonly assumed algorithm for selecting a canonical form (dictionary form) of a [[lujvo|lujvo]]. It was created by Bob and Nora Lechevalier in 1989, and is printed in section 4.12 in [[The Book|The Book]]. The following is a mostly verbatim quotation:
This is the commonly assumed algorithm for selecting a canonical form (dictionary form) of a [[lujvo|lujvo]]. It was created by Bob and Nora Lechevalier in 1989, and is printed in section 4.12 in [[The Book|The Book]]. The following is a mostly verbatim quotation:


# Count the total number of letters, including hyphens and apostrophes; call it ``L''.
# Count the total number of letters, including hyphens and apostrophes; call it "L".


# Count the number of apostrophes; call it ``A''.
# Count the number of apostrophes; call it "A".


# Count the number of ``y''-, ``r''-, and ``n''-hyphens; call it ``H''.
# Count the number of "y"-, "r"-, and "n"-hyphens; call it "H".


# For each rafsi, find the value in the following table. Sum this value over all rafsi; call it ``R'': CVC/CV (final) (-sarji) 1 CVC/C (-sarj-) 2 CCVCV (final) (-zbasu) 3 CCVC (-zbas-) 4 CVC (-nun-) 5 CVV with an apostrophe (-ta'u-) 6 CCV (-zba-) 7 CVV with no apostrophe (-sai-) 8
# For each rafsi, find the value in the following table. Sum this value over all rafsi; call it "R":


# Count the number of vowels, not including ``y''; call it ``V''.
::{| class="wikitable"
! rafsi form !! example !! value
|-
| CVC/CV (final) || -sarji || 1
|-
| CVC/C || -sarj- || 2
|-
| CCVCV (final) || -zbasu || 3
|-
| CCVC  || -zbas- || 4
|-
| CVC || -nun- || 5
|-
| CVV with an apostrophe || -ta'u- || 6
|-
| CCV || -zba- || 7
|-
| CVV with no apostrophe || -sai- || 8
|}
# Count the number of vowels, not including "y"; call it "V".


# The score is then: (1000 * L) - (500 * A) + (100 * H) - (10 * R) - V
# The score is then: (1000 * L) - (500 * A) + (100 * H) - (10 * R) - V

Revision as of 02:22, 8 June 2015

This is the commonly assumed algorithm for selecting a canonical form (dictionary form) of a lujvo. It was created by Bob and Nora Lechevalier in 1989, and is printed in section 4.12 in The Book. The following is a mostly verbatim quotation:

  1. Count the total number of letters, including hyphens and apostrophes; call it "L".
  1. Count the number of apostrophes; call it "A".
  1. Count the number of "y"-, "r"-, and "n"-hyphens; call it "H".
  1. For each rafsi, find the value in the following table. Sum this value over all rafsi; call it "R":
rafsi form example value
CVC/CV (final) -sarji 1
CVC/C -sarj- 2
CCVCV (final) -zbasu 3
CCVC -zbas- 4
CVC -nun- 5
CVV with an apostrophe -ta'u- 6
CCV -zba- 7
CVV with no apostrophe -sai- 8
  1. Count the number of vowels, not including "y"; call it "V".
  1. The score is then: (1000 * L) - (500 * A) + (100 * H) - (10 * R) - V

This score is calculated for all possible forms of the lujvo, and the one with the lowest score is selected as the canonical form. The algorithm has no provision for ties, but this is rare, given the large amounts of coefficients that must be factored in.

This algorithm was written on the basis of the personal tastes of its authors (which appears to be quite compatible to the tastes of the rest of the Lojban community). It prefers short words over long ones, and vowels over consonant clusters. It also ranks the different rafsi forms according to which of them the authors find more pleasing.