me lu ju'i lobypli li'u 1 moi

From Lojban
Jump to navigation Jump to search

For a full list of ju'i lobypli publications, see ju'i lobypli.
Next issue: me lu ju'i lobypli li'u 2 moi.

Me la Uacintyn Loglytuan *    Number 1 - July 1986

* (Washington loglan-worker - the old metaphor behind Loglantan (Loglan language talk) would result in Loglylentaa(n) which seems too long. Of course log is free as an affix and could be assigned to generically reflect all of the L-prims related to Loglan (culture, language, and people), resulting in Loglentaa(n). In any case, there are as of yet, no speakers of the language, and I want people who are working purposefully, which is what turka means).


WHO - From Bob LeChevalier(rjl). Washington DC area Loglan worker, and currently coordinating Loglan dictionary update work. Intended originally for DC area loglanists, the audience has expanded to include any loglanist who is sufficiently local that they may be interested in special interest group (SIG) activities, and those who have evidenced by past contribution that they are capable and possibly willing to work to complete the current language effort. It is my hope to inspire other local areas with a high concentration of loglanists to form their own SIG; I have therefore included all known loglanists in the U.K., Boston area, and Raleigh NC area, as areas where I know that there is interest in establishing such SIGs. I've called the majority of you, and the interest is certainly there.

When SIG membership is clear from your responses, I'll include an address list. Please let me know phone numbers, and whether you wish them released. I do a lot of business by phone, and I expect others do as well. But I also have a personal standard to not release phone numbers without permission. If you have volunteered your number in Lognet, I will assume it is releaseable.

WHAT - This newsletter was originally intended to bring together all of the loglanists in the DC area. There has been a dearth of information about what is going on in the language, and there are a lot of 'coiled springs' out there looking for a language to work with. I found that there was desire for information about other projects beside the dictionary, and a significant interest in volunteering to help in those efforts. In attempting to determine the status of those efforts for this newsletter, I have committed to sending this to a much larger audience than originally intended. As such, at least for the nonce, the newsletter is available for anyone who wishes to be active in the Loglan community.

I am putting this out unofficially. The newsletter, and the DC SIG that I am trying to organize, are not part of the Loglan Institute. We overlap Members of the Institute, ex-subscribers of TL, and miscellaneous others who have expressed interest in the language. As such, all statements made in this are the opinions of the author, and are not in any way to be construed as official Institute policy. As such, the author of all material will be identified. At the moment, I'm paying all costs - which will be some 20 to 30 dollars this issue (not to mention the phone bills); I will accept contributions, but there is nothing required except interest at this time.

On the other hand, because the Institute is a vital part of the Loglan effort, and Jim Brown(jcb) is the inventor of the language, all material will be made available to him with an open invitation for him to respond to any and all topics raised in these pages. In addition, all material is being given to Ed Prentice, or to anyone else who serves as Lognet editor, with permission to use any or all material herein. Since neither I, nor anyone else except possibly jcb, can be considered a master of either the language or the ongoing work, I expressly invite him or anyone else to offer any corrections of fact stated in this publication, subject only to an attempt to constrain against 'political dispute' (see below), which is not the purpose of this effort.

Anyone may submit material. Send it to me at the address below. Please distinguish private correspondence from SIG submissions. Preferred material is:

  1. self identification, especially if you have never been identified in Lognet. Among other things, I suggest giving your background in Loglan related subjects, your Loglan-related interests, areas you have or would like to work in. How much time you are willing to spend on the language, and whether you especially are interested in working face-to-face, by mail, network, or phone with others. Computer information will help others who are designing Loglan software.
  2. anything related to ongoing projects covered by this newsletter.
  3. news about available Loglan-related materials
  4. reactions to language proposals, GPA materials, etc., of general interest
  5. questions and answers of general interest
  6. technical material of general interest. This may be published as Appendices, so people can read the news without being bogged down in details. I also may eventually include such material for selected interested people only, to reduce costs. In such case, I'll list the material in the newsletter, and let people write to ask for it. If we start publishing word lists or lengthy translations, this may become necessary.

Since I'm a slow typist, I'll obviously give priority to computerized submissions (CP/M or MS-DOS), or short hand/type written pieces. Longer stuff not on computer format will get in when I (or some other volunteer typist) has time.

WHERE - The DC SIG is primarily intended for those who live in the DC area. jcb has contacted at least one Delaware resident and suggested he contact me. As such, I am opening the SIG to any who express interest without geographical constraint. My address, for correspondence related to this newsletter or the SIG or dictionary work is:

                          Robert LeChevalier
                            2904 Beau Lane
                           Fairfax, VA 22031
                      phone: 703-385-0273 (home)
                             703-847-4465 (work)

WHEN - The DC SIG will tentatively hold an organizational meeting on Saturday, 26 July, so we can all meet each other. DC area people will be contacted with specific times. Others who would attend, please call or write and I'll let you know. Guest space is available at my place for the weekend, and I am located on the DC Metro so that I can be easily reached without a rental car from Amtrak Union Station or Washington National airport.

There are also plans to hold a Logfest at my house in the late August/ early September time-frame. I am suggesting any of the weekends including 16 August, 7 September, or possibly Labor Day weekend - 30 August. (I have plans for the 23rd.) The time-frame would take advantage of anyone who may pass thru on the way to the World Con SF convention in Atlanta, since I know several Loglanists who usually go to such things. Later tends to be better - DC is noted for August heat - but I'm flexible. All Loglanists are invited, and we'll be trying to have versions of the various unpublished software available for demo, at least on PC- compatibles, by then. If you are interested, and would like to vote on a date, let me know. Try to express your interest prior to the SIG meeting on the 26th.

WHY - Loglan is achieving stability again as a language. jcb is writing a summary of the language (NB3), and then is planning on dropping out for three years or so. I'd like to talk him into an extra 6 months after NB3 to help consolidate one or more SIGs like this one to carry on active Loglan work without requiring his active involvement. To do this, we must convince him of the dedication and interest of the community that would make further such effort worthwhile. I believe the existing community has more potential to carry on active Loglan work than will a half-time graduate assistant (HAGA), although I agree that there is more than enough work in Gainesville with the Institute to justify such a person.

Most Loglanists I have talked to have cited the following as reasons for their current lack of involvement. The SIG is specifically intended to counter those reasons - by any and all means feasible.

  1. The death of The Loglanist(TL), and Lognet(LN), have left the community with little current news as to what is going on. This is coupled with the fragmentation of the community into 'members', 'TLers', and various other interested people including those of the above who lapsed or resigned.
  2. A dispute that occurred in the 1983-85 time-frame which has been labelled as 'politics'. There were some personality conflicts among active workers, as well as some significant technical disputes, which culminated in an apparent struggle for control over the Institute. (I was not involved, and intentionally so. I consider myself strongly loyal to the Institute, and to jcb's efforts as father of the language, and saw no benefit in the dispute to either the membership or the language.) It is my intention to support no debates over Institute policies or decisions, and to maintain as apolitical a stance as possible, given that most Loglanists are highly intelligent, creative, and individualistic.
  3. A large portion of the Loglan community are not skilled in either or both of languages or linguistics, and the technical level of the material in TL was far above that of L1 in those areas. The vocabulary is arcane and somewhat jargonized, and many L1 readers have not been able to follow it.
  4. The work has in general not been divided up so as to allow a person with little time to feel that da's contribution can be worth the time commitment, even if small. The primary low-effort activities, taste- tests and word-making, have been slow, multi-person efforts with little response given to give da a strong feeling of participation.

    To get the various tasks on track, accurate direction must be given, and the workers must be kept informed of progress.

    This newsletter will attempt to do so for all projects which SIG participants are involved in. It is my belief that 2 hours a week on average by any of you will enable a major contribution. If the whole group of about 30-40 that I expect will continue to receive this can overall average 3-4 hours a week, our total effort will be the equivalent of 2 full time people or 4 HAGA's.

  5. While there have been calls for volunteers for many activities, there has been too little information for people to judge whether they can meet the commitments of volunteering. There has also been a lack of clear direction as to the procedures of the various activities. In the case of the dictionary, luckily, jcb has written me a 17-page Manual of Dictionary Reformatting, and I've had a weekend in Gainesville to gain further details, coupled with 9 years of TL and LN commentary suggesting changes and means for dictionary rewriting. Most of the other tasks, on the other hand, have relied on close direction, interaction, and feedback with and from jcb - who is after all only one man. This has been heightened by the haphazard entry of Loglan into the computer age, with inherent compatibility problems, and a general tendency away from correspondence as a means of communication. Those efforts being coordinated through the SIG will obviously require less of jcb's time.

    I'm considering building SIG membership on a couple of computer networks. Compuserve and PeopleLink (American Home Network) have been suggested. The former has wider coverage. The latter is cheaper. Any others, or votes for the above? Such networks, and an emphasis on quick redistribution of input data will hopefully keep you better informed and interested

  6. Many people want to learn the language. Books, and computer programs will help, and more current tapes would also be useful. But most people want someone to speak to or write to. This is the main purpose of a SIG - to form a subcommunity of people to work together and to interact. Those SIG members who are geographically close to each other should meet. Its a lot easier to learn to speak a language if there is someone to listen and to give feedback.
  7. There has been no accurate information as to the specifics of the language available roughly since about 1979-80, when the original Loglan books (grammar-L1, and dictionary- L4/5) were sufficiently outdated to no longer serve as a basis for active work. As such, many Loglanists have not felt capable to participate due to a lack of current knowledge.

There have been several updates and partial summaries since 1979, but many have not reached the entire community. The following summarizes these, which are frequently referred to in Loglan discussions. Many are still available from the Institute, and if you ordered or expected any and did not receive them (except for MacTeach and NB3) , please contact the Institute. Your balance may be too small, or your order may have been lost. I'm suggesting to the Institute that any new orders for old material be billed at a higher rate - say 10 cents per page - to cover inflation and to conform to the new policy of supporting the Institute with product sales, rather than just paying costs.

TL 4/2 was intended as a supplement to update L1, but it was written at a much more technical level, and generally stated the issues more than resolved them. Notebook 1 (NB1) documented the machine validated grammar (MacGram), as it currently existed. Notebook 2 (NB2) documented changes to the morphology, primarily in complex-making, as a result of a research effort known as the Great Morphological Revolution (GMR). Both NB1 and NB2 were apparently obsolete within a month after each appeared, but too little of the problems in each and their resolution appeared in LN, the only active publication at that time. The last 2 issues of TL were TL6/1 and TL7/1, which updated the morphology and grammar, respectively. On the whole, each is reasonably accurate, though there are significant changes to each, and TL7/1 does not contain the complete grammar as did NB1, in the detail needed to actively work with the language. There have been abortive or sluggish attempts to update other material, including the Primer, L1, and the L4/L5 dictionary, but none has completed. jcb is currently working on a current state-of-the-language document (NB3), which may resolve this problem, and Glen Haydon and jcb have worked off- and-on to complete an automated flashcard program called MacTeach. Other efforts, such as my dictionary work are now rapidly accelerating in light of the potential for NB3 to bring everyone back up to date, and the response to my telephone calls to start this SIG have indicated an underlying enthusiasm that should allow many of the rest of the ongoing efforts to be completed within a few months after NB3 becomes generally available, thus allowing the Loglanist masses to help out again.


The Status of the Loglan Project

The following is as complete and accurate as I'm able to determine. I'll keep everyone up to date with any new information I acquire.

Going Public Again (GPA) - Our goal. This has evolved over the years as participation has risen and fallen. The following is a summary of projects that people have included in the effort in previous descriptions, and a short status. I'll then go into detail on those which are still (to my knowledge) active.

  1. L4/L5 Dictionary Update - This has cycled between being a reformatting, and a true update. Time and participation will determine whether the update will consist only of correcting typos, updating existent word changes, and adding approved new words, or a more major revision that might include redefinitions and even new changes (such as argument structure) to old words. The time-frame is also negotiable, but I think a good product can be out in a year with preliminary drafts available in 6 to 8 months for workers.
  2. L1 Update - considered by many to be the most essential element, this is on hold and may not be done given jcb's other priorities. However:
  3. NB3 - this is to be jcb's statement of the language as it is. 40 odd pages are done, and if he does the whole language as he has covered the portion so far, the L1 update may be superfluous. NB3 is apparently what used to be called L6, but seems to be going beyond at least what I expected. Probably out by the end of the year.
  4. MacTeach - this is a series of 8 or so flashcard programs for computers, along with cassette tapes to allow the user to hear what da is working on. Glen Haydon and jcb have worked off and on for a few years, and have settled what is wanted. I have an early version of the first program, and it seems useful. SIG members will be helping finish the programs, which may be in FORTH as currently, or rewritten in other languages. Machine transportability, and program size are problems, especially for CP/M machines. A problem exists in timing responses, since various machines have different clock speeds, or means of determining time. The first program will probably be out within a month after jcb's return. Depending on our approach, the rest could take from a couple of months to a year to complete.
  5. LYCES/MacGram - The computer verification of the language's unambiguity. Scott Layson has updated this to run on an MS-DOS machine. With the larger memory, the whole language will again fit. jcb apparently has solutions to three old problems to be confirmed and the work will be done.
  6. Loglan Interactive Parser (LIP) - This is a subset of LYCES that verifies whether a given Loglan statement is grammatical, and how it should be grammatically interpreted. This is the test for an English-to- Loglan translation, or for an original Loglan composition, and a possible basis for a Loglan-to-English translator. Until we have significant experience writing/speaking the language, this tool will be vital. I've been trying to reach Scott Layson to determine the status; we have people willing to work on the update. One known change besides updating the grammar, will be the capability to parse larger chunks of text.
  7. Trial N Grammar - Related to LIP is a simple definition of the current grammar. NB1 had the last published version. TL7/2, which never came out, was to have the final grammar. I'm trying to get the current grammar so that people have a good tool to supplement the TL7/1 Teaching Corpus. In any case, NB3 should include this with the last of the LYCES-verified changes if I can't get it out sooner.

    jcb has asked for volunteers to write a short description of the grammar. Such a description is needed for the dictionary, as described in my Lognet submission. With no current formal definition of the grammar, I've heard reluctance to try to use the language. (TL7/1 does not say what has been changed since NB1) Any volunteers, or is someone already working on it?

  8. SA Translation Project - as a test of the language, jcb has translated portions of Scientific American articles from 8 languages into Loglan. This has added many new words, some new letters to the alphabet, and in general has proven the language. jcb intends to include excerpted translations in NB3.
  9. Other translations - there is little in the way of other translation in progress. A couple of linguists have agreed that the SA effort should be confirmed with some Bible (and perhaps Shakespeare) translations. Literary use of Loglan will be verified, and The Bible, regardless of one's religious beliefs, is the most linguistically studied text in history. The current scientific bent of translations will also weight down the dictionary with specialized scientific terminology.
  10. Primer - a written grammar of the language. MacTeach is only useful to those with computers. This would be an expansion and completion of the original primer that is now found in TL2. Chuck Barton was working on this, and may be interested in completing it when details of the current grammar are available to him.
  11. Other computer aids - Nora Tansky had a working flashcard program a few years ago, Which should be compared with the current MacTeach. Anita Lees also was developing such a program, but I have no status on it. Nora also had a random Loglan sentence generator and a Loglan-to-pidgin- English translator. These might be combined with LIP and the MacTeach framework to complete the MacTeach set, or may be developed separately. Nora also has programmed an appendage to her machine to 'speak' Loglan. She says it has a few problems, and the device doesn't fully support the Loglan phoneme set. Perhaps newer devices are more powerful, and a tool can therefore be devised that will speak text while it is being worked on, say in conjunction with MacTeach or LIP.
  12. Eaton Interface/Shakedown Cruise - originally a test of GMR; this has become a cornerstone of GPA as the primary organized Loglan work being worked on outside of Gainesville the past few years. jcb wants good coverage of the most used concepts in the natural languages, and Eaton has been the standard reference in this field. Kieran Carroll has patiently coordinated this effort, but volunteers have drifted away and progress is slow. With his new job and an impending marriage, Kieran's effort faces continued slow progress unless a new infusion rescues it.
  13. Complexing - my proposed solution to the Eaton project, which has stemmed from a need to verify the complex-making algorithm. I'll describe this below, but it will closely tie the Eaton work to the dictionary, and generate lots of words. I'm hoping to involve all of you in this.
  14. Universals - Chuck Barton has told me that there are new theories of linguistic universals that may be supplanting Eaton as the measure of language commonality and coverage. Someone needs to look into this further.
  15. Yngve - jcb has requested people to look into this researcher's work. I don't know what it is about, but it may have to do with the adequacy of the Corpus as a test of the grammar.
  16. Publications - Loglan has seen little general print over the years since the June 1960 SA article. There have been other articles published which I'll list below. With GPA, we need to get more national attention. Birrell Walsh has offered to write a piece for the publishers of the Whole Earth catalog, who are interested.
  17. Funding - jcb is going to seek funding for his graduate assistant to support the Gainesville Institute work. This will give some stability to the Institute and its finances.

Apparently, though, a lot of linguistics work is being funded now through the Defense Department. There are applications for Loglan in artificial intelligence(AI) and in international communications systems, which could bring DOD funding. But there is the obvious need to prevent possible classification of any funded research, and the DOD is not the most popular of institutions. But some people I've talked in the DOD environment say that the time has come for a Loglan-like language to be used by computers in AI-aided communication and translation, and the DOD may develop one of its own within a few years. The side-effects of such funding could ensure Loglan's acceptance, especially for those interested in its use as an international language or in AI, as well as giving the Institute a steady income to fulfill the language's potential. A controversial topic - opinions are welcome.

If I've forgotten your project, let me know, and I'll include it next issue.

More on L1 update and NB3

As I said above, NB3 could completely replace L1 if jcb continues writing at the level, and with the completeness and clarity he seems to be using so far. But if he is not going to cover the whole language at this level in NB3, an update to L1 should be a GPA priority. I'm hoping that a massive show of support may result in both.

There was talk at one time that instead of rewriting L1, that jcb would paste up a copy into a notebook and annotate it with updates and corrections. I should think that this would be comparatively easy after NB3 is done, especially if someone else volunteers to paste up the master for him. After he marks it up, we can set up a team to get it typed into a machine. He could choose to just outline the changes he wants and cross-reference to NB3, and the team could flesh out the text, subject to his approval when it's done. This minimizes jcb's time spent on this high priority project that would interfere with more technical tasks that require greater time investment.

I personally would like the revised L1, or NB3 to mention all little words(LWs), if only in an Appendix, with an example or two of the usage of each. That and an index would be the most significant improvements to L1 besides correcting for changes in the language.

More on the Dictionary work

I am including in the Appendices a copy of my lengthy submission to LN last year which includes several notes and suggested plans for the dictionary update. I've updated these to reflect the current status as much as possible, but I tried to retain my original flavor, so I could not cover everything current about the dictionary. The following are items about the status, and where you can help.

  1. New Format - I have converted the raw L4/L5 keypunched card image data to a delimited field format. The format and a sample of the text can be found in the LN submission Appendix 2. I believe my conversion is better than 95% accurate, which will minimize the need for manual massaging. For practical reasons, I may have to do that massaging myself, or use volunteers with directly compatible computers who can give short turnaround. The data is voluminous! As noted in the LN text, the comma- delimited format(see Appendix 1) given in LN for wordmakers is useful for their short-form lists. But the longer Universal format is needed to support the dictionary work, since much of the data - especially English Translation(E-trans) data doesn't fit in the comma delimited format. I am asking for volunteers to write programs to convert between the two formats, and from Universal format to dictionary text. (The Dictionary Reformatting Manual is needed for the latter).
  2. Remade Primitives/Etymologies - I have replaced all of the old primitives with the new primitives. This list is substantially identical with that in TL6/1 and NB2. I have also updated the primitive etymologies. These are the same as in NB2. Three of the etymologies had to be changed. Appendix 3 lists corrections to the NB2 primitives, affixes, and etymologies, so as to bring everyone current. I have a complete set of primitive etymologies on disk, if any linguists want to check the data vs. their own dictionaries/knowledge. I wouldn't doubt that there are a few typos in there, as well as possible phonological errors - and I have no knowledge of most of the languages in question. I will try to get a list of the dictionaries jcb uses for his etymology work, since standardization of our references is highly desirable.
  3. Word Types - Dictionary data has been sorted by word type to allow groups of people to work on specialized sections of the text. With the primitive changes and the upcoming remaking of the complexes, the dictionary can no longer be maintained in sorted order. However the Universal format was designed to enable most sort programs to be able to produce a sorted listing easily, although it will be time-consuming. The current text takes over 6 Meg on my disk. I expect it to double (or more) before it is done.
  4. Priority Efforts - My commitments to Jim means that the highest proportion of time will be spent on complex regeneration. He has also approved of my coordinating an analysis of the little words. Some must be remade because the primitives they were derived from changed. And all the new little words must be written up in Universal format, preferably with many examples for the E-trans section. A third effort is the reconstruction of the Element Names and primitives. The final rules on borrowings made the NB2 list obsolete, and the new alphabet makes it possible to make the names more phonologically correct. With help from jcb and Rebecca Bach (a new recruit), I have a new list which I'll try to have ready for kibbitzing within the month. Any kibbitzers out there.
  5. Old New Words - A most important task to be done with the dictionary is to comb the back issues of TL and LN for little words, new primitives, metaphors, etc. that need to be added to the dictionary. I believe pc, Chuck Barton, and Colin Fine have notes on these topics, but I don't know how complete they are. And they must be converted to Universal format. I'm hoping to get one of them to volunteer to collect all of the cards and notes from everyone and preferably to write (or better word process) them in Universal format. We can then generate word lists for others to review in whatever format is found most useful. All proposals should be so included, but the collector will inherently have primary kibbitzing rights as to which should become 'official'. The final decisions of course, are up to the Academy.
  6. New New Words - A similar task is to collect all of the new words that jcb has collected from the translation work, Eaton work, and words of opportunity, and put them in Universal format to be merged with the others. Some kibbitzing has been done on the Eaton words. But the conversion is still necessary to get them into the dictionary.
  7. Complex-Making Algorithm - There is now a formal algorithm for complex making. It is in 3 parts, and I'll try to include all of them in the Appendices. Part 1 is Kieran Carroll's algorithm for choosing the best metaphor for an English word. Parts 2 and 3 are the integrated generation of the Loglan complexes from the metaphor affixes. First a set of all legal complexes for a metaphor must be made (Part 2), and then the best must be selected (Part 3). There are actually two proposed Part 3 algorithms so far, and discussion indicates that a new one will eventually be necessary due to the 'untastiness' of certain consonant pairs, or interactions between affixes.

    A computer program must be written to automate Parts 2 and 3. I've had a couple of volunteers, but the project is important enough that I want several versions. I will try to distribute the best to all who do word- making. Word-makers need a user-friendly interface. The dictionary effort needs a version which can take the various Universal format inputs, and either print the best choice, all choices, and/or substitute a selected choice back into the Universal format dictionary where appropriate. A sophisticated user interface is needed here.

  8. Complexing - As part of evaluating various Part 3 algorithms, I came up with an idea which may solve Eaton interface problems while helping verify the algorithms. While in Gainesville, the two versions were developed by generating random test case complexes using convenient affixes. jcb agreed that algorithms should be tested like the old taste tests - by generating complexes, and letting human evaluators verify the computer choices.

The logical extension of this is to take the set of primitives, and generate 'metaphor pairs' to test the algorithm. This primitive set exists separately from the dictionary (as part of the MacTeach development effort), in a form most suitable for a computer program to play metaphor generator. A variety of the complex building program could then take the output of that program and generate taste tests for any given Part 3 algorithm.

I then realized that the intermediate product is even more useful. We have the capability, by computer, to generate all of the n-part metaphors possible with the set of primitives. There are, of course, millions. But if a systematic approach to selecting which n-combinations were most productive of useful metaphors, we could easily generate lots of new words. The affix set provides this approach - the most useful words in metaphors have the largest affix sets.

A computer program which generates the 2-part metaphors possible with the affix set would generate tens of thousands of possible metaphor words. Human reviewers can then go down lists, and determine which, if any, bring to mind useful English (or other languages) translations/concepts. Da would then write these down. The results of all reviews would be sorted by E-trans, giving cases of English words with multiple metaphors, of which the best can be chosen.

Since a similar data set for argument structures of primitives is needed for MacTeach, a similar program could apply quantifier and conversion little word affixes to primitives to generate all of the other kind of 2- part complex metaphors.

An interesting thing about this approach is that it exactly models the process by which a Loglan listener will recognize a new metaphor. Hearing a complex, da will decomposes it into the metaphorical roots, and try to determine the concept that is intended. And because we are attempting to model the process, experienced list workers should be able to analyze prospective metaphors at several per minute. We should literally be able to produce thousands of words if a significant percentage of metaphors are useful. (I suspect they won't - if even 10% are useful, the 900 odd primitives and thus 800000+ 2-part metaphors would produce a staggering 80000 new words). As such, 1000 metaphor-blocks should be able to be analyzed in an hour or two, at most. With a discriminator to reduce the volume of useless algorithm produced by the computer, the effort, while time consuming, could be completed within a few months, if everyone gets involved.

The other interesting thing about this approach is that it will produce a uniquely Loglan set of common concepts. With Zipfean assumptions, the 2- part metaphors should represent the most common and useful words in Loglan. We will end up with a complete set. I have no doubt that, when applied against the Eaton list, this set will cover most Eaton concepts. The remaining can be analyzed for patterns that might suggest additional primitives and affix requirements, but the job will be much smaller, and it may be superfluous since the word set will most likely exceed the goals of the original Eaton project in coverage.

We obviously need some programs as described above, and we need someone to generate the raw argument list for the primitives. The latter will be useful for MacTeach in any case.

This idea was developed after jcb left for Europe. I would like to have useful (and impressive) results for him when he returns. A coordinator other than myself may be required eventually, since this work is voluminous enough to conflict with the dictionary effort. But it will be fun, perhaps more fun than the Eaton effort, which some have told me was often slow, tedious, and unproductive. And working on this will help us all learn the language and contribute, without having to have dictionaries, grammars, or a knowledge of computers or linguistics. I intend to involve everyone in this if it looks like it will work, and there are at least 90 Loglanists out there still active according to jcb. Watch out world!

Changes to the Language

I won't try to describe all of these. NB3 does so. But since the last reporting in LN, some things of note have occurred.

Hyphens - My proposal for the use of y and iy as hyphens, described in the last LN, was modified and merged with the TL6/1 comments on the same subject, and accepted into the language. The details of hyphenation are in the complex-making algorithm - see Appendix 4. In general, when hyphenation is required, use y (schwa) after all affixes except CVV/Cvv's, use r in the latter, unless the first letter in the next affix is r, in which case use n. The latter preserves the mandatory consonant pair in all cases, and is easier to say. The n is necessary to preserve permissible medials. If a buffered dialect (used when a speaker cannot manage specific consonant pairs), then use y as the buffer, and iy (yuh), as the hyphen.

Tosmabru - The *Tosmabru test, as reported in TL6/1 was erroneously stated, and the example of a word that should pass should instead have been detected to decompose. The complex-making algorithm in Appendix 4 has the correct test.

Alphabet - jcb has devised a scheme for representing Linnean binomials (the formal names for plants and animals). Partly from this and the SA translations, he determined a need for additional coverage of phonemes in the alphabet to replicate names and binomials. Thus:

q         is the 'th' in 'theta', the greek letter
x         is the hard ch in 'Bach'
w         is the u umlaut in the German name for Munich

Thus the latter is now la Mwnxen, Bach is la Bax and Khruschev is now la Xrustcev.

These letters are never found in 'true' Loglan words, except in the names for themselves as use in acronyms. They specifically cannot be used in either predicates, borrowings, or little words.

Cases - jcb has devised an approach to representing cases in Loglan. All of the possible arguments have been classed into 13 types. The details have not yet been described, and the LWs have not been assigned. LN48.9 described the basis for the case system, which would expand the lexeme set consisting of the pua family. A set of 13 words, similar to sau would be used as 'case-tags', in the same way that pua words number arguments. Sau for example could be redefined to cover 'sources, authors, or starters' since it now means 'from source/origin'. He suggests dii for 'destinations or receivers', and mou for 'moveables or transmissibles'. The labels would make learning arguments easier, and eliminate problems caused by the sometimes arbitrary argument order. Perhaps they could even be used to solve the problem of how to relate an argument not normally part of the predicate definition, by providing a clue as to how that argument relates to the predicate. The problem of reordering arguments for style and emphasis would obviously be made easier.

The negatives include the possible standardization of these words in all utterances, which would be redundant and inelegant. They are, in effect, like prepositions in English, which are not always necessary, but required. The English 'I go store' is clear to the listener, but not grammatical. (These would be optional, of course, in Loglan; if the speaker wishes to use either the conversion operators nu, fu, ju, or the numerical argument pua series labels, they will cover all usages.)

I am personally undecided on cases. There might be a tendency for people to retain more of their old cultural thought patterns when using Loglan, which could reduce Whorfian effects. And they add more semantic content to LWs, which as lexemes have been almost entirely grammatical. And they seem to eliminate the simple elegance of the predicate as a universal word type with all of the rest of the utterance implicitly related to it. On the other hand, cultures such as German which have strong case structures will find the language easier to use. English is relatively case-free, and we don't have a good feel for how other cultures will find the language. And the change is potentially major in its usage implications - too little time remains before GPA to thoroughly study those implications.

Commas - the written form of the language uses written commas in multiple ways. They represent pauses for emphasis, and close-commas embedded in vowel strings can be used to alter the default left-pairing of vowels, as described in TL6/1.

Borrowings - for those without TL6/1, borrowings are no longer required to be 1 mod 3 in length. The algorithm for defining the borrowing space is non-trivial. See my comments in the LN submission Appendix, as an alternate or supplementary proposal to the current system.

Optional Vowel Disyllables - the iV, and uV vowel pairs can all be pronounced monosyllabically or disyllabically, depending on circumstances. The circumstances have been stated narrowly (only after vocalic consonants), and I personally prefer less restriction, since I've found some possible complexes almost un-mouthable monosyllabically, while others are easy. Try mreduo, glopuo, and polpuo as examples that I find difficult to say without tongue-twisting, or splitting the disyllable. (Incidentally, similar tongue twisting effects are why I believe some changes will be be needed to the Part 3 complex-building algorithm. Too many vocalics or monosyllables, and certain of the permissible medials seem very untasty compared to hyphenated forms.

Material I Have and Policy

jcb has given or sold me copies of nearly everything the Institute has put out over the years, including a complete set of TLs and LNs, and several things such as the Dictionary Reformatting Manual, and advance copies of NB3 text and MacTeach. He has given me no clear direction (and the Institute has never published to my knowledge) about any restrictions on the reproduction of this material for Institute business, such as the dictionary work, or his requested assistance in reviewing NB3 and completing MacTeach. Since the Institute sells much of this material, I do not feel at liberty to haphazardly reproduce it. Yet some things, like the new primitive/affix list, are needed by anyone who wishes to work actively on the language.

Since I wear 2 hats, editing this unofficial newsletter, and serving the Institute as a task foreman, I'm going to have to keep my hats separate. Thus, as editor, I will not be printing or reproducing any Institute- proprietary material, until jcb gets back from Europe and says it's OK. As foreman, however, It should be OK for me to make copies for those who need them to work on tasks that I am coordinating. I will list such material that I have that seems useful in such contexts, and in some cases, I will send it to you directly. If something I list seems like it will be useful in your Institute-related work, please request it and I'll oblige. This policy may, of course, be amended when jcb returns, if he so directs.

Bibliography

Things of Interest to Loglan-workers

There is an article on Loglan in Dr. Dobbs' Journal, by Dave Cortesi, in the September 1982 issue. I do not yet have a copy.

jcb and Bill Greenhood wrote an article in Cultural Futures Research, a quarterly publication of a consortium including the American Anthropological Association. It takes up the entire issue of the Winter 1983/84 edition, and is entitled Paternity, Jokes, and Song: A Possible Evolutionary Scenario for the Origin of Mind and Language. The article is intriguing, is partially derived from what has been learned in developing Loglan. Since the Institute did not copyright it, I can't make copies. The Institute does have a set of reprints available at the issue price of $7.50. If you can't find it in your local university library, and don't want to wait for jcb's return, I have an address for the journal editor. A related article appeared in TL6/1. jcb is presenting material on the subject at Oxford in England this August, for any of you in the U.K who he didn't tell. I don't know the exact dates.

jcb has been working on a paper with Scott Layson and several others on the machine grammar for the Communications of the ACM. I don't know if it has been published yet.

I have A Manual for Dictionary Reformatting, 17 pages, for those who work with me on the dictionary.

I have a complete list of primitive etymologies post-NB2, for linguists who would like to verify them. (Any volunteers?)

The following are references to LN articles I think are particularly useful to active workers. If you were not a member and so did not get them, or have lost your copy, I have a complete set up to the last (#52).

47.6 - jcb's request for a short summary of the current grammar. 48.9 - details of the 3 post-TL7 grammar changes. I believe these are the ones as yet un-YACCed. One relates to a new use of ga to mark untensed predicates. The 2nd is the case system referred to above, and the 3rd relates to a way to collapse jia and jio into jo. 50.10 - the current policy on when and how to borrow scientific words. 50.15 - more on cases 51.7 and 52.30 - jeff taylor's question and jcb's answer on terminating descriptions. 52.11 - jcb makes clear that old hands who may have disappeared during rough times are welcome to come back and work on the language. (James Carter, pc, Bob McIvor, and Chuck Barton were specifically mentioned as having volunteered to help on Eaton). 52.26 - jeff taylor and jcb, a letter in 'New Loglan', with jcb's analysis and translation. The detailed analysis is a very helpful analysis of grammatical problems. If there is any problem, it is that the number of comments is so large, that it causes me to doubt I'll ever master the grammar. But those who are more familiar with it may reassure the rest of us by producing more such material and translations.

Most everyone has TL6/1, which is the latest summary of GMR, and TL7/1, which has the Teaching Corpus and is devoted to grammar. If you don't, the last Institute price was $3.25 for TL6/1 and $2.95 for TL7/1. It's hard to work without them.

Incidentally, while there are stacks of old dictionaries, there are effectively no remaining copies of either L1 or the Supplement (TL4/3).

Notebook #1 ($10.00 at last word) has the last complete corpus, and the formal grammar. TL7/1 had the updated Teaching Corpus, but this is apparently a subset of the full Corpus, is missing all the background and many of those little words I never have really mastered. Some have found TL7/1 to be very useful; its inductive approach models natural language learning methods. But inductive methods do not give confidence that ba correctly understands the nitty gritty rules behind ba's learnings. I hope NB3's summary of the grammar includes the formal description, definitions of terms, a complete list of little words, and explanatory material like NB1, as well as the excellent teaching style of TL7/1.

Notebook #2 (also $10.00) describes the process used in remaking primitives, devising affixes, and otherwise completing GMR. There are several useful tables and lists, but most are outdated. The updated list of primitives and affixes in the Appendix to this newsletter will be a better reference, especially when coupled with the Complex-Making algorithm.

Notebook #3 (to be published this fall - estimated $50.00 at 15 cents/page) will be the most useful recent work on the language, and may supersede L1. I have the first 40-odd pages for review, and may ask for help in areas such as the phonology.

MacTeach (current deposit $50.00 - price will be a multiple of production and mailing costs to include small royalties to the authors) The first of these programs should be out by fall, and is an updated teaching/oriented flashcard manager. About 7 or 8 others, covering the whole language in ascending difficulty, are expected - though I have some ideas on how to extend the series using LIP to include translation training. As currently advertised, this includes permanent free update rights, but I think the Institute is risking something in making such a promise without requiring a shipping and handling fee for such updates. As planned, each ordered list in the series will come with a cassette tape so that you can hear the words/sentences as you work on them when you work in sequence order.

LIP (last listed at $60.00) This also allowed unlimited update rights, and since the current version needs update, will test that policy. (I think we should all be willing to contribute to costs given the Institute's financial state). jcb asked Scott for a version that could digest larger portions of text, as well as include new grammar changes. This may require larger machines than those that run CP/M. LYCES was rewritten to use the larger memory of jcb's new Z-100 under MS-DOS.

What Can You Do

Nearly everyone I've talked to has been interested in working. Some have been concerned about time commitments; some want to meet with other Loglanists, while others would prefer to work alone. Some feel relatively familiar with the new changes, others are completely lost and will be until a new L1 or similar lay-oriented text appears. Some are computer- proficient, others are 'users', and others don't have (or want to have) a computer. Some have remained active over the years; others have been recently laying low; others have been TLers, or even non-subscribers who do not know what has been changing in the last 5 years.

                  YOU ALL HAVE THE CAPABILITY TO HELP!

We have tasks, like complexing, that don't take a lot of time or knowledge of Loglan, nor do they require a computer. But lots of you are needed. The original Eaton work, while slower in apparent progress, is equally undemanding of time.

We have materials necessary to learn what is needed to do some of the more technical tasks.

If you want to help on any of the computer-related efforts, there are more than enough programs for everyone to write. Let me know.

And if you are uncertain whether you can do a particular job in the time requested, just say so. Neither the Institute nor myself are slave- drivers. There will be coordinators for each of the tasks, and we'll repartition the work as necessary to keep everyone busy and get the language done.

And them everyone can get on with the fun of learning and using our new language.


List of Appendices/Attachments

  1. Comma-Delimited Format
  2. rjl submission to LN
  3. Affix and etymology changes since NB2
  4. Complex-Making Algorithm, Parts 2 and 3
  5. Current List of Prims and Affixes
  6. Complex-Making Algorithm. Part 1 (if received from Kieran Carroll prior to publication.
  7. Manual of Dictionary Reformatting - This 17 page document is not going out to everyone, but only to those who have volunteered for dictionary work. Since it was written by jcb, I am not yet releasing it generally. Anyone who thinks that they need it may have a copy. See #7 for my policy on giving out copies.


Appendix 1 - Comma-Delimited Format

When to use which formats:

  1. Universal format is preferred for direct dictionary work, since it contains everything necessary for building dictionary entries.
  2. Comma-delimited format is preferred by jcb and Kieran Carroll for Eaton work, and is OK for dictionary work if you want to save time - we can fill in the other data later.
  3. For complexing, it is sufficient to just return word-lists with the English word (or phrase) suggested by the metaphor written alongside. Cross out metaphors with no obvious meanings, so we know you looked at them. If a list of taste-test trial words for a metaphor is given, circle the best, and/or put a number by them to indicate an order of preference. Simple!

All fields separated by commas. If you need a comma in the text, use a semi-colon instead. Don't use semi-colons (they'll be changed to commas). If you do not have data for a given field, just put the commas together with nothing in between. If you don't know what to put in a field, put a question mark.

Field     Contents

1         5 digit Eaton number (3 digit page and 2 digit number indicating order on the page)
2         English keyword - 1 word to help identify the E-trans (and sort)
3         Loglan word/phrase
4         Etymology - if any. Use the format in the old dictionaries for primitives. If a complex, give the complete metaphor.
5         A full English translation of the word
6         Any Affixes for primitives, otherwise null
7         Derivative Complexes, if any (give the metaphor, we'll build the word)
8         Morphological Type - possible values are the same as for Line Type 1 in Universal format
9         Author's Initials
10        Year the word is invented
11        Eaton Rank in 1000's.

If submitted on computer, add notes with an ** in the first column of the first line, and at the end, so any computer processors don't try to figure it out. Notes could include references for etymologies, related E- trans, such as listed in the lines below each main entry in L4/L5, etc. Anything that will aid the conversion to Universal and then Dictionary format, or will explain your logic to any kibbitzers.


Appendix 3 - Changes to the NB2 Primitive/Affix lists (pg 37-42)

                        Changes to Etymologies

(These can also be used to mark up the list in TL6/1, though some are already there.)

Old       New       Affixes        English keyword

blabo     bulbi     bui bul        bulb
brani     brona                    brown
bulju               buj            boil
carta     curtu                    shirt
cidjo     cibra                    bridge
citre               cie            thread
dampu     pudja                    thumb
dertu               deu der        dirt
detri     detra     dea det        daughter
durzo               dru dur duo    do
dutci               dut            doubt
flofu               -              float
folma               flo            full
forma               fom foa        form
gotri               got            industry
gotso     godzi     god goz goi    go
gusta     tasgu                    disgusted
kampe     kambi                    compare
klira     kalra                    collar
krena     kurti                    curtain
madji               maj            magician
marke     marte     mae            market
matca     metca                    match
matci               mai mac        machine
metro               meo            meter
pento     penta     pea pet        point
petri               -              distribute
pidra     hompi     hom hoi        drink
pirle               pie            parallel
rorno     horno                    horn
selba     helba     hel hea        help
slano               sla            slow
tatro               tat            theater
virsa               vir            poetry
virta               vit            ad

godzi  3v x goes to y from z. 2v x goes to y (from here).
godzi  -b 2/2e go 2/4c dzou 2/4R idti

marte  2n x is a market of town/district y, a place for trading
marte  -b 4/4e mart 4/5g markt 3/4f marche 4/6j maketto 3/7s
mercado 2/5h bazar 2/5R bazar

sluko  2n x is a/the lock of/on/in y. 1n lock
sluko  -b 2/3c suo 2/3e lock 2/5G schloss


[1993 note - the following is obsolete, but is the version as written by JCB and Lojbab in May-June 1986.]


CPX - Making Algorithm

The following is the algorithm for generating Loglan complexes that is being implemented as part of the dictionary reformatting project.

Given an n-term metaphor P1 P2 ... Pn and the (unflagged)* instruction to find the best** reduced word:

  1. Look up or generate all of the affixes (3- and 4- letter forms) of P1, P2, ... Pn-1, forming the sets {A1}, {A2}, ... {An-1}.
  2. Look up or generate all of the affixes (3- and 5- letter forms) of the final term Pn, forming set {An}. Eliminate any CVC affixes from this set.
  3. Form all of the combinations, without hyphens, a1I a2J ... anK, where a1I Ž {A1}, a2J Ž {A2}, etc. (There will be N{A1} x N{A2} x ... x N{An} such combinations. Form them all.)
  4. Install hyphens where necessary in any combination. Specifically:
    1. Put y at any proscribed C/C joint (e.g. mekykiu). See Section 10, page 29, of TL6.
    2. Put y at any proscribed C/CC joint (e.g. menydjo). See Section 11, page 29, of TL6.
    3. Put y after any 4-letter affix form (e.g. mrenysai).
    4. Test all CVC ... CVC + CVCCV forms for "Tosmabru failure", as follows. Examine all the C/C joints between the CVC affixes, and between the last CVC and the CVCCV term. If the first one or more of those C/C joints are "bridged" by permissible initials, listed in Section 13, page 30, of TL6, then the trial word will break up. But if the first C/C joint is unbridged, i.e., is impermissible as an initial CC, the trial word will not break up. It has passed the Tosmabru Test. Only the first joint in a trial word needs to be unbridged in order to ensure resolvability. (Note that this definition of the Tosmabru Test is a change from the version in Section 24, page 35, of TL6. ?Gusnilbo'tci can resolve to gu snilbo'tci, and the Tosmabru Test must therefore detect this failure.) Install y at the first bridged joint if the Tosmabru Test fails (e.g. tosymabru).
  5. Evaluate all combinations and select the best**.

* Unflagged: It will be possible to specify certain flags to constrain the algorithm in its evaluation and selection. The currently identified flags are:

   NR - Non-reduced. The non-reduced (4- or 5- letter affix) form will be automatically selected as the best, overriding the evaluation.

   RR - Right-reduced. Only the right-most term Pn may reduced using a short affix form. All other terms will use the non-reduced form.

   LR - Left-reduced. The opposite of RR. All terms except the right-most term may be reduced using a short affix form, but Pn will use the non-reduced 5-letter form.

   LL - Leftmost-reduced. Only the left-most term P1 may be reduced using a short affix form. All other terms will be left unreduced.

   MO - Manual override. When previously processed through this algorithm, human review of the evaluation caused the manual selection of another form instead of the best scoring form. Determine the best form and report it, but indicate the manual override condition to allow human re-review. If available, also report the 'old' form that was determined by human review. (This is intended to cover two cases - those where some constraint on the word form is needed that is not covered by the other flags, such as 'middle term reduced' or, more likely, a form such as long-short-long-short, which reflects the compounding of 2 2-term metaphors. The second case is when usage or some non-algorithmic determination of 'tastiness' dictates that some form is preferable to the algorithmically selected form.) In case of manual override, the dictionary reformatting processing will not automatically change the entry, thus possibly leaving an improper word - if affixes have been changed - until the manual review takes place.

** Best: Options will exist in the implementation of the algorithm to report out any of the following:

  • all wordforms and their scores. If flags constrain the algorithm, wordforms excluded by the flags may be optionally either excluded or listed separately below the forms permitted by the constraints. This option will be used until an evaluation algorithm has stabilized.
  • the 5 best wordforms and their scores, again with optional separate listing of the 5 best non-constrained wordforms and their scores. This option will be used to reduce output for human review when a degree of confidence has been established in an evaluation algorithm, but while it is still considered desirable to allow human 'kibbitzers' verify the evaluation.
  • the one best scoring wordform, and its score. (for dictionary reformatting). Optionally, the best non-constrained form and its score will also be reported. In case of manual override, the previous form and its score, and an indication of manual override will be reported as well, and no change will be made to the dictionary.


Evaluation Algorithms

Two types of evaluation algorithms are currently being implemented. Results from each will be produced and compared and evaluated by the community of word-makers. After tuning, one algorithm will eventually be used to actual reformat the dictionary and to compose words from metaphor lists submitted by word makers.

Algorithm 1 (JCB) - This algorithm form is similar to that of the Taste Test 5 evaluation. (See pp. 20-26 of NB2, especially Table 3.) Values are attached to individual affix forms, to hyphens, and to certain specified interactions between affixes. The score is the sum of these values.

The advantages include a short mathematically determinate result where the effect of each component can be clearly seen. A disadvantage is that the setting of values for each component or interaction is difficult, given that it is desirable to achieve a particular order of wordform scores as results of the evaluation. The effect of a small change in any value can be the remaking of many words. It is thus difficult to accurately reflect the results of taste tests in this algorithm form. The current set of component/interaction values being implemented is:

                   AFFIX FORMS

         Non-Finals               Finals

    Cvv- (sai-)    7         -Cvv (-sai)         7.6
    CVV- (veo-)    5.25      -CVV (-veo)         4.4
    CCV- (gra-)    4         -CCV (-gra)         3
    CVC- (men-)    5

    CCVC- (mren-)  -20.5     -CVCCV (-sadji)     -9
    CVCC- (matm-)  -25.5     -CCVCV (-brudi)     -9.5

                   INTERACTIONS

                   y hyphen       -6
                   r/n hyphen     -14
                   C/CC juncture  -1.3

Resolve ties by using the wordform with the minimum number of consonants.

                   EXAMPLES

men + sai                     mensai
 5  + 7.6                          = 12.6
gra + sai                     grasai
 4  + 7.6                          = 11.6
sai + gra                     saigra
 7  +  3                           = 10
men + gra                     mengra
 5  +  4 - 1.3                     = 7.7
men + y + nai                 menynai
 5  - 6 + 7.6                      = 6.6
sai + r + sai                 sairsai
 7  -14 + 7.6                      = 0.6
gra + sadji                   grasadji            (Left-Reduced)
 4     - 9                         = -5.0
sai + r + sadji               sairsadji           (Left-Reduced)
 7  -14    - 9                     = -16.0
mren + y + sai                mrenysai            (Right-Reduced)
-20.5 - 6 + 7.6                    = -18.9
mren + y + sadji              mrenysadji          (Unreduced)
-20.5 - 6   - 9                    = -35.5

gra + gra + sai               gragrasai
 4  +  4  + 7.6                    = 15.6
men + gra + veo               mengraveo
 5  +  4  + 4.4 - 1.3              = 12.1
sai + r + gra + veo           sairgraveo
 7  - 14 + 4  + 4.4                = 1.4
gra + mren + y + sai          gramrenysai         (only possible by Manual Override)
 4  - 20.5 - 6 + 7.6               = -14.9
gra + mren + y + sadji        gramrenysadji       (Leftmost-Reduced)
 4  - 20.5 - 6    - 9              = -31.5
gra + gra + sadji             gragrasadji         (Left-Reduced)
 4  +  4     - 9                   = -1.0
mren + y + mren + y + sai     mrenymrenysai       (Right-Reduced)
-20.5 - 6 - 20.5 - 6 + 7.6         = -45.4
mren + y + mren + y + sadji   mrenymrenysadji     (Unreduced)
-20.5 - 6 - 20.5 - 6   - 9         = -62

Algorithm 2 (RJL) - This algorithm is the opposite of Algorithm 1. Algorithm 1 attempts to define the factors governing tastiness of individual components, and builds a mathematical model composed of those factors, and then modifies the results minimally by the effects of interactions between components. This algorithm uses the tastiness of 2 adjacent interacting components as a gestalt, and does not directly reveal specific reasons or factors for the tastiness.

The algorithm assumes the following types of interacting components are found in complexes:

   Initial/Final (all 2-part complexes)
   Initial/Medial (the beginning of a 3-or-more part complex)
   Medial/Medial (The middle of a 4-or-more part complex)
   Medial/Final (the end of a 3-or-more part complex)

All legal forms of each interaction are enumerated (there are 37 forms where the 2nd component is medial, and 30 where the 2nd component is final) and evaluated via taste test. The ordered list is then scored descending from 37/30 for the tastiest or most desirable form to 1 for the least desirable. A given wordform is then evaluated by looking up the component interactions, and summing the scores. For a 2-complex, no summing is necessary, making this algorithm very straightforward in reflecting tastiness patterns. For more than 2 terms, there is a question whether middle terms might be improperly over-weighted over initial and final terms. (e.g., in a 3-term complex, the sum of the scores for Initial/Medial and Medial/Final would appear to give double weight to the single Medial term). Only review of algorithm results will determine if this bias exists.

It is possible to correct for such biases by weighting the scores, and by adding in factors similar to those in algorithm 1 to reflect preferences in initial and final terms. It is also possible to adjust scores for interactions not reflected in the separate component scores, such as the interaction between specific consonants or vowels appearing in adjacent components (e.g. -saicai- might be down-weighted because of the proximity of s and c causing the Loglan equivalent of the English tongue twister 'She sells seashells ...').

The advantages of this approach include the direct reflection of the choices made by a speaker or auditor in the algorithm. As such, tuning the algorithm to reflect changed perceptions of the relative tastiness of words is relatively simple. The disadvantages include the hiding of the reasons behind the ordering used, and thus the possibility that hidden factors in the specific words comprising the taste tests might bias the scores in the algorithm. For example, the original auditor detected a personal bias against test words containing mre, presumably because the CC pair is so unfamiliar to an English-native speaker. Such biases can be corrected when identified, but this algorithm can be most effective only when many independent reviewers have confirmed the tastiness ordering used by examining its results.

The specific initial algorithm and values are as follows:

  1. Determine the interacting components in the wordform to be evaluated.
  2. Obtain the values for each such component.
  3. Apply any appropriate weighting factors.
  4. Sum the values to get a total score.
  5. Apply any corrections for being used initially initial and final components.
  6. apply any corrections for special vowel or consonant patterns.
  7. Automatically evaluate all unhyphenated wordforms as better than any hyphenated wordforms regardless of score. (or apply a negative correction for hyphens as in Algorithm 1. The correction is different for different numbers of terms.)
  8. Resolve ties in favor of the wordform with fewer consonants.

Values for interacting components may be found on the following table:

Interaction                                2-term CPX        3-term CPX
Pattern                         1st/Final  1st/Medial    Medial/Medial     Medial/Final

CCV + Cvv                       30         37            30                29
CVC + Cvv                       29         36            28                21
Cvv + CCV                       28         illegal       27                28
Cvv + CVC                       illegal    illegal       26                illegal

CCV + CVV                       27         31            35                22
CVC + CVV                       26         30            32                20
CVV + CCV                       25         illegal       37                27
CVV + CVC                       illegal    illegal       34                illegal

CCV + CVC                       illegal    35            25                illegal
CVC + CVC                       illegal    34            24                illegal
CCV + CCV                       24         33            36                23
CVC + CCV                       23         32            29                26

Cvv + Cvv                       illegal    illegal       22                30
Cvv + CVV                       illegal    illegal       33                24
CVV + Cvv                       illegal    illegal       31                25
CVV + CVV                       illegal    illegal       23                19

CVC + y + Cvv                   22         29            21                18
CVC + y + CVV                   21         28            20                17
CVC + y + CCV                   20         27            19                16
CVC + y + CVC                   illegal    18            18                illegal

Cvv + r/n + Cvv                 19         26            illegal           illegal
Cvv + r/n + CVV                 18         25            illegal           illegal
Cvv + r/n + CCV                 illegal    24            illegal           illegal
CVV + r/n + Cvv                 17         23            illegal           illegal
CVV + r/n + CVV                 16         22            illegal           illegal
CVV + r/n + CCV                 illegal    21            illegal           illegal
Cvv + r/n + CVC                 illegal    20            illegal           illegal
CVV + r/n + CVC                 illegal    19            illegal           illegal

Cvv + CVCCV/CCVCV               illegal    illegal       illegal           15
CVV + CVCCV/CCVCV               illegal    illegal       illegal           14

CCV + CVCCV                     15         illegal       illegal           13
CCV + CCVCV                     14         illegal       illegal           12
CVC + CVCCV                     13         illegal       illegal           11
CVC + CCVCV                     12         illegal       illegal           10

CCV + CVCC                      illegal    17            17                illegal
CCV + CCVC                      illegal    16            16                illegal
CVC + CVCC                      illegal    15            15                illegal
CVC + CCVC                      illegal    14            14                illegal

CVC + y + CVCC                  illegal    13            13                illegal
CVC + y + CCVC                  illegal    12            12                illegal

CVC + y + CVCCV                 11         illegal       illegal           9
CVC + y + CCVCV                 10         illegal       illegal           8

Cvv + r/n + CVCCV/CCVCV         9          illegal       illegal           illegal
CVV + r/n + CVCCV/CCVCV         8          illegal       illegal           illegal

Cvv + r/n + CVCC/CCVC           illegal    11            11                illegal
CVV + r/n + CVCC/CCVC           illegal    10            10                illegal

CCVC + y + Cvv                  7          9             9                 7
CCVC + y + CVV                  6          8             8                 6
CCVC + y + CCV                  5          7             7                 5
CCVC + y + CVC                  illegal    6             6                 illegal
CVCC + y + Cvv                  4          5             5                 4
CVCC + y + CVV                  3          4             4                 3
CVCC + y + CCV                  2          3             3                 2
CVCC + y + CVC                  illegal    2             2                 illegal

CCVC/CVCC + y + CVCCV/CCVCV     1          illegal       illegal           1
CCVC/CVCC + y + CVCC/CCVC       illegal    1             1                 illegal


Variants for 3-or-more Term Complexes

Variant 1 - Multiply values for Medial/Final by 37/30 before summing for all 3-or-more term complexes. This adjusts for the unequal maximum value for that lookup. Alternatively, the following table summarizes the adjustment from the unmultiplied value above. This value may thus be added to the non-variant algorithm score.

Raw Value Adjust    Raw Value Adjust    Raw Value Adjust

     1    .2             11   2.5            21   4.8
     2    .5             12   2.8            22   5.1
     3    .7             13   3.0            23   5.3
     4    .9             14   3.2            24   5.5
     5    1.2            15   3.5            25   5.8
     6    1.4            16   3.7            26   6.0
     7    1.6            17   3.9            27   6.2
     8    1.9            18   4.2            28   6.5
     9    2.1            19   4.4            29   6.7
     10   2.3            20   4.6            30   7.0


Variant 2 - Apply the Variant 1 correction. Then apply a correction for the Initial and Final terms as follows. The values are determined by averaging those values for a given affix-form for Initial position in the Initial/Medial look-up, and for Final position in the Medial/Final look-up.

Initial Affix-form       Values                   Adjustment

Cvv + r/n                26+25+24+20+11 /5        21.2
CVV + r/n                23+22+21+19+10 /5        19.0
CCV                      37+35+33+31+17+16 /6     28.2
CVC                      36+34+32+30+15+14 /6     26.8
CVC + y                  29+28+27+18+13+12 /6     21.2
CCVC + y                 9+8+7+6+1 /5             6.2
CVCC + y                 5+4+3+2+1 /5             3.0

Final Affix-form                                  x 37/30

Cvv                      29+21+30+25 /4           32.4
y + Cvv                  7+4 /2                   6.8
CVV                      22+20+24+19 /4           26.2
y + CVV                  6+3 /2                   5.6
CCV                      28+27+23+26 /4           32.1
y + CCV                  5+2 /2                   4.3
CVCCV/CCVCV              15+14+13+12+11+10 /6     15.4
y + CVCCV/CCVCV          9+8+1 /3                 7.4


EXAMPLES (listed in the same order as Algorithm 1 for Comparison

                                   Basic     Variations
                                   Score     1         2

men + sai                     mensai
                                   = 29
gra + sai                     grasai
                                   = 30
sai + gra                     saigra
                                   = 28
men + gra                     mengra
                                   = 23
men + y + nai                 menynai
                                   = 22(hyph)
sai + r + sai                 sairsai
                                   = 19(hyph)
gra + sadji                   grasadji            (Left-Reduced)
                                   = 15
sai + r + sadji               sairsadji           (Left-Reduced)
                                   = 9(hyph)
mren + y + sai                mrenysai            (Right-Reduced)
                                   = 7(hyph)
mren + y + sadji              mrenysadji          (Unreduced)
                                   = 1(hyph)

(Due to the hyphen rule, grasadji would be preferred over sairsai.)

gra + gra + sai               gragrasai
    33 + 29                                  +6.7      +28.2+32.4
                                   = 62      = 68.7    = 129.3
men + gra + veo               mengraveo
    32 + 22                                  +5.1      +26.8+26.2
                                   = 54      = 59.1    = 102.1
sai + r + gra + veo           sairgraveo
    24   +   22                              +5.1      +21.2+26.2
                                   = 46      = 51.1    = 98.5
gra + mren + y + sai          gramrenysai         (only possible by
                                                       Manual Override)
    16   +   7                               +1.6      +28.2+6.8
                                   = 23      = 24.6    = 59.6
gra + mren + y + sadji        gramrenysadji       (Leftmost-Reduced)
    16   +   1                               +.2       +28.2+7.4
                                   = 17      = 17.2    = 52.8
gra + gra + sadji             gragrasadji         (Left-Reduced)
    33   +   11                              +2.5      +28.2+15.4
                                   = 44      = 46.5    = 90.1
mren + y + mren + y + sai     mrenymrenysai       (Right-Reduced)
    1    +        7                          +1.6      +6.2+6.8
                                   = 8       = 9.6     = 22.0
mren + y + mren + y + sadji   mrenymrenysadji     (Unreduced)
    1    +        1                          +.2       +6.2+7.4
                                   = 2       = 2.2     = 15.8


The following is material that I wrote last year annd submitted to John Lees for Lognet. I sent some advance copies to others for comment, and received enough such comments that I decided to rewrite it before publication. Unfortunately, Lognet dies before I finished the rewrite. I have revised the information to be current.


x.1 Net from Bob LeChevalier (RJL), Background

I was an inactive Loglanist starting in 1979, and a member for some 4 or 5 years now. I was JCB's only San Diego resident Loglanist during the last few years that the institute was located there. I wasn't active because my practical skills at languages and linguistics were negligible. However, I was JCB's local sounding board during the early years of GMR. I thus feel a sense of the history of the project even though I haven't really contributed much time before the dictionary effort.

I am 32 and single. I have a B.S. in Astrophysics from Michigan State (which I haven't used since joining the computer field). By trade I am a computer systems requirements engineer working for Systems Development Corporation. I am prone to occasional periods of heavy travel and lots of work deadlines which have made my contributions to Loglan subject to a lot of interruption.

I am the "dictionary update foreman" originally defined in LN29.22. jcb has recently redefined this more narrowly to include only those aspects relating to updating the old data to the new format, and changing that data to match GMR and MacGram. Part of the reason for this reduced scope was the lack of progress on the dictionary.

I personally want to serve the broader original dictionary update function, and to build on work done by Colin Fine, pc, Chuck Barton, and others in correcting flaws in the dictionary, improving definitions, adding new words, etc. Jim has not granted me this, but I believe with your help, we can convince him that the dictionary effort requires this breadth, and that the team I'm trying to build can handle the effort on a timely basis. His sabbatical this summer should give us the time we need to so demonstrate. Of course, pending jcb's decision, this extra work I am doing is unofficial, and will take lower priority to the tasks he has specifically approved.

I am interested in the dictionary project since it is a long term effort. As such, my irregular time availability isn't as noticeable. I am also especially interested in the interface between Loglan and the 'real world'. As GPA approaches, I haven't heard a lot of discussion about how our esoteric research can be communicated to the lay public and to a generally disinterested or even antagonistic academic community. Faced with my own difficulties in learning Loglan, I'm sympathetic with those to whom "affix", "morphology", "predicate", "intervocalic glottal stops", "Eaton", and "penultimate" are a bunch of meaningless jargon (not to mention our peculiar algebra of "CCV"s and "CVC"s, etc.). I'm afraid the technical aspects of our effort will thus tend to turn off those without linguistic background.

The dictionary will be a major resource to the new Loglanist. My contributions will hopefully help render it more useful and intelligible to the lay reader. The dictionary also happens to be an area where my weakness as a linguist and as a linguisticist(?) will less impede my contribution, while my particular variety of skills will be especially useful. Since we have several Loglanists skilled in translation, and thus familiar with lexicography, my reliance on them should prevent any problems that might result from my technical weakness in the subject. On the other hand, I bring skills from my professional life that are necessary to this effort; specifically: organizational skill, an ability to coordinate very complex tasks, and a reputation for enthusiasm that encourages people to be active because they know that progress is being made.

In volunteering, I have offered my computer as a repository for the new Loglan, and my time in an attempt to make it intelligible to the non- Loglanist. I am going to use some basic computer techniques to make the job easier. I am NOT, in general, planning on doing a lot of programming. The weaknesses of the last dictionary are principally those aspects that are most difficult to devise general algorithms for. I spent several hours trying to write a parser that could interpret 100% of the computer files used to generate L4/5, and turn them into something easily manipulable by word-rebuilding algorithms, human text massagers, and formatters. My results show that it would take too much special case processing (and programming time) to have such generalized programs at this time.

Instead, I am automating only a few of the steps, including the final formatting, and planning on a lot of hand massaging to fill in the gaps.


x.2 Net from RJL, Computer Resources

jcb has occasionally alluded to my computer problems, which luckily no longer exist. I have a Zenith Z-151 PC-clone with a good supply of software, including Microsoft WORD, DBASE III, and Turbo-Pascal. I also have a C-compiler (CI's C86 - which I've never used) and the Zenith equivalent of BASICA. I have an installed 22 Meg hard disk, of which the partially reformatted dictionary already uses over 6 Meg.

I have a program called MEDIA-MASTER (by Intersecting Concepts) that allows me to read most 5-1/4" 40-track disk formats (especially CP/M and MS-DOS varieties). It is a cheaper version of the type that was mentioned in Lognet last year. It costs $40 for the PC version and handles some 72 disk formats. I can probably read most CP/M or PC-related diskettes, and a few others. It does not handle the Apple format, but we have Apples at work (and also Burroughs B-20 Series workstations). We also have a Z-90 with a 8" high-density drive at work that is pretty much a clone of JCB's system. I also have a modem (never used) to handle data exchange that can't be done by diskette. I am not a member of any net, however, and have no immediately plans to change that. Therefore, diskette is the preferred means of data transfer unless you can afford long distance bills. If you send diskettes to me, just let me know what machine/ format you used.


x.3 Net from RJL, Ideas on The Dictionary Reformatting/Update Task

JCB spent many hours generating a concept for the new dictionary format. As a result, he generated "A Manual for Dictionary Re#formatting" for our use. The printed dictionary formats that he would like to see have been published in the March Bulletin (#5). I believe this format, or any similar format, would require too many complex algorithms to be processed easily by computer, especially since the dictionary text has embedded commas.

An early part of the dictionary update task will have to be the conversion of the bulk of the dictionary to a format that is as general as possible, and yet human readable to allow for the extensive manual work that must be done. That portion which cannot be converted automatically must be performed by hand (or word-processor). Contributors can assist by generating all dictionary inputs in either of 2 formats:

  1. the "dictionary style" format with all the commas given in LN47.5.
  2. my new Universal processing Format described below (x.5).

The latter is preferred, since I will be running the former format through a parser to create the latter format. Remember, comma-delimiting means that da can't use commas in da's definition, and that such definitions cannot contain the complete data that is included in the Universal format. The dictionary workers must work with both formats because I don't want to cause a lot of work for those who have been creating entries with all the commas.

My view of the dictionary task involves taking the reformatted dictionary and massaging it in several ways:

  1. Substituting new primitives and their derivations into the old entries.
  2. Checking all the old metaphors for consistency with the current word-building philosophy.
  3. Building new complexes using affixes and the word-building algorithm.
  4. Incorporating new metaphors generated by Kieran Carroll's word-building team.
  5. Standardizing English translations and formats for maximum clarity.
  6. Adding complexes for metaphors suggested by the clarifying and standardization process.
  7. Adding selected acronyms, phrases, names, etc. that exemplify the use of those aspects of Loglan.
  8. Adding new grammar words and constructs that have changed since the last dictionary.
  9. Adding any approved borrowed words.
  10. Generating appendices of various sorts that aid the dictionary user.
  11. Generating and massaging a format suitable for typesetting and publishing, in both Loglan/English and English/Loglan directions.

JCB has assigned me some portions of the reformatting, and I am hoping to demonstrate my capability to coordinate the rest, as well. If that sounds like a lot of work, it will be; therefore, the dictionary effort can use volunteers. I am especially interested in getting some of the members who have worked with Kieran's team to perform double-duty in smoothing the interface between our two efforts. The effort also needs someone comfortable with Loglan grammar (especially the changes), so that it can be properly documented in both dictionary entries and text. In the months since JCB appointed me, I have not been contacted by anyone seeking to contribute, which in turn diminished my enthusiasm for quite a while.


x.4 Net from RJL, The Plan

With the complex task described above, I have analyzed the dictionary effort and tried to order it in such a way to maximize the use of everyone's talents. I expect the dictionary to become the critical obstacle to GPA, since it is one essential that an outsider can't do without, and it will be the most time consuming of the efforts. JCB has to coordinate many other tasks, so I want to try to minimize the requirement for his time in the dictionary work.

I've submitted this plan to jcb, but feel that it should be subject to all of your comments, as well, because you will be the workers who will make it possible.

  1. Obtain the raw files used in the old dictionary. (done 6/85)
  2. Convert the raw files by computer program to the Universal format. This format must be compatible with the word-builder's format, and must be sufficient to allow easy human manipulation, sorting, and automated formatting into both Loglan-English and English-Loglan dictionary entries. (done 7/86)
  3. Massage these files manually to eliminate the special case problems that I couldn't easily program around.
  4. Sort the old words by word type. (done 7/86) Divide lists among volunteers who will make additions, deletions and corrections that are word-type dependent. In addition to what is listed below, changes are necessary to incorporate the material that has appeared over the years in TL, Lognet, and the notebooks. pc, Chuck Barton, Colin Fine, and possibly Bob McIvor should have collections of data in addition to that which jcb has been accumulating.
    1. Names, letters, chemical words, etc. - redo to match the new standard formats. Generate standardized definitions. Add fields found in the Universal format but not in the raw data. Some may be remade to utilize the modified alphabet.
    2. Primitives - automatically substitute remade primitives (done 11/85), and their etymologies (done 5/86). Add affix data, complete Eaton references, and other fields found in the Universal format but not in the raw data.
      1. Analyze English definitions. Make them semantically neutral.
      2. Where necessary for semantic neutrality, determine the multiple English connotations and derive metaphors for them based on the primitive. (See x.6). Coordinate these with Kieran's word-making team.
      3. Generate additional metaphors common in English phraseology, but not necessarily in Eaton. In some cases, these will become classes of borrowed words (See x.7). Coordinate these with Kieran's team.
      4. Analyze the argument structure. Determine if changes are required. Add English translations to support all argument reorderings (See x.8).
    3. Complexes
      1. Determine if the metaphor used is still optimal, given the current word-building philosophy.
      2. Generate new metaphors as applicable, coordinating with Kieran's team.
      3. Build new complexes according to the word-building algorithm. (Algorithm written 6/86).
      4. Verify and split out multiple English meanings. Treat these as b.2 and b.3 above.
      5. Determine an argument structure deducible from the primitive components. Process arguments as in b.4 above.
      6. Generate standardized English definitions. Semantic neutrality is less necessary than for primitives, but is a goal, especially for those complexes included from Eaton. As such, additional metaphors may be needed as in b.2 above, which will in turn be used to generate 3 and more place complexes.
      7. (7/86) Partially as a test of the complex building algorithm, I am now proposing that we use a computer to generate primitive pairs (and perhaps some triplets) using some or all primitives. The resulting metaphors may or may not mean anything; human reviewers must so determine. This process will thus model the way a Loglan listener will determine the meaning of an unknown metaphor, and thus make them more learnable (at least for English speakers, since that is the common denominator of Loglan workers - it seems to me that a Chinese speaker may be unlikely to recognize many English metaphors, but I'll leave that for the linguists to argue.) It is my belief that we can take care of most of Eaton by this method. Remaining Eaton words may show patterns that will suggest new primitives to complete the effort. In some cases multiple metaphors may yield the same word. Both could be used, especially if nuances in the metaphors cause connotation differences, or the kibbitzers could select just the best. Meanwhile the complex-making algorithm can be fine-tuned to give the tastiest words representing each metaphor.
    4. Grammar words should be treated as in a. However they will not be done until step 9. below.
  5. Enter all of the above changes into Universal format (manually, unfortunately - but if people are able to use word processors and/or data base programs to create either of the two acceptable formats, we can minimize this step.) By this time the parser to reformat the comma delimited format into Universal format must be written.
  6. Add words generated by Kieran's team. These must be subjected to the same procedure used in 4. above, so that this may be an iterative process.
  7. Add words generated by the Translation Project and other sources. JCB has been collecting these up to this point. Also add words that were submitted and published in Tl and LN, or that others have collected over the years. Primitives and complexes should be treated as in 4. above. Borrowings may require special screening and processing (See x.7).
  8. Meanwhile, the language description currently found in the beginning of the dictionary must be updated. This should be written by one person to ensure consistency of style; someone who knows and has used both the old grammar and the current one. Others can then review it for clarity. Hopefully JCB has the rough text from the old dictionary in typist- readable size. I don't expect that the text is on any computer. Putting it on computer may require Institute resources; that's a lot of typing.

    I believe that this update is more important than many of our other activities, primarily because it is the shortest description of the language. Once it is updated, we could conceivably GPA with just a dictionary. The completed update should be distributed to all active Loglanists as a short summary of the current language. In the new summary should be the MacGram Normal Form description of the grammar. (Kieran Carroll has suggested that a longer term goal should be to create this language description in Loglan. Then, any Loglan-XXX dictionary would have both the Loglan description of the language, and its translation in language XXX. This is probably too much effort for this next edition.)

    At this time, the language must stop changing until significantly after GPA. In computerese, this is called a baseline. When a baseline occurs, any later changes that are required are saved up, and then distributed after a period of time with full documentation to all users. If no baseline occurs, GPA will only put the language where it was in the early 1980's, when no one had complete current information about the language except jcb, and the active portion of the community started declining rapidly. On the other hand, the L1/L4/L5 publication established a baseline which effectively stood until 1979-80. I believe that most people felt that those books reflected the 'official' language, and that no change (such as those in TL, which were considered only proposals) became 'official' until the supplement was published. The new dictionary and other publications must recognize their 'officialness', and procedures to periodically update them must be put in place. But in the meantime, proposals approved by the Academy should be accumulated for the next 'release' (another computer systems term).

  9. Add grammar words and constructs derived from the new language description. Definitions should be consistent with the wording used in the description from 8, since both will be in the same book.
  10. Add any desired examples of acronyms, names, borrowed words, or other special words in Universal format.
  11. Hopefully, at this point, Kieran's wordmaking team will have completed it's task. Otherwise the above should continue iteratively until we can baseline the dictionary word set.
  12. An ongoing parallel task will be to drive a set of appendices to the dictionary. Some of these will stem from the language description, including the very useful lists of little words found on the front and back cover. Others will include:
    1. A description of how c-Prims are built, algorithmically, to serve as a reference in case new ones must be derived. Preferred reference standards for each source language should be given.
    2. A list of affixes.
    3. A description of the word building algorithm.
    4. A statement of the process for formal approval of new words.
    5. Perhaps an Eaton-ordered list of concepts and the associated Loglan words.
    6. Useful lists such as colors, numbers, chemical names, plants, animals, acronyms, musical instruments etc., and algorithms for borrowing or deriving new words in each class that are not in the lists.
  13. Merge and sort the baseline word set.
  14. Generate a trial text. This should be partially formatted (single column, both translation directions). Two teams will be needed for review. One team will be primarily concerned with proofreading and ensuring that the Loglan is correct. A second team, hopefully composed of linguists who have done a lot of translation work (not necessarily in Loglan) and are thus familiar with bilingual dictionaries. English and grammar experts can also participate. The latter team will be examining the text from the lay public and user point of view. Their review is critical to GPA, and hopefully the reviewers will include some people who can take a fresh look at the language.
  15. By this time, we should have determined the publisher. The final printed format should be coordinated with printing professionals for the best 'look'. We also need to generate typesetting commands as part of the output. Since I have Microsoft WORD, which supports most anything in the way of fancy typesetting, this should not be too difficult.
  16. Incorporate any last minute changes and corrections that have taken place as a result of the reviews, as well as any changes since the language was baselined that are absolutely essential.
  17. Publish and GPA.

The following areas need immediate volunteers to do this plan (7/86):

  1. metaphor builders/ complex reviewers
  2. someone to review old TLs, Lognets, and Notebooks for previously suggested changes to the dictionary, and new words. These must be written in Universal format.
  3. appendix generators
  4. someone to rewrite the language summary for the dictionary
  5. lots of text reviewers, kibbitzers, and hopefully people who can word process.


x.5 Net from RJL, Universal Format

The ideal format for word submissions is a multi-line format with the word and a line or record-type repeated on each line (to make sorting easy). Each field on each line is separated by a backslash (\). For each word, include only the line types that are appropriate, but include all fields. If computer generated, no line should be longer than 256 characters, and the character positions should be followed.

Line Type 1 - Main Word Line (required for each word)

Position
1:16      Loglan word, left-justified, space filled
17        line type '1,'
19        Word-type :    'v-LW  '
                         'vv-LW '
                         'cv-LW '
                         'cvv-LW'
                         'Cpd'
                         '2-Cpx ', '3-Cpx ', '4-Cpx ', etc.
                         'Phrase'
                         'Name  '
                         'Acron '
                         'I-prim', 'C-prim', 'N-prim', 'S-prim'
                         'Borrow'
          \
          Eaton frequency (n.n) in thousands and tenths
          \
          Creation year with apostrophe (e.g. '75 )
          \
          Eaton Number (page and sequence number on page)
          \
          Author's initials, if applicable
          \
          percent score ('nn%')
          \
          Eaton page and line number, as per comma-delimited format
          \
          affixes, if any, in an order (and with delimiters?) to be
          specified by the complex algorithm programmer.

Line Type 2 - Complex Etymology Lines - required for each complex.

Position
1:16      Loglan word
17        '2,'
19        Loglan primitives comprising metaphor, in order, separated by \
        
  \
          flags as defined in complex making algorithm

Line Type 3 - Derivation line(s) - required for those primitives with language derivations. More than 1 line of this type is permitted if necessary to fit all derivation data.

Position
1:16      Loglan word
17        line type '3,'
19        Derivation text

Line Type 4 - English Definition line - required for all words. It is preferable to put only one definition/part of speech on a line if more than one exists. The raw dictionary parser cannot do so and multiple definitions from the old dictionary will have to be manually separated.

Position
1:16      Loglan word
17        line type '4,'
19        Loglan Part of speech, in parentheses (e.g. '(3n)',
'(2a)','(na)')
          \
          English definition (preferably under 80 characters)

Line Type 5 - English expansion line - at least one required for all words. These are the raw lines for use in building the English-Loglan side of the dictionary. The Eaton English key-word will be used from the comma-delimited format.

Position
1:16      Loglan word
17        line type '5,'
19        English word or phrase
          \
          English Part of speech
          \
          English word definition/explanation
          \
          Loglan translation, including little words
          \
          Loglan word type. If blank, assumed to be that in word 1.

Line Type 6 - Derived Complexes - One line for every derived complex using this Loglan word. If a 3 or more part complex uses metaphors based on a shorter complex, the latter should have this line type, as well as all of the component primitives. (e.g. rojmadsesmao would have type 6 lines for rodja, madzo, and sensi; however, it also might be listed under sesmao (scientist), rojmao (farmer), and rojmadsensi (agronomy)). This will aid workers building definition-refining metaphors by showing them what already has been built. Metaphor makers who can generate Universal format would assist greatly if they generate these for the components of the metaphors. The comma-delineated format does not support this. Only 1 metaphor per line.

Position
1:16      Loglan word
17        line type '6,'
19        old complex
          \
          each Loglan word in the metaphor, separated by \
          \
          flags as per complex-making algorithm

A sample from the dictionary raw file, and the resulting Universal format follows. Note that not all required lines are present, and that minor parsing errors may exist. I haven't corrected these so that volunteers can see what typically has to be done.

Raw data:

   -                                  -
cupli  2a is more coppery/copper-like than                cupriclika047
 a  coppery - more coppery than /copper-like/     cupli        2 7 cupri clika
   -                                  -
cupri  1n is a piece/particle/atom of copper; comb. forms: cup/cur.       s306
 a  copper - made of copper             cupri        1 6
 n  copper - a piece/atom of element 29      cupri /cu/ 1 6
 n  copper - the mass of all such atoms      lo cupri /cu/ 1 6
1 cupli = cupri clika
   -                                  -
cuprium, la  na copper /cu/, the 29th element; also la cup/lo cupri, q.v.    300
na  copper - its long international name:    la cuprium          0
   -                                  -
curdi  4v x insures y against hazard w for fee h. 1n insurer        44&5--3---106
curdi  -b 3/3e sure 2/4c i ding 2/6s asegur-ar                         1f
vt  insure - insure...against...for fee...   curdi        4 6
 n  insurer - one who -s           curdi        3 6
 n  insurance - a spec. act of -ing          po curdi     3 6
 n  insurance - mass term of acts    lo po curdi     3 6
 n  insured - an -ed person, conv. of insure nu curdi     3 6
 n  insurable - can be -ed              nu curdi     3 6
 n  risk - that which is insured against, qv fu curdi     3 6
   -                                  -
curdu  2v x waters y/pours/sprinkles water on y.          cutridurzo247
vt  water - put water on... /water-do/       curdu        2 7 cutri durzo

generates the Universal format:

cupli           1,2-Cpx\7.4\'75\  %\
cupli           2,cupri\clika\\
cupli           4,(2a)\is more coppery/copper-like than
cupli           5,coppery\(a)\- more coppery than /copper-like/\cupli\2 7
cupri           1,S-prim\3.0\'75\  %\
cupri           4,(1n)\is a piece/particle/atom of copper; comb. forms: cup/cur.
cupri           5,copper\(a)\- made of copper\cupri\1 6
cupri           5,copper\(n)\- a piece/atom of element 29\cupri\/cu
cupri           5,copper\(n)\- the mass of all such atoms\lo cupri\/cu
cupri           6,cupli \cupri\clika\\
cuprium, la     1,Name\3.0\'75\  %\
cuprium, la     4,(la)\na copper /cu/, the 29th element; also la cup/lo cupri, q.v.
cuprium, la     5,copper\(na)\- its long international name:\la cuprium\0
curdi           1,C-prim\1.0\'75\44%\
curdi           3,-b 3/3e sure 2/4c i ding 2/6s asegur-ar
curdi           4,(4v)\x insures y against hazard w for fee h. 1n insurer
curdi           5,insure\(vt)\- insure...against...for fee...\curdi\4 6
curdi           5,insurer\(n)\- one who -s\curdi\3 6
curdi           5,insurance\(n)\- a spec. act of -ing\po curdi\3 6
curdi           5,insurance\(n)\- mass term of acts\lo po curdi\3 6
curdi           5,insured\(n)\- an -ed person, conv. of insure\nu curdi\3 6
curdi           5,insurable\(n)\- can be -ed\nu curdi\3 6
curdi           5,risk\(n)\- that which is insured against,\qv fu curdi\3 6
curdu           1,2-Cpx\2.4\'75\  %\
curdu           2,cutri\durzo\\
curdu           4,(2v)\x waters y/pours/sprinkles water on y.
curdu           5,water\(vt)\- put water on... /water-do/\curdu\2 7


x.6 Net from RJL, Metaphors and Semantic Neutrality

One of the first problems I want to deal with in reformatting definitions is that of semantic neutrality. Whenever I try to discuss Loglan with a prospective new recruit, I describe the language in terms of its unique features, especially its unambiguity. Almost always, though, upon turning to the dictionary, the recruit notices that the Loglan primitives and complexes represent concepts that are abstract, and often ambiguous. The most concrete words are usually complexes; the longer ones are most concrete. Most of the primitive's English definitions carry all the ambiguity and cultural stereotypes of 20th century American usage. To the lay recruit, this almost always seems to cancel the theoretical attraction of the multi-lingual roots found in primitives, and the supposed cultural neutrality necessary to test Whorfian concepts.

Presuming that the Eaton tables justifiably present a set of culturally neutral concepts that must be represented by primitives and short complexes, it is necessary in Loglan to define these Eaton terms to preserve that cultural neutrality. Some of the most common Loglan primitives do not do so. Cluva (formerly clivu) represents the concept of love. But the English 'I love you' conveys only one meaning of the word. 'I love my work' is another. There are also meanings never adequately conveyed in English, like the Greek concepts of Eros and Agape. My limited linguistics library shows over 20 Hebrew and Greek words representing the concept of love as used in English in the Bible. This would render the most basic abstract discussion of the topic totally tied to whatever prior cultural associations the Loglanist speaker has for the word 'love'.

Similarly, the English word 'run' has dozens of denotations, of which prano, as defined in the current dictionary, could represent any of several. Even the arguments for various predicates, which can limit the scope of meaning, leave prano ambiguous. Colloquial English "I'm going to run out to the store" usually does not mean the same as "I'm going to run in the Boston Marathon" or "Will you run this upstairs". In Loglan, hopefully, each of these meanings will have a different predicate available. And the primitive has to be defined so as to be sufficiently abstract to cover all such meanings consistent with the Eaton concept, while avoiding (preferably) the connotative English jargon of, say, "run this computer program", which does not fit most interpretations of the abstract concept or the Loglan argument structure. Note that Eaton, and most concept frequency analyses are probably limited in that they do not account for common semantic misuse. The commonalty of the word 'run' includes all the definitions listed in the dictionary, and related words that are passed over (such as 'convey quickly', 'execute', or 'stocking defect'), are less common in such analyses even though the concepts may be more common.

Zipf has already struck English, causing ambiguity. It will probably affect Loglan implicitly in ways that cannot be planned for. We must therefore try to allow room for Zipfean effects, while not preserving those implicit in English already. Yet, attempts to decide Zipfean effects in advance will tend to put heavy English undertones in the language. Since most Loglanists are Engliish-speakers, it is essential for Whorfian testing to preserve the unique concepts in the language against this cultural bias. For these reasons, I tend to be opposed to changes which are added only to make Loglan seem more like 'natural languages'. Let those who use the language determine that there is a need for such changes before making them.

Incidentally, I asked JCB about the abstract concept expressed by cluva, in this context. He defined it in the abstract sense of "strong emotional attachment". Since this definition doesn't even convey a sense of 'positive' feeling, I would be reluctant to use cluva to express my feelings toward a lady from a different cultural background. In a related example, an earlier Lognet referred to two different metaphors for 'politics' which reflected two different philosophies, and in fact two different English denotations of the word.

Thus, we need a cultural neutral definition of all primitives, and preferably those Eaton concepts expressed as complexes. And we need a set of derived complexes, hopefully including the primitive in the metaphor, to express the various denotations of the English words representing the concept in a Loglan-English dictionary. Presumably another, possibly distinct set of complexes would be needed to write a Loglan-Chinese or Loglan-Hebrew dictionary. But that is luckily not my problem.

A lot of careful thought and wording needs to go into each definition. Each definition should be reviewed for semantic neutrality by linguists familiar enough with one or more other languages, that can catch any English cultural bias in the concept definitions. And a lot of looking at Webster's, as well as perhaps a few bi-lingual dictionaries, will be necessary to create 'families' (another ambiguous word - I should use the mathematically precise 'sets') of related predicates. These should be reviewed by Kieran's team.

One can sense that those working on the dictionary definitions will end up creating as many, or more words for the dictionary as Kieran's team. And without this effort, the 'interface' between Loglan and English will remain incomplete.


x.7 Net from RJL, Borrowings

As I described above, my limited background involvement in Loglan is tied to discussions in San Diego with JCB. One of our most frequent discussion topics, and one that perhaps affected many of the early GMR ideas, was the subject of adding words to the language to discuss complex ideas of narrowly defined meaning. One of my few attempts at Loglan translation was to try to convert the theme song from 'Man of La Mancha', a very emotional poem, to what seemed a dry form of Loglan. The song was full of its own set of metaphors, and I was curious whether they would come across in translation.

The project died in its infancy, partly because my knowledge of the grammar was too weak to manipulate the complex structures of the poem. But I stopped trying when I realized that more than half the words in the poem had no Loglan equivalent. I could not even develop good metaphors that conveyed the original metaphors. Try to come up with a metaphor for 'trumpet' that denotes a musical instrument, and connotes the spirit one thinks of when one hears 'the sound of the trumpets of glory'. 'Loud- cone-shaped-musical-instrument', a 5-part metaphor got me to a dry definition of the term that did not exclude trombones, tubas, or sousaphones. But it just didn't make it. There is really no way to invent a satisfactory metaphor for such concrete terms. Try 'I hurl down my gauntlet to thee' for an easier example; I solved that one with existent Loglan words to create a usable, if weak, metaphorical 'I (x) throw to the ground (y) my safe-make(protect)-hand-shoes(gloves)(z)'. (I still won't attempt the grammar). The concept of challenge is completely lost, without the cultural association.

My point is that we won't, at this point, succeed in solving such problems. Loglan does not yet have its own culture to translate such metaphors into. *Iglu (no longer a legal borrowing) will always be something more than bisli hasfa (ice-house) to an Eskimo, if not to a Japanese or Italian. And the primarily English/American Loglanists developing the language owe it to our Loglandian descendants not to impose on them the cultural bias of including or excluding a separate word for igloo.

In short, I speak in opposition to making iglu-type words automatically acceptable (at the coiner's discretion) as Loglan words. In fact, the Loglandization of borrowings should be avoided until significant usage proves the need. If we don't, we have to examine every other tribal language for their specific words for their house-structures, or we present a cultural bias toward Innuit that only an American could understand. The use of simba for lion, is similar. Most Loglanists have never been exposed to Swahili. It is comparatively well-known in America because of the heritage of our black culture. So I oppose arbitrarily incorporating any n-Prim or 'borrowed word' into the Loglan wordspace in a formal sense without careful consideration of the cultural decisions being made.

This cannot be done by any individual, and perhaps shouldn't be done until there is a substantial body of speaking Loglanists, and thus a 'Loglandian culture' that can adopt such borrowings permanently. At that point, the Loglan Academy, as the French Academy, can choose which borrowings should be adopted based on their usage in cultural discussion.

Having said that, I know it won't work, for the same reason that French, as now spoken, goes beyond the 'official' language, due to the borrowings necessary to discuss modern concepts.

This brings me to the related problem of jargon. Loglan should avoid this vice of English like the plague. In the interests of concise speech, Americans probably coin more words per day than Kieran's group will do in 3 months. Most of these will be accepted as, and be indistinguishable from, normal English. And they will cause incipient confusion among the unknowing listener (part of the intent, I'm sure). Loglan has had phases of the trend towards jargon in its evolution. Back in TL1 and TL2, there was an effort to coin primitives for various computer terms. God spare us from bit-byte-nibble-core-dump-and-patch. (all of which are English metaphors that may or may not come across in any other language). Then there were color words. And I remember soksu, the carefully derived (from all 8 base languages) word for oak. As if there is only one kind of named tree family worth having a primitive for. (not to mention that there are dozens of distinct species of oak). I note that 1 mod 3 format for borrowings, that JCB originally proposed in response to my jargon argument, has died, and borrowings are morphologically distinguishable from primitives and complexes only by recognizing that they are not of the latter form, and are not one of the 'known' c-Primitives. This is equivalent to asking an English speaker to identify as jargon and/or borrowings from other languages simply by the fact that one doesn't know the meaning. When I was younger, 'compiler', and 'deja vu', representing jargon and borrowed words, were as Greek to me as 'quantitative', which is a Loglan primitive, 'australopithecine', a proposed borrowing, and 'morphology', much used among Loglanists but not yet in the dictionary.

I have not yet decided whether the approach to acronyms in TL6 sufficiently discourages them to render them less used in the jargon filled military-industrial-government complex that I serve each day. I doubt it, and the easy use of acronyms in Loglan would make it prey to the same tendency as American English to abbreviate without pause. This particular problem is less worrisome to me, though, since Loglan has the option, at least, of writing acronyms as words. What if it were the only option? In the government documents I read, it would enhance understanding immensely if the extra effort to spell out an acronym led to fewer acronyms being used. Of course, the acronymic concept may be cultural basic to Loglan, since the free variables da, de, di, do, and du are acronymic in nature, but necessary parts of Loglan grammar. Incidentally, jargon has already hit Loglandia, since acronyms like GPA, GMR, jcb, MacGram, MacTeach, and pc are so common as to be manipulated as words, even by me.

Whatever the decision on the problems of borrowings, existing n- Primitives, 1 mod 3, jargon, and acronyms, as dictionary foreman, I'm stuck with the problem of determining which borrowings to put into the dictionary, and how to determine the preferred word forms for various concepts. I see no easy algorithm, so I will repropose some of the ideas I came up with 6 years ago, before GMR made it impossible to coherently discuss the problem.

I propose four morphological/grammatical forms exist for words and concepts derived from sources other than primitives and metaphors. These forms are:

  1. True-borrowing
  2. Designator-form
  3. Pseudo-complex
  4. Approved borrowing

True borrowings are those words and or phrases used in Loglan speech with no pretense or attempt to call them 'Loglan'. The grammar supports this currently with quotations li ... lu, and lae. (See TL4/2 4.6.4). These obviously would not belong in a Loglan dictionary (except for the grammatical lexeme words).

Designator form would be the first step in incorporating a borrowed predicate into Loglan. If one wishes to 'borrow' a word not currently in the language such as 'gauntlet' or 'trumpet', one could form a borrowing in designator format, be spared the overhead prescribed in TL4 for true borrowings. (TL4 requires that one specify that the English word 'stingy' be associated with the Loglan katli, as well as enveloped in li#...#lu, and prefixed by lae.) Note that when introducing a new borrowing, say iglu, under the current corpus, one is not required to identify that an iglu has any relation to either a residence or architectural form. Designator form would be restricted only to, and defined as, predicates, as are the current borrowings. They would be grammatically and morphologically similar to names, except that a different Little Word (the designator) instead of la would be used to designate the word as a new or infrequent borrowing. Lae may be usable here - I leave it to the grammarians to determine if that would be unambiguous - or another Little Word, preferably similar to la and lae must be assigned. Stress and the ending consonant-pause would be required to be the same as for names, and spelling would have to use Loglan phonetics. However, the user would not have to worry about consonant pairs or possible resolution as a complex because the designator- consonant-pause would render the word inherently resolvable. The listener would know the word was a borrowing. The user could apply the word grammatically as any predicate or argument (as presumably a name can be). The listener would be warned that the word is an infrequent borrowing and would therefore be prompted to ask its meaning if it is not already known.

Pseudo-complexes are a new concept that I have recently come up with, so their form may need resolvability analysis to determine what is acceptable. The concept is based on the idea of borrowing-complexes mentioned by JCB on page 37 of TL6/1. This form would be initially used for the large families of borrowings that are required to support normal, but non-jargon, conversation and metaphorical literary speech. Without it, the typical TL or Scientific American translation, as well as the poetry of 'Man of La Mancha', would be filled either with quoted borrowings, constructs similar to names that convey no meaning other than their grammatical nature, or coined borrowings of the arknidia form that carry no more meaning to the listener/reader than the designator form, but seem like 'real words'; i.e. jargon. The form of a pseudo-complex may still require a Little Word prefix like designator-form (I defer to the grammarians and morphologists). To be easily recognizable as opposed to merely resolvable, I would prefer a form such as 1 mod 3 with at least one CC pair, followed by a legal complex form that represents a primitive or metaphor which 'types' or 'classes' the meaning of the word. Thus, an acceptable word would be igluhaa (igloo-house, the common usage of igloo) or iglutektosensi (igloo-architecture, a less likely but plausible usage of the root). Iglumao (igloo-maker), the borrowing complex form suggested by JCB, would be ambiguous, indicating only that an iglu is a 'made thing'. But igluhasmao would be a preferred form. It is not clear to me whether meanings would be totally clear - the latter word could be interpreted as a borrowed form of 'house-builder' as well as its intended meaning. But then all complexes are based on metaphors that are subject to misinterpretation.

The pseudo-complex form would be true Loglan with a clear metaphorical emphasis while retaining its obvious background as a borrowing. Most borrowings that we put in the dictionary should be of this type. A list of animal borrowings would have a borrowed 1 mod 3 (preferably, but see below) root followed by -nia to indicate the meaning. Plants would have - herba appended since there is no affix assigned. (This might lead to such common suffixes being assigned an affix even though they cannot be justified for an affix based on Eaton concepts.) Trees would end with - tri, so that oak could be rendered as oksutri. For scientific-related terminology, however, I would prefer to stick to Greek roots and Latin taxonomic names unless the root can be legitimately claimed to be International in scientific usage (This can be determined by looking at a German, Russian, or Japanese translation of a scientific article and seeing whether the English root has been borrowed. 'Astronaut' would not be a basis for a borrowing - the Russian 'cosmonaut' form is more internationally recognizable and is equally based on a classical root. (It is accepted in English text, whereas 'astronaut' is not used in the Russian.)) Four (or possibly five, if German is counted) of the eight basic languages use a significant number of Greek and Latin roots. Thus we would lose simba, or perhaps merge it with the root leo to form simbleo- that would be metaphorized to form simbleonia. I would consider an argument that roots need not be 1 mod 3 if the pseudo-complex remains easily resolvable. But in practice, using 1 mod 3 roots is a preferable convention, allowing easy recognition by non-computer resolvers.

That last form is reserved for those roots that through common Loglan usage become commonly accepted throughout Loglandia. In effect, these would be Loglandian I-Primitives. Thus iglu-, which initially would be required to be defined metaphorically with -haa, could become a stand- alone Loglan primitive if the usage and acceptability of this particular borrowing became universal. This status should be reserved for only the most frequently used borrowings, and should require adoption by the Loglan Academy before they could be added to a dictionary. I see no words except perhaps the N-Primitives and I-Primitives in the current dictionary and those that were remade as part of GMR that could have this rule applied. Of course, if forms for roots other than 1 mod 3 were permitted, or if roots could end in consonants (a question for the resolution algorithm experts), the new dotci could be written as dotcypiu.

Thus a typical borrowing might progress through all four forms as its usage and general acceptance became common. Yet it would always preserve its distinctive borrowed flavor, as well as its Zipfean emphasis on word length vs. usage frequency. It also eases my problem of determining what types and forms of borrowed words should be included in the dictionary, as well as which lists are appropriate in the Appendix.


x.8 Net from RJL, Arguments

No, I am not referring to what the last entry is likely to cause.

Rather, I am referring to the one most significantly undecipherable aspect of Loglan not remedied by GMR - the multiple arguments permitted to each primitive and/or complex. The best way to explain s by example: mutzavkao (the former vedzafka - evildoer) has the current dictionary definition "X acts wickedly in doing Y, by moral standard W". A clear definition, if one uses the dictionary. But the moral standard argument 'W' is an invention of the previous dictionary team. None of the three components of the metaphor used in the complex refers to a moral standard. If I were hearing the metaphor for the first time, I would make the guess that I would use one of the components arguments, probably that of kakto, the last one. But W in kakto (and zavlo, for that matter) delineates a purpose, not a moral standard. In usage, I could get an erroneous (or at least confusing) idea of the intent of the metaphor if I had never seen it before. (I will note in passing that fu zavlo or fu kakto are acceptable, if limited, translations of English purpose, that are not in the dictionary. Shouldn't they be?)

The dictionary editor thus has two problems to solve:

  • What is the argument structure of a newly defined metaphor (hopefully algorithmically determined as easily as the word was built from the metaphor).
  • What passive argument exchanges are defined. (Shouldn't they all be, if possible,, since we have no usage history?)

In addition, argument structures should be made parallel and simple for primitives to allow them to be memorized as easily as the word itself. Dorja (war) and kamda (fight) have similar 3-place argument structures. But cteki (tax) and lilfa (law), which are imposed by Y on W for purpose H (in monetary or barter units Q, or is it - on goods Q?) That plausible meaning that one might guess when learning the two primitives, is correct for neither. Coupled with the non-parallel structure is that each is permitted to be used without specifying all the arguments.

There is probably no easy cure for the argument structure of primitives, though we can try to specify sufficient arguments to cover most conceivable contexts. We will have to, of course, compare with Eaton's concepts. But the metaphor argument algorithm is a must, as is any consensus on passive exchanges and permissibly omitted argument forms to be included in the dictionary. Any ideas?

Note, possibly unrelated to the dictionary work:

I still have some uncertainty on the grammatical uses of multi-argument predicates. Specifically, I am wondering if it is permitted to use more than one such predicate in an utterance. I do not see the Little Words that allow me to refer to arguments of more than one predicate in a single utterance. Thus, if I wish to speak of an evil tax, using mutzavkao and cteki, I see no way to refer to the moral standard W of the first predicate, if the subject of my utterance is the latter predicate, and I am using its arguments throughout the discussion. Is this covered? I may have missed something in the TL7 corpus. I could always combine the two metaphorically, hopefully retaining all the arguments of each according to whatever the metaphorical argument algorithm allows. But this may lead to requiring undefined place-holding little words. I have an admittedly contrived example that combines 35 primitives of various argument structures into a single desired utterance - even using metaphors doesn't help much, and breaking up the utterance in the contrived context would be confusing, and would also cause difficulties in emphasis. I'm not including it here, but if anyone answers my grammatical question, I may make it available if I still can't figure out how to translate it.