Any idea if French language is (will be) supported?

Issue #333 new
Former user created an issue

Originally reported on Google Code with ID 333

Dear Sir,

My name is Michel Bastien and I work for Université du Québec à Montréal as a research
and development agent. We are developing a French language admission test using Moodle's
questions. Because we want to evaluate writing skills and it cannot be done with actual
Moodle questions, we looked for third party questions that could do this. I found your
"Correct writing question type", tried it and was very impressed by its written text
correcting power.

Problem is that the only supported languages I see are C and English. Here are my questions:
do you know if a French language parser for the "Correct writing question type" has
been developed? Could the various English files be adapted to French? (Any hint on
the files I should pay attention to to do this?)

Thank you,

Michel Bastien

Reported by bastien.michel.mtl on 2015-03-02 17:18:50

Comments (5)

  1. Oleg Sychev repo owner
    Hello, Michel.
    
    Developers of CorrectWriting question type have no knowledge about French language.
    The are some questions that can determine whether French language support is possible
    and how much work it will require.
    
    1) The basic idea of CorrectWriting question analysis is that order of the words in
    the senteces is strictly defined. If there are several ways to write a sentence by
    shifting words, teacher should enter all possible variants. There are languages where
    order of words matter quite strictly (like English) and there are languages where order
    of words does not matter (like Russian for example). How strict is French about order
    of the words in the sentence?
    
    2) Correct Writing question does not use natural language parsing (for now), but it
    do lexing (i.e. broking the sentence into the tokens - words, punctuation marks etc).
    Can English language lexer be used for French (i.e. can French words have character
    English words can not?) or there some additional rules that should be taken into account?
    
    The code doing language-dependent lexing is placed in Formal languages block (blocks/formal_langs).
    If you are a programmer, know regular expressions and a basics of lexer generators,
    you may want to look on blocks/formal_langs/langs_src/simple_english.lex to see English
    lexing rules and think how a French variant could be done.
    Question type CorrectWriting itself contains language-independent code that analyze
    tokens order.
    
    We are also currently working on typo correction module based on Damerau-Levenshtein
    editing distance, so information on how it is applicable to the French will also be
    useful.
    

    Reported by oasychev on 2015-03-08 14:54:29 - Labels added: Component-WritingCompetently

  2. Former user Account Deleted
    Hello Oleg,
    
    Thank you for taking the time to answer.
    
    First, some answers to your questions:
    
    1) French is relatively strict about word order, but for some adjective
    (relative to nouns) and adverb (relative to auxiliary and verb) positions.
    *But* that's not a problem as far as our own needs are concerned: we intend
    to use the CorrectWriting question for dictation purpose (students have to
    write down a text read aloud to them).
    2) I tried the CorrectWriting question using French. It works well overall
    but for accentuated characters: they're considered individual tokens. I
    guess we need to add some instructions about those characters (é, è, à, ù,
    etc.) in the English lexing rules.. Also, I don't remember what happened
    when I submitted the French text, but the apostrophe (') has a different
    function in French. It should be part of the preceding string, never the
    following. For example, "jusqu'à l'éviter" should be tokenized
    
    jusqu'
    à
    l'
    éviter
    
    I will take a close look at the simple_english.lex file to see if there is
    something I can do (I'm not a great programmer, but know a bit of PHP). I
    will write back if I have particular questions.
    
    Thanks again.
    
    Sincerely,
    
    Michel Bastien
    
    On Sun, Mar 8, 2015 at 10:54 AM, <oasychev-moodle-plugins@googlecode.com>
    wrote:
    

    Reported by bastien.michel.mtl on 2015-03-09 13:56:45

  3. Oleg Sychev repo owner
    Hello, Michel,
    
    It seems that writing French lexer for CorrectWriting question is possible, but right
    now we are quite busy integrating code for typo detection, so can't do it right now.
    
    Also, if you are doing research, did you plan to publish it's results? And if you do
    so, would you consider collaborate work with us on it? Current situation there is that
    publication indexed in Scopus or Web of Science database will seriously help improve
    project standing in our university and will give us more time and resources, that we
    could use on implementing French language support and other new features. However,
    lacking people with native English language and almost no experience with foreign scientific
    press make this somewhat difficult goal to achieve on our own. Please inform us if
    you can consider such joint work and possible conditions of it.
    

    Reported by oasychev on 2015-03-18 01:30:52

  4. Former user Account Deleted
    Hello Oleg,
    
    Yes, writing a French lexer for CorrectWriting is not an issue.On our side,
    the issue is that CorrectWriting uses Moodle's shortanswer question type,
    which uses a one line text input element. The text input size isn't adapted
    to our purpose. We need to use a text area object. Is there a quick way to
    change CorrectWriting code so that a text area object is used instead of
    text input one?
    
    
    We do plan to publish when (and if) we get results with the CorrectWriting
    question. Of course, I'll keep you informed of this and we'll discuss the
    best way to underline your work behind the CW question's design.
    
    Best,
    
    Michel
    
    On Tue, Mar 17, 2015 at 9:31 PM, <oasychev-moodle-plugins@googlecode.com>
    wrote:
    

    Reported by bastien.michel.mtl on 2015-03-23 14:40:55

  5. Oleg Sychev repo owner
    A good start to writing French lexer will be putting there a list of necessary additional
    characters with their Unicode codes.
    
    I do not know how quick you change code. To change input you should look at renderer.php
    file and overload function print_formulation_and_controls (or something like that)
    copying it from the shortanswer and changing input control printing. There will be
    no actual function in correctwriting/renderer.php - it is inherited from shortanswer/renderer.php
    now. Note that you should not change element name ("answer" currently) If textarea
    returns it result in a way similar with line text input, it will probably work. If
    you need HTMLEditor, the thing is more complex thought.
    
    Note that if you want you change to be accepted in CorrrectWriting code and available
    in all future versions without effort on you part, you should give user a choice which
    control he want to use - line text edit or textarea - on question or at least on admin
    settings level. Actually, we can discuss a change with Tim Hunt to make it part of
    shortanswer question instead.
    

    Reported by oasychev on 2015-03-24 16:39:21

  6. Log in to comment