Store the suggestion lists inside wikipedia

Issue #89 open
Peeter Tinits created an issue

The suggestion lists seems a bit static. How do you add them?

We could store these suggestion lists actually on a wikipedia page, then anyone could edit them (I think danger of vandalism is very small and managable). It would also be easier to see, what's in it. Downloading takes somewhat longer time.

This wiki page would then be "List of lists used in MT for suggestions or smth".

This would also make it easy to add small explanations as to what they are, as it could include link, a human-readable name and small description separately.

Eventually, if there are too many lists, the user could select which lists of lists they want to look at.

Comments (38)

  1. Andrjus Frantskjavitsius repo owner

    Have a look at http://mwtranslate2.keeleleek.ee/userslists !

    Most of what you said is already implemented.

    If someone wants to add his own custom list to the application, he/she can ask us to add the wiki page to userslists.json file and MT should automatically pick up all the articles listed on the page.

  2. Peeter Tinits reporter

    Thanks, good to know!

    If you say that the software first picks up the links from wikipedia pages anyway, then why not store it first as a list of lists, and then look for links in all those mentioned lists. This would be a slight innovation and would entail migrating the current json to a wikipedia page. But it would bring many benefits for usage I think.

    The point is exactly, that if someone would like to add a list, they could just add it to that public wiki page (why would we want to be bothered by these lists by being asked?). Also this allows a nice overview of what is there. The GUI now is fast, but it still isn't mere milliseconds to see all the subcategories of 10,000 articles. Indeed its more than a minute for me at least.

  3. Peeter Tinits reporter
    • changed status to open

    I wouldn't make it invalid just yet. It would be in our interests to keep the content as decentralized as possible. This would be one way to do it, but I think an improvement on having a static json file.

  4. Andrjus Frantskjavitsius repo owner

    I understand what you are trying to say. I guess it could be a good way of advertising the application.

    Can you look at the usesrlist and try creating wikipedia page for it.

    There has to be 3 pieces of information: language code (prefix), full/page/Name and group name. And of course a systematic way of fetching that information.

  5. Peeter Tinits reporter

    Yep, it could increase visibility.

    Ok, I can try to make something. But how are you retreiving them now: e.g. you have page https://et.wikipedia.org/wiki/Vikipeedia:Eesti_100 - yes? and then you just retreive the links? or is it stored in some other way too?

    In this case the wiki page would just include links to the pages. Language code could actually be retreived from the link too, although it can be specified separately if needed, and groups could be just regular wiki headers (though some other arrangement could be found).

    I would also try to add a small description of the list (this would also make it easier to translate these between languages if needed). This short description could be displayed just above the current "Articles" box.

  6. Peeter Tinits reporter

    Ok, see here https://meta.wikimedia.org/wiki/User:Pusle8/List_of_suggestion_lists_for_Minority_Translate

    There are a few options to organize it. 1) Each link title is the title of the list. 2) Title of list is written just after each list.

    Heading can be the title of the group. Or this could be elsewhere.

    It would be good if an explanation can be given of a list. For each individual list, this can be on their own page maybe, but for a group of lists, this could be on the main page just linked. Maybe retreive the first paragraph just after the heading for the metalevel description?

    This organization also allows the lists to be translated easily. We can place the current issue into the english language space, and then anyone wanting to translate it, can just do it as articles are normally translated. And then we'll have it in another language too.

  7. Andrjus Frantskjavitsius repo owner

    This is generally good but I think the headings should be as short as possible. Otherwise the GUI will show: "List of articles every W|"

  8. Peeter Tinits reporter

    Can't you just increase the box size? It is very useful to be able to understand what the lists are on first view.

    On afterthought, I'm starting to think that an explanatory text box is not altogether necessary, provided that this wiki page has them, and it can be easily found via a link somewhere in the menus or help, or preferences.

  9. Andrjus Frantskjavitsius repo owner

    Te box below is there to fit the whole link.

    I can increase the box size, but that would be bad design. Lots of empty space.

  10. Peeter Tinits reporter

    Unfortunately I think the title shoudl match what it is known by in wiki communities (e.g. see Kristian looking for it). I would want to consult Ivo & Kristian here.

    In my opinion the text box can be made much larger without great losses for design.

    You should be able to write "Finno-Ugric World Finno-Ugric World Finno-Ugric World" there easy I think.

    If there is any success, there may emerge lists that are much more difficult to explain even, and will really be grateful for the extra space.

    I'd be keen on hearing other opinions though. I don't know how to call Ivo and Kristian on bitbucket however.

  11. Andrjus Frantskjavitsius repo owner

    I had a look at the GUI and some expansion (but not too much) will not hurt the design.

    To call someone, start writing @name and Bitbucket will offer name suggestions.

  12. Kristian K

    I wrote this earlier today in another Issue:

    I personally don't find long lists a problem. I more see the problem as usage patterns for different kind of lists. Shorter lists are more like personal "to do" lists and the long list is representing some long term goal that should be achieved, e.g "add the articles of fauna to the X-language Wikipedia".

    The issue of too long lists is relevant in the context of having the program be usable in "language teams". The language team should have a group leader that chooses which articles should be translated (this could be linked with some kind of funding or grant) and the group in whole translates them.

    Could the management of such a list be done via Wikipedia? E.g. for each translation project, a page is created where the article names are added (and later more articles can be appended or deleted).

    Could such a list be dynamically divided to each user? Or should it be up to the users to choose what they want to translate?

    In the work-flow of shrinking the long list into a shorter, personal to-do list, it could be usable to have an "exploration view" of the list. Much like in many contexts "choosing photos" is done by showing a MxN table of smaller images and the user is let to mark the wanted. Similarly the title and categories could be shown to user, and s/he can make a selection. A manual filter.

  13. Kristian K

    I want to point out that lists should be seen as something language community specific. There is no such thing as "essential articles". That is essentially colonialism. Yes, the English Wikipedia is the biggest, but also Wikimedia is trying to fight this centrism. Their Mission statement tries to state this with the word "multilingual".

    The lists should be definable by the respective community. Using Wikipedia for this is obviously the best place. In a group focused workflow, I think think users of MT could add lists by pasting an URL to the wikipage with the list. This way the lists could reside both a) as public articles/categories and b) in some users discussion pages (presumably this user would be the group leader of the translation task).

    This way also splitting of the "big list" into smaller sublists can be managed by the group leader and be handed to translators/writers by simply giving the URL.

  14. Peeter Tinits reporter

    @keeleleek - I guess any of these things could be done by changes to their. The way I understand it works now, is that it keeps the tree of the wiki page intact for the program - eg. https://meta.wikimedia.org/wiki/List_of_articles_every_Wikipedia_should_have/Expanded/People. Instead of "Painters" you could add "For Julie" etc. It's just up to the users to organize their suggestion list page as they want, and post the link to this page on the main list of lists page.

    Anything else practically needs to be added? Should the main list of lists page be restructured somehow?

  15. Peeter Tinits reporter

    Any idea how to best make the translation process streamlined with wikipedia? Special:Minority Translate suggestion lists page for each language? All languages in the meta section?

    Any computational requirements here?

  16. Peeter Tinits reporter

    Ok, it is unfortunate if there is no chance to translate these lists. Maybe far future then.

    Yes, the location was of course meant just for testing. It could maybe even be more public when it's done, and not in the userpage, but a project page or smth. @Kruusamagi knows best how to make it easiest to find.

  17. Peeter Tinits reporter

    It seems that it would need just:

    1) Dropdown menu a'la interface languages based on a possibly predefined list of languages available into the preferences. Can be stored in a wiki page somewhere or just hardcoded into a file, and then extended in bitbucket if translations emerge.

    2) Make the address of the wiki page that hosts the list of lists dependent on that dropdown. The dropdown data can include the correspondences, or it could be simply done with language code.

    Then these pages could just be located in different languages such as this. https://en.wikimedia.org/wiki/User:MinorityTranslate/Suggestion_Lists https://et.wikimedia.org/wiki/User:MinorityTranslate/Suggestion_Lists

    And that could be it.

  18. Peeter Tinits reporter

    But if its difficult we can leave it as an update based on need - that is, we leave the instructions on how to translate them, and will collect them before implementing. Once we have maybe 4-5 translations and the need for them, we can implement them. Until then, English can suffice I guess.

    (Edited the previous post too.)

  19. Andrjus Frantskjavitsius repo owner

    Remember that each call to wikipedia takes time, which means I need to thread the call. A fallback is also needed. While this can be done, it gets messy.

    There is also the problem of needing to update multiple lists.

    Lets keep it simple for now.

  20. Peeter Tinits reporter

    Ok, thanks for explaining. I wasn't thinking that lists embedded in lists could create computational difficulties.

    Updating won't be a problem though, as it will be in the interests of the translator (or their audience) to keep it updated, so updating just the ENG version would be enough. No big losses if translations stay behind in time either.

    Ok, lets keep it simple then. Having it in wikipedia and community-editable is a very big step forward already! :)

  21. Kristian K

    I still think this all sound very culturo-centric. It is good that there are such global and/or universal lists as the "list of articles every Wikipedia..." but I think we need something more local. Local doesn't necessarily mean language dependant either since it's obvious a language community consists of independent people with their own interests. It should be a bottom-up structure because we are dealing with grassroots anyway.

    Does the infrastructure you discuss contain the possibility that individuals make their own interest lists? I understand your structure as "one list per language code".

    What I envision is that a user copy-pastes the address of an article to a wikipage. This wikipage converts the address to its wikidata id of the article.

    Now this user-created list can be added to MT much in the same way as the lists.json works. This json only needs an interface for adding the url of a list and perhaps simple categorisation tags (i.e what is called "group" in the current json file). Other list editing features are available in the GUI widget, e.g sorting and moving the entries. Last not least, we make this json file accessible to the user via the filesystem.

    It is understandable, that the only user who does this will be the so called group leader of the translation task, but there is nothing in the structure that hinders other work-flows. Some possible that come in mind is that a "translator" of the group who is interested in flowers and creates a list of flowers that grow in the region, she then communicates with the group in any of the following ways: a) sends the url of her created list to her fellow group members or the group leader b) adds the url in her MT and sends the json file to her fellow group members or the group leader c) saves the json in her local version storage and commits it to the members shared repository d) sends a diff to her fellow group members e) depending on her knowledge, does something else

    Why am I so reluctant to global and universal lists? Because I believe the first articles that is going to get written in a minority language wikipedia is about the language. The second will be about it's ortography. The third will be about a third local culture-specific phenomena. Even though these are articles not yet existing in the Wikipedia universe, the structure of the articles will be drawn from similar articles in other languages. It will take some time until a balance is reached between the local culture and the universal western Wikipedia culture. Until then, the categories will be narrowed down to what is considered local, as with the flowers example above.

  22. Andrjus Frantskjavitsius repo owner

    Lets not bring language codes and other complications into this. At least for now. MT is a translation tool, hence it relies on lists from other languages.

    Remember that the lists are grouped "10 000 articles" etc. This is how locality can be introduced.

    If the need arises, aditional complexity can be added.

    Also, this goes to both of you: try keeping you replies manageable. Long replies result in the opposite effefct. If I see a wall of text, I usually try to extract the meaning by skimming through it. Why? Because there is another wall of text waiting for me behind the corner!

  23. Peeter Tinits reporter

    @keeleleek, currently, you could add a local list just under the category "local lists" (or arrange it somehow else). It would be in the list of lists for everyone, but only the people interested would select it. In this case it doesn't really have to be in English to start either (each list referred to in the list of lists will be just a wiki-page from which the links are retrieved as wikidata id-s).

    Currently Estonia 100 is exactly this kind of local list at https://meta.wikimedia.org/wiki/User:MinorityTranslate/Suggestion_Lists.

    It would be great to have translation down the line, but it's ok functional for now without it too I guess.

  24. Peeter Tinits reporter

    What's the status? So currently, they are still stored as a static list? Are there plans to change? Just checking in, thanks.

  25. Kristian K

    I'm convinced the solution we find or should search for, should be compatible with ContentTranslate.

  26. Peeter Tinits reporter

    @andrjus was it difficult to make? - if you would just add the link to a wikipedia page with suggestion links, as the only link on the server, and make it go one level deeper, does it not solve the problem?

  27. Andrjus Frantskjavitsius repo owner

    The problem is the 1 + N HTTP calls. One for the list and the remaining for the wiki articles containing the lists. I don't want to make a lot of calls.

    While it is not hard to write, its not trivial either. The code gets verry messy and I'm trying to avoid that at this stage. If there is an easier solution I would go with that.

  28. Peeter Tinits reporter

    Ok, how about periodically updating the server copy with the list at the link. This way what would be needed is just a small script that updates the server file, according to the wikipedia list of lists that any user can edit. It could be done e.g. twice a day or on command to begin with.

    I wasn't aware that making lots of calls is an issue. Ok, in this sense, I don't know how necessary it is at this point.

  29. Log in to comment