Allow filtering by language

Issue #6 resolved
Waldir Pimenta created an issue

Via a blacklist (e.g. I can't understand non-latin scripts) and/or whitelist.

A whitelist could function more as a preference filter, i.e. show me items in these languages first, and only in the other ones after those are run out; or show me the description/wikipedia excerpt in one of my whitelisted languages, if there are several linked wikipedia articles for a given item.

The reason I suggest the whitelist not to be an absolute cutoff filter is that I found myself surprisingly able to understand some of the articles even in languages I don't speak, just from the similarity of the sentence structure and some cognate words (I mean, of course, languages from the same roots, e.g. Latin-influenced languages)

Even better yet, of course, would be to allow automatic translation of the texts via some API :D (I believe translatewiki.net implements something of this sort)

Comments (16)

  1. Bohdan Melnychuk

    +1 would be very nice. E.g. I can guess gender in English and in Slavic languages but skipping French, Spanish, Chinese, Japanese and others takes time.

  2. Giso Broman

    Each of us has his or her own unique set of language skills & filtering would be a great way to capture those abilities. Using a translation program would lead to a lot of errors, however, especially with regards to gender. He/she pronouns are very often translated incorrectly.

  3. Andre Engels

    Apart from a blacklist or whitelist, I'd also like a preference list. There's no language that I would blacklist, but especially with the merge game there is often a choice of language. Now it seems the largest one is chosen. It might be nice to change that order, I had for example a case where Ukrainian was shown, but Romanian would have been preferred by me.

  4. Waldir Pimenta reporter

    @andreengels yes, that was the original idea :) see the issue description:

    A whitelist could function more as a preference filter (...)

    Although I agree, the name "whitelist" is definitely misleading, since it's literal meaning is different.

  5. Luc Béasse

    Let me try to put all this together. It seems what is asked for here is :

    • a language blacklist: don't show me items with descriptions in these languages
    • a language whitelist: only show me items with descriptions in these languages
    • a language priority list: show me the description among the available languages in this order of priority.

    Then the use case indicated in the initial request (show me these languages first, then these when the 1st batch runs out) can be implemented by saying that the selection behavior falls back to blacklist filtering whenever the whitelist filtering returns an empty result.

  6. Waldir Pimenta reporter

    Hm, I don't see why there needs to be a pure whitelist if a preference sorting is implemented (e.g. we could even specify our fluency with babel-like parameters).

    If that is done, then the languages at the topmost level would work as an implicit whitelist; when this level is exhausted, the languages in the next level come forward, and so forth. When all the fluency levels are exhausted, the game would start presenting items in the remaining languages, which would likely still work quite often because most people can get the general idea even for languages they don't know anything about (I think anyone who has played the "persons" game can attest that).

    Only at this point the blacklist would come into place, by filtering out the languages that one is sure not to understand, for instance those written in foreign scripts (and again, it would effectively be equivalent to the babel 0 level).

    So in essence what we'd need is just a babel-like set of user preferences; how the game interprets these (whitelist, blacklist...) should be transparent to the user.

  7. Unknown Name

    Please, please do not create a complicated blacklist + whitelist + priority list system. That would be a user interface nightmare. Most people only know 1-5 languages, so just make it a simple whitelist.

  8. Giso Broman

    I agree that an overly complex solution would be an awful user experience, but a whitelist for many people would include many more languages than five. While it's true that most people only really know a few languages, there are potentially dozens of very closely related languages (e.g., Romance languages). No one is going to want to have to pour over a list of 100+ language checkbox options just to ensure that they can avoid looking at a couple of pages that they can't help identify. This is, after all, just a game…

  9. Waldir Pimenta reporter

    what Giso says. A babel-like language fluency list doesn't seem like an overly complex implementation to me, it would essentially be a simple associative array (pt=n, en=4, fr=2, ja=0, etc.) and the filtering of languages would simply favor the highest-rated languages by fluency, and avoid the zero-rated languages.

    By contrast, a whitelist would require pretty much the same complexity (an array with keys only, no values), but would lack the "serendipity" effect (by lack of a better term) already mentioned by many in this thread, since anything not on the whitelist would implicitly be blacklisted.

  10. Jesse Chandler

    I confess I've done it myself, but think people should be careful about working with languages that they are not native in or nearly so. A key term like "transgender man" may be hard to detect, for example, or the fact that the item is incorrectly labelled as a human to begin with. I would suggest limiting the game to one language per instance via an argument in the URL. People who are fluent in more than one language and want to work with them all at once can open more tabs. As a programmer, If it were me I would really want to avoid making all the code necessary to maintain whitelists or blacklists, and I don't think it's really necessary or helpful.

  11. Log in to comment