Collect usage data

Issue #82 on hold
Peeter Tinits created an issue

To improve the program, it would be good to know how people use it, what they use it, etc.

It could be even included as an opt-out tick box if needed. And I think most won't have a problem with this. It shouldn't really take much space to store either (at least for now, in the beginning)

But generally collecting all sorts of usage data on any upload would be really great source of information, on what's useful. I don't think we'll always have a hands-on personal contact with the users for feedback.

Comments (9)

  1. Andrjus Frantskjavitsius repo owner

    Useful... yes. Useful enough to sink development time into... maybe.

    What would you collect?

    I will probably put this on hold later. Keeping it open for discussion.

  2. Peeter Tinits reporter

    Yea, ok to keep on hold. I would collect interface settings (which templates and features they used). Etc. More thoroughly one could monitor all the registered clicks on the interface (e.g. changing templates back and forth etc, to see how people work with the interface). It seems to me that collecting should be easier than analysing, but analysing would be worried about later.

    But I agree, that this is avoidable now, though it could be kept open.

  3. Andrjus Frantskjavitsius repo owner

    While its nice to see how people use the application, its time consuming to implement and there is no concrete use for it right now. Putting on hold.

  4. Peeter Tinits reporter

    Ok, I have to say that some data is already being collected - on the op system that people use. This doesn't really have much to do with a corpus and could be included in a separate dataset stored alongside corpus. This would justify having two ticks in the menu (one for user metadata and one for usage data), later more data collection features can be added.

    This is just to tie it into preferences restructuring. But usage collection can be kept on hold.

  5. Kristian K

    The reason why this data is collected is because it is readily easy to do so. There is no need to split the dataset into many. This is done by filters, views and queries. From the point of the user, the technical division of how the data is stored is not relevant.

  6. Peeter Tinits reporter

    The only point to separate them would be as it makes sense for me for the user to tick the usage data and corpus data boxes separately. Though I'm sure that most would choose anyway. Just cleaner and more polite this way I think. The data can still be stored in one file if needed (though the usage data file might potentially grow very large - depending on what we collect. Collecting lots of usage data however probably is a topic for future as I understand it is not easy to implement.

  7. Andrjus Frantskjavitsius repo owner

    Unless you build your application from the ground up with the intention to collect usage data, its not easy. Or at least it will be a verry messy solution. This is why I'm reluctant to implement usage data collection.

  8. Log in to comment