bibsonomy / BibSonomy / issues / #2573 - Tag cloud seems to be wrong — Bitbucket

Issue #2573 open

Andreas Hotho created an issue 2015-12-02

The tag cloud is wrong (regarding post count) http://www.bibsonomy.org/search/clustering

Comments (19)

Daniel Zoller
the size is correct, dblp with ~1000 dominates the cloud.
- 2015-12-03T09:24:06+00:00
Daniel Zoller
- changed status to duplicate
Duplicate of ~~#2439~~.
- 2015-12-03T09:24:14+00:00
Andreas Hotho reporter
But 30 for dblp seems to be the wrong number and clustering should be something like 1700 as /tag/clustering has 1700 posts.
- 2015-12-03T19:55:50+00:00
Daniel Zoller
removed old rawsearch and replaced it with search (modified search was used by the database queries)

getPosts used rawsearch while getTags used search (which added boolean operators to the query)

addresses #2573; reopening #2573

→ <<cset ebef8c5242d1>>
- 2015-12-03T20:54:21+00:00
Daniel Zoller
- changed status to open
removed old rawsearch and replaced it with search (modified search was used by the database queries)

getPosts used rawsearch while getTags used search (which added boolean operators to the query)

addresses #2573; reopening #2573

→ <<cset ebef8c5242d1>>
- 2015-12-03T20:54:21+00:00
Daniel Zoller
- changed component to search
- changed title to Tag cloud seems to be wrong
- changed milestone to 3.5
- edited description
- 2015-12-03T20:56:09+00:00
Daniel Zoller
one problem was that the methods for retrieving posts and tags did not use the same query:

getPosts used clustering methods and getTags used +clustering +methods
- 2015-12-03T20:59:26+00:00
Daniel Zoller
Still the limit and offset for search is broken: The tag cloud is only computed on limit post results; currently we compute the tag cloud based on the tags of the 1000 most recent posts that matched the query. Do we want to change this behaviour? (Note: handling was not changed when switching from lucene to elasticsearch). I don't know how performant it is to calc the tag cloud on all results. Maybe we could use the aggregation function of Elasticsearch.

@jaeschke Do you have any experience with aggregations in Elasticsearch?
- 2015-12-03T21:33:45+00:00
Robert Jäschke
No, unfortunately, not yet. It sounds interesting and like a good option. For single-tag queries we can, of course, use the database which provides the related tags. For more than two tags, however, we have to rely on Elasticsearch.
- 2015-12-04T15:08:34+00:00
Andreas Hotho reporter
Even for single tag queries, numbers can be different as tags can be in description of other posts as well. The best way would be the use of the elastic search. I suggest to have it as a kind of lazy loading with Ajax. This would allow to load the page very quickly and the tag cloud with some delay.
- 2015-12-04T16:35:35+00:00
Robert Jäschke
Yes, that's possible. We have/had AJAX loading of the tag cloud implemented anyway (for the case where a user changed the number of shown tags, etc.). What I don't understand is the remark "tags can be in description of other posts as well". Which description and which posts are meant? Normally, the tag cloud shows all tags of all posts of the user.
- 2015-12-07T07:57:50+00:00
Andreas Hotho reporter
Ok, tags is the wrong term, it is more words which can be found in the description field. The tags of this post should be count as well.
- 2015-12-07T08:19:21+00:00
Robert Jäschke
Yes, that makes sense indeed.
- 2015-12-07T08:22:57+00:00
Daniel Zoller
- changed milestone to 3.6
- 2016-02-08T09:11:20+00:00
Daniel Zoller
- changed milestone to 3.6.0
- 2016-02-24T17:23:14+00:00
Daniel Zoller
- changed milestone to 3.7.0
- 2016-06-05T21:17:11+00:00
Daniel Zoller
no system for testing aggregation -> next release
- 2016-11-06T16:34:18+00:00
Daniel Zoller
- changed milestone to 3.8.0
- 2016-11-06T16:34:27+00:00
Daniel Zoller
- removed responsible
- 2017-01-26T10:35:33+00:00
Log in to comment

Assignee: –

Type: bug

Priority: major

Status: open

Component: search

Milestone: 3.8.0

Version: 3.4

Votes: 0

Watchers: 2

Jira: the preferred issue tracker for Bitbucket. Join the team!