1. Ákos Kádár
  2. Search Engine for Java Method-Signatures

Overview

HTTPS SSH

Introduction

This is my Master Thesis project submitted in partial fulfillment of the requirements for the degree of Master of Sciences in Communication and Information Sciences, Master Track Human Aspects of Information Technology, at the faculty of humanities of Tilburg University.

It is a source code component retrieval application and it can retrieve Java methods-signatures from the Java Standard Library given an English query.

It works in a fairly unorthodox way: retrieves methods using bag-of-words translation: The translation model is a Ridge Regression model trained on the term-document matrices of the two parallel document collections: Java method-signatures + Descriptions

Usage

  • Required packeges: Gensim, Scikit-learn and argparse and climate
  • First run learn.py to create a model
  • Run the search engine by running GUI.py

  • vectorspace.py can be used to create tf*idf vectors from texts

  • search.py contains methods to use Gensim's search interface
  • learn.py trains the regression model

References