Clone wiki

virusbattle-sdk / Scalability


In general, computational cost depends on the algorithm, the implementation, underlying hardware, and the dataset. The times provided below are ballpark times for analyzing a binary of around 650Kb (95 percentile in our collection). These times are for an unoptimized implementation on a modest hardware: the work horse consisting of 4 cores with 8 Gb memory, and separate machines for database server and web server.

  • Unpacking: 15 seconds (sometimes 3 minutes)
  • Semantic Reverse Engineering: 15 seconds,
  • Searching: around 15sec per binary, on a database of over 30,000 samples (with a naive algorithm).
  • Searching: around 10ms per procedure, on a database of over 30,000 samples.

The implementations are unoptimized in that they do not take advantage of certain inherent parallelism in the computations. Thus, the above time estimates provide ample space for improvement by using better hardware, and improved software architecture.