HTTPS SSH

MICO Extractors

This repository contains the extractors that have been developed for the MICO Platform during the MICO project and beyond by members of the MICO team. Please feel free to fork and add your own MICO extractors. Guidelines on how to do this are given in the sections below.

MICO extractors are components written in Java or C++ that are triggered by the MICO Platform broker in order to extract meta and/or media data from multimedia sources such as text, images, audio and video. They receive so called parts as input and create one or multiple parts as output. Please refer to the meta data model spec to understand what exactly that means.

If you intend to just trying out the existing extractors you may consider to install their binary versions. For this, please follow the instruction on the MICO project website.

Extractor Build

If you intend to build the MICO extractors on your own on a native system (e.g. to debug/modify them) please follow the steps below.

OS Requirements

All extractors are build and run on the Linux (x64) operating system. Our current reference system is Debian 8.0 (Jessie). However, we've also successfully build the extractors on a Fedora 20 system (some dependencies must be installed manually though). Java extractors are basically compilable on any system with JDK 8 and above but since they are run as daemons only Linux is supported during run time.

Windows is NOT SUPPORTED as a build platform.

Please notice that you need a working internet connection during build time!

MICO platform package repository

The MICO platform contains the whole MICO system components as well as development components we need for C++ devlopment. You may install the whole platform in order to be able to test/debug your extractors in a real world scenario. In order to be able to get the MICO API Platform package follow these steps as root/sudo:

  1. Create the file /etc/apt/sources.list.d/mico.list and add the line:

    deb http://apt.mico-project.eu/ mico main contrib
    
  2. Run

    apt-get update && apt-get install mico-apt-key && apt-get update
    
  3. [optional]: You may install the whole MICO Platform now for testing extractors:

    apt-get install mico-platform
    

C++ extractor build

For building the C++ extractors you need at least the following build tools and dev libraries to be installed on the system:

Name Debian package
Git git
gcc & Co build-essent
CMake cmake
Boost libboost-all-dev
OpenCV libopencv-dev
libavcodec/libav libavcodec-dev libavformat-dev libavfilter-dev libswscale-dev libavresample-dev
ImageMagick libmagick++-dev
libatlas libatlas-dev
ZenLib libzen-dev
MediaInfo libmediainfo-dev
MICO C++ API libmico-api-dev (see section above)

Please notice, that some external dependencies(libccv, yolo, etc.) are downloaded built and installed during build due to lack of an respective debian package.

To build all C++ extractors:

  1. Clone the repository

    git clone https://bitbucket.org/mico-project/extractors.git
    
  2. Create a build directory "build_mico_extractors"

    mkdir build_mico_extractors
    
  3. Enter build directory

    cd build_mico_extractors
    
  4. Execute CMake

    cmake ../extractors/c++
    

After build, all extractors executables will reside in their respective build directory. E.g. the animal dectector yolo you'll find by typing

cd c++/TE-202_OAD_yolo_detector/

To build all C++ extractors Debian packages

  1. Follow steps 1-3 above

  2. Run

    ../extractors/c++/build_released_extractor_packages.sh
    

After build, all extractors Debian packages reside in the "package" directory

New extractor development

Environment and Build

We use CMake for C++ based and Maven for Java-based extractor services. Please make sure that dependencies to external libraries are properly documented (especially in the C++ case) and can be resolved by the build configuration.

Repository structure

The top level folder specifies the programming languages to be used, i.e. Java or C++. On the section level the each extractor is placed in an own folder with a meaningful name. For extractors created during the time of the MICO project, the folder name should refer to the technology enabler ID in some way. This makes finding specific extractors easily. For C++ extractors a 3rd_party folder exist which contains closed source binaries that are required by an extractor. Open Source dependencies must not be placed in the repository but should rather be installed in the system and resolved by the build script of the extractor.

extractors
|
|-README.md
|
|--c++
|   |
|   |--3rd_party
|   |  |
|   |  |-bin
|   |  |-include
|   |  |-lib
|   |
|   |--modules
|   |
|   |--TE-XXX_extractor_serviceA
|   |  |
|   |  |-README.md
|   |  |-CMakeLists.txt
|   |  |-[submoduleX]
|   |  |-[submoduleY]
|   |
|   |
|   |--TE-XXX_extractor_serviceB
|
|--java
    |
    |-pom.xml
    |
    |--TE-XXX_extractor_serviceC
    |  |
    |  |--src
    |  |  |
    |  |  |-main
    |  |  |-deb
    |  |  |-test
    |  |
    |  |-README.md
    |  |-pom.xml
    |
    |--TE-XXX_extractor_serviceD

C++ Implementation

C++ extractors must follow these coding rules in order to be run by the MICO platform:

  • Must be compiled as executables
  • Must have mico-extractor-[name] as executable name
  • Must provide 3 CLI positional arguments in the following order and 1 CLI switch:

    extractor_executable_name  [hostname] [username] [password] -k
    

    where

    [hostname] - is the server name of the mico platform (broker, marmotta, RabbitMQ)

    [username] - is the user name for the MICO platform

    [password] - is the user password for the MICO platform

    -k - is used to kill the daemon service in a controlled manner

    You may add additional CLI switches and options for the specific extractor configuration. Please also refer to C++ extractor examples in the repository.

  • Must run as Linux daemons when executed. This can simply be achieved by using the mico-daemon provided by the MICO platform API. To start the daemon use mico::daemon::start. To stop it with the -k option use mico::daemon::stop. A minimal main() function could look like:

    //minimal extractor main (TBD)
    int main(int argc, char **argv) 
    {
      //...
      //set doKill according the command line parameter -k
      //...
      if(doKill) {
        return mico::daemon::stop("MyMicoExtractorDaemon");
      } else {
        return mico::daemon::start("MyMicoExtractorDaemon", server_name, mico_user, mico_pass, {new MyMicoExtractorService()});
      }
    }
    

C++ development build

The extractor repository provides a module for finding the MICO platform C++ API for conveniences. Use

find_package(MICOPlatformAPI REQUIRED)

in your CMake-script to find it. If the MICO platform C++ API has not been installed into the Linux system you can give hints using CMAKE_PREFIX_PATH or MICOPlatformAPI_HOME as arguments to the cmake script. A typical call could then look like:

cmake -DMICOPlatformAPI_HOME=[path to MICO platform API] -DCMAKE_PREFIX_PATH=[path to local installs such as protobuf] [path to extractor source]

All depdendencies introduced by the MICO platform C++ API should be resolved by the find script for the platform and using

target_link_libraries(${TARGET_NAME}
    ...
    ${MICOPlatformAPI_LIBRARIES}
    ...
)

in your extractors CMakeLists.txt.

Please check example extractors build configuration for more ideas on how configure the build.

Once you've successfully run CMake you may run

make -j

in your build directory to build your extractor.

Java implementation

Java extractors have to implement the interface eu.mico.platform.event.api.AnalysisService A sample extractor is available in platform repository api/java/sample

Java extractors should use ''mico-extractors'' as parent to easily manage the correct version of dependencies and build plugins, necessary to run with latest platform release.

  <parent>
    <groupId>eu.mico-project.extractors</groupId>
    <artifactId>mico-extractors</artifactId>
    <version>1.0.0-SNAPSHOT</version>
  </parent>

Java build

To build an executable ''standalone jar' the maven-shade-plugin can be used:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <executions>
        <execution>
            <phase>package</phase>
            <goals>
                <goal>shade</goal>
            </goals>
            <configuration>
                <createDependencyReducedPom>false</createDependencyReducedPom>
                <transformers>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer"/>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                        <mainClass>[qualified.name.of.main.class]</mainClass>
                    </transformer>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                    <transformer implementation="org.apache.maven.plugins.shade.resource.ApacheNoticeResourceTransformer">
                        <addHeader>false</addHeader>
                    </transformer>
                </transformers>
            </configuration>
        </execution>
    </executions>
</plugin>

The build can be triggered by the standard call of mvn package

MICO Extractors deployment

Although residing in the same git repository, each MICO extractor is versioned, packaged and deployed individually. While C++ extractors are deployed as daemon executables, Java extractors are deployed as jar-files. All extractors must be provided as versioned Debian files. See guides below on how to achieve this.

Packaging extractors

Each MICO extractor package must contain and install the following files in the locations specified below:

Native (C++) extractor package content

/.
|
|--bin
|  | 
|  |--mico-extractor-[name]
|  |--mico-extractor-[name]-config
|
|--share
    |
    |--doc
    | |
    | |--mico-extractor-[name]
    |    | 
    |    |--copyright
    |
    |--mico-extractor-[name]
          | 
          |--[resource 1 required by extractor]
          |--[resource 2 required by extractor]
          |--...

Java extractor package content

/.
|
|--share
    |
    |--mico
    |  |
    |  |--mico-extractor-[name].jar
    |  |--mico-extractor-[name].jar-config
    |
    |--doc
    | |
    | |--mico-extractor-[name]
    |    | 
    |    |--copyright
    |
    |--mico-extractor-[name]
          | 
          |--[resource 1 required by extractor]
          |--[resource 2 required by extractor]
          |--...

Configuration file specification

The current version of the MICO broker requires a simple description file that accompanies each extractor in order to configure extractor pipelines. This file is essentially a bash script and must follow these rules

  • Name: [extractor-name]-config
  • Location: see sections above
  • Content for native extractors:

    #!/usr/bin/env bash
    EXTRACTOR_NAME="mico-extractor-[name]"
    EXTRACTOR_DESC="[a description of the extractor]"
    EXTRACTOR_SYSTEM=native
    
  • Content for Java extractors: In addition to name and system the starting class and the required class paths (including the one of the own JAR) must be added

    #!/usr/bin/env bash
    EXTRACTOR_NAME="mico-extractor-[name].jar"
    EXTRACTOR_DESC="[a description of the extractor]"
    EXTRACTOR_SYSTEM=java
    
    EXTRACTOR_CLASS="[the java daemon entry class]"
    EXTRACTOR_CLASS_PATH="[class path 1]":"[class path 2]":"/usr/share/mico/$EXTRACTOR_NAME"
    

Native (C++) extractor deployment work flow

Packaging preparation

For sake of easiness one may use CMake / CPack for creating Debian packages. While easy to implement it is not as flexible as using the tools provided by Debian. Anyhow the packages are more or less accepted by the Debian packaging test tool and with a bit more effort also insertable to the MICO package repository. Packaging consists of these steps:

  • Add installations instructions to the CMakeLists.txt for your extractor according to the file structure above. Example:

    install(TARGETS ${TARGET_NAME} RUNTIME DESTINATION bin COMPONENT ${TARGET_NAME})
    install(FILES ${CMAKE_CURRENT_BINARY_DIR}/copyright DESTINATION share/doc/${TARGET_NAME}/ COMPONENT ${TARGET_NAME})
    install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/face.sqlite3 DESTINATION share/${TARGET_NAME}/ COMPONENT ${TARGET_NAME})
    install(PROGRAMS ${CMAKE_CURRENT_BINARY_DIR}/${TARGET_NAME}-config DESTINATION bin COMPONENT ${TARGET_NAME})
    

    Please refer to the CMake documentation for install fore more details.

  • Add CPack information to the CMakeLists.txt. The following snippet (may need adpations to your extractor) produce acceptable Debian packages

    # configuration for Debian packaging via CMake
    IF(EXISTS "${CMAKE_ROOT}/Modules/CPack.cmake")
        INCLUDE(InstallRequiredSystemLibraries)
        SET(CPACK_GENERATOR "DEB") 
        SET(CPACK_PACKAGE_NAME ${TARGET_NAME})
        SET(CPACK_PACKAGE_FILE_NAME "${TARGET_NAME}_${TARGET_VERSION}_${CMAKE_SYSTEM_NAME}_${CMAKE_SYSTEM_PROCESSOR}")
        SET(CPACK_PACKAGE_VENDOR "Fraunhofer IDMT")
        SET(CPACK_PACKAGE_DESCRIPTION ${TARGET_DESC})
        SET(CPACK_PACKAGE_DESCRIPTION_SUMMARY ${TARGET_SHORT_DESC})
        SET(CPACK_SET_DESTDIR On)
        SET(CPACK_INSTALL_PREFIX /usr)
        SET(CPACK_PACKAGE_CONTACT "Christian Weigel <christian.weigel@idmt.fraunhofer.de>")
        SET(CPACK_PACKAGE_VERSION_MAJOR "${TARGET_VERSION_MAJOR}")
        SET(CPACK_PACKAGE_VERSION_MINOR "${TARGET_VERSION_MINOR}")
        SET(CPACK_PACKAGE_VERSION_PATCH "${TARGET_VERSION_PATCH}")
        SET(CPACK_STRIP_FILES On)
        #Debian specifc package informations (add package dependencies here!)
        SET(CPACK_DEBIAN_PACKAGE_SHLIBDEPS On)
        SET(CPACK_DEBIAN_PACKAGE_DEPENDS "libmico-api1 (=${MICOPlatformAPI_VERSION}), libmagick++-6.q16-5, bswscale3, libavfilter5, libavformat56, libavcodec56, libboost-regex1.55.0, libboost-filesystem1.55.0, bboost-program-options1.55.0, libboost-log1.55.0, libboost-system1.55.0")
        SET(CPACK_DEBIAN_PACKAGE_SECTION "non-free/science")
        INCLUDE(CPack)
    ENDIF(EXISTS "${CMAKE_ROOT}/Modules/CPack.cmake")
    

    Please make sure, that your extractor adds all runtime library as Debian package dependencies to the CPACK_DEBIAN_PACKAGE_DEPENDS variable. Please refer to the CMake documentation for packaging and Debian specific packaging fore more details.

Repository preparation and deployment

For every native extractor release please:

  1. Prepare additionally required files (broker configuration file)
  2. Prepare install and packaging environment - see sections above
  3. Up-count the version in the CMakeLists.txt
  4. Make a release preparation commit on the master
  5. Tag this commit with the release version, example:

    git tag tvs-1.1.0 HEAD
    

    Tagging rules: Use a 3 letter acronym followed by a minus and a dot separated 3 digit version number:

    ABC-1.0.0
    
  6. Push the commit

  7. Push the tag with

    git push --tags
    
  8. Prepare your build directory by running CMake on the source tree you've just committed (see section C++ Build, we recommend to use a separate build directory)

  9. Run

    make package
    

    or

    cpack
    

    in your build directory.

  10. Test install the generated Debian package in a clean Debian system (you may use Docker or some other fancy mechanism for that)

  11. Provide the debian package to Salzburg research so that they can add to the MICO Debian repository.

Notice: For your convenince a bash script is provided that automatically build the extractors to be know release-ready. It's located in c++/build_released_extractor_packages.sh directory of the extractor repository. In order to use it follow these steps

  • Add any release-ready extractor to the array within the script.
  • Follow stept 1. to 7. above
  • Prepare an empty build directory change there and run

    [src-location]/c++/build_released_extractor_packages.sh
    

    Afterwards you'll find all extractor Debian packages in the [build directory]/packages folder.

Java extractor deployment work flow

The main part of the debian package configuration for an extractor is in pom.xml (jdeb build plugin). For a more complex configuration sample see TE-202 or TE-206.

<plugin>
    <artifactId>jdeb</artifactId>
    <groupId>org.vafer</groupId>
    <version>1.4</version>
    <executions>
        <execution>
            <phase>package</phase>
            <goals>
                <goal>jdeb</goal>
            </goals>
            <configuration>
                <verbose>true</verbose>
                <controlDir>${project.build.directory}/deb/debian</controlDir>
                <snapshotExpand>true</snapshotExpand>
                <snapshotEnv>buildNumber</snapshotEnv>
                <attach>false</attach>
    ....
</plugin>

There are some more files to control installation of debian package at src/deb/debian/

  • control
  • postinst
  • postrm
  • preinst
  • prerm

The configuration to start and stop the extractor daemon is located at src/deb/resources/mico-exctractor-[name].jar-config

#!/usr/bin/env bash
# global setup variables
EXTRACTOR_NAME="mico-extractor-${project.artifactId}.jar"
EXTRACTOR_DESC="Converts object (animal, face) detection results into RDF"
EXTRACTOR_SYSTEM=java

EXTRACTOR_CLASS="de.fhg.idmt.mico.extractors.or.ObjectToRDFDaemon"
EXTRACTOR_CLASS_PATH="/usr/share/java/commons-daemon.jar":"/usr/share/mico/$EXTRACTOR_NAME"

A complete sample configuration is available in the ''TE-202-object-detection-To-RDF'' extractor. To create the debian package run maven with debian profile enabled: mvn package -Pdebian