Plugin parsers file structure

General

The plugin parser is a .json file. The general structure is as follow:

{
    "version": 1, 
    "title": "MangaTown", 
    "code": "mangatown", 
    "host": "http://www.mangatown.com/", 
    "language": "en", 
    "online_search": {},
    "cover": {},
    "chapters": {},
    "chapter_page": {}	
}

version: the version of the file.

title: title of the source. REQUIRED.

code: unique identifier o the source. REQUIRED.

host: url of the source. REQUIRED.

language: language of the manga hosted by the source. REQUIRED.

online_search: data needed to search manga on the source. REQUIRED.

cover: data to retrieve the cover thumbnail of a manga.

chapters: data needed to retrieve the list of chapters of a series. REQUIRED.

chapter_page: data to get the number of page of a chapter, get the image page. REQUIRED.

Online search

To search online, the app need informations like the url, search criteria, how to parse the result, ...

"online_search": { 
    "url": "http://www.mangatown.com/search.php?advopts=1",
    "search_criteria": [ 
        "name", "author", "artist", "type", "genres", "year", "rating", "complete"
    ],
    "search_name": "name_method=cw&name=%1$s",
    "search_author": "author_method=cw&author=%1$s",
    "search_artist": "artist_method=cw&artist=%1$s",
    "search_type": "type=%1$s",
    "search_year": "released_method=eq&released=%1$s",
    "search_rating": "rating_method=eq&rating=%1$s",
    "search_complete": "is_completed=%1$s",
    "type_code": [
    "", "manga", "manhwa", "manhua"
    ],
    "type_label": [
        "All", "Japanese Manga", "Korean Manhwa", "Chinese Manhua"
    ],
    "rating_code": [
    "", "0", "1", "2", "3", "4", "5"
    ],
    "rating_label": [
        "Any star", "No star", "1 star", "2 stars", "3 stars", "4 stars", "5 stars"
    ],
    "complete_code": [
        "", "1", "0"
    ],
    "complete_label": [
        "All", "Yes", "No"
    ],
    "genres_code": [
    "all", "genres[4+koma]", "genres[action]", ..., "genres[youkai]"
    ],
    "genres_label": [
        "All", "4 Koma", "Action", ..., "Youkai"
    ],
    "css_path": "ul.manga_pic_list li p.title a",
    "css_path_name": "div.info p.subj",
    "next_page_css_path": "a.next",
    "no_more_page_value": "javascript:void(0)",
    "manga_id": "http://www.mangatown.com/manga/{manga_id}/" 
}

url: base url of the search engine. REQUIRED.

search_criteria: search criteria displayed in the app. You don't need to use them all. REQUIRED.

search_name: value added add the end of the base url to search by name. %1$s will be replaced by the value entered in the application. REQUIRED if search_criteria contains "name"

search_author: value added add the end of the base url to search by author. %1$s will be replaced by the value entered in the application. REQUIRED if search_criteria contains "author"

search_artist: value added add the end of the base url to search by artist. %1$s will be replaced by the value entered in the application. REQUIRED if search_criteria contains "artist"

search_type: value added add the end of the base url to search by type (like manga, manhwa, ...). %1$s will be replaced by the value selected in the application (see "type_code" and "type_label"). REQUIRED if search_criteria contains "type"

search_year: value added add the end of the base url to search by year. %1$s will be replaced by the value entered in the application. REQUIRED if search_criteria contains "year"

search_rating: value added add the end of the base url to search by rating. %1$s will be replaced by the value selected in the application (see "rating_code" and "rating_label"). REQUIRED if search_criteria contains "rating"

search_complete: value added add the end of the base url to search by status (complete, incomplete, ...). %1$s will be replaced by the value selected in the application (see "complete_code" and "complete_label"). REQUIRED if search_criteria contains "complete"

type_code, type_label: list of types used to generate a dropdown list to select the type. The first is the code sent to the server. The second the label for the user. If the user select the second type (Japanese Manga), the corresponding code (manga) will replace %1$s in search_type. If the defined search_type = "type=%1$s", "type=manga" will be sent to the search engine. REQUIRED if search_criteria contains "type"

rating_code, rating_label: list of ratings used to generate a dropdown list to select the rating. The first is the code sent to the server. The second the label for the user. If the user select the fifth rating (3 stars), the corresponding code (3) will replace %1$s in search_rating. If the defined search_rating = "rating_method=eq&rating=%1$s", "rating_method=eq&rating=3" will be sent to the search engine. REQUIRED if search_criteria contains "rating"

complete_code, complete_label: list of statuses used to generate a dropdown list to select the status. The first is the code sent to the server. The second the label for the user. If the user select the 3rd value (NO), the corresponding code (0) will replace %1$s in search_complete. If the defined search_complete = "is_completed=%1$s", "is_completed=0" will be sent to the search engine. REQUIRED if search_criteria contains "complete"

genres_code, genres_label: used to include or exclude genres. The first value of the 2 lists is not sent. It's an entry to select/unselect all other entries. REQUIRED if search_criteria contains "genres"

css_path: CSS path to get <a href="url"></a> elements of the series (Reference: http://www.w3.org/TR/CSS21/selector.html%23id-selectors). "ul.manga_pic_list li p.title a" will return all tag <a> in <ul class="manga_pic_list><li><a href="...">series1</a></li><li><a href="...">series2</a></li></ul>. The href attribute will contains the url of the series page. By default the name of the series is the text between <a> and </a>. REQUIRED

css_path_name: If the name of the series cannot be found directly in the tag <a>, this path is used to search inside it. In the case <a>series...</a>, a path "span.name" can be defined to search in the tag <a> the part containing the name of the series.

next_page_css_path: Path to the tag <a> containg the url of the next page of result. Ex: <a class="next" href="...">Next</a> can be found with the path "a.next". Used by the endless list to load more result when at the bottom of the list.

no_more_page_value: Stop loading more result if value is meet.

manga_id: extract the id of the series from the url. If the value of this property is "http://www.mangatown.com/manga/{manga_id}/" and the url found is "http://www.mangatown.com/manga/naruto/", the id "naruto" will be extracted and be used to construct url to load cover and chapters. REQUIRED

Loading the cover

To download the cover of the series.

"cover": {
    "url": "http://www.mangatown.com/manga/%1$s/",
    "image_css_path": "div.detail_info > img"
}

url: url of the series page containing a tag <img> with a cover of the series. %1$s will be replaced by the manga id. REQUIRED

image_css_path: CSS path to find the <img> tag in the page. REQUIRED

Loading the list of chapters

To load the list of chapters.

"chapters": {
    "url": "http://www.mangatown.com/manga/%1$s/",
    "chapters_css_path": "ul.chapter_list li a",
    "chapters_css_path_name": "span.subj span",
    "next_page_generator": "&page=%1$s"
}

url: url of the series page containing a tag <img> with a cover of the series. %1$s will be replaced by the manga id. REQUIRED

chapters_css_path: CSS path to get the list of links (tag <a>) of the chapters. REQUIRED

chapters_css_path_name: if the chapter name is not the text of the tag <a>, this CSS path find it in a child tag.

next_page_generator: if the list of chapters is paginated, the value will be added to the url to load the next page. %1$s is the page number.

Loading the pages of the chapter

"chapter_page": {
    "pages_css_path": "div.manga_read_footer div.page_select option",
    "selector_mode": "full_url_option",
    "ignore_last_option": "true",
    "next_page": "%1$s.html",
    "image_css_path": "div#_imageList img._images",
    "image_src_attr": "data-url"
}

pages_css_path: CSS path to get the tags <option> used to select a page.

selector_mode: if the value is "full_url_option", the <option> found contains the URL of the pages (<option value="http://server/manga/series/chapter/pagex">). If the value is "num_page_option", the option contains the page number. The page number will be added at the end of the url of the first page to load it. If the value is "all_pages", it's a single page (all images on a page). REQUIRED

ignore_last_option: if the last <option> is used to load something else that a page (like comments), set it to true.

next_page: pattern to generate the next page if selector_mode = num_page_option.

image_css_path: CSS path to find the <img> tag containing the image. If selector_mode = all_pages, the CSS path can return a collection of <img> tag. REQUIRED

image_src_attr if the url of the image is not in the src attribute of the tag <image>, idicate the name of the attribute containing it.

CSS path

The CSS is used to parse the html and find the information needed.

Ex: In the simple HTML

<html>
 <body>
  <a href="url">outside</a>
  <ul>
    <li><a href="url1">1</a></li>
    <li><a href="url2">2</a></li>
    <li><a href="url3">3</a></li>
  </ul>
  <ul id="chLst">
    <li><a href="url4">4</a></li>
    <li><a href="url5">5</a></li>
    <li><a class="myClass" href="url6">6</a></li>
  </ul>
 </body>
</html>

The CSS "ul li a" will return

    <a href="url1">1</a>
    <a href="url2">2</a>
    <a href="url3">3</a>
    <a href="url4">4</a>
    <a href="url5">5</a>
    <a class="myClass" href="url6">6</a>

This CSS "ul#chLst li a" returns the tags <a> in the ul having the id chLst

    <a href="url4">4</a>
    <a href="url5">5</a>
    <a class="myClass" href="url6">6</a>

This CSS "ul#chLst li a.myClass" returns the tags <a> with the class myClass in the ul having the id chLst

    <a class="myClass" href="url6">6</a>

Wiki

MangaDLR / Plugin parsers