1. Frederic De Groef
  2. csxj-crawler

Commits

Frederic De Groef  committed c530ace

[lalibre] extract_article_data() supports url and file-like objects

  • Participants
  • Parent commits e5f1bd1
  • Branches default

Comments (0)

Files changed (1)

File csxj/datasources/lalibre.py

View file
 
 
 
-def extract_article_data(source_url):
+def extract_article_data(source):
     """
     """
+    if hasattr(source, 'read'):
+        html_content = source.read()
+    else:
+        html_content = fetch_html_content(source)
 
-    html_content = fetch_html_content(source_url)
-    return extract_article_data_from_html(html_content, source_url)
+    return extract_article_data_from_html(html_content, source)