Commits

Peter Hosey committed 3e2fb51

Added test cases and fixes for a couple of eating-characters-after-the-URL bugs in Amazon shortening.

Comments (0)

Files changed (2)

 	echo 'http://amzn.com/1556229119' >> amazon.out.correct
 	echo 'http://www.amazon.com/Planets-Op-32-Bringer-Allegro/dp/album-redirect/B0013D9YB4/ref=sr_1_album_1?ie=UTF8&s=dmusic&qid=1311514320&sr=1-1' | ./Shorten-URLs.py >> amazon.out
 	echo 'http://amzn.com/B0013D9YB4' >> amazon.out.correct
+	echo '((http://www.amazon.com/gp/product/B005GMR9LK/ref=kinw_myk_ro_title))' | ./Shorten-URLs.py >> amazon.out
+	echo '((http://amzn.com/B005GMR9LK))' >> amazon.out.correct
+	echo '((http://www.amazon.com/Java-Bibliography-ebook/dp/B005GMR9LA/ref=sr_1_1?s=digital-text&ie=UTF8&qid=1336400865&sr=1-1))' | ./Shorten-URLs.py >> amazon.out
+	echo '((http://amzn.com/B005GMR9LA))' >> amazon.out.correct
 	diff -u amazon.out.correct amazon.out
 
 test-amazon-wishlist:
 		return short_URL
 
 class AmazonURLShortener(URLShortener):
-	canonical_URL_exp = re.compile('(?:http://)?(?:www\.)?(?:amazon\.com(?:(?:/[-_a-zA-Z0-9]+){2}|/dp)/)(?:album-redirect/)?(B[A-Z0-9]+|[0-9]{10})(/ref=.*)?(\?.*)?')
+	canonical_URL_exp = re.compile('(?:http://)?(?:www\.)?(?:amazon\.com(?:(?:/[::identifier::]+){2}|/dp)/)(?:album-redirect/)?(B[A-Z0-9]+|[0-9]{10})(/ref=[::identifier::]*)?([?&][::identifier::]*(=[::identifier::]*)?)*'.replace('[::identifier::]', '[-_a-zA-Z0-9]'))
 
 	def shorten_URL_from_match(self, match):
 		item_ID = match.group(1)