the whole world burns

Archive for category 'parser'

Webstemmer

 #

Webstemmer is a web crawler and HTML layout analyzer that automatically extracts main text of a news site without having banners, ads and/or navigation links mixed up.

Small things, links and miscellany, sparkling with light. Sam's tumblelog.

Related Tags