This is the first stable release of Clear Read API. Clear Read is a service for extracting article text and metadata from a URL. The API takes any article URL as an input and turns it into a full-text XML feed that can be integrated into a third party application. Along with the article text, Clear Read pulls out metadata such as title, description and link. The API uses RESTful calls and responses are formatted in XML and JSON.
- worded by programmableweb.com Join the Discussion on Hacker News
Turns the following URL into a Full-Text XML/JSON Response:
<title>Balsamiq ❤ UX.StackExchange.com | Mockups Product Blog</title>
<description>Full Article Text (in encoded HTML)</description>
Point your App here: http://api.thequeue.org/v1/clear?url=[ArticleURL]&format=[xml/json]
Important: The url query must always come first.
I may add keys in the future. As long as you don't crawl the entire web with this API, feel free to use it.
*Once again, DO NOT crawl the web with this API.
Premium Status: If you enjoy using Clear Read, please consider giving a donation through PayPal or Flattr. Any donations above $10/month will get you Premium Status and I'll do my best to prioritise your support requests.
10/27/12: Major new feature: now stripping every inline tag except the essentials (href, alt, title, etc). Added new extraction patterns. Other minor fixes to the code.
9/28/12: Added Google Ad onto the homepage to pay for hosting. This isn't and will not be affecting API endpoints.
9/7/12: Various bug fixes: Improved reliability, better and more extraction patterns, can now handle AJAX blogs from Google and more.
7/18/12: Fixed bug that affected the extraction of some URLs.
7/17/12: Introducing the Toolbox. In case the extraction failed or you need to clear Clear Read's Cache on a particular URL, be sure to visit: Toolbox. Additional fixes: script removal, cleaner html output and inline css stripping.
5/1/12: Fixed Wikipedia extraction. Clarified usage limits.
4/28/12: Over a Million Pages extracted since Clear Read API was launched 2 months ago.