Tidy Extension for Scheme


Last Saturday I needed to clean up some HTML (that I'd read into Scheme as a string) into valid XML for storing in Sleepycat's DbXml database. HTML Tidy is a great way to do this, so I put together a small, single-function extension to the Tidy library for PLT Scheme.

The library's easy to build and use. Here's an example:

(require (lib "tidy.ss" "tidy"))

(define bad_string "<p>Foo!<ul><li>first<li>second")

(display (tidy:string bad_string))

This displays

<p>Foo!</p>
<ul>
<li>first</li>
<li>second</li>
</ul>

I've run several hundred HTML snippets that I've gotten out of RSS feeds through the function over the last week and it's worked great.


Please leave comments using the Hypothes.is sidebar.

Last modified: Thu Oct 10 09:47:19 2019.