Google

Baen Book related stuff

Baen eBook HTML enhancement scripts

There are currently two choices:

Baen HTML to Rocket eBook enhancement script

baen2rocket (Version 4.0.2)

Baen HTML to MS Reader enhancement script

baen2lit (Version 1.0.0)

I have created two perl scripts that takes one or more Baen WebScription books (HTML eBooks) and enhances the HTML for reading on either the Rocket eBook reading device or for the MS Reader (software and PocketPC verions).

Both scripts clean up the HTML to remove some web-oriented HTML and both enhance the punctuation to make it look nicer (though you can optionally disable this, if you like).

The baen2rocket script generates a single HTML page with some simplified HTML (e.g. book-style paragraphs with no style sheets), so some people without the Rocket eBook have also found the script useful.

The baen2lit script also tweaks a problematical table on the copyright page and inserts some beneficial page breaks.

Using a Perl Script

To use one of the scripts, you need perl installed -- if you don't have it, you can find out how to download perl at perl.com.

All that you need to do to use the scripts is to put all the HTML pages (along with the book's cover image, if desired) into a single directory and then run the script on those files. For the WebScription books, you can simply grab the zip file of the books you want to read and unzip all the files into the directory of your choice.

After you have one or more books in a directory, you would then run the script of your choice on the files. From your command-line interpreter, you would "cd" into the directory that contains your files, and then type something like this:

perl baen2rocket *
or:
perl baen2lit *
The script will ignore files which aren't part of the Baen eBooks. You can also get more specific in what you want to convert, if you like. The baen2rocket script will generate just one book if you specify any file from that book (or even just specify the ISBN prefix number, sans dashes). The baen2lit script will only tweak the files that you specifically mention on the command line.

More info for the baen2rocket script

The result of running the baen2rocket script will be a single .htm file that has its name based on the title of the book. The format of the file will be a cover page, followed by a contents page (with hyperlinks to the chapters), a book blurb/info/dedication section, and all the chapter pages (including any preface, epilogue, and appendices). You would then simply import this file using your Rocket Librarian software and send it to your Rocket eBook as normal.

My script attempts to format the paragraphs in a normal book fashion (e.g. using indented paragraphs), and it does a pretty good job of upgrading the quotes into real opening and closing quotes too. I have also attempted to fix some of the HTML glitches that I've encountered in the various books that I have seen so far. I will probably continue to tweak this script as time goes on to make this better.

More info for the baen2lit script

My script rewrites the .htm files that are a part of the baen eBooks. If you didn't download the .zip file verison, you may want to save off a copy of the files before running the script. As far as I've seen, re-running the script on an already-enhanced file does no harm, though.

Conclusion

If you encounter any problems converting a book, feel free to about it. If necessary, send me a small excerpt of the raw HTML that the script is having trouble with and I'll try to fix it. Or, better yet, send me your own improvements that you've made to the script.

Return to my home page.