`org.paneris.bibliomania`
Requirements Analysis and Plan

(document $Revision: 1.6 $)

This document provides rough notes about some aspects of the requirements which the system is intended to meet, and indications as to the outstanding requirements. See also the system's master QA document.

Background

The Project has evolved, with the documents being retrofitted. So this document describes the current state of the system and the outstanding requirements. In addition the business model has changed from an advertising and associate based model to a direct selling model.

The customer

Bibliomania Ltd is an Oxford based private limited company.

Project genesis

Bibliomania has been in existence for some while. The current site evolved out of a plain HTML site.

Business need

The site has had a large injection of capital, the pressing business need is to establish a sizable income stream. Income is expected to be generated from the following:

Sale of paper books
Sale of electronic books
- Word format
- .lit format
- pdf
- Palm format
Sale of MP3 format lectures
Sale of CD of site
Advertising
Associate payments from other booksellers

See http://www.memoware.com/mw-helpm.htm for a useful list of converters.

Application concepts

Users

The users who are going to interact with the system fall into the following categories:

System Administrators
Text preparation people
Bibliomania editorial team
Registered users
Unregistered users

Pagination

The input texts are split into files by chapter. The Chapter files are then split into pages using TeX to determine the page breaks.

Old URL Redirection

See 30221, 30224, 30230, 30267.

The old site used directory based URLs of the form Section/Author/Book/, initialliy it would be good to ensure that these redirects were case insensitive. In the longer run the noframes sites should revert to the path based URLs.

Constraints

Platform

The external factors influencing the choice of OS, language, hardware, software are:

Price and performance: Only free, Open source software is to be used.

Performance

The external factors influencing the performance required of the system are:

The system must be able to support 20,000 hits per day.

Security

The external factors influencing the security required of the system are:

The system should be maintained in an up-to-date fashion, so that loopholes and known exploits are guarded against.

Robustness

The external factors influencing the robustness required of the system in the face of poor user/admin input are:

The system should not allow incorrect usage to mess it up.

Stability

The external factors influencing the stability required of the system are:

Stability is not crucial, however the caching machanism means that that the site continues to function even when the servlet runner has died.

Required Functionality

Personalisation

There has been a request to use cookies to enable people to return to where they were and/or book mark pages. If the site were not frames based then the user could use their normal book mark facility, which I think would be better.

Nofront page and Pagination

There appears to be a small bug, where pagination fails for a book without an index.wm even when the book has noFrontPage set, unless the book also has paginated set.

Searching

The FTI system for searching books must have the following features:

The ability to search for contiguous phrases as well as individual words is considered essential.
End-of-word wildcards (for rough stem searching) are considered nice for advanced users but not critical.
Contextual presentation of search results, as Google and htdig do but Altavista does not, is considered essential.
Restriction to particular areas: a single author, book or chapter.

Search Interface

There are two separate searches on thr system: one of the text of the books, the other on the data held about the books. The two searches are identical in functionailty but take their source from the texts and teh database respecively.

If the user selects '0' as the number of hits per document then the default (5) is used.

Shop

A standard PanEris shopping trolley is grafted on to Bibliomania by adding a book key to the Product table.

Multiple Fulfillment Centers

We intend to start adding many more books by a range of different publishers. The fufillment partner will be different depnding on both publisher and location of purchaser.

We will try to keep the combinations to as few as possible. At the moment the following list the possibilities:

Publisher	Customer	Fufillment
Stratus	Non-US	Stratus London
Stratus	US	Netpub, US
Continuum	Non-US	Gardners
Continuum	US	Continuum US
Others	Non-US	Gardners
Others	US	Ingrams

So the program that generates the purchase order that gets sent to the fufillment centre will need to recognise both publisher and location (country) of purchaser. In fact it will only need to distinguish between US and Non-US delivery addresses.

For now we intend to use secure trading for processing all purchases.

Products

Paper books from Stratus

Currently only books from the publisher Stratus are sold.

Word versions of Study guides

Word versions of Books

MP3 versions of lectures

It is intended to record lectures given by selected Oxford academics. These lectures to be recorded on a handheld MP3 recorder. The lectures will be associated with the apropriate Study guide.

Bibliomania on CD

See Bibliomania CD and its thread.

The requirement for a biliomania CD is still fluid, but the idea is that one should be able to sell a CD which replicates, as nearly as possible, the experience of the site.

Buying books from Associates

Bibliomania did not originally its own physical fulfilment service, but is affiliated with the major online book retailers; the Comparison shopper uses the toolkits they provide to offer a tailored and context-sensitive service for Bibliomania's user base.

Desirable goals:

Stopping people getting sucked away by the booksellers' sites. It's likely that the affiliate toolkits make it easy up to a point to put together (for example) "Amazon-like" pages which still have Bibliomania content/links on.
Context-sensitivity. The "Buy" links which lead from the book content pages to the shop should be tailored according to the context at the appropriate level. For instance, if you are looking at the Dickens page, you get taken to a page of links to the retailers' own Dickens pages. At a lower level, if you are looking at Great Expectations, you are led to their information about Great Expectations. At a higher level, you might be taken to their Classics page (if there is one).
Tracking user activity. Exactly what's possible/interesting depends on the details of the affiliate programs.

Currently the comparision shopper is not active, it should be set up as a separate site.

Access to essays and study guides

Essays and study guides are not sold through the site. They are available for free, but only to registered users. The demand-driven login capability of Melati (PanEris's application framework) is well suited to providing this protection; the functionality will be the same as that of paneris.org.

Styling

Frames site

The current, frames based design evolved from one designed by Januzzi Smith. It is now generated when alterations are made to the database. There are mechanisms to ensure that pages are only viewed within the frameset.

No Frames site

See No frames Version

There is a requirement to create a new version of the site, which may replace or live alongside the current version, this version to be navigable by the old style URLs and to employ a tree based navigation scheme so that no frames are required.

Validation and QA

Text Validation

The current texts are HTML fragments, whose only validation has been that they look OK in a browser and that they can be interpretted by the HotJava parser. All texts should be validated using an SGML validating parser and the errors corrected.

Data Validation

There is a requirement to produce pages showing the state of the db. For example a page showing for all nullable fields in the database, the percentage which are not null, possibly in graphical form. A page showing all Books with/out a TOC. A page showing Units which have been modified since last they were encached.

Cache Validation

Although the cache is deleted prior to a complete re-run, it is still possible for webmacro errors to slip through. A script to grep the cache for webmacro errors is required.

Changes to Input Files

Book Table of Contents

The system looks for a file called index.wm in the book directory. If that file does not exist then a default TOC is created from the books chapters. In many instances the index.wm is redundant, in as much as it is not any different from the generate text, and so can be deleted.

Foreign Language texts

We already have Collected French Verse, which has issues with regard to character encodings (see

) these problems need to be addressed either by replacing high ascii characters with SGML entiies or by getting the encoding right.

We also have permission to publish a large number of Brazilian (Portuguese) documents (see Brazilian texts).

I am of the opinion that we should build separate sites for each languagem, ie lose the french text and setup two separate sites for French and Portuguese.

Changes to database structure

Additional Fields

All Unit fields should have the user who last modified them added.

Redundant Fields

User	Description
Chapter	Oldtextid

Open issues

See 30471.
A more general tree structure should be considered, to avoid reference books being input at authors and to remove the need for 'nofrontpage' flags.

Junk mailer

See Email Lists. A function to select users who have opted into email alerts and send them emails.

System Administration Requirements

Backups

Currently TimP takes a snapshot of the source data and writes it to CD. Ideally a snapshot of the cache, a dump of the db and the whole of the CVS tree whould also be backed up.

Visitor statistics

To sell advertising we need statistics on page impressions broken down by Sectiongroup.

Search recording

A record of the searches made on the engine needs to be kept to give us an idea of its typical usage. This record should also give an indication of the number of hits found.

Bibliomania Email domain

Bibliomania currently have a number of email addresses administered by Maytech. These need to be transferred to a domain under Bibliomania's control. We need to determine which machine this should be: the dev machine or the live machine.

The following email addresses are required:

Alan@bibliomania.com
Marianne@bibliomania.com
David@bibliomania.com
Kate@bibliomania.com
Anna@bibliomania.com
TimP@bibliomania.com
TimJ@bibliomania.com
JamesW@bibliomania.com
Ask@bibliomania.com
comments@bibliomania.com

Messageboard

Messageboard Listing

A list of all messageboards available, browsable from top down is required.

Messageboard Replies

Replies to emails from the messageboards are not currently set up.

Messageboard Relationship to Content Heirarchy

See message 37820.

Top level boards: Comments - General comments
Press - Press releases
Technical - The technical side of bibliomania
Group related boards: Read = Favourite Classics - Tell fellow Bibliomaniacs your favourite classic texts
Study = General Revision - General revision/help questions
Shop = Book Shop - Discuss what authors/books you would like to see in the shop
Section Related boards: Teachers - Share experiences/resources in teaching english, literature and e-education (this board already exists)

,p> I propose that we enable the group boards to be seen, adding :
Research - Share tips and links with other researchers
Search - Assistance with finding out of print resources

About this document

Authors

William Chesters <williamc@paneris.org>

Most recent CVS $Author: timp $ @paneris.org

Readership and purpose

The customer should feel confident that the work will be useful, and understand its scope, if this is considered relevant: it is.
The project leader should feel happy about taking responsibility for leading the spec of the system.
The developers should feel informed about the broad picture surrounding what each of them is doing.
Future maintainers should be able to understand the motivation behind the system so that it falls into place easily for them.

History

The important points in the life of this document are listed below (for detailed change history consult its CVS log.

$Log: RequirementsAndPlan.html,v $
Revision 1.6 2001/07/29 15:03:05 timp
Add general boards comments

Revision 1.5 2001/07/28 14:41:06 timp
Tidy up

Revision 1.4 2001/07/27 02:13:17 timp
Typo

Revision 1.3 2001/07/27 01:52:12 timp
Collate all requirements from messages into one document.
Needs further work.

Revision 1.2 2000/05/13 16:47:46 williamc
Update after discussion with Alan (some days before ...)

org.paneris.bibliomania Requirements Analysis and Plan