How we maintained the GCNA Website
|
This page was written in 1999, and its textual content was last updated in 2009,
before the second redesign of the GCNA Website.
It has not been revised to reflect that major change (and the accompanying
personnel changes) nor the migration of the pages of the Data section
of the GCNA Website to the TowerBells Website in March-April 2012.
Nevertheless, those sections of this page which do not explicitly reference
the GCNA remained relevant, and many others remained accurate,
from that time until mid-2021
if the words "GCNA Website" are read as "TowerBells Website".
This page remains relevant as historical background for the current (post-June 2021)
description of how this Website is now maintained.
|
|
Maintenance of the GCNA Website is shared by two members
of the committee appointed by the Guild for that purpose.
This page presents a moderately detailed overview of how some of that work is done,
for the enlightenment of those visitors who are interested in such things.
It also serves to provide documentation of the technology in use for this work.
Finally, it explains why some requested updates don't get done as promptly as
some might like to see.
Host system
The computer system which hosts this Website belongs to a commercial enterprise,
whose services were contracted
by Wylie Crawford.
It provides the basic hardware/software infrastructure
and the Internet access point.
The infrastructure includes the host computer and its operating system (Unix),
utilities and disk storage management, plus Web server software (Apache).
It supports long, mixed-case filenames;
these are case-sensitive (unlike the previous host, a Windows NT system,
which supported long filenames but was not case-sensitive).
Public access is unrestricted for display of Webpages and downloading of
selected files; maintenance access for adding or updating Webpages and other files
is restricted to authorized personnel.
Main pages
The "main pages" on this Website are those Webpages
which are in the home directory of the Website.
For the most part, they have to do with the Guild itself,
its operations and activities.
They include the
GCNA Home Page
and all others which have Uniform Resource Locators (URLs) in which
the domain name "
www.gcna.org" is immediately followed by a solidus (/)
and then a filename.
Most of these pages were originally designed and maintained by Norman Bliss,
but all are now maintained by Carl Scott Zimmerman.
Each page has his email address
(
csz_stl@swbell.net)
at the bottom of the page,
since he should be contacted for any questions, suggestions or comments
which you may have about that page.
No matter how they were originally constructed,
maintenance of these pages is now done using a sophisticated HTML-aware
text editor, operating on an offline mirror of the Website.
After verification of a set of changes,
all revised pages are immediately uploaded directly to the GCNA Website.
For small changes, this update process can be completed in less than an hour
from the time the Webmaster receives a request from a Guild officer or committee;
more complex changes will naturally take longer.
(For the technorati:
The offline mirror for this part of the GCNA Website
is resident on an Apple Macintosh PowerPC G4 computer
running OS X 10.3.9.
The HTML editor is BBEdit 8.2.6.
The FTP utility used for upload is Fetch 5.3.)
Displayed as parts of the main pages are various images.
They may serve as stylistic elements (background or page heading)
or as illustrations for portions of the text.
All are graphic files which reside in a subdirectory of the Website.
Data pages
The "data pages" on this Website are those Web pages which are
in the "
data" subdirectory or below it,
i.e., those which have URLs of the same form as that of the
page you are now reading.
Specifically, they all have the "
/data/" directory name
in the middle of their URLs.
These pages have to do not with the Guild itself but with the instruments
which are the reason for the Guild's existence -
carillons and their "relatives" in the world of tower bells.
"World" is not only symbolic but geographically literal, because these pages
are derived directly or indirectly from the author's database
of carillons of the world.
These pages are maintained offline by him,
using a pair of computers with the technologies described below,
and uploaded to the GCNA Website.
Like the main pages, all have his email address as
csz_stl@swbell.net
at the bottom of the page,
since he should be contacted for any questions, suggestions or comments
which you may have about such pages.
The data pages may be further subdivided into text data pages,
site data pages and index pages.
There are also a very few hybrid data pages.
Text data pages
Text data pages provide the framework within which all data fits -
introductory and general descriptive material, as well as access to the
various kinds of indexes.
Text data pages all have mixed-case filenames ending in "
.html",
e.g., "
Data_Top.html".
They were composed and are maintained using a methodology
just like that used for maintaining the main pages (see above),
though on a different computer.
(For the technorati:
The offline mirror for this part of the GCNA Website
is resident on an Apple Power Macintosh 9600/200MP computer (Mac)
running MacOS 8.1.
The HTML editor is BBEdit 6.1.
The FTP utility used for upload is Fetch 3.0.1.)
Text data pages may be uploaded directly to the GCNA Website
whenever they are revised,
usually accompanied by an appropriately descriptive addition to the
What's New page.
Site data pages
Site data pages are the reason for the existence of the
/data/
section of this Website,
since each such page describes one carillon or other tower bell instrument
somewhere in the world.
All site data pages have uppercase filenames ending in "
.HTM",
e.g., "
CODENVUD.HTM", and are derived from a database which is
resident on an IBM-compatible personal computer (PC),
using a specially-written Turbo Pascal program which runs under DOS.
The process of producing these pages works as follows:
- Run the extraction program to select
information from the site database (further described below)
and organize it into HTML files (Web pages) that reflect
all substantive changes to the database since the last such run;
one of those pages will be a revision index (see below).
- Copy the generated files from the PC to the Mac via a diskette,
a transfer method sometimes referred to as "sneakernet".
- For each transferred file, apply the HTML editor as follows:
- If it is a revised site data page, use the file-compare feature
to find and copy forward any information which was manually added
in previous versions and remains relevant.
This includes all site-specific internal and external Weblinks.
- If revisions to the site data page necessitate changes
to any indexes which reference it, update those indexes appropriately.
- If it is a new site data page,
add the appropriate site-specific internal Weblinks, using standard templates;
also copy the revision index entry for this site
into all relevant existing index files, with appropriate adaptations for each.
- Convert from DOS format (CR-LF line ends)
to Macintosh format (CR line ends).
(Both are different from Unix format, which has LF line ends.)
This is a single-click process within BBEdit.
- Convert any remaining characters with diacritical marks
(such as á,é,É,ç)
from what works on the PC to the equivalent Macintosh character.
- If it is a site data page, and new external Weblinks have
been found for that site, incorporate those into the page.
- Add an appropriate descriptive entry to the "What's New" page,
including a link to the revision index.
- Verify the consistency of all internal links, and verify that
all external links are still current.
Once the batch of new and revised pages are ready, they are uploaded
to the GCNA Website together.
Then BBEdit is used to find all of the email addresses in the batch,
in order to send a standard notification message to as many of those sites
as possible; a copy of that message also goes to other interested persons.
Index pages
Website revision indexes are pages with names such as "
IX990924.HTM".
They are relevant only to the batch of site data pages with which they are uploaded,
and they are referenced only from the
What's New page entry for the date of upload.
They are generated by the same Turbo Pascal program as the site data pages,
as part of the same data extraction process,
and are handled in the same manner (see above).
Each revision index includes links to the site data pages which were
generated in the same batch run.
Permanent index pages, which enable the finding of site data pages
using any of several different criteria, have mixed-case filenames
like text data pages, but are a composite of text plus database material.
Most began as special extracts from the database, in a format essentially
identical to revision indexes, to which appropriate text material was added.
Over time, these permanent indexes have been expanded by copy-and-paste
methods using revision indexes as the raw material (see above).
Normally, they are revised and uploaded only
in conjunction with the new or changed site data pages which are
associated with changes in the index lines which link to such pages.
This stricture is followed in an attempt to make sure that index pages
are never out of sync with the site data pages which they index.
(It is still possible for human error in editing an index page
to result in a discrepancy between the index and what is being indexed.
Please report such discrepancies as soon as you find them,
using the "mailto" link at the very bottom of the affected page.
Doing so will automatically generate an appropriate subject line
which identifies the page.
Please send a separate message for each discrepant page.)
Hybrid site data pages
A very small number of site data pages have filenames ending in "
.htm",
e.g., "
DENEWCAS.htm".
These are for 6-bell rings in North America,
which are too small to be included in the database
but too important to be omitted as we support our sister organization,
the North American Guild of Change Ringers
(
NAGCR).
These pages were constructed by hand to resemble normal site data pages,
and are also maintained by hand.
Database
Several aspects of the underlying database are particularly relevant
to the process of Website maintenance.
Special characters
Different types of computers represent character data in different ways,
so one of the most troublesome problems for computer users is managing the
translation of character data from one system to another.
American-made computers all share a common base of ASCII (American
Standard Code for Information Interchange), which includes upper and lower case
alphabetic characters, numeric digits, and common punctuation marks.
But they have differing ways of extending ASCII
to represent various "special" characters.
In the database for carillons of the world, the encoding of special characters
was designed for optimal display of western European languages on an
HP LJ IIIP laser printer.
As the data portion of the GCNA Website expanded to cover
areas beyond North America, procedures were developed to manage the
translation of these special characters from one system to another.
The extraction program translates most extended ASCII characters
(e.g., those with diacritical marks, such as á,é,É,ç)
within site data pages into HTML entities
so as to be independent of font and character set.
Those few special characters which are not handled this way
are corrected by hand in the editing step in which they are first seen on the Mac.
Thereafter, these alterations are carried forward semi-automatically
as described above.
The Fetch utility automatically translates any remaining
extended ASCII characters from the Macintosh character set to the
ISO-8559-1 character set during the upload process.
Please report incorrect special characters (which may appear as asterisks or
other "garbage" characters) as soon as you find them,
using the "mailto" link at the bottom of the page where they appear.
Fetch also translates Mac line ends to Unix line ends during the upload process
(see above).
Database maintenance
Tracking of changes to the database is done using the two dates which are
presented near the bottom of every site data page.
These dates show when the textual and technical parts of each site's data
were last revised in the database.
They also control which sites will be extracted in each batch run,
when report-generator criteria such as "changed on
date" or
"changed since
date" are used.
All new and revised pages,
together with a concurrent revision of the What's New page,
are uploaded to the GCNA Website in one batch.
This minimizes the possibility that a visitor to the Website might follow an index link
to a site data page which did not agree with the content of the index.
PDF files
Dependent from the
Hardcopy data page
are various files in Adobe Portable Document Format (PDF),
intended to be downloaded for viewing and printing.
They are all contained in the "pdf" directory below the "data" directory,
i.e., they all have the "/data/pdf/" path name in the middle of their URLs.
All of these files originated as printable output from the database extraction
program, though some were produced by processing plain text files which are
not, strictly speaking, part of the database itself.
Those print files are copied from PC to Macintosh via sneakernet,
using a custom conversion utility which translates all special characters
to the Mac character set.
On the Mac, they are imported into a WordPerfect template document,
lightly edited to polish the pagination,
and "printed" to PDF files through a shareware PDF "printer" driver.
Precautions
No information extracted from the database is ever altered after extraction,
to miminize the risk of inconsistencies or loss of information.
(Occasionally, information from the database may be slightly re-formatted
for the sake of appearance.
Also, since the Technical data section of each site data page is
a limited interpretation of only certain aspects of the database contents,
some editorial changes are occasionally made to overcome those limitations
and present such information more accurately.)
Whenever information kept in the database is changed,
the affected pages are extracted anew.
(Note the distinction between
information and its
format.
Some changes in format are a necessary part
of the cross-platform transport process.)
Archiving
Each set of uploaded files is copied to archives on three different storage
devices for backup purposes.
One of those devices is part of a set which is periodically rotated off-site
to provide backup against natural disaster.
The others provide backup against device failure and/or human error.
Mirroring
A local mirror of the GCNA Website is maintained on the author's
primary Web-linked computer for test and reference purposes,
so that statistics of public access to the "real" Website are not biased by
access for development work.
Process timeline
The batch process which is used for extraction,
finishing and upload of site data pages
explains in part why additions and corrections sent to the maintainer
may not appear promptly on the GCNA Website.
Such changes are first collected and organized,
then used to update the database itself.
That process involves a different plain-text editor operating on one or
more of a set of card-image text files.
(This editor is Kedit 5.0, a PC-based equivalent of Xedit,
a powerful mainframe-based program.)
Then the extraction program is run to produce a printed report of the
changes, for purposes of proofreading.
After all changes have been confirmed as correct,
the extract and upload process described above can take place.
Although this may appear to be a tedious process,
it is vital to insure that the information which we present
is as accurate and consistent as we can make it.
If we violated process rules for the sake of speed, we would not only
risk introducing discrepancies into what we publish, but also lose
track of what is current versus what is not.
Or else we would lose the ability to proofread changes to the database effectively.
The same program used for data extraction is also used to produce
hardcopy reports.
This program and the database are the direct descendants of those used to produce the
very first published edition of Carillons of the World in 1979,
as well as a series of six articles for the Bulletin of the GCNA
in the same time frame.
Database history
As indicated under "Process timeline", above,
the database which underlies the "data" pages
of the Website is nothing more than a set of card-image text files,
maintained with a plain-text editor.
What is meant by "card-image" is that each record of a file contains nothing but
human-readable characters as if it were an 80-column punched card
of the sort that once was used on mainframe computers.
In its original form, the database was in fact a box of such cards,
and revision with a card punch machine was tedious.
The basic format of those original cards is still in use today,
just as it was designed more than 40 years ago.
Although it has been extended and expanded to add new categories of information,
is has never been necessary to change it.
Conversion from a box of cards to a set of files on a floppy disk, and now
a set of files on a hard drive, has greatly eased the maintenance process.
Not only can changes be made more easily and quickly, but the risk of error has
been considerably reduced, because the result of each change is immediately visible
on the computer screen.
The program which processes the database has undergone considerably more change.
Originally it was written in the high-level programming language FORTRAN,
and itself resided in a box of punched cards.
Several variants of FORTRAN were used, as the program migrated from one mainframe
to another.
Eventually, the data (and the program) migrated from actual punched cards
to card-image files on reels of half-inch reel-to-reel magnetic tape,
of the type that used to be common on mainframe systems.
When personal computers became not only affordable but also sufficiently powerful
to handle the processing requirements for the database, the program was completely
rewritten in Borland Turbo Pascal, another high-level programming language.
All of the fundamental logic of the original program was retained,
including the subprogram structures,
though a number of minor implementation details had to be changed.
It was at this point that both program and data became resident on direct-access
storage ("hard disk"), eliminating the need for either punched cards or
magnetic tape.
The increased ease of editing in this environment affected not only maintenance
of the data but also maintenance of the program.
As new categories of information were added to the database, changes
were made to the program to display that information appropriately.
The largest single change was the addition of an option to display information as
HTML files, i.e., Web pages.
Previously, all program output was in the form of print files,
i.e., files formatted for delivery to a printer for producing hardcopy.
Database extraction
The data processing program mentioned above, which is used for extraction of
information from the database, can be viewed conceptually as having three principal
components: a control statement interpreter, a data loader,
and a report generator.
Control statement interpreter
The control statement interpreter accepts input from either the console or
a control file.
Control statements can cause loading of a data file,
setting of various processing options,
selection of data (based on a wide variety of simple or complex criteria),
sorting of selected data (based on single or multiple parameters),
and generation of several different kinds of reports.
There is online help in the use and format of all control statements,
though some can only be used in control files.
Data loader
The data loader reads one flat file and builds three temporary data structures,
one in memory and two on disk.
It can be invoked repeatedly to load any number of data files together,
subject only to the constraints of available space for the in-memory table.
(Loading additional flat files simply expands the three temporary data structures,
as if a single large flat file had been read.)
Unfortunately, the total number of carillons and chimes in the world is so large
that it is now impossible to load all data simultaneously.
This makes the production of certain types of reports impractical
(e.g., world-wide summaries).
The in-memory table contains all of the condensed technical information found
in the flat file, some of it converted from character strings to integers.
This table also has forward and backward pointers which connect all of the rows
which describe a particular tower bell instrument, with each row describing a
particular stage in the instrument's history.
The newest (or only) row for a particular instrument always describes its
present (or last known) configuration, and is considered the primary record;
other rows are secondary to it, in reverse chronological order.
One of the on-disk temporary data structures is a sequential list of the
condensed technical information records found in the flat file,
in un-converted form.
The in-memory table has pointers to the records in this list,
which are used to produce Condensed Information Listing (CIL) reports.
The other on-disk temporary data structure is a sequential list of the
textual information records found in the flat file, in their original form.
The in-memory table also has pointers to the records in this list,
which are used to produce Master Information Listing (MIL) reports, etc..
Report generator
What is conceptually a single report generator is actually a collection
of special purpose report generators.
It is trivially simple to combine several different reports into a single
printout, or a print file which can be used to make a PDF file.
Several of the possible reports are described on the
Hardcopy page, and several combinations of them
are available there as PDF downloads.
The production of Web pages (both site data pages and index pages,
as described above) is accomplished with one of these report generators.
The future of the database
From time to time, various people have urged the author to convert to use of
a conventional relational database program (
e.g., Microsoft Access).
The motivation for such a conversion is that it would enable the database to
be maintained by anyone who has a reasonable degree of expertise in the use
of such a program.
Of course that is in principle an excellent idea; very few (if any) people have
the author's peculiar combination of expertise
in relatively obscure programming languages and data design,
and so such a conversion would make continuation of the author's work
after his inevitable death a relatively straightforward matter.
Unfortunately, from the author's viewpoint it is not such a good idea
(at least not just now), for two reasons.
Firstly, the present organization of the data files does not fit the requirements
of relational databases.
Relational database programs (RDBs) are excellent for handling data
which fit the constraints of the relationships for which they are designed,
and many kinds of commonly used data do that.
Nevertheless, RDBs are not a panacea.
On the one hand, they are unnecessarily complex for managing small quantities
of simple data.
On the other hand, there are data relationships which can be forced into
the "relational" mold only with great difficulty, requiring extraordinary
contortions in design and programming and maintenance.
In the author's professional opinion, his existing set of data
regarding carillons and chimes falls into this category.
It seems highly probable that significant information might actually be lost
in the course of conversion to such a database.
Secondly, such a conversion would have almost no direct benefit for the author,
and indeed would be to his detriment.
The total redesign of the database, the conversion of the existing
data into an entirely new format, and the construction and debugging of
an entirely new report-generator system would be a very large task that
would contribute nothing to the current project for extending the scope
of the Website to cover all of the carillons and chimes of the world.
Indeed, the time and effort required would considerably delay that project,
the basic data for which already exists in the present format.
The one possible direct benefit to the database owner from such a conversion
would be the ability to view the entire database at once
for production of worldwide summaries and reports.
At present, that is an insufficient incentive for the author.
/signed/ Carl Scott Zimmerman, database owner.
[TowerBells Home Page]
[Site data top page]
[What's New]
[Feedback]
This page was created on 1999/10/04 and last revised on 2021/08/07.
Please send comments or questions about this page to
csz_stl@swbell.net.