The Clementine Vulgate Project: source files
This repository contains part of the source data that underpins the Clementine Vulgate Project. It will be of interest to anyone who wants to manipulate the text or modify the accompanying software.
The project is split into four git repositories:
- VulSearch4: the source code to the Windows .NET program VulSearch 4; also serves as a container for the other repos.
- Text: just the Clementine text in its raw format (see below).
- Scripts: various scripts used to convert the text into different formats for publication on the website.
- Website: the source of the project website on Sourceforge; only of interest to the project maintainer.
Downloading the source
To download everything, the simplest thing is to do a
git clone https://bitbucket.org/clementinetextproject/vulsearch4.git VulSearch4
then in the resulting directory run
Of course, you can also clone the repositories individually, but some of the scripts expect a directory structure that looks like
. ├── Scripts ├── Text └── Website
The raw format of the text is an embarrassing artefact of history: the markup is described below.
Versions of the text
The first complete proof-read text was released on April 3rd 2005. Subsequent corrections were kept in a Subversion repository on Sourceforge. Unfortunately, technology changed, and Sourceforge's Subversion server stopped operating. I've replayed the Subversion commits up to 2006 into this git repository; there's then a gap until 2009, from where individual changes with dates can be seen in the RSS feed at the time when the git repo was created. Of course, later corrections are in that git repository.
Description of the markup
The text is plain text, codepage 1252, with DOS-style line endings.
Commas and periods have no space before, and a single space after
(unless they end a line—there is never a space at the end of a verse),
: ; ? ! each have a single space before, and a single space
after (unless they end a line). In general, the first word of a verse is
not capitalized, nor the first word of a line of poetry, but the first
word of a sentence, as well as the first word of direct speech or
quotation, is capitalized.
The text really has two structures: the traditional division into books, chapters and verses, and a 'natural' structure as sentences and paragraphs. This latter structure is not an intrinsic part of the text, and has been imposed differently by each editor of the Vulgate through the centuries; for my part I have tried to use punctuation both to make the meaning transparent, and to reflect the natural cadences in the text.
- Paragraph divisions are indicated by a backslash
\, though this is omitted at the very start or end of a chapter. This is followed by a space if it should occur in the middle of a verse.
- When text is set as verse, the start and end of a section of verse are indicated by brackets
[(preceded by a space) and
](followed by a space unless it end the verse) respectively. Line breaks within the verse are indicated by a slash
/(followed by a space unless it end the verse).
- When different speakers are indicated (e.g. in the Lamentations), the speaker's name is placed between angle brackets
<...>, with no space after the closing bracket.
- Lamentations and Ecclesiasticus have prologues (which may be non-canonical?). In the source, this appears at the start of 1:1, though logically it belongs before the start of ch. 1. The prologue is preceded by
<Prologus>and in both books the text of verse 1 begins at the first bracket
- Information on the creators and proof-readers of each book can be found in
data.txtin the Scripts repository; a description of the format of this file appears at its head.
publish.py script is used by the maintainer to generate
new files of all formats when a correction is made to the text. It invokes:
makehtml.pl, a Perl script to produce HTML output for the website, which further calls
maketxt.sed, a Sed script to produce plain text output for the website
makemarkdown.pl, a Perl script forked from a repository contributed by Fr Jacques Peron; the
publishscript combines the output with Fr Peron's front matter files and Pandoc invocation to produce an ebook in .epub format.
Building the program requires Microsoft Visual Studio 2013. To build the
installer, you also need InstallShield Limited Edition, which is free for users of all Visual Studio editions except the Express edition. Open
src/VulSearch4.sln and go from there.
Note on hard-coded paths: to build the installer project or to run the main
program in the debug configuration, the top-level VulSearch4 directory
needs to be at
D:\Projects\VulSearch4. This isn't ideal, but I don't have
a need to fix it (and I'm not sure it's possible to fix the installer issue
nicely): if you find a better way of doing things, please submit a pull
The maintainer's current contact information is on the Clementine Vulgate Project. Pull requests for any of the repositories are also welcome.