GUIDELINES UPDATE: As of 16th November 2011, new guidelines for dashes and punctuation marks have been implemented for both the online text correction and online moderation projects. Please see below for further information or contact us if you have any queries. |
||||||||||||||||||||||||||||||||||
HEADING |
TEXT |
EXAMPLES / IMAGE / SCREENSHOT |
||||||||||||||||||||||||||||||||
Accents – Advertisements – Christmas Numbers – Currency symbols – Dashes – Font sizes – Footnotes – Headings – Household Narrative: currency symbols – Household Narrative: first pages – Household Narrative: fractions – Household Narrative: tables – Household Words Almanac: images – Household Words Almanac: calendar days of the week – Household Words Almanac: textflow – Hyphenation – Images – Italics – Line-length – Line-breaks (double) – Line-breaks (enforcing) – Logged in? – Moderation – Missing text – Paragraphing – Poems/poetry – Punctuation – Saving pages (glitch) – Selecting a 2nd magazine – Spacing – Spelling mistakes – Symbols – Tables and charts – Titles and Headings – Unusual characters |
||||||||||||||||||||||||||||||||||
See under ‘Unusual characters’ etc. |
||||||||||||||||||||||||||||||||||
Sometimes advertisements appear on the final page of a magazine. Please do not remove these but retain the text as it appears in the original (it is not necessary to replicate the original layout— |
||||||||||||||||||||||||||||||||||
Yes, please retain the masthead, title and index on the cover page of Extra Christmas Numbers. |
||||||||||||||||||||||||||||||||||
Dickens’s journals quote British currency in pounds, shillings and pence, using the symbols/abbreviations l. or ₤ for pounds, s. for shillings, and d. for pence. The abbreviations ‘s.’ and ‘d.’ should be reproduced in italics, and the ‘₤’ symbol,should always be used (from the unusual characters sub-menu), in front of a numerical sum of money, rather than an ‘l.’ in front, or following. TTS rendition will make sense of the pound symbol. Similarly for dollars. (This is particularly important when working on the final page of issues of the Household Narrative, which abounds in currency abbreviations.) |
||||||||||||||||||||||||||||||||||
Dashes (double hyphens, ‘em’ dash, ‘en’ dash etc.), why missing and how to insert? |
IMPORTANT NOTE: As of 16th November 2011, the guidelines for dashes have changed. Please read the guidelines below carefully. Corrections begun BEFORE 16th November should continue to follow the old rule. NEW RULE (after 16th Nov): OCR very seldom, if ever, reads horizontal lines – you will notice therefore that most divisions between articles are missing, and also most dashes. There is no need to reproduce divisions between articles, but please do reproduce/insert a long (‘em') dash (—) with no spaces on either side as appears in the body of the original text. You can find a long ('em') dash from the Symbol/Unusual characters menu (Ω) or by clicking the em dash icon at the bottom of the editing panel. Once you have inserted one, copy it (by highlighting it and pressing Ctrl + c) and paste it (by pressing Ctrl + v) to repeat insertions. Where the page image shows an even longer dash, if necessary, use a double (or even triple) long ('em') dash. OLD RULE (before 16th Nov.): OCR very seldom, if ever, reads horizontal lines – you will notice therefore that most divisions between articles are missing, and also most dashes (double hyphens, ‘em’ dashes, ‘en’ dashes, etc.). Please insert a double hyphen with a space on either side ( — ), in place of a long (‘em') dash (—). Alternatively, you can insert an 'em' dash (with no spaces on either side) or 'en' (–) dash (with spaces on either side) from the Symbol/Unusual characters menu (and copy+paste to repeat insertions). Where the page image shows an even longer dash, if necessary, use a triple or quadruple hyphen. In due course, we shall automate the replacement of the double dashes with 'em' dashes. |
|||||||||||||||||||||||||||||||||
Given that readers who can view the page image will have access to an exact replica, there is no need to attempt to style fonts according to size or family. A plain text in a uniform size of san serif font is all that is aimed at in manual text correction. So, for example, the Gothic font used on the masthead of All the Year Round should appear in the same font as everything else. |
Gothic font1.jpg |
|||||||||||||||||||||||||||||||||
Occasionally you will come across footnotes (maximum one or two per magazine), which, if they come at the foot of column 1, break the flow of text in the main body of a page. For the time being, we suggest relocating the footnote to the bottom of the paragraph of body text in which the footnote symbol (usually '*') is inserted, using the 'cut and paste' keys: Select/Control+C to copy; Control+V to insert. |
||||||||||||||||||||||||||||||||||
(see ‘Titles and Headings’) |
||||||||||||||||||||||||||||||||||
Household Narrative: first pages |
For the first twenty-four months of its publication, the Household Narrative ran as its opening articles an overview essay on the month's most significant events in England, Scotland and Wales, called 'THE THREE KINGDOMS'. These articles (probably penned for the most part by Dickens's friend, the journalist and historian John Forster) run across the whole width of the page, rather than in two narrow columns, and they occupy the first few pages of each of these 24 issues. For text correctors with moderator status (the correction of Narratives is restricted to the latter), the correction of these articles does not require any different set of procedures, but we recommend you take a little time to optimise the page view, both in 'Edit' mode, and again, in 'Read' mode. Use the corner icons in either the facsimile or the edit panel, or both, to adjust the width of each so that the lines do not wrap in an unnatural way. Please take extra care with lineation, so that each line in the transcript consists of the same set of words as in the facsimile. When you reach the end of 'THE THREE KINGDOMS', the page will revert to the usual 2-column format with which you are familiar. |
|||||||||||||||||||||||||||||||||
Household Narrative: fractions |
3318 South Eastern 205/8 185/8 201/4 To create the required fractions, simply add { } around the relevant numbers, with no spaces, thus: 33{1/8) South Eastern 20{5/8} 18{5/8} 20{1/4} On pressing 'Save Now', each group of numbers within curly brackets will be replaced by the same numbers shaded grey, thus: On pressing 'Exit', these shaded groups will convert to fractions, thus:
|
|||||||||||||||||||||||||||||||||
Household Narrative: table template: STOCKS |
|
|
||||||||||||||||||||||||||||||||
Household Words Almanac: calendar days of the week | The first page of each month in the 1856 Almanac presents, in the middle of the page, a table containing key events occurring on each day of the week, surrounded by a decorative border. Please reproduce this table in a single column. For reasons of economy, the compositors abbreviate the days of the week: M for Monday, Tu for Tuesday, W for Wednesday etc. As we have no limits with line length for this feature, it will make sense (for TTS etc.) for us to spell out the days of the week in full. Spaces can be inserted to even the columns, or, a three-column table could be inserted, to taste -- so long as the end result is not ragged to the eye. For consistency: please also embolden 'SUNDAY.' |
|||||||||||||||||||||||||||||||||
As you will see from consulting the image pages for the 1856 Almanac, both left and right hand pages for each month are, in terms of design and textflow, designed as a spread. For better or worse, we have decided that we cannot reproduce this, so the text transcript has, from the outset, some dead ends, and flow issues. Where, in the 1856 Almanac, the double column at the foot of the verso/LH page flows direct to the foot of the recto/RH page, we have followed this in laying out the blocks of OCR text for correction. This mean that the transcript for rectos almost always starts at the foot of the page, moves to the top, then flows down to the end of the section headed 'SERVICEABLE INFORMATION.' Please do not adjust or re-arrange the order of text blocks in the Almanac pages: these have been placed deliberately. Markers in curly brackets−'{continued from foot of previous page}'−have been inserted to signal the ordering, and to assist audio listeners to make slightly better sense of the movement. Given that the content of the Almanac consists of an aggregation of factual information, the non sequiturs matter a good deal less than they would, say, in discursive prose. |
|
|||||||||||||||||||||||||||||||||
Yes, where a single word has been broken in two, across two lines. With their page layout of 2 narrow columns, Dickens's journals frequently needed to hyphenate in order to balance line lengths. The corrected text need not preserve this, and it will make TTS rendition, word searches and various other kinds of text-mining run better if it is removed. As a rule of thumb, take UP into the preceding line the half-word from the next line IF the numbers of letters involved is less than half of the total number of characters in the complete word; move DOWN into the next line the half word from the preceding line IF ditto. If it is evenly balanced, then you may decide! Where you see a hyphenated double word broken over two lines (e.g. road-mender, hand-picked, organ-grinder) you should not remove the hyphen, and may leave the structure as it is. |
Example 1: |
|||||||||||||||||||||||||||||||||
Images |
Images are rare across Dickens's suite of publications, but are very common in the Household Words Almanac. For obvious reasons, they cannot be reproduced in the transcript, but it is important to signal their presence. A marker should have already been inserted for you, beginning {image:} followed by a category description e.g. {image: 'historiated initial letter 'X', showing ...?} and we would like you to insert a four- to ten-word (i.e. brief as possible), description after the colon e.g. {image: 'historiated initial letter 'T', showing a mare with her foal}. If the image contains text, e.g. handwriting, use {image: text: contents}, replacing contents with the actual wording, or an approximation of the wording. Thus, when the page is being listened to (using TTS), the listener will be able to acquire the gist of what the image signifies, and anyone searching the archive, will be able to find all the images by searching under {image:. |
|||||||||||||||||||||||||||||||||
Italics should be reproduced whenever possible. To insert italics, either press Control +i before commencing to type the requisite word(s) and repeat Ctrl+i after; or, type normally, then go back and select the word(s) to be italicised, and press Ctrl+i. |
||||||||||||||||||||||||||||||||||
Please reproduce the line length as it appears in the original scanned copy (with the exception of words split over two lines/pages by a hyphen — see above). This can be done by pressing SHIFT + ENTER after the appropriate word. |
||||||||||||||||||||||||||||||||||
Please remove double linebreaks that are not present in the body text of the page image. For example, after the column break at the foot of the first column of a page of text. Or, within the stanzas of a poem (stanzas should be separated by paragraphs, however). The way OCR makes sense of line breaks, and how the transcript displays them can appear arbitrary, and often at variance with the page image. On the one hand, this is unlikely to affect TTS rendition, but on the other such variance, if unexplained, can look random and unnecessary. |
||||||||||||||||||||||||||||||||||
Often, in attempting to remove an unnecessary double or single linebreak, you will find the lower line is taken back up into the line before. To enforce a single linebreak, place your cursor where the line ought to end (check against page image) and press Shift + Return (or Shift + Enter). Repeat, to insert a forced double linebreak. |
||||||||||||||||||||||||||||||||||
As of 08 August 2011, you will not be logged out of your session, and an Autosave (set to kick in after 5 minutes) has been installed. Neverthless; our advice remains simple: use the 'Save Now' button frequently, during page correction, and always after returning from any mid-session breaks. |
||||||||||||||||||||||||||||||||||
Occasionally (rarely) the OCR will simply miss text: perhaps a word at the start of a line in col. 1 of a recto or at the end of a line in col. 2 of a verso – i.e. close to the gutter. Simply insert the word or words as appropriate. If a page has lots of ‘missing’ words, it may be that the OCR has misread whole chunks of the 2-column structure of the page – if so, text will not be missing so much as mixed up. See the advice about content order on the main ‘How do I Select and Correct a Magazine?’ (link) page. Using the cut and paste shortcuts (Control + C to copy selected text; Control + V to paste selected text), it is possible to rearrange the page, but it takes a little time, and is rather like working on a jigsaw, with the page image acting as the box lid! |
||||||||||||||||||||||||||||||||||
Moderation, how does it work? |
The submission and moderation process for your magazine is described on the 'Getting Started' page for Online Text Correction. If your magazine correction is accepted on first submission congratulations! If it is returned for some further corrections, these will be clearly indicated on the 'Correction Record', to allow you to focus on exactly which pages need fixing. If 'Assorted, on various pages' is flagged as a response from your moderator, then they will have come across assorted kinds of correction that need implementing. These will have been noted carefully on the 'Correction Record' for the first few pages examined. As these will have been quite consistent, not every page has needed to be consulted, so if you are willing to implement these changes, please check THROUGHOUT the magazine, even on pages where there is no note beneath the thumbnail on the 'Correction Record.' Thank you very much! |
|||||||||||||||||||||||||||||||||
Paragraphs, should I rejoin when they are split across pages? How should I reproduce them? |
Please rejoin paragraphs that have been split across columns within a page, but not across two pages. Put each article title, including those split over multiple columns, into one paragraph. Each paragraph you see in the original should appear within its own paragraph box in the Text Editor, even if only a short line of dialogue. New paragrpah boxes are created simply by pressing ENTER. |
|||||||||||||||||||||||||||||||||
A potentially complex question! However, here are some simple rules of thumb. 1) Where the poem/verse has mainly short lines (short relative to the width of a column of text), you should try to reproduce the indented margin. It will look and read better as poetry in the transcript. 2) Where the poem/verse has a long line – about as long as a column of text – there isn’t room, so each new line should by left justified to the same position as normal text. 3) Complications begin when, within a verse or poem, the original typesetting offers a pattern of tabs and indents for new lines. YOU CAN REPRODUCE THESE IF YOU WISH BUT IT IS NOT ESSENTIAL. 4) There is room in the DJO transcript for longer lines than were possible in the columns of Dickens’s journals. Sometimes the original typesetting is forced to break a line unnaturally, and a word or so has to stand on its own in the line below, WITHOUT A CAPITAL LETTER. These breaks should not be reproduced, so take the word or words back up into the previous line. 5) Please insert a new paragraph for each new verse. |
||||||||||||||||||||||||||||||||||
|
See ‘Spacing, between words’ |
|||||||||||||||||||||||||||||||||
|
A number of our text editors have reported that occasionally, a page refuses to save corrections: on pressing the 'Save Corrections and Exit' button or the 'Save Corrections and Return' button, the page freezes and refuses to accept the corrections. This is annoying, and we apologise! At present, however, this particular glitch is hard to resolve from the front end of the website, so we must ask you please to send a note of the journal, volume number and page which has the glitch to This e-mail address is being protected from spambots. You need JavaScript enabled to view it —it will be reported to our webmaster, and fixed in due course. The problem roots with some corrupt code in the database cell for the page in question, and our advice is simply to report the page, and move on to the next, rather than attempting to re-correct the glitchy page, and lose the corrections again. Eventually, on completion of the magazine, please drop a line to this email address, and the magazine can be approved. At present it is unclear whether there is a generic solution that can be implemented, or if each page needs to get fixed individually: we will update this FAQ as soon there is more information available. |
|||||||||||||||||||||||||||||||||
Selecting a second magazine to correct – why is this not working? |
Currently correctors can only select another magazine to correct after their first magazine has been both submitted and approved. |
|||||||||||||||||||||||||||||||||
IMPORTANT NOTE: As of 16th November 2011, the guidelines for spacing have changed. Please read the guidelines below carefully. Corrections begun BEFORE 16th November should continue to follow the old rule. NEW RULE (after 16th Nov.): Throughout the magazines, there are what the modern eye recognises as unnecessary spaces between final letters of sentences and closing punctuation (e.g. a question mark, or exclamation mark). Most of these have been automatically closed-up. However, some spaces still remain which need to be manually closed-up by the corrector. Also, spaces between quotations/words of direct speech and single ( ' ' ) and double ( " " ) quotation marks remain and should also be manually closed-up by the corrector. OLD RULE (before 16th Nov.): There is no urgent need to remove what the modern eye recognises as unnecessary spaces between final letters of sentences and closing punctuation (e.g. a question mark, or exclamation mark). At present it looks as though we will be able to automate the removal of such spaces. TTS rendition will not be disturbed by them, either. However, it will be useful to remove unnecessary spaces between words of direct speech or quotation, and their opening/closing quotation marks. |
PLEASE CORRECT “ I have no idea why he went, †retorted the Sailor, roughly. TO “I have no idea why he went,†retorted the Sailor, roughly. |
|||||||||||||||||||||||||||||||||
You will come across typographical errors and mistakes occasionally in Dickens’s journals, which should be silently corrected. However, it is worth checking in a good dictionary whether the spelling given is an acceptable Victorian variation on modern spelling, or not. Dickens’s journals often give ‘recal’ for ‘recall’, ‘befal’ for ‘befall’ etc. – these variants should not be modernised. Nor need any factual errors you may espy be corrected – we are reproducing the original content, warts and all! |
||||||||||||||||||||||||||||||||||
See under ‘Unusual characters’ etc. |
||||||||||||||||||||||||||||||||||
Occasionally, Household Words and All the Year Round will feature tables and charts. If you feel confident about using the ‘Insert Table’ function on the JCE Editor (similar to the Insert Table function in Word), and patiently styling the table to look like the original, then please go ahead. If not, there are two alternatives.
Tables are very common in the Household Narrative of Current Events and they are, generally, very complex and time consuming to reproduce. It is best to use the Insert Table function on the JCE Editor throughout, so please do not undertake the correction of a Household Narrative if this is a feature you feel uncertain about using. The layout of the tables does not have to absolutely replicate the original -- this is actually quite difficult to achieve using the JCE Editor -- so don't be worried if your table(s) isn't an exact copy of the original(s). |
||||||||||||||||||||||||||||||||||
|
As the introductory OTC tutorial explains, we ask you to remove the standard Masthead, and the running headers of internal pages (the exception being the Masthead of Extra Christmas numbers):—these will be replaced by automated means in due course. However, please retain titles of articles, advertisements, and announcements. Titles, sub-titles, and chapter headings etc. should each be in a separate paragraph box (use SHIFT + Return/Enter to create a new paragraph box). Reproduce/leave what look like unnecessary full stops, as these will improve text-to-speech (TTS) rendition. Leave block capitals where this reproduces the original, but do not attempt to reproduce different font sizes or families. A plain rendition in a uniform size of san serif font is all that is aimed at with manual text correction. |
|||||||||||||||||||||||||||||||||
The JCE Editor has a sub-menu for inserting the most common of these; its button has the Greek ‘Omega’ symbol (in capital), thus Ω, as its icon. If a character or symbol used in the page image is not available via this sub-menu (for example, an ancient runic character) then the best option, to assist TTS rendition is to use the nearest modern English equivalent (if there is one), or, as a last resort, to put in square brackets the briefest description imaginable. |
Unusual characters sub-menu.jpg |
—