Download

User Manual

Examples

Buy Now

Real Example

Let's see this real example, this is a book from gutenberg.org, called

Student's History of England

written by Samuel R. Gardiner.

 

Inorder to download quickly, we extracted NCX file, and the example will be performed on this NCX file, you can try on ePub Magic TOC by menu>Open NCX

Original TOC file (before Text Modification)

you can download the complete original ePub file.

Original ePub File (0.4MB)

 

Analysis

Open the book(here the NCX), you will see the TOC is complex and not easy to read.

 

 

It contains these major problems

Task Plan

So we decide to do these improvements

 

So that it looks simple and clear, like this

 

1 Capitalization

First let's change allTOC items in to readable Capitalized form.

 

After Capitalization.

(The S after ' (Student'S) is not lowercased, we have to change it manually or use Replace)

 

2 Convert Roman Numerals

Now let's convert complex Roman Numbers into simple Arabic Numbers.

Because the function only convert the first number in each entry, so other Roman numbers in text will not be changed, such as , here, Henry VI not changed, that's exactly what we want. But if you want other numbers also changed, you have to change them manually. Because ePub Magic TOC can not judge which should be converted which not.

 

After number conversion

 

3 Regular Expression

If you already know Regular Expression, you will know the power of it. In complicated task, Regular Expression can turn your work into a piece of cake, saving much time and labor.

Now let's move to our major object, the numbers of period or years. These years is not so necessary in TOC, in order to shorten the entry and eleiminate the errors of hyphens, we can just remove them all. Of course you can delete them one by one manually, but if you can use Regular Expression, this can be done in one second.

If you don't know Regular Expression, you can read a book or look up from some professional website, if you are working on many books, this technique is so worhty to learn, because it can accelerate your work so efficiently.

Now let's begin.

Input Regular Expression

Note, this Regular Expression only works for this file, you should write your own expression on different situation. Please see discussion of this Regular Expression.

 

Now You can see your TOC like this.

Thus we get a very neat TOC at last.

 

Your can try the example on ePub Magic TOC.

Original TOC file (before Text Modification)

New TOC file (after Text Modification)

 

And we still have to simplify the structure. Please see next page.

 

Discussion of the Expression

This Expression begins with a comma and ends with a dot, there's a space after "?".

It can catch all things like ", 1216-1272." no matter what between two year numbers.

"?" after comma to ensure " 1360-1377" also matched.

If your year number are not four digits, you should change the number inside {}.

If you are certain that the symbols between years are only - and — you can also use

this expression ",? \d{4}[-—]\d{4}\." which using "[-—]" instead of ".+?"

Be careful, if there're two separate single year in an entry, ".+?" can make error by matching all words between the two years.