Articles Comments

TheWord Tutorial » 04 Expert, Module Creation » Fixing Module Broken Lines

Fixing Module Broken Lines

Fixing Module Broken Lines in a module is a theWord Module Creation class for reformatting text. At this point, I need to clarify what I am talking about. I will use an example file. Because you cannot see formatting marks in theWord, I am using LibreOffice so you can see the formatting marks I am talking about, and I prefer LibreOffice over Microsoft Word. Later about that.

What are broken lines?

A broken line is a linefeed at the end of a line. This is not at the end of the paragraph usually, but simply in a PDF file, it is where the line ends and a new line begins.

broken-lines-1 theWord

broken-lines-1 theWord

Notice that while the space for text continues to the right, there is a new line feed mark that cuts the line and begins a new one.

broken-lines-2 theWord

broken-lines-2 theWord

In this image from LibreOffice you can see the line feeds. Below is a close up.

broken-lines-3 theWord

broken-lines-3 theWord

These “line Feeds” are a problem. They break up the text and make it harder to read. They cause a disaster if you resize theWord down to a smaller footprint, or use a Bookview pane with other panes in the program. The text doesn’t reformat well.

In the image below, I have a normal 4 pane view of theWord, and look how the lines are broken up.

broken-lines-4

broken-lines-4


Without a doubt, there are more than one way to fix this. A long time ago, I used to manually delete each new line feed by hand. Considering in a typical topic with this problem, there may be 1000+ new line feeds, that is a lot of time wasted on each topic. But just use search and replace. Right. theWord search and replace does not handle regular expressions, so I have not found a way to do this in theWord program itself. So I copy and paste the text into a Word process.

“In the good ole days” I used to work with Microsoft Word and create macros in it to do this kind of thing. But over the years Microsoft has gotten so sophisticated that it is impossible to use their macro languages. I have tried and it is learned a whole programming language to do anything in it. It is also clumsy and awkward.

I now prefer LibreOffice because their search and replace language is robust but not so complicated. I consider these macros as “quick and dirty”, in other words, you create them as you need them and if you need to make them for a single use, it works.

STEP BY STEP PROCESS TO FIX

The problem is if you just replace all the newline feeds (often they are simple new paragraph marks), there are places that should remain a new paragraph or linefeed. So Step 1 is to protect those instances of new paragraph marks.

  1. Replace valid new paragraph marks with a placeholder.
  2. Replace all new line feeds or paragraph marks with a space.
  3. Restore valid paragraph marks.

So why do I prefer LibreOffice over other word processors (for this text formatting only really)? It is because LibreOffice has a good handle on special characters in the search and replace dialog box. Many word processors will let you find a paragraph mark, but not replace it with one.

Alt Search and Replace — LibreOffice

There is a plugin for the LibreOffice called Alt Search and Replace. It has a lot, of flexible that is impossible to find elsewhere.

Alt Search and Replace LibreOffice

Alt Search and Replace LibreOffice

Notice here the vast number of options of special regular expressions that it can look for in the searching and replacing. They are different sometimes, like the paragraph mark

See the green binoculars in the menu bar? After you install the plugin in LibreOffice (its on their official website) then this icon will appear. LibreOffice is freeware by the way.

See the button “Batch”? You can make a series of macros (search and replaces) very quickly here.
My batch file for fixing broken lines called “Linebreaks newline to space”

[Name] Text [all] Linebreaks newline to space
[Find]^$
[Replace][newparagraph]
[Parameters] MsgOff Regular
[Command] ReplaceAll

[Find]\p
[Replace]
[Parameters] MsgOff Regular
[Command] ReplaceAll

[Find]\[newparagraph\]
[Replace]\p
[Parameters] MsgOff Regular
[Command] ReplaceAll

There are three passes through the text.

1st pass – replace valid paragraphs with a place holder  [emptyparagraph]. My basis for discerning a valid paragraph is that there are two paragraphs together. In some cases, such as

This is the title
1st Point,
2nd Point,
3rd Point
blah blah blah

All of these newline feeds will be replaced with spaces. You will either have to edit this yourself or just live with it compressed.

This is the title 1st Point, 2nd Point, 3rd Point blah blah blah

Pass 2 – Replace all remaining new line feeds or paragraph marks with spaces.

Pass 3 – Replace all placeholders with paragraph marks.

Note that since I opted for something I didn’t think would ever be in the text of a module “[emptyparagraph]” I have to escape out the brackets, [] with a forward slash.

Observations

Since these are quick and dirty macros, there may be better ways of doing this. The alternate Search and Replace plugin is itself kind of clunky and slow, but it does what is very difficult to do otherwise. Depending on how long your text is, you could see it run for a minute or two. 1000s of replaces. Be aware of this. But doing a thousand search and replaces by hand would take you the afternoon, and this is the beauty of these quick and dirty macros, they just work so you don’t have to.

Filed under: 04 Expert, Module Creation · Tags: ,

Comments are closed.