A dead man fell from the sky...

View Original

Spaced out

Some time ago I wrote about advanced searching with Word, and also, advanced use of replace. Here's another one: a quick way to remove extra spaces. If you're like me, you occasionally put three spaces after a full stop (that's a period for US readers), or simply hit the space bar twice in the middle of a sentence. You can go blind searching by eye for extra white space on the screen, so save your eyesight and let the computer do it for you.

Run the Find/Replace dialog in Word. That's cntrl-f on the keyboard or Edit --> Find on the menu bar.

The string you're searching for is this:


[!\!\.\"\?^l]  


I've inverted the the colours and made it large so you can see there are two spaces at the end of the search string. You must include those two spaces in the search, or this won't work. Make sure you click the More button and then check the Use wildcards checkbox. Here's a screenshot:




If you don't care about the technicalities, you can stop reading now and just use the search as I described. Otherwise, here's how it works.

The two spaces at the end of the string will match any two spaces in your manuscript.

The problem is we want to catch two spaces in the middle of sentences, but two spaces is always correct when a sentence ends and a new one begins. We have to exclude the start of sentences; that's what the gobbledygook does.

Everything inside the square brackets [ ... ] will match a single character. The exclamation mark ! right at the start means match any one character except those that follow. A sentence could end in a full-stop (that's the \.), an exclamation mark (that's the \!), or a question mark (\?). We have to deal with the situation where a sentence ends with dialogue, in which case the final character is a speech mark (\"). We also don't care if the spaces trail the end of a paragraph (that's the ^l). The backslash character \ before the punctuation in each case is merely an instruction to Word to treat the following character as a literal, because each of those characters has a special meaning when wildcards are turned on. The \ escapes the special meaning. The ^l is a special symbol that denotes the end of a paragraph (actually, any manual line break).

So three spaces will be caught anywhere. Two spaces will be ignored if they're preceded by standard end-of-sentence punctuation. Two spaces preceded any other character will be caught, because the square brackets will match anything except the end of a sentence.