Mobile app version of vmapp.org
Login or Join
Deb5748823

: Why does InDesign not always recognize page breaks from Word? Some of my page breaks were not carried over into InDesign from Word. Knowing why would be great before I start designing my next

@Deb5748823

Posted in: #MicrosoftWord

Some of my page breaks were not carried over into InDesign from Word. Knowing why would be great before I start designing my next book. Thank you for your help.

10.02% popularity Vote Up Vote Down


Login to follow query

More posts by @Deb5748823

2 Comments

Sorted by latest first Latest Oldest Best

 

@Miguel516

SAVE AS…

This TIP took me almost 20 years to discover. I have used Microsoft Word since its 1.05 Beta Version. I use Adobe, Scribus, and Quark page layout programs. All have their issues dealing with importing formatted text; but, you can minimize the problems that exist for different reasons and conditions.

When using "Word" there are different kinds of "save" available. A normal, often incremental, save involves the code equivalent of a trailer attached to the end of the existing file. In an existing document, a change is saved appended to the end of the document. Another change to the document with a save is appended to the end of that, and so on whether the changes are made sequentially or randomly.

When the file is loaded, the first-saved version is loaded. Then, each change is applied in-turn, first-in, first-out, until the latest alteration to the document is loaded.

Importing this "freight train" of a document with changes is a recipe for something to go wrong.

Therefore, when working with Word, the final step to put all the changes into one file cleanly with no appended pieces is perform a final "SAVE AS" to overwrite the dog's breakfast into a single, defragged, contiguous file that can be imported cleanly with the minimum of file errors, drop-outs, etc.

10% popularity Vote Up Vote Down


 

@Murray976

Because Adobe's programmers are really bad at reading Microsoft's publically available documentation. (Here, for example, is the low and down on DOC.)

Well ... maybe not entirely.

The .doc format of Microsoft Word is >30 years old, the .docx format nigh on 9 years. In the mean time, there have been several versions of Word, with MS attempting to maintain backwards and forwards compatibility at each version. That means that apart from the current documentation, a typical Word file also contains a huge back log of "old" data as well. Word (any version you like) knows what formatting items supersede which ones and is thus able to reconstruct a document. InDesign, however, lags behind, and so it attempts to use old, no longer used, structures, and therefore sometimes drops the ball.

A few things it cannot handle (most observed in my daily CS4, although I am always - surly - delighted to hear these problems persist in newer versions) are:


restarting footnotes. InDesign not only does not have the infrastructure for this, but it will crash if the footnote numbering scheme changes from 'symbol' (an asterisk for the first note, for example) to digits.
footnotes in general may get lost entirely, or appear (Inception like) as footnote-in-a-footnote. Lost footnotes are recognizable because the numbering changes to the "invalid character" box.
nested tables. It cannot read these (freeze) although they are no problem in InDesign itself.
general formatting. At times, InDesign decides to not reset formatting for an entire paragraph, or from a certain 'new' attribute in the middle of a paragraph, all the way to when it gets used again. So you get random fonts, font sizes, and bolded/italicized text.
Microsoft Word internally uses Windows' character maps, which are all well-defined. However, some characters do not get translated to InDesign's native Unicode scheme; for example, en- and em-dashes, and single and double quotes. These appear as 'unavailable' characters.
oh yes: the page breaks. InDesign has more break types than Word, and the ones it has should be fully compatible with Word's. However, at times a Break is not recognized at all (text runs together, no hard return), is of the wrong type, or the import filter forgets to change its internal meta-code U+E012 to an actual page break.


Some of these problems can be lessened by always re-saving your Word document without Word's "quicksave" feature. "Quicksave", as I understand it, pastes changes at the end of a Word file in a compressed format, and makes it harder for ID to read. This can manifest itself in bits and pieces of text that cannot be restored, and get pasted at the very end of the imported text.

There seems to be less problems with the .doc reader, so I always resave as .doc.

A final anecdote:

Once, I had enough of those flaky import errors, as Word is the only formatted text that ID can read (RTF doesn't really count, it has the same problems.) So I sent a Word file to Adobe, plus the malformed InDesign document, plus annotated screenshots displaying some of the aforementioned import errors, highlighted and with hand drawn arrows pointing them out.

The answer I got from an Adobe engineer was: "well of course you cannot expect that the import filter creates the exact same document in InDesign. Page size and margins are ignored, and so are headers and footers. Also, the formatting engine is different, so you cannot expect paragraphs to break at the same position. Case closed."

The document I sent had missing footnotes, random characters instead of page breaks, misplaced text fragments. I guess it just did't have a high priority - and nor did reading my mail and looking at the screenshots.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme