: Opening PDF in Illustrator breaks up text objects I recently had to convert some PDF files to SVG, and did this by opening the PDF in Illustrator, and saving out to SVG. The issue was
I recently had to convert some PDF files to SVG, and did this by opening the PDF in Illustrator, and saving out to SVG.
The issue was that when Illustrator opened the pdf, many (but not all) text objects would be broken up into different text objects. For instance, the word "policy" wouldn't be one text object, but rather several text objects, "po", "l", "icy". There didn't seem to be any rhyme or reason to it.
How can I prevent this from happening?
More posts by @Kevin459
2 Comments
Sorted by latest first Latest Oldest Best
If you'd like to merge broken text while preserving as much of the formatting, placement, paragraphs and other typography of the existing text as you can, rather than pasting into a newly created text area as plain text, you can try John Wundes' amazing Join Text Frames script.
It does exactly what it says on the tin: it merges snippets of text into one snippet of text, by making intelligent judgements based on where they are relative to each other:
It merges adjascent text snippets (e.g. from broken lines of text from PDFs) into single lines, with some control offered over how close is considered to be the same line.
It merges separate lines of text into one multi-line text object with the text in the right order (top to bottom), fixing broken paragraphs.
There's then an option to restore the original formatting of all the merged text. This can take a while, but it helpfully gives you pretty accurate estimates of how long it will take and the option to skip if it's not worth it.
It's really good!
Note that it only works on point text, not area text (fine here since PDFs are almost always point text). If you're trying to merge area text for any reason, you can convert it with the Kelso Cartography 'Make point text' script
It's also handy used in conjunction with AjarProductions' Convert to Text Area script (Kelso Cartography also have a similar script, see link above), if you want to turn broken text back in into proper text areas with auto-flowing paragraphs:
Select the broken text snippets, run the Join Text Frames script
Copy and paste the text into a text editor that lets you find/replace paragraph characters (e.g. InDesign, a coder's plain text editor, or maybe even something like (whispers) Word...)
Find/replace away the unwanted end of line breaks. If there are many seperate paragraphs which you want to preserve, 1) are you sure you wouldn't be better off using InDesign? 2) you could do it like this:
Find/replace two consecutive paragraph markers with some text placeholder that doesn't appear anywhere else in the text (e.g. |C.L.O.W.N.H.O.R.R.O.R/|/)
Find/replace paragraph markers with nothing or spaces: turning it into one long line of text with occasional bursts of |C.L.O.W.N.H.O.R.R.O.R/|/
Find/replace |C.L.O.W.N.H.O.R.R.O.R/|/ for a new paragraph character - which places one paragraph wherever two were before.
Copy the text back in, and run the Convert to Text Area script on it. It's now one flowing text area with paragraph breaks in the right places.
Unfortunately, nothing can fix outlined text except for stopping it from being outlined, re-typing it, or trusting potentially dodgy OCR software.
Generally this happens to maintain appearance. If the text interacts with other objects it may be broken up. If the font embedded in the PDF for the text is a subset rather than the entire font it can be broken up in AI so that AI can insert the missing characters from the subset.
There's little you can do to stop this from happening.
But you can correct it in Illustrator by selecting the text strings with the Direct Selection or Selection Tool, copy, then start a new point or area text and paste. The pasted text will be 1 string rather than pieces.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.