: Export: Copy text from pdf without line breaks There are some PDF out in the wild where every line of text seems to be hardcoded so when I copy a text block everything comes with it: Line
There are some PDF out in the wild where every line of text seems to be hardcoded so when I copy a text block everything comes with it: Line breaks and even "-" separators.
My questions is: How do I create PDF's in InDesign where this behaviour doesn't happen.
Does somebody know more about this?
More posts by @Welton168
3 Comments
Sorted by latest first Latest Oldest Best
That's because PDF's can be generated in many ways from a number of software and online apps. Each of these is treating lines of text differently, so you can never tell how text is actually enclosed until you try to copy paste it from PDF back to InDesign.
InDesign-exported PDF's however will generally keep the spaces at the end of each line so you dont have to worry about a paragraph return being inserted after each line. To make 100% sure, chech the Create Tagged PDF checkbox when you export a PDF from InDesign. Personally i will always check this box and include it in any presets i am using. More details about this option here.
If you do run into a badly-exported PDF and need to clean up the trailing paragraph returns after each line of text, the quickest option is Find/Replace. Type ^p in the Find what field and put a blank space in the Change to field. Select either Story or Selection below depending on your situation and this should clean up your text.
One way that works is to export the PDF as HTML from Acrobat Pro, open that file in your web browser and then copy the text from there.
Unlike exporting as text format, the html usually doesn't break lines.
To my knowledge, you can't prevent this from InDesign, it seems to be a behavior that comes from the PDF or PDF software. It's possible that any publishing software that uses "text frames/boxes" will create that kind of texts in a PDF.
It's because this is how pdfs recognises text – every line becomes in fact a paragraph (hence return at the end of it). There is no way round, you have to change it globally in documents, after copying, using Find/Replace option and hidden characters.
Terms of Use Create Support ticket Your support tickets Stock Market News! © vmapp.org2024 All Rights reserved.