Mobile app version of vmapp.org
Login or Join
Welton168

: Export: Copy text from pdf without line breaks There are some PDF out in the wild where every line of text seems to be hardcoded so when I copy a text block everything comes with it: Line

@Welton168

Posted in: #AdobeIndesign #Export #Pdf

There are some PDF out in the wild where every line of text seems to be hardcoded so when I copy a text block everything comes with it: Line breaks and even "-" separators.

My questions is: How do I create PDF's in InDesign where this behaviour doesn't happen.

Does somebody know more about this?

10.03% popularity Vote Up Vote Down


Login to follow query

More posts by @Welton168

3 Comments

Sorted by latest first Latest Oldest Best

 

@Ravi4787994

That's because PDF's can be generated in many ways from a number of software and online apps. Each of these is treating lines of text differently, so you can never tell how text is actually enclosed until you try to copy paste it from PDF back to InDesign.

InDesign-exported PDF's however will generally keep the spaces at the end of each line so you dont have to worry about a paragraph return being inserted after each line. To make 100% sure, chech the Create Tagged PDF checkbox when you export a PDF from InDesign. Personally i will always check this box and include it in any presets i am using. More details about this option here.

If you do run into a badly-exported PDF and need to clean up the trailing paragraph returns after each line of text, the quickest option is Find/Replace. Type ^p in the Find what field and put a blank space in the Change to field. Select either Story or Selection below depending on your situation and this should clean up your text.

10% popularity Vote Up Vote Down


 

@Rambettina927

One way that works is to export the PDF as HTML from Acrobat Pro, open that file in your web browser and then copy the text from there.

Unlike exporting as text format, the html usually doesn't break lines.

To my knowledge, you can't prevent this from InDesign, it seems to be a behavior that comes from the PDF or PDF software. It's possible that any publishing software that uses "text frames/boxes" will create that kind of texts in a PDF.

10% popularity Vote Up Vote Down


 

@Si6392903

It's because this is how pdfs recognises text – every line becomes in fact a paragraph (hence return at the end of it). There is no way round, you have to change it globally in documents, after copying, using Find/Replace option and hidden characters.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme