Mobile app version of vmapp.org
Login or Join
Jessie844

: How to change PDF text encoding ? (ANSI to UNICODE) I have this problem with a PDF I am trying to copy the text from... I have this text in a pdf and I need to insert in into a HTML

@Jessie844

Posted in: #Pdf #Text

I have this problem with a PDF I am trying to copy the text from... I have this text in a pdf and I need to insert in into a HTML page, the problem is that when I copy the text some of the letters(the one with diacritics(like: Ț or Ș) are being left out, the words containing them are not correct anymore...

I found out that this is because the PDF is using ASNI font encoding while the browser uses UNICODE ... how can I change the ANSI encoding in the PDF to transform it to UNICODE ?

10.01% popularity Vote Up Vote Down


Login to follow query

More posts by @Jessie844

1 Comments

Sorted by latest first Latest Oldest Best

 

@Ann6370331

If the problem is indeed what you describe, Notepad++ should do what you want, it's free. Create a new document in Notepad++, make sure 'Encode in ANSI' is selected in the Encoding menu, paste the text there, then choose 'Convert to UTF-8 without BOM' in the Encoding menu.

You can also try using Decoder, a free online tool for fixing encoding problems. It's in Russian, but usage is pretty straightforward - paste mangled text into the text box and hit the button that says "Расшифровать".

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme