Converting PDF Files with Embedded Text

Converted document contains strange characters

PDF2XL Support avatar
Written by PDF2XL Support
Updated over a week ago

Some PDF documents use an embedded font set. When trying to convert embedded characters, the preview and the resulting target will look like gibberish. Sometimes, only some of the characters are replaced.

There are two ways to determine whether or not a font is embedded:

Method 1

  1. Open the PDF with Acrobat reader. 

  2. Go to File > Properties and select the “Fonts” tab. 

  3. Look for the term “(Emdedded Subset)”, which can indicate that there are embedded fonts.

*Note that embedded fonts are not always indicated here, so this is not a foolproof method.

Method 2

  1. Open the PDF with Acrobat Reader.

  2. Try to copy the text and paste it into another application such as Notepad.

  3. If the text was not copied correctly from Acrobat, it is an embedded font problem.

  4. If the text was copied correctly with Acrobat, this is a PDF2XL bug (please email us at [email protected]).

Embedded fonts are more likely to happen in a foreign language such as Hebrew, Arabic, Japanese, and so on, but are also found when non-standard fonts are being used.

In order to convert a document with embedded fonts, you will need the OCR capability available in PDF2XL Business or Enterprise plans, or PDF2XL Pro. 

*There is no guarantee that the OCR will be able to convert your embedded fonts 100% of the time.

Did this answer your question?