Sanskrit lines are not shown legibilly

The PDF that I tried to read has sanskrit verses rendered intelligibilly.

Like this:

Screenshot_2023-02-27_23-41-57

However, that should not occur.
Are there fonts I can install to make them legibile?

There is an AUR package that may be helpful, if you have not tried it yet.

paru -Si sanskrit-fonts
Repository      : aur
Name            : sanskrit-fonts
Version         : r33.15a2dd4-1
Description     : Various unicode fonts for Sanskrit (Scripts: Devanagari, Kannada, Tamil, Telugu, Malayalam, Oriya). Fonts include: Siddhanta, Chandas,
                  Uttara, Sanskrit2003, Noto Sans Devanagari.
URL             : http://github.com/indic-transliteration/sanskrit-fonts
AUR URL         : https://aur.archlinux.org/packages/sanskrit-fonts
1 Like

Didn’t work out. Still showing me illegible lines.

Too bad, sorry I was not able to be more helpful.

Try ttf-indic-otf

Edit:

Just a sanity check, you need to close and open that specific PDF for the updated fonts to be read.

Also, now that I think of it, aren’t PDF supposed to be readable without requiring specific fonts? I’m not super sure about this, could someone confirm?

1 Like

Depends, you can embed specific fonts, but some can use system fonts or even svg / images instead of font glyphs

2 Likes

It is very likely it’s a broken PDF. Does it look properly on other systems, like on your phone?

1 Like

Not a problem.

It’s still illeligible.

I suggest you first check which fonts the PDF is actually using. That can tell what is missing on your PC.

The poppler package provides a tool pdffonts which tells you the fonts used in a given PDF document.

Example:

# pdffonts Teledat_USB_2ab.pdf
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
Tele-Grotesk-Halb                    Type 1C           Custom           yes no  yes    378  0
MEBCMG+Tele-Antiqua                  Type 1C           Custom           yes yes yes    381  0
Tele-Grotesk-Fett                    Type 1C           Custom           yes no  yes    399  0
Tele-Grotesk-Norm                    Type 1C           Custom           yes no  yes    397  0
TeleTasten                           Type 1C           WinAnsi          yes no  no     401  0
Helvetica-Bold                       Type 1            WinAnsi          no  no  no     259  0
HelveticaNeue-Light                  Type 1            WinAnsi          no  no  no     263  0

The column “emb” tells you if a font is embedded in the PDF or loaded from your system.

3 Likes

As @Kresimir told, the pdf you are using is definitely broken or there is some problem with it.

The image you provided is showing the following text:

image

You can clearly see that nothing can be known what is written there. There might be mismatch in character encoding supported by the reader and writer applications.

IMHO, there are two possibilities

  1. The font is embedded in the PDF. Then it should just work.
  2. The font is not embedded. Then you need the font installed on your system.

As @mbod showed pdffonts will tell.

You’re missing out the most likely possibility:

The PDF file is broken and will never show correctly.

1 Like

This could be, of course. Would be possibility 3.

I guess pdffonts would shout if the pdf were broken. But for checking pdfinfo might be best.

Downloading some of the fonts required didn’t make the pdf more legibile. Seems the pdf is broken.

Did you try running pdfinfo and pdffonts?

1 Like

Yes

Do you want to share what the outcome was?

Output of pdfinfo:

Title:           1-Inner Title.pmd
Author:          TTD Publications
Creator:         PageMaker 7.0
Producer:        Adobe Acrobat 8.0
CreationDate:    Sat Jul 21 15:56:53 2018 IST
ModDate:         Sat Jul 21 15:56:53 2018 IST
Custom Metadata: no
Metadata Stream: yes
Tagged:          no
UserProperties:  no
Suspects:        no
Form:            none
JavaScript:      no
Pages:           533
Encrypted:       no
Page size:       612 x 792 pts (letter)
Page rot:        0
File size:       1754696 bytes
Optimized:       yes
PDF version:     1.6

Output of pdffonts:

name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
GKIMPN+NeoNatrajMedium               Type 1C           Custom           yes yes yes   2403  0
GKINDN+NeoNatrajBold                 Type 1C           Custom           yes yes yes   2404  0
GKINGN+NeoGaneshBold                 Type 1C           Custom           yes yes yes   2414  0
GKININ+NeoYogeshMedium               Type 1C           Custom           yes yes yes   2415  0
GKINLN+NeoNatrajLight                Type 1C           Custom           yes yes yes   2416  0
GKINMO+BookAntiqua-Bold              TrueType          WinAnsi          yes yes no    1105  0
GKINOA+BookAntiqua-Italic            TrueType          WinAnsi          yes yes no    1104  0
GKINOC+Calibri                       TrueType          WinAnsi          yes yes no    1106  0
GKINPD+BookAntiqua                   TrueType          WinAnsi          yes yes no    1107  0
TimesNewRomanPS-BoldMT               TrueType          WinAnsi          no  no  no    1109  0
TimesNewRomanPSMT                    TrueType          WinAnsi          no  no  no    1110  0
TimesNewRomanPS-BoldItalicMT         TrueType          WinAnsi          no  no  no    1111  0
TimesNewRomanPSMT                    TrueType          WinAnsi          no  no  no    1117  0
TimesNewRomanPS-BoldMT               TrueType          WinAnsi          no  no  no    1118  0
ArialUnicodeMS                       TrueType          WinAnsi          no  no  no    1114  0
Calibri                              TrueType          WinAnsi          no  no  no    1120  0
BookAntiqua-Bold                     TrueType          WinAnsi          no  no  no    1122  0
TimesNewRomanPS-BoldItalicMT         TrueType          WinAnsi          no  no  no    1123  0
DV-TTYogeshBold                      TrueType          WinAnsi          no  no  no    1130  0
DV-TTYogeshNormal                    TrueType          WinAnsi          no  no  no    1128  0
TimesNewRomanPS-ItalicMT             TrueType          WinAnsi          no  no  no    1131  0
FNTSBS+Wingdings-Regular             CID TrueType      Identity-H       yes yes yes   1135  0
ArialMT                              TrueType          WinAnsi          no  no  no    1134  0
Mangal                               TrueType          WinAnsi          no  no  no    1136  0
Gautami                              TrueType          WinAnsi          no  no  no    1137  0
FNTSBS+Mangal                        CID TrueType      Identity-H       yes yes yes   1140  0
DV-TTYogeshBold                      TrueType          WinAnsi          no  no  no    1147  0
TL-TTHemalatha-Bold                  TrueType          WinAnsi          no  no  no    1148  0
TimesNewRomanPSMT                    TrueType          WinAnsi          no  no  no    1142  0
DV-TTYogeshNormal                    TrueType          WinAnsi          no  no  no    1144  0
TL-TTHemalatha-Normal                TrueType          WinAnsi          no  no  no    1145  0
DV-TTYogeshBold                      TrueType          WinAnsi          no  no  no    1151  0
DV-TTYogeshNormal                    TrueType          WinAnsi          no  no  no    1152  0
DV1-TTYogeshNormal                   TrueType          WinAnsi          no  no  no    1154  0
TimesNewRomanPSMT                    TrueType          WinAnsi          no  no  no    1158  0

I removed many of the fonts I installed previous specifically for this file because, it did not make it more legibile than what it previously was. So, those fonts won’t be displayed here.

So the output of both commands suggest that the PDF is not corrupted in any way.

My understanding: All fonts that are shown as not embeded must be available on your system in order to be shown correctly when viewing the pdf document. If a font is not available on your system the pdf viewer tries to subtitute by a suitable (whatever suitable does mean) font. This could lead to rubbish here as things are in देवनागरी

1 Like