OCR to Add Text to File

[limo@limo ~]$ pacman -Qo /usr/lib/python3.10/site-packages/jinja2
/usr/lib/python3.10/site-packages/jinja2/ is owned by python-jinja 1:3.1.2-2
[limo@limo ~]$ 

OK, so…that is all fine. You have jinja2 installed through a proper package.

don’t worry, nothing bad will happen, jinja2 doesn’t access the system(jinja:A simple pythonic template language written in Python). but yeah, you shouldn’t use sudo with pip, @dalto and @2000 are right about that.

Well, in this strange situation having something installed is a great achievement :rofl:
But still I don’t understand

==> ERROR: A failure occurred in build().
    Aborting...
 -> error making: python-coloredlogs
[limo@limo Downloads]$ 

No, that is not the issue. As soon as something tries to pull in python-jinja as a dependency pacman will throw errors.

That is the issue with using sudo pip on an Arch-based system. pacman expects to manage all the system files. If you manually install things, it will cause chaos.

This is a common issue people have with their systems where they can no longer update or install software because they used sudo pip.

1 Like

did you want to install ocrmypdf with arabic font?
there are often problems that have to do with the system font.

Is there a way to revert back?
Better reinstall?
Or is it still OK with me?

You are OK.

I am not sure why you are getting that error though.

Does this give a clue?

==> Sources are ready.
==> Making package: python-coloredlogs 15.0.1-3 (Wed Jul 20 18:24:58 2022)
==> Checking runtime dependencies...
==> Checking buildtime dependencies...
==> WARNING: Using existing $srcdir/ tree
==> Starting build()...
running build
running build_py
creating build
creating build/lib
creating build/lib/coloredlogs
copying coloredlogs/cli.py -> build/lib/coloredlogs
copying coloredlogs/syslog.py -> build/lib/coloredlogs

The
WARNING: Using existing $srcdir/ tree

you can remove the installed package with:

pip uninstall jinja2

you may need to remove the jinja2 folders manually

Edit: ask @dalto bevor you do that! :smile:

Using pip without sudo is not an issue.

However, I don’t think it will help in this situation.

2 Likes
[limo@limo ~]$ pip install coloredlogs
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: coloredlogs in ./.local/lib/python3.10/site-packages (15.0.1)
Requirement already satisfied: humanfriendly>=9.1 in /usr/lib/python3.10/site-packages (from coloredlogs) (10.0)

Defaulting to user installation because normal site-packages is not writeable :thinking:

Why remove it? Isn’t it needed?

Like I said, using pip won’t help in this situation.

1 Like

I tried

[limo@limo ~]$ pip install coloredlogs
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: coloredlogs in ./.local/lib/python3.10/site-packages (15.0.1)
Requirement already satisfied: humanfriendly>=9.1 in /usr/lib/python3.10/site-packages (from coloredlogs) (10.0)
[limo@limo ~]$ pip install ocrmypdf
Defaulting to user installation because normal site-packages is not writeable
Collecting ocrmypdf
  Downloading ocrmypdf-13.6.1-py37-none-any.whl (127 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 127.7/127.7 kB 638.9 kB/s eta 0:00:00
Requirement already satisfied: Pillow>=8.2.0 in ./.local/lib/python3.10/site-packages (from ocrmypdf) (9.2.0)
Requirement already satisfied: coloredlogs>=14.0 in ./.local/lib/python3.10/site-packages (from ocrmypdf) (15.0.1)
Requirement already satisfied: img2pdf>=0.3.0 in /usr/lib/python3.10/site-packages (from ocrmypdf) (0.4.4)
Requirement already satisfied: packaging>=20 in /usr/lib/python3.10/site-packages (from ocrmypdf) (21.3)
Requirement already satisfied: pdfminer.six!=20200720,>=20191110 in ./.local/lib/python3.10/site-packages (from ocrmypdf) (20220524)
Requirement already satisfied: pikepdf!=5.0.0,>=4.0.0 in /usr/lib/python3.10/site-packages (from ocrmypdf) (5.3.1)
Requirement already satisfied: pluggy>=0.13.0 in /usr/lib/python3.10/site-packages (from ocrmypdf) (1.0.0)
Requirement already satisfied: reportlab>=3.5.66 in /usr/lib/python3.10/site-packages (from ocrmypdf) (3.6.10)
Requirement already satisfied: tqdm>=4 in ./.local/lib/python3.10/site-packages (from ocrmypdf) (4.64.0)
Requirement already satisfied: humanfriendly>=9.1 in /usr/lib/python3.10/site-packages (from coloredlogs>=14.0->ocrmypdf) (10.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/lib/python3.10/site-packages (from packaging>=20->ocrmypdf) (3.0.9)
Requirement already satisfied: charset-normalizer>=2.0.0 in ./.local/lib/python3.10/site-packages (from pdfminer.six!=20200720,>=20191110->ocrmypdf) (2.1.0)
Requirement already satisfied: cryptography>=36.0.0 in /usr/lib/python3.10/site-packages (from pdfminer.six!=20200720,>=20191110->ocrmypdf) (37.0.4)
Requirement already satisfied: deprecation in /usr/lib/python3.10/site-packages (from pikepdf!=5.0.0,>=4.0.0->ocrmypdf) (2.1.0)
Requirement already satisfied: lxml>=4.0 in /usr/lib/python3.10/site-packages (from pikepdf!=5.0.0,>=4.0.0->ocrmypdf) (4.9.1)
Requirement already satisfied: cffi>=1.12 in /usr/lib/python3.10/site-packages (from cryptography>=36.0.0->pdfminer.six!=20200720,>=20191110->ocrmypdf) (1.15.1)
Requirement already satisfied: pycparser in /usr/lib/python3.10/site-packages (from cffi>=1.12->cryptography>=36.0.0->pdfminer.six!=20200720,>=20191110->ocrmypdf) (2.21)
Installing collected packages: ocrmypdf
  WARNING: The script ocrmypdf is installed in '/home/limo/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed ocrmypdf-13.6.1
[limo@limo ~]$ 

It says “Successfully installed ocrmypdf-13.6.1”
But can solving the " WARNING: The script ocrmypdf is installed in ‘/home/limo/.local/bin’ which is not on PATH.
Consider adding this directory to PATH"
be the solution?

I have “ocrmypdf” in “/home/limo/.local/bin/” but even from that folder

[limo@limo bin]$ ocrmypdf "/home/limo/Downloads/filename.pdf"
bash: ocrmypdf: command not found
[limo@limo bin]$ 

Yes, that should work.

Hopefully!
Sorry, what command to do to add path? (sorry for being that…)

If you are using bash as your shell, edit ~/.bashrc and add this:

export PATH=$PATH:/home/limo/.local/bin/

Then type exec bash or source ~/.bashrc

1 Like

what I mean is in the Case if you want to undo it.

Wonderful!
It worked!
I could scan and export to another file and select text from it.

The only strange thing now when I copy the “text” added it is English letters not Arabic!
Though I installed

[limo@limo ~]$ yay -Syyu tesseract-data-eng
[limo@limo ~]$ yay -Syyu tesseract-data-ara

UPDATE:
Command not found again! though it worked after I did

export PATH=$PATH:/home/limo/.local/bin/

Then worked again after I issued the commands again!
How to make the commands permanent?

UPDATE:
Arabic leters worked OK, I searche and found I have to add “l -ara” to the command.

ocrmypdf -l ara source.pdf target.pdf

But though I can read it properly “orange” as “orange” when I copy the text is reversed (eg: orange become “egnaro” -arabic alphabet of course)

UPDATE: I opened the same file in Chromium browser and it is read properly. So, it seems the problem is with okular.

UPDATE:
Searching for an Arabic word inside the file it worked properly as expected. This confirms the problem with okular.
I tried on an English PDF and it worked OK.