Pacman wants to update tesseract-OCR-AI trained library-this is new?

Hi Everyone,

I am doing my daily pacman -Syu and I am getting an update: I do not recognize it and so did a search of my system and it came up with nothing, no results. I may have done a search that was insufficient and so, here I am asking you guys if this is cool to install and what is it?

It looks like an OCR AI library and if so whats it for? OCR can be used to take screenshots and write text files of what I am doing on my PC. Just like WInsucks Recall.

Look if I wanted this type of “backup” I’d stream all my PC sessions to twitch or maybe over to my 30 TB of HDD space maybe. ffs. Who needs it right in the OS where it can be used by any one who can hack into the OS; or worse it’s being installed with that intent in mind.

I did not see anything in the new announcements about it.

 multilib is up to date
:: Starting full system upgrade...
resolving dependencies...
:: There are 128 providers available for tessdata:
:: Repository extra
   1) tesseract-data-afr  2) tesseract-data-amh  3) tesseract-data-ara  4) tesseract-data-asm
   5) tesseract-data-aze  6) tesseract-data-aze_cyrl  7) tesseract-data-bel  8) tesseract-data-ben
   9) tesseract-data-bod  10) tesseract-data-bos  11) tesseract-data-bre  12) tesseract-data-bul
   13) tesseract-data-cat  14) tesseract-data-ceb  15) tesseract-data-ces  16) tesseract-data-chi_sim
   17) tesseract-data-chi_sim_vert  18) tesseract-data-chi_tra  19) tesseract-data-chi_tra_vert
   20) tesseract-data-chr  21) tesseract-data-cos  22) tesseract-data-cym  23) tesseract-data-dan
   24) tesseract-data-dan_frak  25) tesseract-data-deu  26) tesseract-data-deu_frak  27) tesseract-data-div
   28) tesseract-data-dzo  29) tesseract-data-ell  30) tesseract-data-eng  31) tesseract-data-enm
   32) tesseract-data-epo  33) tesseract-data-equ  34) tesseract-data-est  35) tesseract-data-eus
   36) tesseract-data-fao  37) tesseract-data-fas  38) tesseract-data-fil  39) tesseract-data-fin
   40) tesseract-data-fra  41) tesseract-data-frk  42) tesseract-data-frm  43) tesseract-data-fry
   44) tesseract-data-gla  45) tesseract-data-gle  46) tesseract-data-glg  47) tesseract-data-grc
   48) tesseract-data-guj  49) tesseract-data-hat  50) tesseract-data-heb  51) tesseract-data-hin
   52) tesseract-data-hrv  53) tesseract-data-hun  54) tesseract-data-hye  55) tesseract-data-iku
   56) tesseract-data-ind  57) tesseract-data-isl  58) tesseract-data-ita  59) tesseract-data-ita_old
   60) tesseract-data-jav  61) tesseract-data-jpn  62) tesseract-data-jpn_vert  63) tesseract-data-kan
   64) tesseract-data-kat  65) tesseract-data-kat_old  66) tesseract-data-kaz  67) tesseract-data-khm
   68) tesseract-data-kir  69) tesseract-data-kmr  70) tesseract-data-kor  71) tesseract-data-kor_vert
   72) tesseract-data-lao  73) tesseract-data-lat  74) tesseract-data-lav  75) tesseract-data-lit
   76) tesseract-data-ltz  77) tesseract-data-mal  78) tesseract-data-mar  79) tesseract-data-mkd
   80) tesseract-data-mlt  81) tesseract-data-mon  82) tesseract-data-mri  83) tesseract-data-msa
   84) tesseract-data-mya  85) tesseract-data-nep  86) tesseract-data-nld  87) tesseract-data-nor
   88) tesseract-data-oci  89) tesseract-data-ori  90) tesseract-data-pan  91) tesseract-data-pol
   92) tesseract-data-por  93) tesseract-data-pus  94) tesseract-data-que  95) tesseract-data-ron
   96) tesseract-data-rus  97) tesseract-data-san  98) tesseract-data-sin  99) tesseract-data-slk
   100) tesseract-data-slk_frak  101) tesseract-data-slv  102) tesseract-data-snd  103) tesseract-data-spa
   104) tesseract-data-spa_old  105) tesseract-data-sqi  106) tesseract-data-srp  107) tesseract-data-srp_latn
   108) tesseract-data-sun  109) tesseract-data-swa  110) tesseract-data-swe  111) tesseract-data-syr
   112) tesseract-data-tam  113) tesseract-data-tat  114) tesseract-data-tel  115) tesseract-data-tgk
   116) tesseract-data-tgl  117) tesseract-data-tha  118) tesseract-data-tir  119) tesseract-data-ton
   120) tesseract-data-tur  121) tesseract-data-uig  122) tesseract-data-ukr  123) tesseract-data-urd
   124) tesseract-data-uzb  125) tesseract-data-uzb_cyrl  126) tesseract-data-vie  127) tesseract-data-yid
   128) tesseract-data-yor

Enter a number (default=1): 

It’s required by spectacle. I have no idea whether it’s new or not.

It’s not an AI library, it’s an old, established image to text conversion package.

It also doesn’t take screenshots. Spectacle, the kde plasma screenshot app has an option to extract text from images, which uses tesseract as backend.

Afaik tesseract is technically still optional, but of course it will break that feature.

@MyNameIsRichard → OK cool. TYVM for your response!

My paranoia rules me when it comes to privacy on my PC.

For me there IS no other way.

I use spectacle though and so far no weird pings in my sniffed packets
except when arch calls home

wtf is that anyways?
Is it like MS’s ping to ensure PC online?
From network manager or something?
The packet itself seems innocent enough.
I’ve been thinking about putting my own PING point online to I can have MY OS ping my own online ping point and forgo the whole ARCH calling home thing.

CAN I configure that arch call home to ping anything I want?
If not then WHY not?

I can start another thread on this one if necessary RIchard?

TY so much for your response to the tesseract thing!

One more thing–there are 128 choices for install tesseract…is it just language based because I do not see an EN or ENG listed?

oops: found it: tesseract-data-eng

If I’ve understood you correctly, it’s how network manager knows whether you’re connected to the internet or not.

I would guess number 30

See 4.4 Checking connectivity

I can disable it.
That’s good as I do not believe I need it.

I have four internet connected devices at my fingertips at anytime I can always check online status easily.

DO you know if any thing else might depend on this connectivity check or is disabling it an OK thing to do?

It seems to be OK to disable according to the small blurb I am reading on it but I am not sure, being fairly new to Linux.

Depending on how network manager reacts, it may say that you have limited connectivity. That’s about the only thing I can think of.

I just uninstalled spectacle (paranoid or not… :wink:) because I have installed xfce4-screenshooter long time ago. It can take screenshots easily on KDE too.

For anyone interested in the future - I did put the file in place to disabled network managers online check - bandwhich monitoring for an hour said it is no longer pinging arch.org and network manger’s system tray icon seems fine, no limited connectivity notification since I popped it up, the alternative of course was to allow it through my VPN protection but no need for that for my setup.

I got notifications about every 15-30 minutes(?-I never really timed them) the whole time I was connected to my VPN service until I out the disable file in place.

Which, as Schlaefer so nicely pointed out above, directions to do it are located here: https://wiki.archlinux.org/title/NetworkManager#Checking_connectivity