Issues with eos-mirrors.pacnew and eos-rankmirrors

Hi all,

Yesterday I got an update for eos mirrors and I’m having some trouble wrapping my head around and melding /etc/pacman.d/endeavouros-mirrorlist.pacnew.

Note that this is not the first time I run across the issue described below (I just ignored it the first couple of times as a minor annoyance), but I’m hoping with some help it will be the last.

Possibly relevant information:
My normal /etc/pacman.d/mirrorlist is automatically generated regularly through a reflector timer.
All relevant reflector configuration can be found here if needed.

Problematic exceprts of output:

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# EndeavourOS mirrorlist, ranked by eos-rankmirrors at 19/07/2023 03:27:03 πμ.
# Preferred mirrors:  .gr .de .dk .nl .pl .no .fi
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# user added mirrors >>>
Server = rank
https://fosszone.csd.auth.gr/endeavouros/repo/$repo/$arch
Server = ###
[endeavouros]
/etc/pacman.d/endeavouros-mirrorlist
manually.
https://ca.gate.endeavouros.com/endeavouros/repo/$repo/$arch
https://mirrors.tuna.tsinghua.edu.cn/endeavouros/repo/$repo/$arch
https://mirrors.jlu.edu.cn/endeavouros/repo/$repo/$arch
https://mirror.alpix.eu/endeavouros/repo/$repo/$arch
https://de.freedif.org/EndeavourOS/repo/$repo/$arch
https://mirror.moson.org/endeavouros/repo/$repo/$arch
https://fosszone.csd.auth.gr/endeavouros/repo/$repo/$arch
https://endeavour.remi.lu/repo/$repo/$arch
https://mirror.albony.xyz/endeavouros/repo/$repo/$arch
https://md.mirrors.hacktegic.com/endeavouros/repo/$repo/$arch
https://mirror.jingk.ai/endeavouros/repo/$repo/$arch
https://mirror.freedif.org/EndeavourOS/repo/$repo/$arch
https://mirror.funami.tech/endeavouros/repo/$repo/$arch
Sweden
https://ftp.acc.umu.se/mirror/endeavouros/repo/$repo/$arch
https://mirror.archlinux.tw/EndeavourOS/repo/$repo/$arch
https://fastmirror.pp.ua/endeavouros/repo/$repo/$arch
https://mirrors.gigenet.com/endeavouros/repo/$repo/$arch
Server = 
Server = 
Server = 
Server = 
Server = /etc/pacman.conf:
# user added mirrors <<<
Server = https://fosszone.csd.auth.gr/endeavouros/repo/$repo/$arch
Server = https://fastmirror.pp.ua/endeavouros/repo/$repo/$arch
Server = https://md.mirrors.hacktegic.com/endeavouros/repo/$repo/$arch
Server = https://de.freedif.org/EndeavourOS/repo/$repo/$arch
Server = https://mirror.moson.org/endeavouros/repo/$repo/$arch
Server = https://mirror.alpix.eu/endeavouros/repo/$repo/$arch
Server = https://ftp.acc.umu.se/mirror/endeavouros/repo/$repo/$arch
Server = https://endeavour.remi.lu/repo/$repo/$arch
Server = https://ca.gate.endeavouros.com/endeavouros/repo/$repo/$arch
Server = https://mirror.albony.xyz/endeavouros/repo/$repo/$arch
Server = https://mirror.jingk.ai/endeavouros/repo/$repo/$arch
Server = https://mirror.funami.tech/endeavouros/repo/$repo/$arch
Server = https://mirror.freedif.org/EndeavourOS/repo/$repo/$arch
Server = https://mirrors.gigenet.com/endeavouros/repo/$repo/$arch
Server = https://mirror.archlinux.tw/EndeavourOS/repo/$repo/$arch
Server = https://mirrors.jlu.edu.cn/endeavouros/repo/$repo/$arch
Server = https://mirrors.tuna.tsinghua.edu.cn/endeavouros/repo/$repo/$arch
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Issues:

  • The generated mirrorlist format is broken and causes (if accepting the changes of .pacnew during the merge) running update operations such as sudo pacman -Syu to fail.

  • The .pacnew comes with an executed eos-rankmirror which shouldn’t be the case to begin with.

    After:

    1. Reading what appears to be a related announcement post here

    2. Assuring my configuration is correct:

      grep 'EOS_AUTO_MIRROR_RANKING=' /etc/eos-rankmirrors.conf                              ─╯
      >>> EOS_AUTO_MIRROR_RANKING=no
      
  • Even IF the automatic mirror ranking was enabled by me by setting EOS_AUTO_MIRROR_RANKING=yes, I don’t remember setting prefferences for eos mirrors
    (notice: # Preferred mirrors: .gr .de .dk .nl .pl .no .fi in the output excerpt above).

    My understanding is that this preference is programmatically “inferred” by my pacman configuration (first post linked in relevant information), which in any case is set to be a mandate rather than a preference.
    So, for some reason, not only does eos generate a ranking I didn’t ask it to, but it includes mirrors I have explicitly chosen to not trust (any server not in the domains specified is unacceptable by my current configuration).

I would highly appreciate any advice on how to resolve those issues.

Thanks in advance! :slight_smile:

The contents between lines that start with

# user added mirrors

is totally incorrect.

I suspect the reason is in file /etc/eos-rankmirrors.conf. Can you show its contents using this command:

cat /etc/eos-rankmirrors.conf | eos-sendlog

and show the returned address here?

Thank you for the quick reply.

Here is the requested configuration: http://ix.io/4B4u
I do not believe I have manually edited this specific config at all.
It should be “as is” (meaning, as originally distributed and/or generated by any default scripts or tools).

Thanks for the link!

And yes, the problem is this line in /etc/eos-rankmirrors.conf:

ALWAYS_FIRST_EOS_MIRRORS='.gr|.de|.dk|.nl|.pl|.no|.fi'

It confuses the program because those partial words are simply too short. And they probably are not meant to be like that either.

To fix the problem, you need to change the value of this variable. The default is an empty string ('').
If you want to prefer certain mirrors, add a unique partial word(s) of a mirror URL, for example: funami, moson, or remi:

ALWAYS_FIRST_EOS_MIRRORS='funami|moson|remi'

In addition, I will make the app somewhat better handle too short words.

1 Like

I was just typing to correct myself after investigating a bit…

Trying to remember if I manually edited the config in question, it appears it was last edited at the same time with my reflector config (~10 days after the original installation):

stat -c %w / | cut -d ' ' -f 1
>>> 2023-05-24

ls -l /etc/xdg/reflector/reflector.conf
>>> -rw-r--r-- 1 root root 477 Jun  5 22:12 /etc/xdg/reflector/reflector.conf

ls -l /etc/eos-rankmirrors.conf
>>> -rw-r--r-- 1 root root 1872 Jun  5 22:24 /etc/eos-rankmirrors.conf

So I probably did edit it to be somewhat comparable with my reflector configuration.

That said, and given your reply, I don’t see why short words would cause that big of an issue.
Doesn’t the relevant script just use the partials as regex filters for the server urls?

It would make a lot of sense if the resulting output was just missing or having some extra mirrors, but why the broken format as a result of that?

In case you indeed plan to improve the existing script to avoid others having similar issues in the future, I think it might be useful for me to make my actual intentions in this case clear. My intentions where to use ALWAYS_FIRST_EOS_MIRRORS as a means to an end of achieving what --includes (or --exclude) does to reflector. That said, depending on the implementation, maybe I should have escaped the '.' (dots).

I just located the relevant scripts on Github, so I might give this another look tomorrow with a clear head.

Thank you very much for your assistance!
I really appreciate the help, it was driving me crazy :stuck_out_tongue:

eos-rankmirrors uses grep to find those partial words, and of course dot is a special character for grep (which sometimes needs escaping as you said).
For example, you had “.de” as the search “word”, and that matched many lines that included e.g. “endeavouros”. Then the output from grep found too many matches and the output caused confusion.

Unfortunately the app didn’t look for only URL lines, that also confused the output. This is one of the improvements I’m going to fix soon.

2 Likes

Hello manuel,

I gave a look at the eos-rankmirrors script on Github.

First of all, thank you for such a fast patch!

The new version works a bit better, still not perfect.

Here is an excerpt of the newly generated mirrorlist:

# Preferred mirrors:  .org .edu .gr .de .dk .fi .nl .no .pl
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# user added mirrors >>>
Server = https://de.freedif.org/EndeavourOS/repo/$repo/$arch
https://mirror.moson.org/endeavouros/repo/$repo/$arch
https://mirror.freedif.org/EndeavourOS/repo/$repo/$arch
Server = https://mirrors.tuna.tsinghua.edu.cn/endeavouros/repo/$repo/$arch
https://mirrors.jlu.edu.cn/endeavouros/repo/$repo/$arch
Server = https://fosszone.csd.auth.gr/endeavouros/repo/$repo/$arch

There’s still some mirrors missing Server = .
Of course I am fully aware this is still an issue with me using the ALWAYS_FIRST_EOS_MIRRORS variable in a way that was not intended.
I understand I shouldn’t be using it like that as you explained in your earlier posts.

I am investigating why this is happening, and will update if I find the reason and possibly a solution.


In the meantime, there’s still some issues with the way that mirror filtering is applied.

There is for example a logical error (to my understanding of the script so far, please correct me if I’m wrong) with the simplicity you accept the cases (you assume correct user input/config):

        case "$m" in
            https://* | http://* | rsync://*)
                mirror="$m"

Say for example I define in /etc/eos-rankmirrors.conf :

ALWAYS_FIRST_EOS_MIRRORS="https://de.freeFif.org/EndeavourOS/repo/$repo/$arch"

with 'F' I have introduced a typo to the original:

"https://de.freedif.org/EndeavourOS/repo/$repo/$arch"

The script will now write this mirror on top (because the defined behavior for "https://*" is to simply set the mirror value to the user defined string, regardless of its content).
Now running pacman -Syyu will throw:

error: failed retrieving file 'endeavouros.db' from de.freeFif.org : Could not resolve host: de.freeFif.org

Even simpler, if I define:

ALWAYS_FIRST_EOS_MIRRORS="https://"

(expecting to simply define a preference for https protocol over http / rsync / ftp / whatever)
The script will now simply write on top:
Server = https://
And pacman -Syyu will again throw:

error: failed retrieving file 'endeavouros.db' from endeavouros.db : Could not resolve host: endeavouros.db

Finally, as for me using strings that contain special regex characters and thus causing issues, if I may, I would propose one of two solutions:

  1. Either define (in /etc/eos-rankmirrors.conf) in comments that the strings should be just plain words, then use in eos-rankmirrors grep --fixed-strings (-F)
    # mirror="$(grep "Server = http" "$latest_ml.orig" | grep "$m"  | awk '{print $NF}')"
    # becomes:
    mirror="$(grep "Server = http" "$latest_ml.orig" | grep --fixed-strings "$m"  | awk '{print $NF}')"
    
  2. Or (and this is a highly preferred solution in my opinion, it also resolves the other issues I described above) define in /etc/eos-rankmirrors.conf in comments that the ALWAYS_FIRST_EOS_MIRRORS variable should be regex (simple words are still treated as words by regex anyway, so simple users’ usage should remain valid), remove the looping logic in UserAddedMirrors() completely, and just | grep -E "$ALWAYS_FIRST_EOS_MIRRORS" .
    This would also improve the power and simplicity of applying rules similar to what I wanted. Eg:
    ALWAYS_FIRST_EOS_MIRRORS="https://.*\.(org|edu|com|gr|de|dk|fi|nl|no|pl)/"
    

The last, could also be applied to EOS_IGNORED_MIRRORS=



Edit:

Some code I only tested with my personal setup (I am unaware of the contribution processes and any automated tests that might exist that I could run)

# /etc/eos-rankmirrors.config
ALWAYS_FIRST_EOS_MIRRORS="https://.*\.(org|edu|com|gr|de|dk|fi|nl|no|pl)/"
# /urs/bin/eos-rankmirrors
UserAddedMirrors() {
    local prefer="$1"    # A regex defining user preferences for mirror filtering

    echo "# user added mirrors >>>"
    if [ ! -z $prefer ]; then
        grep "Server = " "$latest_ml.orig" | grep -E "$prefer"
    fi
    echo "# user added mirrors <<<"
}

# Output excerpt:

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# EndeavourOS mirrorlist, ranked by eos-rankmirrors at 21/07/2023 02:40:36 μμ.
# Preferred mirrors:  https://.*\.(org edu com gr de dk fi nl no pl)/
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# user added mirrors >>>
Server = https://ca.gate.endeavouros.com/endeavouros/repo/$repo/$arch
Server = https://de.freedif.org/EndeavourOS/repo/$repo/$arch
Server = https://mirror.moson.org/endeavouros/repo/$repo/$arch
Server = https://fosszone.csd.auth.gr/endeavouros/repo/$repo/$arch
Server = https://md.mirrors.hacktegic.com/endeavouros/repo/$repo/$arch
Server = https://mirror.freedif.org/EndeavourOS/repo/$repo/$arch
Server = https://mirrors.gigenet.com/endeavouros/repo/$repo/$arch
# user added mirrors <<<

Further improvement suggestion:
Apply the grep on the ranked mirrors, so that preference order benefits from ranking.

1 Like

Thanks for the feedback and analysis! It is very much appreciated.

I’ll look into your suggestions and in time probably include some of them, while trying to keep the implementation relatively simple.
For example, the user options are meant to be easy to use which more or less excludes requiring regular expressions.
The idea with the partial words is, as you noticed, to use just words instead of regular expressions. The number of mirrors of EndeavourOS is not large, so regex would not give much benefit. I can understand if some people disagree with this.

2 Likes

This is very understandable and I respect the idea of keeping things simple.

I just wish there could be a solution somewhere in-between, that maybe keeps things KISS (the arch way) by default, while still enabling power users.

As you see, I was able to solve my own problem (with your valuable guidance of course, without it I would still be scratching my head about what is going on :stuck_out_tongue: ) and hopefully demonstrate some practical use-cases even given the limited mirrorlist size (a way of filtering/prioritizing protocols and domains). But to go by the simplicity argument, it would make things a lot simpler if power users didn’t have to intervene to manually modify scripts in /usr/bin/* (which I believe is generally discouraged for an abundance of very good reasons anyway) and rather give them power through the config files.

One example of such a solution incorporating both of my previous 2 suggested approaches could be:

# /etc/eos-rankmirrors.config

# ENABLE_REGEX_FILTERING determines whether the
# provided user preferences for prioritizing/filtering mirrors
# should be treated as a Regular Expression
# Supported values:
#   "yes"     User preferences provided by ALWAYS_FIRST_EOS_MIRRORS will be treated as a single regex
#   "no" (default)      User preferences provided by ALWAYS_FIRST_EOS_MIRRORS will be treated as a list of plain words separated by '|'
#
ENABLE_REGEX_FILTERING="no"
# /usr/bin/eos-rankmirrors
UserAddedMirrors() {
    local prefer="$1"    # User defined mirror filtering preferences
    local mirror p
    local regex="$2"
    
    echo "# user added mirrors >>>"
    if [ $regex == "yes" ]; then
        grep "Server = " "$latest_ml.orig" | grep -E "$prefer"
    else
        for p in $(echo "$prefer" | tr '|' ' ') ; do
            mirror="$(grep "Server = " "$latest_ml.orig" | grep --fixed-strings "$m")"
            if [ -n "$mirror" ] ; then
                notif_echo2 "User added: $mirror"
                # I am unsure why there was a use of awk '{print $NF}' to remove "Server = " then add it again
                # this is just for proof of concept anyway
                echo "$mirror"
            fi
        done
    fi

    echo2 ""
    echo "# user added mirrors <<<"
}

if [ $ENABLE_REGEX_FILTERING == true ]; then
    grep "Server = " "$latest_ml.orig" | grep -E "$prefer"
else

Warning:
This is just concept sample untested code I just scrambled.
It is not to be used as-is!


In any case, I understand any decision that might be made by the team, and I appreciate even the fact we are discussing on improving tools provided exclusively by the efforts of EndeavourOS development team! :slight_smile:
Thank you for all the time you put into reading all I had to say :sweat_smile:


P.S: I am still getting familiar with how the EOS community works. Due to the many inclusions of technicality and code in this post, I wonder if I should have instead opened a Github issue(?). If that is the case, please advise for future reference.

Thanks! I certainly will examine your examples and use them to improve the implementation.

Thanks for the warning too. I will test it (fortunately there is a debugger, bashdb :wink:) before release.

For me the discussion here is preferable, because more eyes can see it and possibly contribute to the ideas.
And sometimes I may have some delay in noticing there is a message for me at github… sorry about that to all who have experienced such delay.

The actual pull requests are easier to handle at github, but if the new code is small, it is possible to copy/paste here too.

1 Like
grep "Server = " "$latest_ml.orig" | grep -E "$prefer"

The idea behind the user preferred mirrors is to list the preferred mirrors in a user defined order.

Those two definitions above seem to have a side effect to “sort” the preferred mirrors in a way that was likely not intended.

So far I couldn’t find a (simple) regex that didn’t sort those mirrors. With a more complex regex it can be done, but that defeats much of the nice regex idea…

Any ideas?

Hello manuel,

First of all, thank you for following up and putting the time to update the eos-rankmirrors script.
I noticed you’ve been pushing some updates and I absolutely appreciate it! :slight_smile:

I see what you mean.
I completely missed that the intended behavior was for the defined order to be preserved in my previous postings (see for example my other suggestion for grepping on the ranked list instead, to benefit from server speed ranking in the resulting mirrorlist).

If it is essential that the order must be preserved, some ideas come to mind.

It becomes obvious that the use of the separator | is then “mandatory”, so that anything is handled in order (in the existing for loop) which in turn causes the order to be preserved as intended.

That means, that regex can still easily be handled as long as it is configured in expanded forms only (no groupings).

Example:

# This cannot be supported
ALWAYS_FIRST_EOS_MIRRORS="https://.*\.(org|edu|com|gr)/"

# Equivalent expanded form can easily be supported
ALWAYS_FIRST_EOS_MIRRORS="https://.*\.org/|https://.*\.edu/|https://.*\.com/|https://.*\.gr/|"

That said, I’m not sure I see the value in pre-defining order.
To be more specific, let’s say I have a pre-defined selection of 3 mirrors in a specific order.
At that point, doesn’t my mirrorlist become static enough that I don’t benefit from the eos-rankmirrors script at all anyway?

Not really (or not fully :wink:), for a few reasons:

  • mirrors sometimes are temporarily (or permanently) disabled, and then you need (good) fallback mirrors
  • mirrors will be updated at different (essentially, unknown) times, which may change which mirrors are the best for you
  • if you know from experience which mirror(s) are the best for your location, you can override the ranking result, as the ranking tools will provide different results at different times of ranking

And of course, you don’t have to use the “user added mirrors” feature, but simply rely on the ranking tool to do a good job (which it normally should). :sweat_smile:

Edit: anyway, thanks for bringing new ideas to the table, I sure appreciate it a lot. This way the tools will get better along the way.

1 Like