Offline Mirror Request

Good afternoon all,

I use an offline mirror for contractual reasons. I just moved from Manjaro to Endeavour. I don’t see any rsync mirrors in the list. IN fact - I have been unsuccessful in finding a comprehensive list of mirrors and the protocols they support on the Endeavour website.

And I read on this forum that Arch is dropping support for rsync mirrors - unless I misread something.

So…I obviously need to sync against two mirrors - the arch one, and the endeavour one.

But I don’t know how to do that with rsync and an http/https address…I think it is not possible because rsync wants ssh connections, and most repos are now ftp, http, or https.

So how do I create an offline mirror without having to ftp the entire thing down each day?

Can I use rsync because of how it minimizes bandwith? Or is there some other tool I am unaware of?

Articles, HowTo’s, and wiki links appreciated.

Sincerely and respectfully,

Dave

This would surprise me - can you link to this discussion?

You should be able to rsync from an Arch mirror, and then use whatever method for the EnOS mirror (there’s not a huge amount in there so you probably don’t need to do anything particularly special).

Looking at your other threads, this may be of interest: https://repo.m2x.dev/ (discussion here: Current, Daily, Weekly, Deferred mirror setup - #7 by ricklinux). If all else fails I can probably enable rsync for that.

2 Likes

Arch mirror structure is Tier1 which is mirroring the master server.

Arch Tier2 are free to offer whatever sync method they want to maintain.

For the EndeavourOS packages - that list is short because EndeavourOS is Arch without actually being it.

If you look at the list of packages which are unique for EndeavourOS - it is pretty short.

For that very short list you could use wget or httrack to keep the list up-to-date.

1 Like

Yeah - I agree, made no sense to me either…dropping rsync support in the mirrors would have serious repercussions for bandwidth usage.

I read it in a tech blog outside of the ones I usually follow. And now that I did a web search for it again - I can’t find it.

The one thing I read on the arch site, was a certain version family had been abandoned:
https://www.archlinux.org/news/rsync-compatibility/

I think the guy I read mis-interpreted this post - or I misinterpreted what he said. :flushed:

Well, now that I have embarassed myself to my new community - I offer my apologies.

I have also decided to not move the developmental workstation off of it’s distro - so I can keep my existing scripts for the whole “Sneakernet” exercise as is. I don’t need an offline repo for my main workstation, as it is not a part of the secure workstation that was provided to me.

Thank you for your response, I appreciate it.

Dave

2 Likes

in a terminal, in a temporary folder as user

git clone https://github.com/endeavouros-team/repo.git

will download every package in less than a minute. It will append the following directory tree to the originating folder, where the packages will be downloaded.
/repo/endeavouros/x86_64

Pudge

2 Likes

@Pudge

Excellent reference! I have a directory full of “LinuxHowTo’s”. Your git clone cli will be put there…

THANK YOU!

This method was exactly what I was hoping for…rsyncing Arch is easy - I’ve done it before. Now I have the full package, and I appreciate it!

I will probably use the offline repos paradigm on my workstation - because I remembered, my wife is a teacher - now online - and I don’t want to burn her “wire” up while she’s teaching with using the wire while she’s on it. So I can sync/git the repos offline once a day, and then access them anytime I want to - without interfering with her.

Dave

Dave

2 Likes

Is that repo using Git LFS? If not then your repo is going to become very large, very quickly. :sweat_smile:

You’re welcome. Anything for a fellow Coloradan. At least I assumed you are in Colorado when you mentioned the Denver VA hospital. :grey_question: Although I guess a lot of patients come from surrounding states.

Pudge

Not sure what LFS (Large File System) has to do with anything as all the packages are quite small.

Using git clone everytime, @dcbdbis would have to move the packages else where and start with a fresh directory tree each time and it will always download every package.

Depending on how far @dcbdbis wants to go with this, I believe he could set up a pull system which would compare his package versions with the repo versions and only download the updated packages. I don’t think doing a pull from github to his computer requires any authorization. Doing a push from his computer to github would require authorization.

Pudge

knuck, knuck, knuck (Like Curly of the three stooges)

Beat you to it!

Dave

Edit: It’s small repo and came down the wire in ~ 20 seconds. Then in my scripts, I rsync it with my actual offline repo, then remove the repo from the temp directory. Thus each git-clone - is a “clean” download, and rsync does the rest. (the mirrors are on a dedicated SSD).

1 Like

Yeah, Colorado for sure… Aurora, Colorado to be specific. We live 10 minutes from the new VA. I volunteered at both the old and the new VA for several years - enough for a 2k hour award. Mainly serving my brothers and sisters in arms in the Infusion Clinic: AKA Chemotherapy.

It’s not the packages themselves that are the issue, it’s how they are stored by Git (i.e. the revision history). If you add a single 5MB package to Git then each and every time you alter the 5MB file the repo will grow by 5MB. After 10 updates the Git repo will be 50MB in size.

Consider how often the package database (endeavouros.db.tar.gz and endeavouros.files.tar.gz) are changed. The 13KB .db database has had 70 updates, each time adding 13KB to the repo (or 910KB total so far). The 168KB .files database has added a total of 11.7 MB to the repo. These have added the equivalent of 6MB every month, so 72MB increase per year just for the package databases!

Add in the other packages and you should see that plain Git is not a great tool for maintaining a package repository.

LFS, on the other hand, stores file data outside of the Git metadata so can be a way to limit this type of “binary bloat”.

1 Like

I’ll admit I am a newbie when it comes to git, so I will definitely take your advice about LFS. As far as I know my tiny repos may be using LFS, Any way to find out?

Pudge

I have the offline mirrors working perfectly… in my bash script, I call rsync to do it’s thing on the arch mirror. Then in that same script, I remove the destination dir for the Endeavour repo from my drive, then git-clone it back into the empty dir. Thus no bloat, clean copy each time (once a day).

Thank you all for your helpful suggestions. I do appreciate it. It’s an rsync script I lifted off of the manjaro site when I was on Manjaro - and just modified it to meet my needs…

Dave

3 Likes