Manjaro Data Donor

last post was 4 years ago

Welcome back again to the forum :slight_smile:

1 Like

It’s pretty interesting to see how many of you assume that an opt in should be the standard method in respect of telemetries. Because the industries standard is the exact opposite.

I’m also a strong opponent of data mining based on personalized data, but that doesn’t seem to be the case of Manjaro here. As I generally see it only as a hardware user survey which other platforms included as well for their services to tailor the own “product” towards their customer base. Even if the “product” is for free and essentially open source ?

They’re open and transparent about that feature which is in development. Let’s be honest, we are here as a community of another distro. Wouldn’t you share your systems details in the form of a inxi hardware report which is specific to your installation in case you’ll need some help to solve an issue ?

In case it would contribute to help a distro to further enhance it’s features I wouldn’t have a problem to share hardware specifications about the system. The question if this is good or bad practice essentially doesn’t apply to EndeavourOS. And blaming the Manjaro developers for this … is to a certain degree hypocritical from my point of view. Especially if that feature isn’t even deployed yet. Its up to them and their community how it will turn out in the end.

As long as they’re transparent about it, if they include it clearly during an install and address that question if it’s okay to you if the hardware details of your installation are shared with them anonymously, it’s essentially totally irrelevant if they set the preset to be opt-in or opt-out. As the transparency and the users choice is important in the end.

In reality, you barely find any opt-ins. And I’m certain that not all of us are aware of all the opt-outs the usual customer / internet user could file for…

Yes, I would. But that is my choice. I do not want the system to just do it. And this is what opt-out is doing. With opt-out the system is just doing it without asking. I want to be asked.

6 Likes

I wanted to say this. Many want this option because it puts them in control. I don’t think that’s hypocritical. Some prefer no telemetry, some are fine with opt-out, some simply don’t care. All options are valid and they’re ultimately up to a choice.

I do agree that it isn’t really our business regarding what Manjaro does. It’s up to them.

I do not know what will change if anything but it does not seem to send anything I need to be worried about.

mdd
Welcome to MDD - The Manjaro Data Donor
Preparing data submission...

------------------------------------------
        Sending the following data
------------------------------------------
{
    "meta": {
        "version": 1,
        "timestamp": "2024-11-05T15:46:43.235506+00:00",
        "device_id": "c1486abb-04e0-5789-88a4-3324a7e06822",
        "distro_id": "manjaro",
        "release": "24.1.2",
        "inxi": true
    },
    "system": {
        "kernel": "6.11.6-1-MANJARO",
        "form_factor": "desktop",
        "install_date": "2021-07-03T10:46:41+00:00",
        "product_name": "Aspire TC-780",
        "product_family": "Aspire T",
        "sys_vendor": "Acer",
        "board_name": "Aspire TC-780(KBL)"
    },
    "boot": {
        "uefi": true,
        "uptime_seconds": 94421
    },
    "cpu": {
        "arch": "x86_64",
        "model": "Intel Core i5-7400",
        "cores": 4,
        "threads": 4
    },
    "memory": {
        "ram_gb": 23.426639556884766,
        "swap_gb": 3.9999961853027344
    },
    "graphics": {
        "comp": "kwin_wayland",
        "dri": null,
        "gpus": [
            {
                "vendor": "eVga.com.",
                "model": "NVIDIA GP107 [GeForce GTX 1050]",
                "driver": "nvidia"
            }
        ],
        "outputs": [
            {
                "model": null,
                "res": "1920x1080",
                "refresh": 59.96,
                "dpi": null,
                "size": "521x293",
                "mapped": "HDMI-A-1"
            }
        ]
    },
    "audio": {
        "servers": [
            {
                "name": "PipeWire",
                "active": true
            }
        ]
    },
    "disk": {
        "disks": [
            {
                "size_gb": 489.04932403564453,
                "root": {
                    "size_gb": 488.5483317375183,
                    "fstype": "xfs",
                    "crypt": false
                },
                "home": null
            }
        ],
        "windows": false
    },
    "locale": {
        "region": "C",
        "language": "en",
        "timezone": "America/Chicago"
    },
    "package": {
        "last_update": "2024-11-04T18:23:15-06:00",
        "branch": "unstable",
        "pkgs": 1369,
        "foreign_pkgs": 5,
        "pkgs_update_pending": 0,
        "flatpaks": 0,
        "pacman_mirrors": {
            "total": 8,
            "ok": 8,
            "country_config": "United_States"
        }
    },
    "desktop": {
        "cli": "/bin/bash",
        "gui": "KDE Plasma",
        "dm": "SDDM",
        "wm": "kwin_wayland",
        "display": "wayland",
        "display_with": "Xwayland"
    }
}
------------------------------------------

Succesful sent at 2024-11-05 09:46:48

Exactly how I was going to respond, give me the option and let me say yes or no and say what is being collected and if it is what is normally collected I would say yes the fact is it was snuck in, and if you can’t see the diff I feel sorry for you (not aimed at you @mbod )

3 Likes

It more it was snuck in than that it was introduced, I get it they need to make a buck somewhere but be honest if you want my info

1 Like

Anyway, it is still in development and it seems they didn’t decide yet how to introduce it in the distro (discussion is ongoing). For sure we’ll see how it will evolve…

Yes. Because opt-in metrics are generally considered non-representative data (and very non-marketable), not because the “industry standards” care about privacy or user freedom and choice.

In essence, the “industry” recognizes that given the power of choice, without manipulative dark patterns, users will (generally) not share their data, therefore degrading sample quality.

That you would want to contribute is great.

However, in no way does that mean that others should have their data shared without their permission.

You might look at that data and say, “I don’t consider this data private, I don’t mind if it is shared by default”. However, when I look at that data, I would say, “No way I would want that data shared.”. Both those perspectives are OK. We just have different opinions.

Opt-in supports both are decisions. Opt-out says the individuals opinion doesn’t really matter. I(The organization) want the data and I will get better data with opt-out so that is what I am going to do.

5 Likes

Said much better than I could have but spot on

I think the way they choose to implement it is significant. If a window pops up in the installer that says “we are going to collect this info unless you uncheck this box” and the box is checked by default, that is technically opt out. Communicating in a forward, transparent way where people can see what is happening and make a choice seems less problematic than if it is just “on”, and the user needs to find out about it on their own, or turning it off is hidden away in a settings menu somewhere.

That is not true. The “right to be forgotten” article of the GDPR includes quite a few exceptions to what data needs to be deleted. Here is the article: https://gdpr-info.eu/art-17-gdpr/

The exception relevant to a technical support forum like this would be this one:

Paragraphs 1 and 2 shall not apply to the extent that processing is necessary:

[…]
4. for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) in so far as the right referred to in paragraph 1 is likely to render impossible or seriously impair the achievement of the objectives of that processing

Deleting posts can remove information relevant to the context of a support thread, making the remaining pieces less useful or meaningless. Anonymizing an account removes all personal information that is required by GDPR.

See also this relevant discussion:

The answer in that thread mentions that even if other posts in the forum contain the (not anonymized) nick of an anonymized user, that still does not necessarily break GDPR compliance.

“If other posts contain the nick of the author of an anonymized post, that is considered an journalistic, academic artistic or literary expression, so Art. 85 GDPR would apply, so the right of erasure does not apply to that.”

The bottom line is deleting user data is not the same thing as deleting content. If content can be anonymized, then it should be able to be preserved. This is obviously beneficial for the usefulness of the forum, because it prevents the site from turning into something like Reddit (where users can easily delete content whenever they feel like it).

Agreed. I should have been more exact with my statement.
In my original statement, “delete” should instead be “at minimum anonymize”.

Practically, unless we are talking about a huge company with big capital and financial interest in the actual content though, I doubt they can actually do anonymization (it’s simply not worth the hassle and resources and liability of doing it correctly), that’s why I mindlessly went with “delete”.

Take for example the wording you rightfully used about the nickname:

For a small business, it simply does not make any sense putting themselves in the position to evaluate what constitutes a “necessity”.

But your corrections are indeed more accurate and valuable, so thank you for taking the time to comment on it. :slightly_smiling_face:

Plus it includes the submitting IP address.
Manjaro Team claimed no usage of personal data which would exclude the IP address. I submitted data with time zone Africa/Niamey in my hardware statistics and look if you can find a dot on their map for Niger:
https://metrics.manjaro.org/public-dashboards/cb0f690cba304389bf3ed2c254c14c01
Sure not, because my submitting IP is from Germany as it was counted with in this statistics&map.

It is about trustability.

In the initially mentioned press article above it says “Manjaro developer Roman Gilg has been working on the Manjaro Data Donor as a better means of user counting than the current ping-based approach with NetworkManager.”
Now, if you analyze Manjaro system with opensnitch, not just network manager, also pacman-mirrors sends whatever to manjaro ping server with every operation (even for pacam-mirrors --help).

In this particular example “transparency” may appear rather a limited hangout.

That has been like that for years now.

And that is exactly the point.
Manjaro team has been doing the ping count for quite some time already and they might be wondering why they now get push back on a systemctl service based data submission.
As pacman-mirrors pings come through python coding the only way to stop it is with opensnitch. Opensnitch is not wide spread in Manjaro due to system challenges with the ebpf installation, so users simply don’t know…

I was hoping that Manjaro was finally becoming a decent distro, since it’s been a long time since their last controversy, and then this happens…

2 Likes

They are asking their forum members if it should be opt-in or opt-out.

You need to have at least Trustlevel 1 to be able to vote.

8 Likes

There are many reasons why people have dumped Manjaro in the past, ironically, this is one of the least egregious reasons… I wish them well, they have a habit of being utterly incapable of being anything other than tone deaf to concerns raised.

7 Likes

Anon user here. Thanks bugmenot!

I switched to Arch as a result of this proposal. It only took a couple hours, even with a full LUKS-encrypted disk.

I also took the opportunity to shit up their public data dashboard via their API that accepted any json as long as the “device ID” was properly hashed and truncated. Didn’t even have to use mdd, just curl. But I wrote a little python script to modify each submission and make sure it showed as a unique category for every metric they chose to display.

AFAIK this is not legally considered hacking - they opened their server to accept data of varying types, yet were disappointed when the data that came in was not what they expected to accept.

Imagine, when you push data collection on users who (largely) chose to use your product to escape data collection… some of them are going to give you false data.

Edit: And no, I’m not the same person they describe as a “headless chicken” in the linked thread… Just some random who heard about it and decided to act on it.

1 Like