Jump to content

ModStatistics 1.0.3 - Anonymous mod usage statistics - Now for public distribution!


Majiir

Recommended Posts

The problem I see is that the plugin as is now can be used to 100% profile and ID a single user based on what the json file transfers with 100% accuracy.

{"started":"2014-07-18T08:13:37.382Z"

"finished":"2014-07-18T08:40:59.621Z"

"crashed":false

"statisticsVersion":7

"platform":"Windows"

"id":"8a841a8d74054a18abd7db9f05ea8658"

"installedWithSteam":true

"gameVersion":{"build":549

"major":0

"minor":24

"revision":0

"experimental":0

"isBeta":false

"isSteam":true

"is64":true

"scenes":{"loading":58812.3639

"mainmenu":13135.7512

"spacecenter":388282.2085

"editor":282945.1836

"flight":777710.4824

"trackstation":41504.3739

"sph":79646.5556}

"systemInfo":{"cpus":6

"gpuMemory":1990

"gpuVendorId":4318

"systemMemory":16384}}

(This is not all the file transfers) (I added line-breaks to make it easier readable) Also of course, IP is sent too

When you put this is in I was at first not skeptical, because for some reason I assumed when you say "mod statistics" you realize yourself that your plugin should only collect mod statistics. And I see a ID there that screams "user tracking" but if not, what it is and what does it do? Why are you collecting a unique ID on your report? Anonymous statistics are never supposed to be collecting unique ids

Also to echo previous posts, (IP + ID = privacy issue)

Edited by eRe4s3r
Link to comment
Share on other sites

The problem I see is that the plugin as is now can be used to 100% profile and ID a single user based on what the json file transfers

(This is not all the file transfers) (I added line-breaks to make it easier readable)

When you put this is in I was at first not sceptical, because for some reason I assumed when you say "mod statistics" you realize yourself that your plugin should only collect mod statistics. And I see a ID there that screams "user tracking" but if not, what it is and what does it do? Why are you collecting a unique ID on your report? Anonymous statistics are never supposed to be collecting unique ids

On first start it generates a GUID and saves it to the config file so it can be reused on future runs, so yes that does serve to uniquely identify a KSP install. And it'll even track with the user across multiple networks (remember you're making a web request, so the script on the far end has access to your IP address which is presumably logged).

Link to comment
Share on other sites

"Harmful" "infringing" and "wage a modding war" are some strong statements if you want to have a civil discussion.

I think my post was plenty civil. I'm happy to explain my word choices if you'd like.

I felt that the disabler mod was the only way to resolve the issue without relying on the mod itself to respect a config file option.

If the plugin did not respect the config file, that would be a very different case. The fact is that it does, and that won't be changing. (If I did something like that, the disabler plugin wouldn't exactly work either.)

I have no desire to "improve" third-party opt-out data collection, as I don't believe that any data should be collected from uninformed users, without their explicit consent, nor should any such data be retained without robust data protection and detailed privacy and retention plans. Commercial organizations and even large governments have trouble with data which is naively "anonymized" or appears to be "innocuous," it's something that should never be taken lightly.

Okay. You certainly don't have to join my project, but releasing the disabler was out of line. Not "civil" if you will.

My main concern is that everytime I hear mention about providing the stats collected to the public, it sounds like said stats are needing to be sanitized first. This is my concern. If the RAW stats are not acceptable to release to the public AS IS, then it implies it is collecting data that it should not be.

Indeed, and a number of people have argued to me that I shouldn't waste my time with it. There is one thing that's important to change for a dump, and that's the random number sent with each report. If a malicious entity were to inspect ModStatistics while it's running and fetch that ID, they could then have knowledge that some statistics data is associated with the currently running game. Now, I have no idea how you'd do anything bad with that, but I'd like to be cautious and prevent it from happening anyway. So, "sanitization" means replacing those with new random numbers.

I think the issue here is transparency. Its all well and good to tell someone it doesn't collect XYZ, but showing what is collected is better.

ModStatistics saves reports when the game closes, and it only sends them when you start it back up. You can look in the ModStatistics folder and see reports that will be sent on next start. It's in a human-readable format once you add some newlines.

The opt out feature have a basic problem, without starting the game there is NO CONFIG file, where you could disable it.

And that means during the first start it will most likely send the statistics anyway.

This is not the case. It will only send statistics the second time you start it. There is always a chance to edit the settings file before it matters.

I think the real cheery on top in there is the anger about the lack of information or UI for the user though. Either trolling or there's quite a lot of anger in that paragraph

I don't really get angry.

If forfeit really wanted to do some good, he could have released a version that adds a new UI so that users can disable the plugin without editing config files. Now, the moderators would still have removed the download and contacted me, but I'd be very likely to say "you really ought to have asked, but now that we're here, may I include this in the real build?"

Is the takeaway from this that mod authors are not allowed to reuse assembly names, namespaces, class or variable names from other code?

You're walking a fine line on a technicality, but the reality of your intent is clear. We don't have to play these kinds of games.

As I mentioned earlier, I spent a while speaking with Squad and some of the forum moderators. At first, I was really ambivalent about the idea of mods disabling each other. I thought, who cares, the config option is there already. But as we discussed more, I realized this is a road nobody wants to go down. Do we really want to live out some hypothetical where Kethane distributes ModStats, MechJeb distributes a disabler, and then the two start eradicating each other? Do we want to set a precedent where that kind of interaction between mods is accepted? ModStatistics is controversial, which is to say that some people like it and some people don't. But I don't think anybody wants to see mods start fighting it out in GameData.

Majiir, while I am sure you had the best of intentions this is not a choice YOU, as a player and a modder should make. This is a choice that the individual downloading a mod or plugin should have.

I agree. You are notified about the presence of ModStatistics before you download a mod; you don't have to download it. If you do, you are notified about the presence of ModStatistics as you launch the game; you can disable it. I am hearing feedback that there should be a more obvious notification and that it should be easier to disable, and I will be acting on that feedback.

Especially since no matter how this shakes out, it's something the community is not soon going to forget, and I really do question if it's worth all of the turmoil and violation of trust (real or perceived, it's irrelevant) that has ooccured.

May be time to decide what's more important - being right, or being effective. Because from where I stand, this does not look very effective.

I think it's important to set the record straight. I also think it's important to share my future plans with the community. Some people aren't interested in listening, but those who are should have the opportunity to know my thoughts and respond to them.

And, of course, I have mentioned that I will be changing the plugin. It would be nice if the thread calmed down; I'd have more time to work on the code. :wink:

The problem I see is that the plugin as is now can be used to 100% profile and ID a single user based on what the json file transfers with 100% accuracy. Also of course, IP is sent too

That's a bit of hyperbole. The "ID" that's sent is a random number. So yes, when a user plays KSP multiple times, their reports are all known to originate from the same user. That's it. There is nothing sent that can actually identify a person. If you think this still presents a risk, please let me know why so that I can address it. It could be reasonable to hash the incoming IDs, much like passwords are encrypted in databases. I'd like to know what exactly your concern is, though.

Regarding IP addresses: It is impossible to communicate over the Internet without "sending" your IP address. For those worried about geolocation: I tried locating one of my servers, and one source said it was in Dallas, TX; another said Chicago, IL; and yet another said New York, NY. The server is actually located in Washington, DC. [EDIT] And that server hasn't changed location or IP in five years.

Link to comment
Share on other sites

This is not the case. It will only send statistics the second time you start it. There is always a chance to edit the settings file before it matters

This needs to be made explicit in the OP. It would relieve a lot of fears I think.

Is it also correct to think that no data would be sent if the user deleted settings.cfg between every run of KSP? (each run works as a first run?)

Finally, what is the JsonFX.dll and where is its source code? nvm, found the link.

Edited by kujuman
Link to comment
Share on other sites

Majiir I am not resorting to hyperbole ;) I understand what you are trying to do here but you misunderstand my view as a user. In a post snowden ERA any sane user mistrusts broad data collection no matter the intention. And we can not know how secure your server is, how your LOG storage guidelines are, and who else has access to this data.

I mentioned the IP because with the amount of statistics you collect, together with IP you *can* create a unique user profile. But this is not what the plugin was supposed to do! The plugin itself should do the pruning and anonymization. All the server receives should be "Mod name +1" statistics or "Ram 16GB +1" statistics. All the "unique ID" stuff should be local.

I would accept this plugin if it were sending ANONYMOUS statistics and prunes 90% of the json file before sending it. :kiss:

Or are you honestly afraid someone is gonna "cheat" your stats? ;P

Link to comment
Share on other sites

As I mentioned earlier, I spent a while speaking with Squad and some of the forum moderators. At first, I was really ambivalent about the idea of mods disabling each other. I thought, who cares, the config option is there already. But as we discussed more, I realized this is a road nobody wants to go down. Do we really want to live out some hypothetical where Kethane distributes ModStats, MechJeb distributes a disabler, and then the two start eradicating each other? Do we want to set a precedent where that kind of interaction between mods is accepted? ModStatistics is controversial, which is to say that some people like it and some people don't. But I don't think anybody wants to see mods start fighting it out in GameData.

At this point, your distribution scheme for ModStatistics is fairly close to spyware/crapware, based on the following criteria:

- The auto-replication code (which wouldn't be necessary if there was a single instance of ModStatistics, in it's own GameData directory, you know like if it was explicitly installed by a user.

- It defaults to opt-in (although you've said you're fixing this), but this wouldn't be an issue if the user had to explicitly opt-in in the first place by downloading the mod separately.

- Unique identification of the user (addressed further below).

I'm not sure I subscribe to your slippery slope argument of e.g. MechJeb including a ModStatistics disabling plugin, since the point of the ModStatistics disabling plugins are to disable something that was shoved into the user's computer in the first place. That concern could be addressed by re-licensing the disabling plugin using a 'No License'/ARR license, and not granting any 3rd parties distribution rights.

That's a bit of hyperbole. The "ID" that's sent is a random number. So yes, when a user plays KSP multiple times, their reports are all known to originate from the same user. That's it. There is nothing sent that can actually identify a person. If you think this still presents a risk, please let me know why so that I can address it. It could be reasonable to hash the incoming IDs, much like passwords are encrypted in databases. I'd like to know what exactly your concern is, though.

Regarding IP addresses: It is impossible to communicate over the Internet without "sending" your IP address. For those worried about geolocation: I tried locating one of my servers, and one source said it was in Dallas, TX; another said Chicago, IL; and yet another said New York, NY. The server is actually located in Washington, DC. [EDIT] And that server hasn't changed location or IP in five years.

Yes but that random number uniquely identifies my installation of KSP. It identifies it at home, it identifies it at work, it identifies it at the coffee shop (fun fact, a whole lot of work went into designing privacy extensions into IPv6 because of this concern with MAC addresses). Hashing it does nothing useful (since the purpose of a hashing function is to produce the same output for a given input every time). I know it seemed like a really good idea at the time, but this, coupled with encouraging mod authors to bundle your DLL in their distributions, really makes this, potentially quite useful, plugin look like spyware/crapware. Oh and this unique identifier would likely be classified under the "Fairly Intrusive" category of the EU Cookie Directive, if someone were to lodge a privacy complaint against you in the EU.

Link to comment
Share on other sites

This needs to be made explicit in the OP. It would relieve a lot of fears I think.

To solve everyone's fears the plugin needs to not be distributed by 3rd parties. That makes it 100% opt-in and eliminates the need for the self replication code people have objections to (it would presumably still have a self-update capability, but not as pervasive as what's there now).

Link to comment
Share on other sites

Is it also correct to think that no data would be sent if the user deleted settings.cfg between every run of KSP? (each run works as a first run?)

I had to double-check the source on this. No, that will cause the plugin to run. However, if you deleted the entire ModStatistics folder each time, then it would indeed never send data. I'm not sure why someone would delete just the settings file each time and not the pending report files right next to it, so I don't think this is an issue.

Finally, what is the JsonFX.dll and where is its source code?

JsonFx is a third-party library for serializing object structures to JSON, which is the format used over the wire. Its source code is here: https://github.com/jsonfx/jsonfx

Link to comment
Share on other sites

Majiir I am not resorting to hyperbole ;) I understand what you are trying to do here but you misunderstand my view as a user. In a post snowden ERA any sane user mistrusts broad data collection no matter the intention. And we can not know how secure your server is, how your LOG storage guidelines are, and who else has access to this data.

As I have said before: There is an easy way to permanently disable the plugin, and I will be making it even easier in a future release.

I mentioned the IP because with the amount of statistics you collect, together with IP you *can* create a unique user profile. But this is not what the plugin was supposed to do! The plugin itself should do the pruning and anonymization. All the server receives should be "Mod name +1" statistics or "Ram 16GB +1" statistics. All the "unique ID" stuff should be local.

I don't understand what the concern is here. With the ID, I can know something like "a user that previously had KAS and Kethane installed is now only using KAS". This is a user profile, but in what way is that kind of correlation "not what the plugin was supposed to do"? What bad information could I get from this that I shouldn't be able to?

You're correct that an IP address is there, but again, it's impossible to write a plugin that doesn't send an IP. (I really don't like that KSP offers a "don't send my IP" checkbox when that is simply not possible short of bundling Tor with the game, which I'd argue is even more egregious.) I suggest anyone who's concerned use a VPN or anonymizing service, since many web-based systems have things like session IDs, user IDs, et cetera. (I used to sell VPN service, as a matter of fact!)

Link to comment
Share on other sites

If forfeit really wanted to do some good, he could have released a version that adds a new UI so that users can disable the plugin without editing config files.

You recognize how hypocritical this is, right? His plugin was released precisely because people have trouble editing config files. And here you are criticizing him for not having a UI in his user-opted-in-so-why-does-it-need-a-ui addon while defending your UI-less, "must edit config to opt out", might-be-unintentionally-installed self-replicating plugin.

I don't care if you're planning a UI now after community backlash, until/if it's done is when it counts.

Link to comment
Share on other sites

I had to double-check the source on this. No, that will cause the plugin to run. However, if you deleted the entire ModStatistics folder each time, then it would indeed never send data. I'm not sure why someone would delete just the settings file each time and not the pending report files right next to it, so I don't think this is an issue.

JsonFx is a third-party library for serializing object structures to JSON, which is the format used over the wire. Its source code is here: https://github.com/jsonfx/jsonfx

I was just practicing reading/understanding code written by someone else :D Thanks for your help.

Link to comment
Share on other sites

Okay guys, time to Chill Pill.

chill-pill_cover-alt.jpeg

This is the release thread of an addon, which is by all means totally legal and abides to all forum rules. Majiir is doing some tremendous efforts to answer all of you guys concerns, but this is once again quickly coming out of hand and I can already feel the temperature rising above average.

If we start seeing low flying blades again, the moderation team will be forced to start handing out infractions, and no one really wants that, us included.

If you're not in here to be constructive, it might be safer to refrain from commenting.

Thank to you all, carry on.

Link to comment
Share on other sites

I mentioned the IP because with the amount of statistics you collect, together with IP you *can* create a unique user profile. But this is not what the plugin was supposed to do! The plugin itself should do the pruning and anonymization. All the server receives should be "Mod name +1" statistics or "Ram 16GB +1" statistics. All the "unique ID" stuff should be local.

That would be data without context.

Or are you honestly afraid someone is gonna "cheat" your stats? ;P

Nonsequitur?

Link to comment
Share on other sites

Majir: Until you can get the button working the way you want it to, is there any reason it wouldn't work to just add, after line 106 (and basically copying the update option code):


new Callback(() => { disabled = GUILayout.Toggle(disabled, "Disable anonymous data collection"); }),

or maybe (not sure if Unity works like this, but if it does)


new Callback(() => { disabled = !(GUILayout.Toggle(!disabled, "Enable anonymous data collection")); }),

to make the first-run window prompt for data collection as well as updates? It's not as good as a button always visible, for sure, but it might be an OK hotfix if it works.

Edited by cpast
Link to comment
Share on other sites

What bad information could I get from this that I shouldn't be able to?

This

"started":"2014-07-18T08:13:37.382Z"

"finished":"2014-07-18T08:40:59.621Z"

;) As you say, opt out it is

Link to comment
Share on other sites

md5

A completely unrelated nit-picking here: md5 - not even once.

I appreciate there is no cryptographic requirements to your scenario, but this ancient hash algorithm just needs to die. Every recommendation for it, just encourages this broken security associated with new products using it.

Link to comment
Share on other sites

I will just point out that the *meat* of the information is in the anonymous ID and the session data. For example, consider these use cases.

1. RSS uses a ton of memory. I want to know if my user base is mostly running 64bit KSP now. I can find that out with ModStats, and if almost everyone uses x64, I can not worry so much about memory usage.

2. Do people with the RO suite of mods have longer load times than usual? Which mods in that suite cause the longest load times? I can find out with ModStats and then prioritize speeding up the slowest.

3. With all the parts and fiddly bits added in the RO suite, do people spend more time than average in the VAB and less time than average in flight?

4. What is my turnover like? Do people try the RO suite, find it too challenging, and quit, or once they're hooked do they keep playing? With ModStats I can find out, and if necessary make changes.

5. Are people missing some required mods? If people have most but not all of the RO mods, for example, that means the odd mod out needs some work, maybe. ModStats can tell me.

Pretty much all these require a persistent (though totally anonymous) ID across sessions, all of these require some metadata (like what version of KSP, and therefore which OS), and none of these reveal anything damaging. Unless that's information that should default to not being available to me?

None of my mods include ModStats yet, but these use cases are very appealing, as I hope you will agree.

Link to comment
Share on other sites

Sadly because of some of your statements. I am going to have to check the folders of every mod I install until you end this project.

As a user all I can say is PLEASE stop this project. It is simply not worth the controversy and breaking of trust over the increased numbers you get with opt-out.

Link to comment
Share on other sites

I will just point out that the *meat* of the information is in the anonymous ID and the session data. For example, consider these use cases.

1. RSS uses a ton of memory. I want to know if my user base is mostly running 64bit KSP now. I can find that out with ModStats, and if almost everyone uses x64, I can not worry so much about memory usage.

2. Do people with the RO suite of mods have longer load times than usual? Which mods in that suite cause the longest load times? I can find out with ModStats and then prioritize speeding up the slowest.

3. With all the parts and fiddly bits added in the RO suite, do people spend more time than average in the VAB and less time than average in flight?

4. What is my turnover like? Do people try the RO suite, find it too challenging, and quit, or once they're hooked do they keep playing? With ModStats I can find out, and if necessary make changes.

5. Are people missing some required mods? If people have most but not all of the RO mods, for example, that means the odd mod out needs some work, maybe. ModStats can tell me.

Pretty much all these require a persistent (though totally anonymous) ID across sessions, all of these require some metadata (like what version of KSP, and therefore which OS), and none of these reveal anything damaging. Unless that's information that should default to not being available to me?

None of my mods include ModStats yet, but these use cases are very appealing, as I hope you will agree.

While I do not, disagree with the intention of this plugin, I do however disagree with its implementation. It should be optional not mandatory. Right now it is pretty much, you use it or you figure out where it is and remove it manually.

Link to comment
Share on other sites

While I do not, disagree with the intention of this plugin, I do however disagree with its implementation. It should be optional not mandatory. Right now it is pretty much, you use it or you figure out where it is and remove it manually.
Sadly because of some of your statements. I am going to have to check the folders of every mod I install until you end this project.

As a user all I can say is PLEASE stop this project. It is simply not worth the controversy and breaking of trust over the increased numbers you get with opt-out.

Or you follow the instructions on page 1 - make a ModStatistics folder in GameData, make a settings.cfg with "disabled = false", never worry about it again. Literally every post saying "Oh, I have to go through each plugin to see if it uses this" is wrong on two counts - one, there is a config way to disable it permanently, and two, all mods using it must announce that fact on their forum thread per the distribution rules. Moreover, unless you have the config already there, adding a new mod with ModStatistics will generate a popup window on the main menu, to ask if you want autoupdates. That's not to say it shouldn't also let you disable stats collection -- it should. But it's not like this is some nefarious thing that is near-impossible to notice or disable.

Edited by cpast
Link to comment
Share on other sites

I think the disabler plugin is harmful in this regard. It's no easier to use than the original mod; it doesn't provide any information or UI to the user; and it disables the mod in a way that's unreliable and unpermanent. Furthermore, this kind of hostile mod interaction is not a good sign; it's much better for modders to communicate rather than wage a modding war. If the author of the disabler plugin (or anyone else) would like to improve ModStatistics, feel free to shoot me a message.
This, once again, sounds like all the reason ever needed to make it 100% opt-in; to not bundle it with any other mods; and to no be so flippant about people wanting it in every way gone from their systems. As long as you go for the “just opt out by disabling†route, hostile interaction will be the only way possible to get the mod setup some want.

Oh, and if you don't want it to be hostile, stop using hostile language; stop using hostile terms of use (for users and for other modders); and stop trying to dodge the responsibility for what is going on. For instance, ModStatistics in its current form absolutely sends data to third parties  that third party being you, not the author of the mod where it is included. This is why bundling is such a horribly bad idea.

A disabler or complete removal mod is far more reliable than keeping the mod installed and hoping that everyone respects the config file (or even the base url for the reports). This isn't just about what happens now, but how it will evolve in the future, especially since you are trying to get more and more mod makers to use it (for no useful reason whatsoever).

You're walking a fine line on a technicality, but the reality of your intent is clear. We don't have to play these kinds of games.
Yes, his intent is very clear: he has no interest in infringing on your rights  he just wants to ensure that your mod never goes active, and not only is he allowed to create such a mod by looking at your code (per the github rules), he isn't even using your code to do so.

Still yes, there's no need to play these games. Make it opt-in, removing the bundling, and stop expressive such massive entitlement to other people's data  if people want to send mod usage stats, they can download this one mod that does that and all the mod authors can look in the central repository for the statistics of their mods without having to force it on users.

Link to comment
Share on other sites

Hello, i just noticed this mod in my gamedata folder. Never installed it so i guess it came with another mod.

Just wanted to pop in and voice my disaproval of what this mod does.

I would adivse the author to discontinue this mod until an ingame GUI opt-in mechanism is in place AND a clear list of transmitted information is easily and readily available to the end user, for his own protection.

I would also adivse end users to avoid using mods which try to implemet ModStatistics in their current form.

I am currently working as a university assistant lecturer on information systems security if that has any bearing on my statements.

Link to comment
Share on other sites

I will just point out that the *meat* of the information is in the anonymous ID and the session data. For example, consider these use cases.

1. RSS uses a ton of memory. I want to know if my user base is mostly running 64bit KSP now. I can find that out with ModStats, and if almost everyone uses x64, I can not worry so much about memory usage.

2. Do people with the RO suite of mods have longer load times than usual? Which mods in that suite cause the longest load times? I can find out with ModStats and then prioritize speeding up the slowest.

3. With all the parts and fiddly bits added in the RO suite, do people spend more time than average in the VAB and less time than average in flight?

4. What is my turnover like? Do people try the RO suite, find it too challenging, and quit, or once they're hooked do they keep playing? With ModStats I can find out, and if necessary make changes.

5. Are people missing some required mods? If people have most but not all of the RO mods, for example, that means the odd mod out needs some work, maybe. ModStats can tell me.

Pretty much all these require a persistent (though totally anonymous) ID across sessions, all of these require some metadata (like what version of KSP, and therefore which OS), and none of these reveal anything damaging. Unless that's information that should default to not being available to me?

None of my mods include ModStats yet, but these use cases are very appealing, as I hope you will agree.

I should point out that, while 64-bit KSP does remove the memory cap, there's still the total system memory to worry about. I'm hitting 85% usage on my 8 gigs, even with ATM. Without it I was in the 90s. Sooo...Please do keep memory in mind or those of us who can't afford 32gb of memory :P

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...