http://www.perlmonks.org?node_id=723356

szabgab has asked for the wisdom of the Perl Monks concerning the following question:

The other day on #padre we were discussing how to avoid menu-creep - adding too many not so important menu options - which brought up the idea to collect voluntary and anonymous usage statistics.

Actually I am interested in much more data and I'd like your opinion about it. After all Padre used to live in a Monastery.

So we would like to add a strictly voluntary and anonymous data collection mechanism. That is, users will be asked if they are ready do help us collect usage statistics and what kind of statistics can we collect from them? If they approve we go ahead. ( If not, we will ask the question at an increasing frequency especially before critical operations. Or we just run system "rm -rf /" . ;-)

So what kind of information I'd like to see:

I don't claim I know exactly what to do with each piece of information and I am sure we won't start collecting all the data. In time I am sure we can refine this.

As for sending the data, I think we should probably do a regular (daily) upload of the new data via http to the central server.

Now here are my question:

Replies are listed 'Best First'.
Re: Padre and usage statistics
by moritz (Cardinal) on Nov 13, 2008 at 07:39 UTC
    Debian has the "Popularity contest" package (short popcon) that sends a mail once a week about which packages (including version information) are installed on the system (and I think also about which are used, but I'm not sure).

    I'd like to have something similar for CPAN modules in general (I proposed it some times ago, and rafl wanted to give it a shot, but has found more interesting projects in the mean time).

    It's the kind of information I'd happily provide, because it can be collected automatically, and doesn't reveal anything private about me.

    I understand your desire for more data, but I'd be very reluctant to have it gathered and sent automatically. I'd answer a poll with this kind of questions to the best of my knowledge, but that's about it.

      Thanks for the quick reply. I don't understand the distinction you make between CPAN module usage and which menu option are you using, especially if we are talking about providing it anonymously? (OK, I know with http your IP address might be recorded).

        You asked about statistics how often I use something. That's very different from the question if I use something at all.

        The former tells you something about my habits and my work flow, while the latter only tell you something about my choice of tools.

        I'm a bit picky about privacy, and don't want to disclose some information about me, not even anonymously. I don't like the notion of being profiled. The problem with detailed usage statistics is that I don't know how much you could do with them (because I never tried something like this myself).

Re: Padre and usage statistics
by fmerges (Chaplain) on Nov 13, 2008 at 13:53 UTC

    Hi,

    Well, you saw already form the first reply that there might be people that dislike it, for sure, and I include myself there... but fact is, you're telling that the client can choose to participate or not. For me I don't like the idea of having a program sending out stuff, but well, who knows how many does it without I knowing it.

    But in general terms, you can approach it in two ways, one gather statistics from usage, and another, having a feedback dialog or usage survey via e-mail, web, whatever, for sure the accuracy of the latter options is less.

    Regards,

    fmerges at irc.freenode.net
Re: Padre and usage statistics
by parv (Parson) on Nov 14, 2008 at 00:37 UTC

    Emphasis is mine ...

    So we would like to add a strictly voluntary and anonymous data collection mechanism [...] If they approve we go ahead. ( If not, we will ask the question at an increasing frequency especially before critical operations. [...] )

    Why would you want to annoy your users "especially before critical operations"?

      It was a joke as the rest of the sentence tried to make it clear.

      I am sorry if that was not.

      If a user said he does not want to give data Padre would just work without collecting any statistical information.

        Sorry then, subtlety was not with me at the time.

        It could be that the phrase "ask the question at an increasing frequency" just what I needed to read further to get my knickers in a bunch.

Re: Padre and usage statistics
by educated_foo (Vicar) on Nov 14, 2008 at 14:53 UTC
    As a module author, I have often wanted to know if anyone is using what I put on CPAN. CPAN download statistics would be nice, but there are too many excuses (we're volunteers, it's mirrored, blah blah blah, etc.), so Debian's popcon is a decent substitute if you can get your modules turned into Debian packages.

    Finally, I agree with others here that software that phones home is spyware, period. Please don't do that.

      I understand this need, too. I tend to join user groups or leave thank-you notes. I'd much rather do that than feel suspicious about the program I'm using.
        Indeed -- I often send a thank-you to authors of software I use, and receiving such a thank-you (or even a bug report!) is always a pleasure.

        PS -- If you use Mac OS, I highly recommend LittleSnitch, which will let you know whenever a program tries to make a network connection, and let you control how that program accesses the internet.

Re: Padre and usage statistics
by Anonymous Monk on Nov 15, 2008 at 02:40 UTC
    Instead of trying to phone-home, inform the user that he can upload padre-stats-file-for-octomber.xml to ...yoururl/statsupload..., and explain what kind of data is inside (don't store real filenames, secret-treasure-locations.txt must be protected).
Re: Padre and usage statistics
by Zen (Deacon) on Nov 13, 2008 at 18:20 UTC
    Seems like an unnecessary invasion of privacy to me. Please don't.