http://www.perlmonks.org?node_id=513844

I was re-reading Distribution of * new * Levels and Writeups and was inspired to do some further explorations from the Overall Stats to get a better sense of the distribution between various levels.

In order to avoid skewing the results, I manually corrected for three outlier "monks":

The table below shows of these some corrected stats, including distributions of posts and total XP for each level. (Yes, that's right, Archbishop Merlyn accounts for 1% of the total posts and XP for all of Perlmonks.) There's also an interesting pattern of average XP per post across the levels, with the peak coming in the middle-levels.

Perlmonks Statistical Analysis -- 2005-12-03

TitleLevelPopulationPosts per MonkXP per Post% of Total Posts% of Total XP
Archbishop231557710.381%1%
Bishop226428910.746%5%
Chancellor216339510.225%4%
Canon208194112.693%4%
Abbot1911139513.533%4%
Monsignor1818116911.865%5%
Prior172967915.594%6%
Parson163759913.285%6%
Vicar154953911.406%6%
Priest149233414.047%8%
Curate1316422515.218%11%
Deacon128718114.553%4%
Chaplain1115814214.315%6%
Hermit1018911313.535%6%
Friar92758512.935%6%
Pilgrim83755613.065%5%
Monk73224411.323%3%
Scribe65232711.573%3%
Beadle55081910.452%2%
Sexton4580129.572%1%
Acolyte396788.302%1%
Novice2185856.752%1%
Initiate13082311.0410%1%

I've produced a couple graphics that illustrate some of the relationships.

I draw no conclusions from any of this -- I just like playing with statistics. Among other things, stats like "XP/Post" blend the XP from both writeups and voting, so I admit that referring to that as a "quality" metric is a bit of a misnomer.

Update: As Limbic~Region points out and I mention above, this is not based on per-post statistics. XP/Post is just the ratio of total XP to total number of writeups.

-xdg

Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Replies are listed 'Best First'.
Re: More PM stats analysis on new levels
by ambrus (Abbot) on Dec 03, 2005 at 21:08 UTC

    Just out of curiosity, have you got the data from jcwren's stats page, or have you used some other data source?

      Yes, it's from there. Then I did my adjustments and calculations in Excel.

      -xdg

      Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

        xdg,
        If you obtained the information externally then I have to question your use of "XP per Post". There have been a number of Perl Monks Discussions over the years about having a way to break down where XP came from (logging in, casting all your votes, voting on a node, having one of your nodes voted on, etc). From my limited knowledge of the topic, there is no way to know how much XP a particular node earned you without keeping meticulous track yourself.

        I would only suggest that you change the column heading to "XP / Writeups" as I assume thats what you did.

        Cheers - L~R

Re: More PM stats analysis on new levels
by DrHyde (Prior) on Dec 06, 2005 at 10:12 UTC
    There's also an interesting pattern of average XP per post across the levels, with the peak coming in the middle-levels.
    It would be very interesting to see whether this applies to individuals as well as the Perlmonks as a whole. By which I mean, as someone goes up the first few levels does their average XP per post increase, level off, and then decrease.

    I speculate that there are two things in play here - first, in the lower levels people can get XP (or at least they used to, I confess I've not bothered reading the details for the new system cos it just doesn't matter to me that much any more) just for logging in. Second, people eventually get over XP-whoring, and are more willing to post stuff that won't get many points, or even get negative points.

    Alternatively, instead of getting over XP whoring, higher level monks' propensity to submit what they know will be low-XP posts may have something to do with them previously not being able to advance any more levels. I wonder if the new system will have any effect on that. I suspect not.

Re: More PM stats analysis on new levels
by demerphq (Chancellor) on Dec 05, 2005 at 11:45 UTC

    I dont know if jcwren has stats on it, but it might be an idea to remove the "zombie" initiates from the stats. If jcwren doesn't have them and you are interested still then let me know and I'll see if i can come up with some for you.

    ---
    $world=~s/war/peace/g

      What do you mean "zombie" initiates? Inactive?

      Another question -- is jcwren getting a direct dump/feed from the database or pulling via the XML feeds? I was considering using the XML feeds to pull down a summary of all the nodes so I could examine reputation, not just XP. For example, reputation from initial posts vs from replies, or in different categories, or the ratio of total reputation to total XP.

      Is that possible -- is reputation available for nodes other than my own? My quick scan of the XML generators didn't reveal it.

      -xdg

      Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

        What do you mean "zombie" initiates? Inactive?

        Sorry, I should have been more clear. Zombies are users that never posted, never voted, and never really used their accounts.

        I think we could look into providing you a batch of more specifc data. Id have to think a bit on how to present the info so that it doesn't tell you each nodes rep exactly, but does allow you to do your stats. If you can suggest forms of the info that would be sufficiently useful to you but sufficiently anonymous that I can give them to you Id be happy to do so.

        ---
        $world=~s/war/peace/g

        Well I put together the following query for you. I don't think its exactly what you had in mind, but its more than nothing. Its a breakdown of posts by type by level of author. Of course its by level of author _now_, not when originally posted. It does not include reaped nodes.

        And this is the breakdown of the notes by the type of the root node of the thread.

        ---
        $world=~s/war/peace/g

        I also put this one together for you. Its a breakdown of posts by type, level of poster and (bucketized) node reputation.


        select t.title typetitle, lb.level, CEIL(n.reputation/10)*10 noderep, count(n.node_id) nodecount
        from node n, node a, user u, node t, level_buckets lb
        where n.author_user = a.node_id
        and   n.type_nodetype = t.node_id
        and   a.node_id = u.user_id
        and   CEIL(u.experience/10)*10 = lb.experience
        and n.author_user != 52855
        and n.type_nodetype in (31670, 1042, 31663, 1036, 11, 935, 1588, 173295, 121, 120, 23614, 23615, 115,
        956, 389544, 1584, 337433, 1440, 7487, 7488, 1980, 1981, 1748, 1749)
        group by t.title, lb.level, noderep
        order by t.title, lb.level, noderep

Re: More PM stats analysis on new levels
by McDarren (Abbot) on Dec 05, 2005 at 11:58 UTC
    Perhaps worth noting that the Archbishop population doubled yesterday.

    ++ to Ovid (not that he really needs any ;)

individual reaching a level vs level itself
by TimButterfield (Monk) on Dec 05, 2005 at 17:28 UTC

    I do not know the source of the data or how it is stored. Please keep that in mind as I post this supposition.

    It appears that the stats at any one level include those of all prior levels for the individuals at that level. If the posts that were analyzed are not for a narrow time period, the output is thus for the invidual who reached a certain level and not for the level itself. For example, the chart shows 275 friars. If the posts analyzed cover a wide date range, perhaps some of those at higher levels also posted while they were at the friar level. Those XP/posts are attributed to the higher levels, though they were actually made while the individual was at the friar level. If the data were available, it would be interesting to see if a certain level is more active than another. As an individual progresses up through the levels, does the XP/post change for a certain level?

      That would require a timeseries analysis and as far as I know, the data are not readily available. As you say, this is a static analysis and should be read as such. I.e. across all individuals who have reached are currently at the level of Friar, what is their total number of XP and what is their total number of writeups.

      There is a huge danger in stats like these that people will confuse correlation with causation. These are descriptive statistics only.

      -xdg

      Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: More PM stats analysis on new levels
by psychotic (Beadle) on Dec 06, 2005 at 00:59 UTC
    I think another interesting statistic would be XP per level per day. For instance, monks of Title Curate average 20 XP per day. It believe it would be somewhat representantive of the participation and commitment each level shows to the PerlMonks monastery.