The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.

Operator: Rick Block (talk)

Automatic or Manually Assisted: Automatic, unsupervised

Programming Language(s): Shell/awk w/ pywikipedia

Function Summary: Update Wikipedia:List of administrator hopefuls once a day

Edit period(s) (e.g. Continuous, daily, one time run): Daily

Already has a bot flag (Y/N): Y

Function Details: Similar to the Wikipedia:LA activity update the bot already does, but rather than update admin activity status update the list of admin hopefuls. Because of the data folks would like to see in this list (see Wikipedia talk:List of administrator hopefuls) there's a fair amount of work to do. I'm open to suggestions, but what I have currently coded (as a standalone tool, not connected to the bot) is:

  1. Retrieve all user pages that are members of Category:Wikipedia administrator hopefuls using api.php?action=query&list=categorymembers
  2. Retrieve all RFA page names using a series of api.php?action=query&list=allpages queries
  3. For each user page in the category
    1. Check if the user is already an admin (against a list that has already been retrieved) and, if so, essentially skip this user
    2. Get all previous RFAs for this user by grepping in the list of RFA page names
    3. If the user has made at least 30 edits in the last 3 months based on a previously retrieved set of contributions for this user, include the user in the "active" set (and skip to the next user)
    4. Retrieve the user's last 30 contributions
    5. Retrieve the user's first contribution if we've never retrieved it before
    6. Determine if the user has made at least 30 edits in the last 3 months

I've put a throttle on the contribution fetches (but not the RFA fetches) that introduces a 10 second sleep after every 10 users. There are approximately 1000 pages in this category (currently roughly 300 "active" vs. 700 not so active). The net effect is the tool executes 1000 allpages queries (to get the previous RFAs) and somewhat more than 700 contributions queries every time it runs (and takes about an hour20 minutes).

Update: The original version looked for previous RFAs user by user. The current version fetches all subpages under Wikipedia:Requests for adminship and uses this local file instead. This change reduces 1000 allpages queries to 8 - fetching the article titles for 500 RFAs at a time -- Rick Block (talk) 14:37, 5 September 2008 (UTC)[reply]

Discussion

[edit]

I don't really think the list needs to be updated daily. BJTalk 05:42, 7 September 2008 (UTC)[reply]

Concur with bjweeks, unless there's a compelling reason to update daily, I think, weekly, at a maximum, would suffice. SQLQuery me! 06:27, 7 September 2008 (UTC)[reply]
The reason to update daily is because the "not so active" list includes the date of the "most recent" contribution of these users. Updating weekly, this date will be up to a week off. On the other hand, I suspect a user who is not making at least 10 edits a month has effectively a 0% chance of passing an RFA. If the list is only going to be updated weekly, I'd prefer to drop the "latest contribution" date. On yet another hand, is 700 contributions fetches once a day anything more than an infinitesimal amount of load? -- Rick Block (talk) 17:28, 7 September 2008 (UTC)[reply]
I don't see why it matters if its a week off, no one's going to pass RFA if they've only been active for a week in the past 3 months. A note on the top of the list giving the date of last update should be fine to avoid any possible confusion. Mr.Z-man 17:34, 7 September 2008 (UTC)[reply]
The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.