The following is an archived discussion of a featured article nomination. Please do not modify it. Subsequent comments should be made on the article's talk page or in Wikipedia talk:Featured article candidates. No further edits should be made to this page.

The article was promoted by User:SandyGeorgia 00:26, 11 October 2008 [1].


Rosetta@home[edit]

Nominator(s): Emw2012 (talk)


I'm nominating this article because I think it fulfills the featured article criteria. I've been working regularly on the article for the better part of three months, and now feel that it does justice to an interesting and important example of distributed computing being used for protein structure prediction. During my time working on the article, it has become listed as a good article and undergone two peer reviews (one before and one after GAN). David Baker, the head scientist on the Rosetta team, has read the article and called it an "outstanding job"; I've incorporated his emailed suggestions. Thanks in advance for comments and suggestions. Emw2012 (talk) 05:39, 28 September 2008 (UTC)[reply]

  • Alignment has been fixed and forced sizes have been removed for all images, but I think the image in the Project significance section (especially) and the image in the Volunteer contributions section (currently 180px × 122px and 180px × 83px) are now too small to convey their intended information. According to WP:MoS#Images, there are exceptions to the policy on forced sizes: "Images in which a small region is relevant, but cropping to that region would reduce the coherence of the image" (e.g. the detailed screensaver in 'Project significance') and "Detailed maps, diagrams or charts" (e.g. the bar chart in 'Volunteer contributions'). Considering that I'd like to restore the previous sizing for those images (300px × 203px and 450px × 207px, respectively) or very slightly smaller. Please let me know what you think. Emw2012 (talk) 08:41, 28 September 2008 (UTC)[reply]
My default is 180 px, and I expect to have to click on images if I want more detail - that's the whole point of thumbs. I'm not going to oppose just on this issue, and for time reasons I'm unlikely to be able to do a full review, so probably won't support either unless it's still here in two weeks time. Really just wanted to raise the issue (if it hadn't been FAC I would have just removed the forced image sizes). jimfbleak (talk) 16:34, 28 September 2008 (UTC)[reply]
  • Unless anyone objects, I'll keep off forced image sizes per your suggestion. Emw2012 (talk) 22:17, 28 September 2008 (UTC)[reply]

Image question - What efforts have been made to get the publishers to release the non-free screenshots on a GFDL licence? Fasach Nua (talk) 13:32, 28 September 2008 (UTC)[reply]

  • Originally, all of the images were non-free. I emailed the creator of the Rosetta@home logo; he said something to the effect of "it would be fine to use the image on Wikipedia", but did not respond when I asked him to fill out the standard free license release form. Considering that I haven't made an effort to get the screensaver freely licensed by the Baker lab. I will email them again later today. The next two images, superpositions of solved and predicted protein structures, were both made by me in PyMOL after a fairly long search for the atomic coordinates of the predicted structures. The bar chart in 'Volunteer contributions' took a while to get appropriately licensed, but now all images on http://boincstats.com are under a free CC license. Emw2012 (talk) 15:40, 28 September 2008 (UTC)[reply]
  • For the Rosetta@home screensaver image ([2]), would a free license apply to only that particular screenshot of the screensaver, or all screenshots of that type of Rosetta@home screensaver? Emw2012 (talk) 16:24, 29 September 2008 (UTC)[reply]
  • Also, if an image (e.g. the Rosetta@home logo) were not under a free license, would it not be shown on alongside the lead if the article were to be made 'Today's featured article' some time in the future? Emw2012 (talk) 02:06, 1 October 2008 (UTC)[reply]

Comments

  • I saw that on the link checker as well, but somehow could still access the site. I'm not sure what's going on there -- perhaps I should remove the link and only include the other reference information? Emw2012 (talk) 15:40, 28 September 2008 (UTC)[reply]
I had no problem accessing the site. Perhaps link checker is incorrect in this case. —Mattisse (Talk) 15:58, 28 September 2008 (UTC)[reply]
I don't blindly trust the link checker, I always try to click through to the article itself. In this case, I'm still getting a 'forbidden' notice, perhaps you both are on an academic network? It's a Wiley Science reference, it appears, so by chance is this an scientific journal accessed through a database? Ealdgyth - Talk 16:05, 28 September 2008 (UTC)[reply]
I am not on an academic network. Just an ordinary, commercial IP. —Mattisse (Talk) 16:19, 28 September 2008 (UTC)[reply]
I too had no problems with this. Graham Colm Talk 16:28, 28 September 2008 (UTC)[reply]
  • "David Baker's Rosetta@home journal archives" is the actual title of that page, but I've added proper author (David Baker) and publisher (University of Washington) information. Emw2012 (talk) 15:40, 28 September 2008 (UTC)[reply]
  • All cited forum posts are authored by either project scientists (e.g. principal investigator David Baker; project scientists are listed as such under their username in each post) or, in one case, a moderator of the forum (moderators in this forum are liaisons between project scientists and project volunteers). I'm aware of the WP policy against using forum posts as references, but consider this particular kind of forum posting both verifiable and reliable considering that they are made project scientists or forum moderators appointed and endorsed by project scientists. I have only used these forum posts in cases where they provided information that is otherwise unavailable, for example in the project website, the scientific literature, or other sources. Emw2012 (talk) 15:40, 28 September 2008 (UTC)[reply]
  • I forgot to mention one forum post that was by a regular user, current reference #61 ("Foldit forums: How many users does Foldit have? Etc. (message 2)". Retrieved on 2008-09-27.. Considering it simply explains how to estimate the number of Foldit users by multiplying the number of users on each page of the list of all users by the number of pages in that list (i.e., 50 users/page * 1189 pages = 59,450 users), I think it is verifiable. Also, since the author is pseudonymous and the site's publisher is uncertain, I've omitted values for those attributes of the cite template. Emw2012 (talk) 22:17, 28 September 2008 (UTC)[reply]
Otherwise sources look okay, links check out with the link checker tool. Ealdgyth - Talk 14:12, 28 September 2008 (UTC)[reply]
  • I could've sworn I added publisher information for all forum references in a recent edit, but guess I hadn't. That's now done. In keeping with practice in scientific publications and what I see in other featured articles in the sciences, I have omitted listing publishers for journal citations. I have also left references to http://boincstats.com without a publisher, since no such information seems to be available (it is a website made by a single man, who I have listed as the author; rationale for reliability is in a previous comment). If they should be there, please let me know, along with anything else I should add. Emw2012 (talk) 04:03, 29 September 2008 (UTC)[reply]
  • Since Rosetta@home is run as a lowest-priority task, it throttles back whenever background processes (e.g., ripping/burning media files, virus scanning, etc.) request resources that Rosetta would otherwise be using -- see the sentence preceding your quotation. In light of that, the most important things would be power consumption and heat production, no? Emw2012 (talk) 02:06, 1 October 2008 (UTC)[reply]
  • While a few GPCR proteins have been crystallized, you're right that GPCRs (and membrane proteins in general) are especially difficult to solve in terms of structure. I've added information on what Rosetta@home is doing on this front in last few sentences of the second paragraph of the ' Project significance' section. Emw2012 (talk) 02:06, 1 October 2008 (UTC)[reply]
  • Folding@home is interested in modeling (via molecular dynamics) the trajectories of the backbone and residues as the protein folds to native state. Although better understanding of those trajectories could possibly help structure prediction, Rosetta@home is much less interested that, and instead focuses on the position of all parts of the protein in its native state. Rosetta's methodology for protein docking prediction is described in the third paragraph of the ' RosettaDock' section. Let me know if and perhaps how I can further clarify this. Emw2012 (talk) 02:06, 1 October 2008 (UTC)[reply]
Ok, now I understand. I have some minor concerns about the compliance to WP:RS but I hope this stimulates the project team to publish a proper description report and help replace the citations to the discussion forum. Shyamal (talk) 03:53, 1 October 2008 (UTC)[reply]

Good luck. Shyamal (talk) 13:37, 29 September 2008 (UTC)[reply]

  • There isn't much information out there on how much bandwidth Rosetta@home uses per day (or per workunit). I've initiated a conversation at the Rosetta@home forums here: Daily bandwidth usage for Rosetta@home. Unfortunately neither a project scientist nor moderator has dropped in, so there may be reliability issues. And though possible, it would be probably be difficult to verify. Let me know what you think about including information from that Rosetta@home user regarding bandwidth usage. Emw2012 (talk) 02:06, 1 October 2008 (UTC)[reply]
  • It doesn't appear to be much, but I don't think a forum passes the WP:RS test. It appears that it takes 1-2Gb per month, which if you're limited to 200Gb, is kind of significant. I wish there was something more reliable as a source. OrangeMarlin Talk• Contributions 21:07, 3 October 2008 (UTC)[reply]
  • I agree that there may be reliability (and verfiability) issues in forum posts by users who are neither project scientists nor moderators. Given the criteria at WP:SELFPUB, however, there may be a case for including the post in question.
Also, I'm not sure how you got to a bandwidth usage of 1-2 GB per month, considering that 1024 MB was the maximum requirement for the most bandwidth-hungry computer being measured (which had eight CPU cores, making it an outlier). The remaining computers being measured (all single core, 2.8-3.0 GHz CPUs) used around 250 MB per month on average, i.e. one 800th of a 200 GB-per-month capacity.
I agree that this is somewhat important information, but a well-vetted source seems simply unavailable. There is other equally important information that is unavailable for this and similar projects: how much extra power per hour is consumed by running the project, how much heat, how much RAM does an average workunit use, etc.? Because of a lack of reliable information, these questions may be beyond our current scope.
Finally, I want to reiterate that all but one other forum post referenced are written by project scientists and moderators, not miscellaneous users. So I think other forum references used hold significantly more weight. Excluding all forum references would seriously deprive the article of non-controversial and in my opinion acceptably-sourced information. Emw2012 (talk) 16:43, 4 October 2008 (UTC)[reply]
  • Sorry, bad math on my part. Dammit, I'm a doctor, not a computer scientist. (Can't use that Star Trek reference enough.) 250 mb is less than .1% of some of the limits I've read, unless you're using a cellular access to the net. Not really worthy of adding to the article.OrangeMarlin Talk• Contributions 21:09, 9 October 2008 (UTC)[reply]
  • There are still missing publishers and incomplete information about sources, and no response if any potential criticism has been adequately researched and covered, considering the Baker endorsement. SandyGeorgia (Talk) 21:13, 8 October 2008 (UTC)[reply]
  • In my previous response to your concern over lack of publisher information, I said: "In keeping with practice in scientific publications and what I see in other featured articles in the sciences, I have omitted listing publishers for journal citations. I have also left references to http://boincstats.com without a publisher, since no such information seems to be available (it is a website made by a single man, who I have listed as the author; rationale for reliability is in a previous comment). If they should be there, please let me know, along with anything else I should add. Emw2012 (talk) 04:03, 29 September 2008 (UTC)". I just reviewed every reference again, and, among all the references needing a publisher to my understanding, found one without a publisher; there was also one without an author and one without an accessdate. Since several websites do not list an author, I have omitted that attribute to corresponding references, listing only publishers. Let me know whether you think the information on references is now satisfactory; if it isn't then please let me know which references to add to and what you'd like me to add. I will search around for any potential criticism and incorporate any findings before the end of Friday. Thanks again, Emw2012 (talk) 22:41, 8 October 2008 (UTC)[reply]
  • Here is a sample of the work needed, from two sections only; it should be apparent to reviewers when a statement is sourced to an internet forum or a self-published source, so they can evaluate the statements for reliability. Boincstats.com as publisher was missing on several in those sections only, forum sources weren't identified, and there were other misc citation items like missing accessdates. Please complete this work thoughout. SandyGeorgia (Talk) 23:10, 8 October 2008 (UTC)[reply]
In light of there now being two experienced editors who have suggested removing subheadings in the 'Disease-related research' section, I'll take care of that soon. I will expand the subsections in 'Comparison to similar computing projects' to at least two paragraphs each; I think they can be filled without simply adding fluff. Emw2012 (talk) 02:50, 8 October 2008 (UTC)[reply]
  • I was hoping you would also clean up PMIDs from the top of the article, so I won't have to do that work. (Pointing to PMIDs is preferable to pointing to journal abstracts or journal free full text, as the journals sometimes take down abstracts or free full text. Also, it makes the citation method consistent with other bio/med articles, using Diberri's PMID template filler, and avoiding subscription only URLs. We should, however, link to the journal URL when it provides free full text not provided at PubMed Central. See Wikipedia:Wikipedia Signpost/2008-06-30/Dispatches.) SandyGeorgia (Talk) 21:19, 9 October 2008 (UTC)[reply]
  • This is turning out to take up quite a bit of my time. I noticed that citations were messy even outside of the section I reviewed. For example, there are a lot of citations that use "et al" using just the main author and not italicizing the et al. At this point, this article should not be promoted to FA until the citations are cleaned up. I'll work on them, but usually with articles I read the abstract or source to see if it confirms the statement. This may take me a long time. I should have looked more carefully.OrangeMarlin Talk• Contributions 21:31, 9 October 2008 (UTC)[reply]
  • Thanks for working on those, Orange; I was chipping away at a few of them myself, but it is time consuming. SandyGeorgia (Talk) 21:50, 9 October 2008 (UTC)[reply]
The above discussion is preserved as an archive. Please do not modify it. No further edits should be made to this page.