Friday, June 5, 2009

An experiment in outsourcing science


The Gene Wiki effort leverages the idea of "community intelligence" (like other similar projects such as WikiPathways, WikiProteins, and WikiGenes). It creates a two-way flow of gene annotation information in which the community of users is also the community of contributors. It enables and empowers anyone to add new content to a centralized gene annotation resource, a role that is traditionally reserved for professional curators.

So while individuals acting individually have the power and responsibility to develop the Gene Wiki, there is also a role for focused and systematic improvements. In fact, the original Gene Wiki paper was one example, an effort to systematically create stubs with a consistent design and content.

Unfortunately, we don't currently have dedicated funding to continue systematic Gene Wiki development, so we don't have a dedicated person to implement the growing list of good ideas. However, using a limited amount of funding we have from a previous grant, I want to explore the idea of outsourcing a well-defined project or two to keep things moving.

There are (at least) three ways that I think this could work:
  • We could set up a contract with a private person, pay the contracting fee on delivery of work, and consider possible authorship on the next Gene Wiki paper depending on the specific project.

  • We could set up a contract with an educational entity as part of a class or internship. The student/intern would be assured authorship on the next Gene Wiki paper and get class credit, and we would pay the contracting fee to the school/department.

  • We could set up a small research collaboration with another research group. Both the student and the advisor would get authorship on the next Gene Wiki paper, and no money would change hands.

So for the sake of argument, let me propose one project idea here. Currently, the vast majority of Gene Wiki articles are essentially islands within Wikipedia. They are neither linked to nor linked from other relevant articles. Utility of the Gene Wiki would be greatly enhanced by better integrating them into the rest of Wikipedia. Therefore, this project would involve downloading and parsing protein-protein interactions from BioGRID, and then adding a new section to Gene Wiki pages that lists known and reliable interactions (example). Gene symbols of course would be linked to other Gene Wiki pages.

Interested in the project above? (Or any other potential Gene Wiki projects)? Thoughts on the way outsourcing arrangements would be structured in general?

(Original icon from Wikimedia Commons licensed under CC-BY-SA 3.0.)

Updated BioGPS usage metrics

It's been a couple of months since the last usage update so I thought I'd post some updated numbers.

  • In May 2009, we had 16272 visits from 8301 unique visitors, resulting in 122,250 pageviews.

  • And although we provide the options for users to http://symatlas.gnf.org to go back to SymAtlas, over 80% of users choose to use BioGPS.

  • There are currently 128 registered public plugins in the plugin library, including several plugins registered by external users in the last week. (Anyone else care to register their own?)

  • There are 1053 registered users of BioGPS, including 200 who registered in May and 265 who registered in April.

  • There are a total of 772 custom gene report layouts created by 403 users.

As always, feedback is welcome and appreciated...

Monday, May 18, 2009

BioGPS supports OpenID now

Don't want to remember another username/password combo for BioGPS? Now you can sign-in to BioGPS using your Google or Yahoo account (or one of several other supported accounts you may use everyday). This feature is brought to you by OpenID, a free online identity system, which allows you to login to any supported web site using a single set of login credentials, e.g., your username/password for your Gmail account.

The first time you login to BioGPS using OpenID, you have two options to quickly get started with BioGPS. If you already have an account with us, you can associate your existing account with your OpenID authentication. Otherwise, you can provide a little basic information to create a new BioGPS account. Once you've set this up, logging in to BioGPS is just a mouse click or two away! You'll be surprised at how easy and simple it is.

On a technical note, BioGPS supports both versions 1.xx and the latest 2.xx OpenID protocols, as well as newer inames protocol.

You can read more about OpenID at our site, at the official OpenID site, or at Wikipedia.

Enjoy!

Tuesday, April 28, 2009

A Gene Wiki usage update

It's been over a year since we created the Gene Wiki, and nine months since the effort was published. We previously blogged about the vision as it relates to the Long Tail, and many others have written about the effort too.

So one year later, how does one retrospectively assess the success of the Gene Wiki? In my mind, all community intelligence efforts are dependent on creating a positive feedback loop between utility, users, and contributors.

The idea goes something like this. A resource that has some baseline level of utility will attract some number of users over time. Some percentage of users will actually stay long enough to contribute. That contribution can be something as trivial as fixing a typo or as substantial as summarizing a recent paper. Regardless, that contribution makes the resource better, which then draws more users and then a larger core of contributors. Efforts that are able to complete this positive feedback loop grow, while those that don't end up stagnating.

With that in mind, how does the Gene Wiki stack up?

  • Utility: This one is hardest to measure since there are no quantitative metrics. Certainly our goal was to create stubs with useful content mined from existing structured databases. Perhaps that the paper was the seventh-most read PLoS Biology paper in all of 2008 is the best indication of utility.

  • Users: In any given one-month period, the 9000+ pages in the Gene Wiki are viewed millions of times. For example, would you have guessed that the CD44 page was viewed 4251 times last month? (See the full access stats for November 2008.) The usage patterns suggest that the Gene Wiki is being used by both scientists and lay people. Adding to our optimism for future growth, we find that eighty-six percent of Gene Wiki pages show up on the first page of Google when searched by gene symbol (up from sixty-six percent when the paper was first published).

  • Contributors: In 2008, 15,255 edits were made by 3590 unique users or IP addresses. (An additional 31,865 edits were made by 85 automated "bots".) All these edits translate to an average increase of 236 kb of text content per month to the Gene Wiki, roughly the equivalent of 27 research letters in Nature. (Credit to Pierre for writing the program to assemble these metrics.)


Overall, we're pretty confident that the Gene Wiki satisfies the positive feedback loop described above, and that there is a solid critical mass on which the effort can grow. So have you edited the page on your favorite gene yet?

Friday, April 17, 2009

BioGPS's next organism

BioGPS currently only supports three organisms: human, mouse, and rat. This obviously reflects the mammalian bias in our internal research programs.

However, after a recent presentation at the SDCSB symposium, we got a lot of feedback from people who use other model organisms wondering BioGPS could be adapted to their needs. We've always planned on expanding our selection of organisms, but the number of recent queries got us thinking that we might want to do this sooner rather than later.

So, we turn this question over to our community of users. Any specific input on which organism we should incorporate next? Drosophila? C. eleagans? Arabidopsis? Other? Make your case here...

Tuesday, April 14, 2009

Unexpected downtime

We're aware that BioGPS has been a bit flaky in the last couple of days. At various times, the server has not been responding, or searches have not been successfully executing. We have developers and system administrators looking into the problem now, and we hope to have the the problem diagnosed and fixed ASAP. We'll post updates here as we have them. Apologies for the inconvenience, we hope to be back to our rock-solid dependability shortly...

Monday, April 6, 2009

The who's who of BioGPS users

To follow up on the previously-posted usage metrics, I did a quick analysis of our registered BioGPS users, and here's the top 10 institutions represented (outside of GNF, of course):

Harvard University
Scripps Research Institute
University of North Carolina
Imperial College (UK)
University of Cambridge (UK)
University of Pennsylvania
University of Edinburgh (UK)
Genentech
Kaohsiung University (Taiwan)
Stanford University

Good to see we're reaching researchers in important places...

Icon from Wikimedia Commons licensed under GFDL.