- Mavens: large public databases with focus on information quantity
- Salesmen: data appeal by simple formats/standards
- Connectors: wiki-based community/knowledge portals
Second point is data appeal by simple formats/standards. Data format/standards played a important role in success and popularity of certain research areas. Masanori notes that
In biology, the readability of raw data affects popularity. In fact, metabolism, the primary research topic in metabolomics, is notorious for its incomprehensibility and many researchers stayed away from metabolic networks containing lengthy structural and stoichiometric information. The KEGG database gained popularity for its oversimplified representation of metabolic networks: each metabolite is represented as a node without structure, and each reaction as a binary relationship without stoichiometry. Although its oversimplification resulted in considerable misunderstandings , the KEGG database boosted the graph-oriented analysis of metabolic pathways, and consequently, it awakened the interest of the research community in metabolism. Many popular databases containing gene expression or protein–protein interaction data also use simple notations.Third point is use of wikis as major platform for hosting the biological information. As matter of fact major biology databases are in the process of transferring to wiki-based sites and use of wiki as sites is getting momenta. Further
We, as scientists, should pay more attention to the evolution of web information because wiki embodies the quintessence of all sciences: the acquisition of knowledge through open discussion.Openness is not only reason in favor of wiki over traditional databases. Situation is quite complicated for curated and annotated non-wiki databases where evolution of data remains intractable. Whether there was any updated in data base, and if yes why it was updated and was it discussed in appropriated forum before update, these issues remains gray area for non-wiki type sites. For instance, take the example of BioModels database which expose the systems biology models as releases, in each release there are few new models but it also includes old models which may or may not be modified after the previous release. In current BioModels implementation there is no clear mechanism to track the changes related to evolution of a given model. From a user perspective tracking of revisions and edits is really important. The other issue is whether or not curation projects have a backup mechanism in place. I am raising this issues because funding sources for biological databases are quite limited, which means sooner or later few or many projects will be out of fund (A recent example of this is the arabidopsis resource, TAIR). Rather than asking to funding agencies for sustainable model for biological databases funding, I would suggest that project manager should be asked to include the additional details in their project proposal such as how they will keep the data stream alive if funding run out at first place. I think there are several options to keep data stream alive for short lived projects, just dump the whole database in sourceforge or any other repository. Best option is make Wikipedia your new home. In all fairness I am not against long term funding of the database projects, but this should be a goal oriented merit based decision and even then very few will be succeed.
Not everything is well with wiki option also, like absence of incentives for participating in crowd sourcing efforts. But there is better chance and hope.
No comments:
Post a Comment