10 Sarcastic Rules on How to Be a Bioinformatician

The advent of fast 3D gaming PCs, the Internet and massive sequencing efforts have attracted hackers and failed wet-lab biologists to the bioinformatics field. What follows is a compendium of 10 “sarcastic” rules that illustrate how a few months in the computer can save a few hours in the library (or in Google) [1].

1. Stay low level at every level. Develop your code by anecdote: avoid planning phases, requirement analysis exercises or any structure to your code. Stay away from object oriented programming. Build up your own little myriad of helper scripts. Do not document either inside or outside your code. Your coding style should only be understood by you. Make sure your software does not scale. Refuse to model or abstract and always choose the quick and dirty fix.

2. Be open source without being open. Error messages should never be provided. If error messages are provided, they should be utterly cryptic so as to convey as little information as possible to the end user [2]. If you create the application, make it difficult to build it. Have plenty of hidden dependencies and bizarre variables. Don’t bother to debug or provide backwards compatibility. Ensure that your code is not portable, it only works in outdated operating systems and assume only you will use your application. Everyone will understand it.

3. Make tools that make no sense to biologists. The less they resemble any intelligible scientific question the better. If you provide a help document, bombard scientists with abbreviations and provide as much unnecessary technical information as possible. The typical biologist hates mathematics, so use mathematical formulas extensively throughout the documentation. Integrate your workflows with as many irrelevant services as possible, so you’ll have greater chances of a potential dead link.

4. Do not provide a graphical user interface, command line is good.Force your end-users to use the command line. It helps if the parameter name does not relate to the intended action. For example, never use –o for specifying an output file, a “k” or “B” creates a much better impression. If you provide a graphical user interface, make sure there is no logic behind it, it is not intuitive to the user and support as few formats as you can, preferably html or text only. Forget HTTP-XML or SOAP. To make sure that the user experience is a nightmare, here are some guiding principles: 1) provide thousands of menu options and pop up windows that make no sense. 2) Ask the user decisions she can’t make. 3) Change your interface/format whenever you feel like it, despite the fact many users might depend on it.

5. Make sure the output of your application is unreadable, unparseable and does not comply to any known standards. Just use plain ASCII text, or better still, provide your own format. Do not use ontologies, XML, or any other inter-exchangeable format. If you use XML, make your data file impossible to validate and do not follow the XML schema. You can also invent a new name for your gene if it doesn’t fit your schema.

6. Be unreachable and isolated. Configure your contact email to either bounce back or permanently set it to vacation. Miss key meetings or seminars where others are presenting their results. Reinvent the wheel. Do not keep up with the literature on current methods of research.

7. Never maintain your databases, web services or any information that you may provide at any time. Provide unstable data, unstable models and unstable services. Your ultimate goal in data curation should be to propagate as many errors as possible from one database to another, while still making sure they sound realistic. Your curated data should only partially reflect the science of the papers you don’t read. When curating your data, make as many new categories as exceptions you find to your classifications. Forget about the biology and stay well away from convention.

8. Blindly believe in the predictions given, P-values or statistics. Select instances for your training set that you know will give you the answer you want. Produce arbitrary cut-offs on rank-ordered result lists. Absolute truth above, absolute falsehood below. Do not ever change parameters of BLAST. If you get a list of hits, only look at the first one [1]. Do not believe in “rubbish in rubbish out”; you just have to make sure that your rubbish data doesn’t smell.

9. Never share your results and do not reuse. Do never discuss your results before your submission has been accepted in a lost conference proceeding. Learning from what others have done is a waste of time. Ignore what your colleagues have developed in the last two decades.

10. Make your algorithm or analysis method irreproducible. The less testing you carry out in your experiments, the more revolutionary results you’ll get. When testing your algorithm, compare it against methods developed the past decade: your performance levels will look much better. Include irrelevant variables in your equations and make them unnecessarily complex, so your reviewers will be very impressed by the complexity and the astonishing predictions you get.


Share/Bookmark

No comments:

Post a Comment


Powered by  MyPagerank.Net

LinkWithin