• Welcome to the new Internet Infidels Discussion Board, formerly Talk Freethought.

Renaming genes so that Microsoft Excel won't misunderstand their names as dates

lpetrich

Contributor
Joined
Jul 27, 2000
Messages
25,148
Location
Eugene, OR
Gender
Male
Basic Beliefs
Atheist
Scientists rename human genes to stop Microsoft Excel from misreading them as dates - The Verge - "Sometimes it’s easier to rewrite genetics than update Excel"
There are tens of thousands of genes in the human genome: minuscule twists of DNA and RNA that combine to express all of the traits and characteristics that make each of us unique. Each gene is given a name and alphanumeric code, known as a symbol, which scientists use to coordinate research. But over the past year or so, some 27 human genes have been renamed, all because Microsoft Excel kept misreading their symbols as dates.

The problem isn’t as unexpected as it first sounds. Excel is a behemoth in the spreadsheet world and is regularly used by scientists to track their work and even conduct clinical trials. But its default settings were designed with more mundane applications in mind, so when a user inputs a gene’s alphanumeric symbol into a spreadsheet, like MARCH1 — short for “Membrane Associated Ring-CH-Type Finger 1” — Excel converts that into a date: 1-Mar.

...
There’s no easy fix, either. Excel doesn’t offer the option to turn off this auto-formatting, and the only way to avoid it is to change the data type for individual columns. Even then, a scientist might fix their data but export it as a CSV file without saving the formatting. Or, another scientist might load the data without the correct formatting, changing gene symbols back into dates. The end result is that while knowledgeable Excel users can avoid this problem, it’s easy for mistakes to be introduced.

Help has arrived, though, in the form of the scientific body in charge of standardizing the names of genes, the HUGO Gene Nomenclature Committee, or HGNC. This week, the HGNC published new guidelines for gene naming, including for “symbols that affect data handling and retrieval.” From now on, they say, human genes and the proteins they expressed will be named with one eye on Excel’s auto-formatting. That means the symbol MARCH1 has now become MARCHF1, while SEPT1 has become SEPTIN1, and so on. A record of old symbols and names will be stored by HGNC to avoid confusion in the future.

Sonic Hedgehog, DICER, and the Problem With Naming Genes - Pacific Standard - "Wait, why is there a Pokemon gene?"
Scientists are supposed to be systematic thinkers. But the names assigned to human genes don't follow any system; they are an odd jumble of cryptic abbreviations, forced acronyms, and weird neologisms. Some gene names are informative, like GPD1, a sensible abbreviation for the functional term “glyceraldehyde-3-phosphate dehydrogenase.” Other genes, like p53, have cryptic names that tell you nothing at all about its function. And then there are genes like "sonic hedgehog," which, according to the scientific paper where it was first described, was named "after the Sega computer game cartoon character."
Many fruit-fly genes were discovered from looking for development defects.
Genes discovered this way are now commonly named after the effect of the mutation. Like a patient taking a Rorschach test, fruit fly geneticists tend to come up with gene names by free-associating words with the deformities of the mutant embryos. When Nüsslein-Volhard and Weischaus discovered hedgehog, they also discovered genes they named "patch" and "gooseberry." This fondness for odd gene names can make the molecular biology of fruit flies sound very strange: Hedgehog and patch interact to activate "cubitus interruptus," which in turn controls the expression of the gene "decapentaplegic."
 
I brought up how excel is retarded about dates in an old technology thread.... I was criticized for the observation that it mangles data... and was "mansplained" some extremely basic, only slightly related information.
I guess I should have just given up, like the body of science involved here has.
 
Back
Top Bottom