lpetrich
Contributor
The Scientific Paper Is Obsolete. Here's What's Next. - The Atlantic
Then the article discussed doing animations and interactive demos, and how these can be useful. These are extensions of pictures and diagrams and charts, all of which have been used for a long time.
Then Stephen Wolfram, and how he used computer algebra early in his career. This led to him founding a company, Wolfram Research, a company that makes a computer-algebra software package Mathematica, complete with notebook files. These files display formulas and their results, and one can select some of the formulas and evalulate them. Since then, Mathematica has been expanded from computer algebra proper into image processing, audio processing, and serving up big collections of data, like geographical data, astronomical data, physical data, etc.
He also wrote a book called "A New Kind of Science", in which he described using cellular automata to make a wide variety of patterns -- all done with his Mma notebooks.
More recently, a certain Fernando Pérez found himself in a similar position with lots of cobbled-together software. He and two others liked the programming language Python, and they were developing ways of making it easier to use. He unified their work as IPython, "Interactive Python", and like Mathematica, it uses a notebook structure. However, he made it open-source instead of payware, open-source like Python itself. IPython now supports other programming languages, and it has been renamed Jupyter as a result.
It has had a lot of success, with a large number of users -- it's now searched for at least as much as Mathematica is. But building software tools has not been very rewarding in academia.
It remains to be seen how far this will go. I suspect that such interactive notebooks will first be published as "Supplementary Information", alongside the "main" paper. A further problem is that scientific papers have a rather stylized organization: Abstract, Introduction, Materials and Methods, Results, Discussion, Conclusions, and References. So an interactive notebook will likely have to include such content.
Nowadays, we have much more data, and we often use a lot of software to analyze it -- stuff that often does not make the pages of journals. Even what gets into journals often has a lot of jargon and fancy mathematics.The scientific paper—the actual form of it—was one of the enabling inventions of modernity. Before it was developed in the 1600s, results were communicated privately in letters, ephemerally in lectures, or all at once in books. There was no public forum for incremental advances. By making room for reports of single experiments or minor technical advances, journals made the chaos of science accretive. Scientists from that point forward became like the social insects: They made their progress steadily, as a buzzing mass.
The earliest papers were in some ways more readable than papers are today. They were less specialized, more direct, shorter, and far less formal. Calculus had only just been invented. Entire data sets could fit in a table on a single page. What little “computation” contributed to the results was done by hand and could be verified in the same way.
Then the article discussed doing animations and interactive demos, and how these can be useful. These are extensions of pictures and diagrams and charts, all of which have been used for a long time.
Then Stephen Wolfram, and how he used computer algebra early in his career. This led to him founding a company, Wolfram Research, a company that makes a computer-algebra software package Mathematica, complete with notebook files. These files display formulas and their results, and one can select some of the formulas and evalulate them. Since then, Mathematica has been expanded from computer algebra proper into image processing, audio processing, and serving up big collections of data, like geographical data, astronomical data, physical data, etc.
He also wrote a book called "A New Kind of Science", in which he described using cellular automata to make a wide variety of patterns -- all done with his Mma notebooks.
Wolfram’s massive book was panned by academics for being derivative of other work and yet stingy with attribution. “He insinuates that he is largely responsible for basic ideas that have been central dogma in complex systems theory for 20 years,” a fellow researcher told the Times Higher Education in 2002.
More recently, a certain Fernando Pérez found himself in a similar position with lots of cobbled-together software. He and two others liked the programming language Python, and they were developing ways of making it easier to use. He unified their work as IPython, "Interactive Python", and like Mathematica, it uses a notebook structure. However, he made it open-source instead of payware, open-source like Python itself. IPython now supports other programming languages, and it has been renamed Jupyter as a result.
It has had a lot of success, with a large number of users -- it's now searched for at least as much as Mathematica is. But building software tools has not been very rewarding in academia.
Number-crunching is getting more respect because it has become more necessary for making big discoveries. Like assembling genome data. Genomes are sequenced by cutting them up into small fragments and then sequencing those fragments, because existing sequencing machines cannot sequence more than around 500 nucleobase pairs at a time, much smaller than the billions of bp's in many genomes. Once they are sequenced, these fragments have to be assembled, and that's done by looking for sequence overlaps.Pérez told me stories of scientists who sacrificed their academic careers to build software, because building software counted for so little in their field: The creator of matplotlib, probably the most widely used tool for generating plots in scientific papers, was a postdoc in neuroscience but had to leave academia for industry. The same thing happened to the creator of NumPy, a now-ubiquitous tool for numerical computing. Pérez himself said, “I did get straight-out blunt comments from many, many colleagues, and from senior people and mentors who said: Stop doing this, you’re wasting your career, you’re wasting your talent.” Unabashedly, he said, they’d tell him to “go back to physics and mathematics and writing papers.”
Still, those who stay are making progress. Pérez himself recently got a faculty appointment in the stats department at Berkeley. The day after we spoke, he was slated to teach an upper-division data-science course, built entirely on Python and Jupyter notebooks. “The freshman version of that course had in the fall I think 1,200 students,” he said. “It’s been the fastest-growing course in the history of UC Berkeley. And it’s all based on these open-source tools.”
The "py" part refers to Python.At one point, Pérez told me the name Jupyter honored Galileo, perhaps the first modern scientist. The Jupyter logo is an abstracted version of Galileo’s original drawings of the moons of Jupiter. “Galileo couldn’t go anywhere to buy a telescope,” Pérez said. “He had to build his own.”
It remains to be seen how far this will go. I suspect that such interactive notebooks will first be published as "Supplementary Information", alongside the "main" paper. A further problem is that scientific papers have a rather stylized organization: Abstract, Introduction, Materials and Methods, Results, Discussion, Conclusions, and References. So an interactive notebook will likely have to include such content.