Python libraries that changed the era of Big Data
Guido van Rossum
Python is the most popular tool used in big data analysis today.
If we think of one biggest contributor to this, many people would surprisingly pick
Travis E. Oliphant, who created the Numpy, rather than Guido van Rossum, who made the Python.
In 1995, the Python communities came up with a special interest group called matrix-sig.
It was to create a new array data type that only Python, among other languages, does not have.
One of its members was the MIT graduate student, Jim Hugunin.
He came up with a C-extension module called Numeric, based on Jim Fulton’s matrix object.
In 1977, Hugunin left to focus on the Java Project, a realization of Python with Java.
Then, Paul Dubois became in charge of the Numeric Project.
By this time, the Python communities go through a thorough review on scientific requirements.
In fact, the creator of Python, Guido van Rossum was also an active member of the matrix-sig.
With the effort of the Python communities, there were many changes made, such as the addition of complex numbers.
However, these efforts were focused on creating a more clear, simple syntax for array manipulation.
For the next five years, Numeric kept on being improved with the attention of engineers and scientists in line,
and additional packages for scientific computing were developed and shared.
By 2000, with more extended modules coming out, there were some incidents which greatly advanced Python’s effectiveness.
First, Travis Oliphant, Eric Jones, and Peary Peterson combined the codes for scientific computation and came up with SciPy.
Soon, Fernando Perex established the first version of IPython. This IPython is an interactive shell which combined codes and explanations.
It was widely used in scientific communities.
Also, Jone D. Hunter came up with matplotlib 1.0, which is the standard 2D plotting library for scientific computation.
All these incredible libraries came out like this!
These earlier packages were very useful on the basis of Numeric, but the difficulties in extending the codes
slowed the process of further development. In order to overcome this difficulty,
Perry Greenfield, Jay Todd Miller, and Richard L. White created the new array package called numarray.
Unfortunately, Numeric and numarray bumped against each other for years.
Luckily, Travis Oliphant reestablished Numeric, merging the most useful features of numarray,
and solved the problem.
Unlike other programming languages, Python received love because of its intuitive expressions and rich built-in data structures.
Overcoming its only weakness: the slow speed of array manipulation with NumPy, it finally spread its wings.
Now, nothing was impossible. It also got the computing ability required in the field of science and engineering.
Later, the SciPy community grows faster than ever.
NumPy wasn’t called ‘Numpy’ from the beginning. In late 2005, it was called ‘SciPy Core’ for 6 months.
On January, 2006, it was then called ‘NumPy’.
Do you need database performance monitoring? Contact us and we will send you a free quote
[email protected]