The NumPy (Numerical Python) library first appeared in 2006 and is the preferred Python array implementation when thinking about data science applications. It offers a high-performance, richly functional n-dimensional array type called ndarray, which from this point forward we’ll refer to by its synonym, array. NumPy is one of the many open-source libraries that you can use in your Python programs. Operations on arrays are up to two orders of magnitude faster than those on lists. In a big-data world in which applications may do massive amounts of processing on vast amounts of array-based data, this performance advantage can be critical. According to, over 450 Python libraries depend on NumPy. Many popular data science libraries such as Pandas, SciPy (Scientific Python) and Keras (for deep learning) are built on or depend on NumPy.

In this section of our text, we will explore the array’s basic capabilities. For example, lists can have multiple dimensions, and you generally process multi-dimensional lists with nested loops. A strength of NumPy is array-oriented programming, which uses internal iteration to make array manipulations concise and straightforward, eliminating the kinds of errors that can occur with the external iteration of explicitly programmed loops.