Introduction to NumPy

This chapter introduces NumPy, a powerful Python library for numerical computation. It covers core concepts such as arrays, indexing, operations, and statistical functions vital for data analysis and scientific computing.

Notes on Introduction to NumPy

1. Overview of NumPy

NumPy stands for Numerical Python and is a fundamental package for scientific computing in Python.

  • Provides support for large, multi-dimensional arrays and matrices.
  • It contains a large collection of mathematical functions to operate on these arrays.
  • Offers tools for integrating with C/C++, facilitating high-performance capabilities.
  • To install NumPy, you can use the command pip install numpy.

2. Understanding Arrays

An Array is a data structure that can hold a fixed-size sequential collection of elements of the same type.

  • Unlike lists, arrays store elements contiguously in memory, which enhances speed of access.
  • Characteristics of arrays:
    • All elements are of the same data type.
    • Each element can be accessed via a unique index.
    • Uses zero-based indexing.
  • Example of an array:
array = [10, 20, 30]

3. The NumPy Array (ndarray)

  • In NumPy, arrays are called ndarray (n-dimensional array).
  • They can store numerical data like lists and matrices but are more efficient.
  • Differences between Lists and Arrays:
    1. Arrays require all elements to be of the same data type, while lists do not.
    2. Arrays store data contiguously, whereas lists do not.
    3. Arrays support element-wise operations directly; lists do not.
    4. Lists consume more memory due to storing type information.
  • You create an ndarray using:
import numpy as np
array1 = np.array([1, 2, 3])

4. Array Creation and Attributes

  • Arrays can be created in various ways:
    • From lists or tuples using the np.array() function.
    • np.zeros(shape) creates an array filled with zeros.
    • np.ones(shape) creates an array filled with ones.
    • np.arange(start, stop, step) creates arrays with a specified range and interval.
  • Useful attributes of ndarray:
    • ndarray.ndim: Returns number of dimensions
    • ndarray.shape: Returns dimensions (rows, columns)
    • ndarray.size: Total number of elements
    • ndarray.dtype: Data type of elements
    • ndarray.itemsize: Size in bytes of each element

5. Indexing and Slicing

Indexing allows you to access individual elements. NumPy supports:

  • 1-D: array[i]
  • 2-D: array[i, j], where i is the row and j is the column.
  • Slicing allows you to extract parts of an array using array[start:end] or array[start:end:step].
  • For 2-D arrays: array[start_row:end_row, start_col:end_col].

6. Operations on Arrays

  • You can perform various operations on arrays:
    • Arithmetic operations (addition, subtraction, etc.) are applied element-wise.
    • Matrix operations include transposition and matrix multiplication (using @ operator).
    • Sorting using sort() rearranges array elements.
  • Example Operations:
array1 + array2 # element-wise addition
array1 * array2 # element-wise multiplication
array1 @ array2 # matrix multiplication

7. Concatenation and Reshaping

  • Concatenate arrays using np.concatenate() or stacks such as np.vstack() and np.hstack() for vertical and horizontal combinations.
  • Reshape arrays using array.reshape(new_shape) to change dimensions while keeping the same data.

8. Statistical Operations

NumPy offers built-in functions for statistical computations:

  • np.sum(), np.mean(), np.max(), np.min(), np.std() calculate sum, mean, maximum, minimum, and standard deviation respectively.
  • Statistical operations can be performed along specified axes (for 2-D arrays).

9. Loading and Saving Arrays

  • Data from files can be loaded into NumPy arrays using:
    • np.loadtxt() for loading plain text files.
    • np.genfromtxt() for loading data with potential missing values.
  • To save arrays to text files, use np.savetxt().

In summary, NumPy is a core tool for efficient numerical computing with robust support for handling arrays, mathematical functions, and data processing features.

Key terms/Concepts

  1. NumPy is essential for scientific computing in Python.
  2. An Array is a collection of elements of the same type, stored contiguously in memory.
  3. NumPy’s array is called ndarray, allowing operations on n-dimensional data.
  4. Arrays support element-wise operations, enhancing computational efficiency.
  5. Key attributes of NumPy arrays include ndim, shape, size, dtype, and itemsize.
  6. Indexing in NumPy is zero-based, and supports slicing for extracting array parts.
  7. Arithmetic and statistical operations can be performed directly on arrays.
  8. Arrays can be concatenated or reshaped without altering their data.
  9. Data can be loaded from and saved to files using specific NumPy functions.

Other Recommended Chapters