Python Packages and Modules

As programs get bigger and bigger, they become more difficult to maintain and comprehend. Therefore, dividing programs into smaller, manageable pieces (modules) and reusing these pieces in different programs is a common practice. Similarly, modules can be further grouped into packages.

Python Modules

A module is simply a Python file .py that can contain variables, functions, or classes. These can be reused in other Python programs by importing the module with the import statement.

For instance, let’s create a simple module named concentration.py:

def molality(moles_solute, mass_solvent_kg):
    return moles_solute / mass_solvent_kg

This module contains a single function molality, which calculates the molality given the amount of solute (in moles) and the mass of the solvent (in kilograms).

We can then use this function in our programs like this:

import concentration

moles_solute = 1
mass_solvent_kg = 1

print(concentration.molality(moles_solute, mass_solvent_kg))  # Output: 1.0

Python Packages

When the number of modules grows, it becomes necessary to arrange them in a hierarchical structure. A package in Python is simply a directory that contains other directories or modules.

Each package in Python is a directory which MUST contain a special file called __init__.py. This file can be empty, and it indicates that the directory it contains is a Python package, so it can be imported the same way a module can be imported.

Consider the following directory structure:

chemicals/
    __init__.py
    concentrations.py
    reactions.py

In this structure, chemicals is a package containing two modules: concentrations and reactions. The concentrations module can be imported as follows:

from chemicals import concentrations

moles_solute = 1
mass_solvent_kg = 1

print(concentrations.molality(moles_solute, mass_solvent_kg))  # Output: 1.0

Python comes with a rich ecosystem of libraries (also known as packages), such as NumPy, SciPy, and pandas for data analysis, Matplotlib and Seaborn for data visualization, and Sklearn for machine learning. These libraries are composed of different modules that can be imported based on the requirement of the program.

As you become more comfortable with Python, you will use packages and modules more often. These powerful tools can greatly reduce the time and effort required to write code, and can make your code more readable and easier to maintain.

Introduction to NumPy

For the moment, let’s just look at one package from the Python ecosystem. NumPy, short for Numerical Python, is a fundamental package for scientific computing in Python. It provides support for arrays (including multi-dimensional arrays), array mathematics, random number generation, and much more. NumPy also allows us to perform data analysis and data manipulation tasks efficiently.

Installing and Importing NumPy

To install NumPy, open your terminal or command prompt and type the following command:

pip install numpy

In your jupyter notebook you could also run.

!pip install numpy

The ! in before a command (here pip install numpy) allows us to execute that command as if we were in our terminal or command prompt.

Once you’ve installed NumPy, you can import it in your programs using the following command:

import numpy as np

The as np part is used to create an alias for numpy, which allows us to use np instead of numpy when calling NumPy functions. This is not required, but it is a common practice and it makes the code cleaner.

NumPy Arrays

A NumPy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

Here’s how you create a NumPy array:

import numpy as np

# Create a 1D NumPy array from a list
a = np.array([1, 2, 3])

# Create a 2D NumPy array from a list of lists
b = np.array([[1, 2, 3], [4, 5, 6]])

print(a)            # Output: [1 2 3]
print(b)            # Output: [[1 2 3], [4 5 6]]

Operations in NumPy are element-wise, that is, they are applied element by element in the arrays. For example, here’s how you can do array addition:

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

c = a + b

print(c)  # Output: [5 7 9]

In the context of chemistry, let’s say we have an array of hydrogen atom masses and an array of oxygen atom masses, we can find an array of water molecule masses by simply adding the arrays:

import numpy as np

# Hydrogen's atomic mass
H_mass = np.array([1.007, 1.007])

# Oxygen's atomic mass
O_mass = np.array([15.999])

# Mass of water molecules
H2O_mass = H_mass.sum() + O_mass  # Element-wise addition

print('Masses of water molecules: ', H2O_mass)  # Output: Masses of water molecules: [18.013]

Using NumPy for Chemical Calculations

With NumPy, we can make calculations for chemical problems simpler and more efficient.

For example, let’s calculate the molar mass of Carbon Dioxide (CO2). We know that the molar mass of carbon (C) is approximately 12.01 g/mol and the molar mass of oxygen (O) is approximately 16.00 g/mol.

import numpy as np

# Define atomic masses
C_mass = np.array([12.01])  # Carbon
O_mass = np.array([16.00, 16.00])  # Oxygen

# Calculate molar mass of CO2
CO2_mass = C_mass.sum() + O_mass.sum()

print('Molar mass of CO2: ', CO2_mass, 'g/mol')  # Output: Molar mass of CO2: 44.01 g/mol