Python Packages and Modules
As programs get bigger and bigger, they become more difficult to maintain and comprehend. Therefore, dividing programs into smaller, manageable pieces (modules) and reusing these pieces in different programs is a common practice. Similarly, modules can be further grouped into packages.
Python Modules
A module is simply a Python file .py
that can contain variables, functions, or classes. These can be reused in other Python programs by importing the module with the import
statement.
For instance, let’s create a simple module named concentration.py
:
def molality(moles_solute, mass_solvent_kg):
return moles_solute / mass_solvent_kg
This module contains a single function molality
, which calculates the molality given the amount of solute (in moles) and the mass of the solvent (in kilograms).
We can then use this function in our programs like this:
import concentration
= 1
moles_solute = 1
mass_solvent_kg
print(concentration.molality(moles_solute, mass_solvent_kg)) # Output: 1.0
Python Packages
When the number of modules grows, it becomes necessary to arrange them in a hierarchical structure. A package in Python is simply a directory that contains other directories or modules.
Each package in Python is a directory which MUST contain a special file called __init__.py
. This file can be empty, and it indicates that the directory it contains is a Python package, so it can be imported the same way a module can be imported.
Consider the following directory structure:
chemicals/
__init__.py
concentrations.py
reactions.py
In this structure, chemicals
is a package containing two modules: concentrations
and reactions
. The concentrations
module can be imported as follows:
from chemicals import concentrations
= 1
moles_solute = 1
mass_solvent_kg
print(concentrations.molality(moles_solute, mass_solvent_kg)) # Output: 1.0
Python comes with a rich ecosystem of libraries (also known as packages), such as NumPy, SciPy, and pandas for data analysis, Matplotlib and Seaborn for data visualization, and Sklearn for machine learning. These libraries are composed of different modules that can be imported based on the requirement of the program.
As you become more comfortable with Python, you will use packages and modules more often. These powerful tools can greatly reduce the time and effort required to write code, and can make your code more readable and easier to maintain.
Introduction to NumPy
For the moment, let’s just look at one package from the Python ecosystem. NumPy, short for Numerical Python, is a fundamental package for scientific computing in Python. It provides support for arrays (including multi-dimensional arrays), array mathematics, random number generation, and much more. NumPy also allows us to perform data analysis and data manipulation tasks efficiently.
Installing and Importing NumPy
To install NumPy, open your terminal or command prompt and type the following command:
pip install numpy
In your jupyter notebook you could also run.
!pip install numpy
The !
in before a command (here pip install numpy
) allows us to execute that command as if we were in our terminal or command prompt.
Once you’ve installed NumPy, you can import it in your programs using the following command:
import numpy as np
The as np
part is used to create an alias for numpy, which allows us to use np
instead of numpy
when calling NumPy functions. This is not required, but it is a common practice and it makes the code cleaner.
NumPy Arrays
A NumPy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.
Here’s how you create a NumPy array:
import numpy as np
# Create a 1D NumPy array from a list
= np.array([1, 2, 3])
a
# Create a 2D NumPy array from a list of lists
= np.array([[1, 2, 3], [4, 5, 6]])
b
print(a) # Output: [1 2 3]
print(b) # Output: [[1 2 3], [4 5 6]]
Operations in NumPy are element-wise, that is, they are applied element by element in the arrays. For example, here’s how you can do array addition:
import numpy as np
= np.array([1, 2, 3])
a = np.array([4, 5, 6])
b
= a + b
c
print(c) # Output: [5 7 9]
In the context of chemistry, let’s say we have an array of hydrogen atom masses and an array of oxygen atom masses, we can find an array of water molecule masses by simply adding the arrays:
import numpy as np
# Hydrogen's atomic mass
= np.array([1.007, 1.007])
H_mass
# Oxygen's atomic mass
= np.array([15.999])
O_mass
# Mass of water molecules
= H_mass.sum() + O_mass # Element-wise addition
H2O_mass
print('Masses of water molecules: ', H2O_mass) # Output: Masses of water molecules: [18.013]
Using NumPy for Chemical Calculations
With NumPy, we can make calculations for chemical problems simpler and more efficient.
For example, let’s calculate the molar mass of Carbon Dioxide (CO2). We know that the molar mass of carbon (C) is approximately 12.01 g/mol and the molar mass of oxygen (O) is approximately 16.00 g/mol.
import numpy as np
# Define atomic masses
= np.array([12.01]) # Carbon
C_mass = np.array([16.00, 16.00]) # Oxygen
O_mass
# Calculate molar mass of CO2
= C_mass.sum() + O_mass.sum()
CO2_mass
print('Molar mass of CO2: ', CO2_mass, 'g/mol') # Output: Molar mass of CO2: 44.01 g/mol