Learn how to extract all the Python library imports from your code files with this simple Python script.
Key Takeaways
- You can extract all the Python library imports from a file using a simple Python script.
- The script uses the built-in
pathlib
library to recursively search for all Python files in a directory. - The script uses a regular expression to extract the library names from import statements.
Introduction
As your Python project grows, you may find it difficult to keep track of all the libraries that you’ve imported in your code files. This information can be useful for various reasons, such as checking for unused libraries or generating a list of dependencies.
In this article, we’ll show you how to extract all the Python library imports from your code files with a simple Python script. The script will recursively search for all Python files in a directory and extract the library names from import statements.
Step 1: Define the Helper Functions
Before we start, let’s define two helper functions that we’ll use to extract the imports and search for files:
from pathlib import Path
import re
def get_imports(file_path):
imports = []
with open(file_path, "r") as file:
for line in file:
match = re.search("^import ([^\n]+)", line)
if match:
imports += match.group(1).split(", ")
else:
match = re.search("^from ([^\n]+) import", line)
if match:
imports.append(match.group(1))
return imports
def get_files(directory):
files = []
for path in Path(directory).rglob("*.py"):
if path.is_file():
files.append(str(path))
return files
The get_imports
function takes a file path as input and returns a list of library names that are imported in the file.
The function reads the file line by line and uses regular expressions to match import statements. It extracts the library names from the matched statements and returns them as a list.
The get_files
function takes a directory path as input and returns a list of all Python files in the directory and its subdirectories. The function uses the built-in pathlib
library to recursively search for all files with the .py
extension in the directory.
Step 2: Search for Files and Extract Imports
Now that we have the helper functions, we can use them to search for Python files and extract their imports:
directory_path = "/path/to/your/directory"
for file_path in get_files(directory_path):
imports = get_imports(file_path)
if imports:
print(f"{file_path}:")
for import_name in imports:
print(f"{import_name}")
The code above defines the directory_path
variable as the path to the directory where your Python files are located. It then uses the
get_files
function to get a list of all Python files in the directory and its subdirectories. For each file, it uses the get_imports
function to extract the library imports and prints them to the console.
The output will look something like this:
/path/to/your/directory/file1.py:
os
sys
/path/to/your/directory/subdirectory/file2.py:
numpy
matplotlib.pyplot
Conclusion
With the simple Python script we’ve shown you in this article, you can easily extract all the imports from your code files and get a better understanding of your project’s dependencies.