Reading and Updating Excel Files in Python Using Pandas

User Icon By Azam Akram,   Calendar Icon March 8, 2025
Excel Files in Python Using Pandas

Working with Excel files is a common task in data analysis and automation. Whether you are processing financial records, managing inventory, or analyzing survey results, Excel is widely used for storing and organizing structured data. However, manually handling large Excel files can be time-consuming and prone to errors. This is where Python comes into play, offering a powerful and efficient way to automate Excel file manipulation.

Python provides several libraries to work with Excel files, but one of the most popular and powerful options is pandas. Pandas is a high-performance data manipulation and analysis library that makes it easy to read, process, and modify Excel files programmatically. With pandas, you can filter data, clean datasets, create summaries, and even perform complex transformations with just a few lines of code. Its integration with other Python libraries also makes it an essential tool for data science and automation tasks.

In this blog post, we'll explore how to use Python and pandas to read a specific column from an Excel file and update a cell in a given row and column. These fundamental operations are useful for automating data entry, updating reports, and handling real-world datasets efficiently.

If you are looking to work with excel files using openpyxl then I recommend you to read this blog.

Prerequisites

Before start, make sure you have the required dependencies installed:

pip install pandas openpyxl
  • pandas: A powerful library for data manipulation.
  • openpyxl: A library required by pandas to read and write Excel .xlsx files.

1. Reading a Specific Column from an Excel File

We define a function read_nth_column that reads data from a specific column in an Excel file and returns them as a list. First, this function checks if the file exists or not, then reads the nth column. It also removes empty (NaN) values before returning the list.

The function takes following parameters,

file_path: Path to the Excel file.
column_number: The column number (1-based index) to read.

and returns list of values from the specified column.

Here is complete function code,

import os
import pandas as pd

def read_nth_column(file_path: str, column_number: int):
    if not os.path.exists(file_path):
        print("Error: File does not exist.")
        return []
    
    try:
        df = pd.read_excel(file_path, engine='openpyxl')
        col_index = column_number - 1  # Convert to zero-based index

        if col_index < 0 or col_index >= len(df.columns):
            print("Error: Invalid column number.")
            return []

        return df.iloc[:, col_index].dropna().tolist()
    
    except Exception as e:
        print(f"Error reading Excel file: {e}")
        return []

Usage Example:

if __name__ == "__main__":
    file_path = os.path.abspath("Financial Sample.xlsx")
    print("Using file path:", file_path)

    # Read a specific column
    column_number = 2
    values = read_nth_column(file_path, column_number)
    print(f"Column {column_number} values:", values)

2. Updating a Specific Cell in an Excel File

Now, let’s add a function to modify the value in a specific row and column. First, this function checks if the file exists or not, then converts the row and column numbers to zero-based indexing. It validates if the specified cell is within range or not and then updates the cell and saves the modified file.

The function takes following parameters,

file_path: Path to the Excel file.
column_number: The column number (1-based index).
row_number: The row number (1-based index).
new_value: The new value to insert.

and edit the value at specified cell,

def update_cell_value(file_path: str, column_number: int, row_number: int, new_value):
    if not os.path.exists(file_path):
        print("Error: File does not exist.")
        return False
    
    try:
        df = pd.read_excel(file_path, engine='openpyxl')
        col_index = column_number - 1  # Convert to zero-based index
        row_index = row_number - 1  # Convert to zero-based index

        if col_index < 0 or col_index >= len(df.columns):
            print("Error: Invalid column number.")
            return False
        
        if row_index < 0 or row_index >= len(df):
            print("Error: Invalid row number.")
            return False

        df.iloc[row_index, col_index] = new_value  # Update the cell
        df.to_excel(file_path, index=False, engine='openpyxl')  # Save changes

        print(f"Successfully updated cell at row {row_number}, column {column_number} with '{new_value}'")
        return True

    except Exception as e:
        print(f"Error updating Excel file: {e}")
        return False

Usage Example:

if __name__ == "__main__":
    file_path = os.path.abspath("Financial Sample.xlsx")
    print("Using file path:", file_path)

    # Update a specific cell
    row_number = 3
    new_value = "Updated Value"
    update_cell_value(file_path, column_number, row_number, new_value)

With this approach, you can efficiently read and modify Excel files in Python using pandas. This is especially useful for data automation, reports, and data cleaning tasks.