In data processing and office automation scenarios, we frequently handle Excel files exported from databases, filled out by users, or generated by third-party systems. These files often contain significant amounts of "dirty data," with blank rows and blank columns being the most common issues.
These invalid entries not only detract from the visual appeal of reports but can also severely disrupt subsequent data analysis, statistical formula calculations, and the accuracy of system imports. Manually filtering and deleting thousands of rows in Excel is both time-consuming and prone to errors.
This article demonstrates how to automatically detect and delete blank rows and columns in Excel using Java code.
Environment Setup
Before writing code, ensure that the Spire.XLS for Java dependency is included in your project.
If you are using a Maven project, add the following repository and dependency to your pom.xml:
<repositories>
<repository>
<id>com.e-iceblue</id>
<name>e-iceblue</name>
<url>https://repo.e-iceblue.cn/repository/maven-public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>e-iceblue</groupId>
<artifactId>spire.xls</artifactId>
<version>16.2.6</version>
</dependency>
</dependencies>
For non-Maven projects, you can download the JAR package from the official website and manually add it to your project's classpath.
Understanding Core Logic: Reverse Iteration
Before diving into the code, it is crucial to understand a key technical detail: Reverse Iteration.
When you delete a row or column, the indices of the subsequent rows or columns shift.
The Problem: If you iterate from the first row to the last (1 to N) and delete row 3, the original row 4 shifts up to become the new row 3. If your loop continues to index 4, you effectively skip checking the new row 3 (which was originally row 4).
The Solution: Iterate backwards, starting from the last row (or column) down to the first. This way, even if you delete the current row, the indices of the remaining rows (which are lower than the current index) remain unchanged, ensuring no data is skipped.
Spire.XLS provides a convenient isBlank() method to determine if a row or column is completely empty, significantly simplifying the logic.
Example 1: Deleting Blank Rows in Excel
This example demonstrates how to load an Excel file, iterate through all used rows, detect those containing no data, and delete them.
Code Implementation
import com.spire.xls.ExcelVersion;
import com.spire.xls.Workbook;
import com.spire.xls.Worksheet;
public class DeleteBlankRows {
public static void main(String[] args) {
// 1. Create a Workbook object
Workbook wb = new Workbook();
// 2. Load the sample Excel document
// Ensure "sample.xlsx" exists in your project directory
wb.loadFromFile("sample.xlsx");
// 3. Get the first worksheet
Worksheet sheet = wb.getWorksheets().get(0);
// 4. Iterate backwards through all used rows
// getLastRow() returns the row number of the last used row (including formatting or data)
// We loop from the last row down to row 1
for (int i = sheet.getLastRow(); i >= 1; i--) {
// 5. Check if the current row is blank
// getRows()[i-1] gets the row object (array indices start at 0, so use i-1)
// isBlank() checks if the row contains no data or formatting
if (sheet.getRows()[i - 1].isBlank()) {
// 6. If blank, delete the row
// deleteRow(i) deletes the row at index i
sheet.deleteRow(i);
System.out.println("Deleted blank row: " + i);
}
}
// 7. Save the result document
wb.saveToFile("DeleteBlankRows_Only.xlsx", ExcelVersion.Version2016);
System.out.println("Blank row cleanup complete! File saved.");
}
}
Key Points
sheet.getLastRow(): Retrieves the row number of the last used row in the worksheet. This is far more efficient than iterating through the maximum possible rows (e.g., 1,048,576).sheet.getRows()[i-1]: Note that Java array indices start at0, while Excel row numbers start at1. Therefore, to access rowi, we use indexi-1.isBlank(): This is the core judgment method in Spire.XLS. It checks not only for text or numbers but typically also considers formulas or objects. It returnstrueif the entire row is empty.
Example 2: Deleting Blank Columns in Excel
The logic is very similar to deleting rows, except the target changes from "Rows" to "Columns," and the iteration variable changes from row numbers to column numbers.
Code Implementation
import com.spire.xls.ExcelVersion;
import com.spire.xls.Workbook;
import com.spire.xls.Worksheet;
public class DeleteBlankColumns {
public static void main(String[] args) {
// 1. Create a Workbook object
Workbook wb = new Workbook();
// 2. Load the sample Excel document
wb.loadFromFile("sample.xlsx");
// 3. Get the first worksheet
Worksheet sheet = wb.getWorksheets().get(0);
// 4. Iterate backwards through all used columns
// getLastColumn() returns the column number of the last used column
for (int j = sheet.getLastColumn(); j >= 1; j--) {
// 5. Check if the current column is blank
// getColumns()[j-1] gets the column object
if (sheet.getColumns()[j - 1].isBlank()) {
// 6. If blank, delete the column
sheet.deleteColumn(j);
System.out.println("Deleted blank column: " + getColumnName(j));
}
}
// 7. Save the result document
wb.saveToFile("DeleteBlankColumns_Only.xlsx", ExcelVersion.Version2016);
System.out.println("Blank column cleanup complete! File saved.");
}
// Helper method: Convert column number to Excel column name (e.g., 1 -> A, 2 -> B)
private static String getColumnName(int columnNumber) {
int dividend = columnNumber;
String columnName = "";
while (dividend > 0) {
int modulo = (dividend - 1) % 26;
columnName = (char)('A' + modulo) + columnName;
dividend = (dividend - modulo) / 26;
}
return columnName;
}
}
Key Points
sheet.getLastColumn(): Retrieves the column number of the last used column in the worksheet.sheet.deleteColumn(j): Deletes the specified column; columns to the right automatically shift left to fill the gap.Helper Method: To make log output more user-friendly, a
getColumnNamemethod is added to convert numeric column indices into familiar "A", "B", "C" formats, aiding in debugging.
Common Questions and Considerations
1. Performance Optimization
For extremely large Excel files (hundreds of thousands of rows), Java memory consumption may increase.
Ensure you call
wb.dispose()after operations (if supported by your API version) to assist garbage collection.Avoid frequent I/O operations inside loops. The examples above save the file only once at the end, which is a best practice.
2. Backup Original Files
Automation scripts can be destructive (they directly delete data). Before running such code in a production environment, always backup the original files or save the results to a new filename to prevent accidental loss of important data.
Conclusion
By using Java and Spire.XLS, you can easily automate the deletion of blank rows and columns in Excel. This approach minimizes human error inherent in manual operations and significantly boosts work efficiency.
Whether you are cleaning up database exports or standardizing report formats, these code snippets provide a robust foundation for your data preprocessing tasks.
We hope this article proves helpful for your Java Excel development endeavors!

Comments
Post a Comment