|laziness, impatience, and hubris|
RFC Tutorial - Deleting Excel Rows, Columns and Sheetsby davies (Vicar)
|on Feb 06, 2012 at 21:49 UTC||Need Help??|
This is being written out of laziness. Questions on deleting rows and sheets are possibly the two commonest Excel questions on this site. This tutorial is intended to provide a single point of reference.
This is not intended as an introduction to automating Excel with Perl. There are plenty of other nodes and resources on how to do this. Please post requests for advice on code as new SOPW posts, not as replies to this tutorial.
Deleting Rows and Columns
Everything said about deleting rows applies identically to deleting columns. Change xlUp to xlToLeft and EntireRow to EntireColumn if that is what you want to do.
There are two ways of deleting rows in Excel. The least commonly used way is one I frequently refer to as "Bill Gates Knows Best". You are most unlikely to want to use this method, but if you are unfamiliar with Excel and have recorded a macro using default behaviour, you may go down this route.
This method deletes a row from the chosen column only. In this case, the delete command worked on cell B1 (row 1 column 2 if you're not familiar with the Cells property). As a result, cells were moved up in column B but not in column A, with the result that the 'c' entries are now out of alignment. Compare the following:
In this case, the whole of the original row 1 was deleted, moving up all cells below. This meant that the 'a' entered in the original A1 has been deleted and the 'c's retain their alignment. This is far more likely to be what you want and will be used in the rest of this section.
Most commonly, it is not a known single row that should be deleted but a number of rows based on some condition. This leads to a common problem. As we saw in both the previous examples, when deleting, cells get moved up. Although, in both cases, we deleted cell B1, there is still a cell called B1 in the spreadsheet. Consider the following example, where we delete rows with a blank cell in column B:
It is not unreasonable to expect this code to leave two rows, but it leaves three, apparently failing to delete the original row 3 even though there is a blank cell in column B. This is because of a logical error in the code. The code loops from 1 to 4. When it gets to row 2 it deletes the row. This process moves the original row 3 up to row 2. However, the code does not re-check row 2 but moves on to row 3, the original row 4. So the original row 3 gets moved up to row 2 and is never tested. The following code works backwards and avoids the problem:
In all these examples, I have used Cells to specify the deletion point. Using Range or RangeR1C1 would work just as well. The advantage of Cells over Range is that the columns can be expressed numerically, avoiding complex code to sort out the change from one to two letters (or two to three in Excel 2007 & later).
You cannot delete every sheet in a workbook. Excel will fail silently. If you want to see this, try the following code:
If you try doing this directly from the Excel interface, Excel will give a warning but will not do so under control from Perl. Despite the fact that there was only one sheet in the workbook, Excel still has one sheet after the delete command. It cannot accept a workbook with no sheets.
If you try to work forward from 1 to n when deleting multiple sheets, you will get an error unless you are very lucky. When deleting rows, a new row is always created at the end so it never runs out of rows to delete. This does not apply to sheets. So:
This deletes only one sheet, sheet 2, and crashes with the message Can't call method "Delete" on an undefined value at Z:\Data\Perl\DeleteObj\Sht2.pl line 11.. As with rows, a solution is to work backwards.
This is certainly the approach to use if deleting sheets conditionally. But if you are setting up Excel and want to be sure that the user has only one sheet in the workbook - something I frequently do when answering questions here - another approach is possible:
You may wonder why I use this approach rather than setting SheetsInNewWorkbook as I do in the rest of this example code. The reason is that changes to SheetsInNewWorkbook are saved automatically by Excel when it exits normally. So if a user changes my code and introduces a bug, this may crash before I reset SheetsInNewWorkbook. That instance of Excel will be orphaned and will have to be closed manually, saving the value at the point of the crash. The user will then be stuck with whatever was used in that code until he resets it manually. This can be a pain.