Blogger D

Blogging for fun and no profit.

« Previous

ExcelPackage

19 October 2007

I've been reviewing the ExcelPackage project on CodePlex. It is a useful project but one thing I was wondering is why does it directly interact with the XML?

I think it is a performance and a code maintenance hit to work directly with the XML for the workbook and worksheets. In my oh so humble opinion I think it would be easier to create plain old C# objects to work with the cells, rows, sheets, etc. and then serialize and de-serialize as needed. I say this because in the XML for a worksheet the cells are stored in XML nodes that have the row number and individual cell addresses in them:

PLAIN TEXT
XML:
  1. <sheetdata>
  2. <row r="1" spans="2:13" ht="12" customHeight="1"/>
  3. <row r="2" spans="2:13" ht="17.25" hidden="1" customHeight="1">
  4. <c r="B2" s="3"/>
  5. <c r="C2" s="3"/>
  6. <c r="D2" s="13"/>
  7. <c r="E2" s="13"/>
  8. <c r="F2" s="13"/>
  9. <c r="G2" s="13"/>
  10. <c r="H2" s="13"/>
  11. <c r="I2" s="13"/>
  12. <c r="J2" s="13"/>
  13. <c r="K2" s="13"/>
  14. <c r="L2" s="13"/>
  15. <c r="M2" s="13"/>
  16. </row>
  17. </sheetdata>

So if we work directly with the XML what has to happen when we insert a row? We have to loop through all of the XML nodes and update the row values. Note that I am not even talking about how to deal with merge cells or ranges (which the ExcelPackage does not work with).

I think that creating a Attribute-value system would be a better approach. Rows and Columns could be a linked list, Cells could be in an indexed cache or hashtable. And the collections and individual objects would handle where they were in the domain of the spreadsheet.

Guess I should get busy and prove my point.

Posted in General | Trackback | del.icio.us | Top Of Page

Comments are closed.