Warm tip: This article is reproduced from stackoverflow.com, please click
c# datatable excel openxml worksheet

Empty rows are getting skipped while getting data from Excel to DataTable

发布于 2020-04-08 09:42:47

Task

Import data from excel to DataTable

Problem

Some of Rows that does not contain any data are getting skipped and the very next Row that has data in the row is used as the value of the empty Row

E.g

In Excel Totally in have 37 Rows when i use openxml to convert excel to Datatable it skipped empty rows and read 29 rows only

WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId);
Worksheet workSheet = worksheetPart.Worksheet;
SheetData sheetData = workSheet.GetFirstChild<SheetData>();
IEnumerable<Row> rows = sheetData.Descendants<Row>();
foreach (Row row in rows) //this will also include your header row...
{
    DataRow tempRow = dt.NewRow();
    int ko = row.Descendants<Cell>().Count();
    for (int i = 0; i < row.Descendants<Cell>().Count(); i++)
    {
        tempRow[i] = GetCellValue(spreadSheetDocument, row.Descendants<Cell>().ElementAt(i));
    }
    dt.Rows.Add(tempRow);
}
Questioner
rajadurai
Viewed
30
Thomas Barnekow 2020-02-03 23:16

If you look at the Open XML markup of an Excel worksheet, you will see that empty rows don't even exist in the markup. This means that when you are reading your rows in your foreach loop, you will skip those empty, non-existent rows right there.

If you want those empty rows in your DataTable, you will have to read each existing row, keeping track of the last row number you have seen. If there is a gap between the current row number and the last row number you have seen, you need to fill in the gap, adding empty DataRow instances before you add a new DataRow for the current row.

Update 2020-02-03

To find out how you can determine the row numbers, you should look at the Open XML markup of a sample worksheet. For example, the following markup shows a reduced and simplified example worksheet with just the sheetData element and a number of row child elements. You will see that each row element (instance of Row class) has an attribute called r (RowIndex property of Row class), which specifies the row index. In this example, we see rows 2, 3, 5, and 8, so we see that rows 4, 6, and 7 are missing.

<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
  <sheetData>
    <row r="2" spans="2:2" x14ac:dyDescent="0.25">
      <c r="B2">
        <v>2</v>
      </c>
    </row>
    <row r="3" spans="2:2" x14ac:dyDescent="0.25">
      <c r="B3">
        <v>3</v>
      </c>
    </row>
    <row r="5" spans="2:2" x14ac:dyDescent="0.25">
      <c r="B5">
        <v>5</v>
      </c>
    </row>
    <row r="8" spans="2:2" x14ac:dyDescent="0.25">
      <c r="B8">
        <v>8</v>
      </c>
    </row>
  </sheetData>
</worksheet>