ava中使用apache事件模型解析excel文件时如何检查字符

ava中使用apache事件模型解析excel文件时如何检查字符

本文介绍了java - 在java中使用apache事件模型解析excel文件时如何检查字符串中的数字包含日期和指数数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在解析一个 excel 文件,其中包含许多日期,例如 13-4-2021 和一些 3,7%,2,65% 格式的数字.所以我正在解析该 excel 文件,我正在获取数据字符串将它们写入文本文件中.所以我的问题是我以整数形式获取日期,如 44299,而它实际上是 04/13/2021 在 Excel 表中的格式.另一种情况是我有一些数字与 3,7%,2,65% 之类的百分比类似 3.6999999999999998E-2.So 我可以使用

I am parsing an excel file which contains many dates like 13-4-2021 and some numbers in this 3,7%,2,65% format.So i am parsing that excel file and i am getting the data in a string to write them in a text file.So my problem is that i am getting the date in a whole number like 44299, while it is actually in 04/13/2021 format in the excel sheet.And another case is i have some numbers with percentage like 3,7%,2,65% which are coming like 3.6999999999999998E-2.So i can convert the number to a date using

SimpleDateFormat("MM/dd/yyyy").format(javaDate)

这是我使用的代码

private static class SheetHandler extends DefaultHandler {
    private SharedStringsTable sst;
    private String lastContents;
    private boolean nextIsString;
    private int rowNumber;

    private SheetHandler(SharedStringsTable sst) {
        this.sst = sst;
    }

    public void startElement(String uri, String localName, String name,
            Attributes attributes) throws SAXException {
        try {

            // row => row
            if(name.equals("row")) {
                   if (attributes.getValue("r") != null) {
                    rowNumber = Integer.valueOf(attributes.getValue("r"));
                   } else {
                    rowNumber++;
                   }
                   //System.out.println("row: " + rowNumber);
                  }

        if (rowNumber > 6) {

        // c => cell
        if(name.equals("c")) {
            // Print the cell reference

            //System.out.print(attributes.getValue("r") + " - ");
            // Figure out if the value is an index in the SST
            String cellType = attributes.getValue("t");
            if(cellType != null && cellType.equals("s")) {
                nextIsString = true;
            } else {
                nextIsString = false;
              }

        }
        // Clear contents cache
        lastContents = "";
        }
        }catch(Exception e) {
            e.printStackTrace();
        }
    }

    public void endElement(String uri, String localName, String name)
            throws SAXException {
        // Process the last contents as required.
        // Do now, as characters() may be called more than once

        if (rowNumber > 6) {


        if(nextIsString) {
            int idx = Integer.parseInt(lastContents);

            lastContents = new XSSFRichTextString(sst.getEntryAt(idx)).toString();
            nextIsString = false;
        }
        // v => contents of a cell
        // Output after we've seen the string contents
        if(name.equals("v")) {
           // System.out.println(lastContents);

            if(!lastContents.isEmpty() ) // Here i am putting the values to a list to process

                pickUpExcelValues.add(lastContents);
            }
        }
    }

    public void characters(char[] ch, int start, int length)
            throws SAXException {
        lastContents += new String(ch, start, length);
    }
}

但是我将如何检查字符串包含让我们说 44299 是一个日期?而且我也不知道如何在写入文本文件时将这个 3.6999999999999998E-2 转换为 3.7%.如果有人有任何想法,请帮忙.

But how i will check the string contains lets say 44299 is a date? And also i have no idea how to convert this 3.6999999999999998E-2 to 3,7% while writing to a text file.If anybody have any idea please help.

推荐答案

这个问题需要进一步解释.

This question needs further explanations.

起初与如何使用已回答的 apache 事件用户模型跳过 xlsm 文件中的行.

但是如果你想使用 XSSF 和 SAX(事件 API) 然后需要有关 Office Open XML 中使用的 XML 的基本知识.

But if one wants using the examples from XSSF and SAX (Event API) then one needs basic knowledge about the XML used in the Office Open XML.

ExampleEventUserModel 是一个非常低级的示例,展示了流式传输原理.为了将此扩展到考虑格式需要解析样式表,然后使用 DataFormatter.

And the ExampleEventUserModel is a very low level example showing the streaming principle. For extending this to taking formats into account needs parsing the styles table too and then using DataFormatter.

下面是一个完整的例子,它就是这样做的.但是还有一个更完整的示例,包括支持获取数字格式信息并将其应用于数字单元格(例如格式化日期或百分比).请参阅 XLSX2CSV 示例.

The following is a complete example which is doing exactly this. But there is a fuller example, including support for fetching number formatting information and applying it to numeric cells (eg to format dates or percentages). Please see the XLSX2CSV example in svn.

import java.io.InputStream;
import java.util.Iterator;

import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.SharedStringsTable;
import org.apache.poi.xssf.model.StylesTable;
import org.apache.poi.xssf.usermodel.XSSFRichTextString;
import org.apache.poi.xssf.usermodel.XSSFCellStyle;
import org.apache.poi.ss.usermodel.DataFormatter;
import org.apache.poi.ss.usermodel.BuiltinFormats;

import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.util.SAXHelper;
import javax.xml.parsers.ParserConfigurationException;

import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.helpers.XMLReaderFactory;


public class ExampleEventUserModel {
 public void processOneSheet(String filename) throws Exception {
  OPCPackage pkg = OPCPackage.open(filename);
  XSSFReader r = new XSSFReader( pkg );
  SharedStringsTable sst = r.getSharedStringsTable();

  StylesTable st = r.getStylesTable();
  XMLReader parser = fetchSheetParser(sst, st);

  // To look up the Sheet Name / Sheet Order / rID,
  //  you need to process the core Workbook stream.
  // Normally it's of the form rId# or rSheet#
  InputStream sheet2 = r.getSheet("rId2");
  InputSource sheetSource = new InputSource(sheet2);
  parser.parse(sheetSource);
  sheet2.close();
 }

 public void processAllSheets(String filename) throws Exception {
  OPCPackage pkg = OPCPackage.open(filename);
  XSSFReader r = new XSSFReader( pkg );
  SharedStringsTable sst = r.getSharedStringsTable();

  StylesTable st = r.getStylesTable();
  XMLReader parser = fetchSheetParser(sst, st);

  Iterator<InputStream> sheets = r.getSheetsData();
  while(sheets.hasNext()) {
   System.out.println("Processing new sheet:\n");
   InputStream sheet = sheets.next();
   InputSource sheetSource = new InputSource(sheet);
   parser.parse(sheetSource);
   sheet.close();
   System.out.println("");
  }
 }

 public XMLReader fetchSheetParser(SharedStringsTable sst, StylesTable st) throws SAXException, ParserConfigurationException {
/*
  XMLReader parser =
  XMLReaderFactory.createXMLReader(
       "org.apache.xerces.parsers.SAXParser"
  );
*/
  XMLReader parser = SAXHelper.newXMLReader();
  ContentHandler handler = new SheetHandler(sst, st);
  parser.setContentHandler(handler);
  return parser;
 }


 /**
  * See org.xml.sax.helpers.DefaultHandler javadocs
  */
 private static class SheetHandler extends DefaultHandler {
  private SharedStringsTable sst;
  private StylesTable st;
  private String lastContents;
  private boolean nextIsString;
  private boolean nextIsStyledNumeric;
  private boolean inlineStr;
  private int styleIndex;
  private DataFormatter formatter;

  private int rowNumber;

  private SheetHandler(SharedStringsTable sst, StylesTable st) {
   this.sst = sst;
   this.st = st;
   this.rowNumber = 0;
   this.formatter = new DataFormatter(java.util.Locale.US, true);
   this.styleIndex = 0;
  }

  public void startElement(String uri, String localName, String name,
            Attributes attributes) throws SAXException {

   // row => row
   if(name.equals("row")) {
    if (attributes.getValue("r") != null) {
     rowNumber = Integer.valueOf(attributes.getValue("r"));
    } else {
     rowNumber++;
    }
    System.out.println("row: " + rowNumber);
   }

   if (rowNumber > 6) {

    // c => cell
    if(name.equals("c")) {
     // Print the cell reference
     System.out.print(attributes.getValue("r") + " - ");

     String cellType = attributes.getValue("t");

     // Figure out if the value is an index in the SST
     nextIsString = false;
     if(cellType != null && cellType.equals("s")) {
      nextIsString = true;
     }

     // Figure out if the value is an inline string
     inlineStr = false;
     if(cellType != null && cellType.equals("inlineStr")) {
      inlineStr = true;
     }

     // Figure out if the value is an styled numeric value or date
     nextIsStyledNumeric = false;
     if(cellType != null && cellType.equals("n") || cellType == null) {
      String cellStyle = attributes.getValue("s");
      if (cellStyle != null) {
       styleIndex = Integer.parseInt(cellStyle);
       nextIsStyledNumeric = true;
      }
     }
    }
   }

   // Clear contents cache
   lastContents = "";
  }

  public void endElement(String uri, String localName, String name)
            throws SAXException {
   if (rowNumber > 6) {

    // Process the last contents as required.
    // Do now, as characters() may be called more than once

    // If the value is in the shared string table, get it
    if(nextIsString) {
     int idx = Integer.parseInt(lastContents);
     lastContents = new XSSFRichTextString(sst.getEntryAt(idx)).toString();
     nextIsString = false;
    }

    // v => contents of a cell
    // Output after we've seen the string contents
    if(name.equals("v") || (inlineStr && name.equals("c"))) {
     // If the value is styled numeric, use DataFormatter to formaat it
     if (nextIsStyledNumeric) {
      XSSFCellStyle style = st.getStyleAt(styleIndex);
      int formatIndex = style.getDataFormat();
      String formatString = style.getDataFormatString();
      if (formatString == null) {
       // formatString could not be found, so it must be a builtin format.
       formatString = BuiltinFormats.getBuiltinFormat(formatIndex);
      }
      double value = Double.valueOf(lastContents);
      lastContents = formatter.formatRawCellContents(value, formatIndex, formatString);
      nextIsStyledNumeric = false;
     }
     // Print out the contents
     System.out.println(lastContents);
    }
   }
  }

  public void characters(char[] ch, int start, int length)
            throws SAXException {
   //collect each character part to the content
   lastContents += new String(ch, start, length);
  }
 }

 public static void main(String[] args) throws Exception {
  ExampleEventUserModel example = new ExampleEventUserModel();
  //example.processOneSheet(args[0]);
  example.processAllSheets(args[0]);
 }
}

这篇关于java - 在java中使用apache事件模型解析excel文件时如何检查字符串中的数字包含日期和指数数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-30 01:58