本文概述
为了提取Jar(Java ARchive)文件, Tika提供了PackageParser类。此类用于从Jar文件提取内容和元数据。它位于org.apache.tika.parser.pkg包中, 并包含下表中列出的各种构造函数和方法。
Tika PackageParser构造函数
Constructor | Description |
---|---|
公共PackageParser() | public PackageParser() |
Tika PackageParser方法
Method | Description |
---|---|
公共Set <MediaType> getSupportedTypes(ParseContext上下文) | 它返回此解析器支持的媒体类型集。 |
公共无效解析(InputStream流, ContentHandler处理程序, 元数据元数据, ParseContext上下文)引发IOException, SAXException, TikaException | 它将文档流解析为一系列XHTML SAX事件。 |
受保护的静态元数据handleEntryMetadata(字符串名称, 日期createAt, 日期ModifyAt, 长尺寸, XHTMLContentHandler xhtml)引发SAXException, IOException, TikaException |
Tika PackageParser示例
package tikaexample;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.pkg.PackageParser;
import org.apache.tika.sax.BodyContentHandler;
import org.xml.sax.SAXException;
public class JarFileExample {
public static void main(String[] args) throws IOException, SAXException, TikaException {
BodyContentHandler handler = new BodyContentHandler();
PackageParser parser = new PackageParser(); // It is used to extract archive file.
Metadata metadata = new Metadata();
ParseContext pcontext = new ParseContext();
try (InputStream stream = new FileInputStream(new File(
"/home/sssit/irfan/tika/example/tikaexample/src/main/java/tikaexample/myjar.jar"))) {
parser.parse(stream, handler, metadata, pcontext);
System.out.println("Document Content:" + handler.toString());
System.out.println("Document Metadata:");
String[] metadatas = metadata.names();
for(String data : metadatas) {
System.out.println(data + ": " + metadata.get(data));
}
}catch(Exception e) {System.out.println("Exception message: "+ e.getMessage());}
}
}
输出
Document Content:
META-INF/MANIFEST.MF
Manifest-Version: 1.0
Created-By: 1.7.0_01 (Oracle Corporation)
Main-Class: First
First.class
public synchronized class First {
void First();
public static void main(String[]);
}
Document Metadata:
Content-Type: application/zip
评论前必须登录!
注册