Bug description
An error occurs when processing PDF files using TikaDocumentReader. This issue does not occur in version 1.0.0-M6.
Environment
Spring Boot version: 3.4.5 Spring AI version: 1.0.0-RC1 Java version: 21
- Code
TikaDocumentReader reader = new TikaDocumentReader(resource);
- Output
java.lang.NoSuchMethodError: 'void org.apache.tika.parser.pdf.PDF2XHTML.setIgnoreContentStreamSpaceGlyphs(boolean)'
at org.apache.tika.parser.pdf.PDFParserConfig.configure(PDFParserConfig.java:229)
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:105)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:219)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:204)
at org.springframework.ai.reader.tika.TikaDocumentReader.get(TikaDocumentReader.java:147)
at org.springframework.ai.reader.tika.TikaDocumentReader.get(TikaDocumentReader.java:51)
at org.springframework.ai.document.DocumentReader.read(DocumentReader.java:25)
- Maven
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-model-vertex-ai-embedding</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-model-vertex-ai-gemini</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-advisors-vector-store</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-tika-document-reader</artifactId>
</dependency>
Comment From: sunyuhan1998
I tried to reproduce the issue using version 1.0.0-RC1, following the code and Maven dependencies you provided, but I was unable to replicate the problem. Could you provide a minimal demo project that can reproduce the issue? Additionally, Spring AI 1.0.0 GA has already been released — you could also try that version to see if the issue still persists.