Bug description

An error occurs when processing PDF files using TikaDocumentReader. This issue does not occur in version 1.0.0-M6.

Environment

Spring Boot version: 3.4.5 Spring AI version: 1.0.0-RC1 Java version: 21


  • Code
TikaDocumentReader reader = new TikaDocumentReader(resource);
  • Output
java.lang.NoSuchMethodError: 'void org.apache.tika.parser.pdf.PDF2XHTML.setIgnoreContentStreamSpaceGlyphs(boolean)'
        at org.apache.tika.parser.pdf.PDFParserConfig.configure(PDFParserConfig.java:229)
        at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:105)
        at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:219)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:204)
        at org.springframework.ai.reader.tika.TikaDocumentReader.get(TikaDocumentReader.java:147)
        at org.springframework.ai.reader.tika.TikaDocumentReader.get(TikaDocumentReader.java:51)
        at org.springframework.ai.document.DocumentReader.read(DocumentReader.java:25)
  • Maven
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-model-vertex-ai-embedding</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-model-vertex-ai-gemini</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-advisors-vector-store</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-tika-document-reader</artifactId>
        </dependency>

Comment From: sunyuhan1998

I tried to reproduce the issue using version 1.0.0-RC1, following the code and Maven dependencies you provided, but I was unable to replicate the problem. Could you provide a minimal demo project that can reproduce the issue? Additionally, Spring AI 1.0.0 GA has already been released — you could also try that version to see if the issue still persists.