I'm using Spring Boot 3.4.5 with Spring AI 1.0.0. On a particular PDF document the ParagraphPdfDocumentReader throws:

java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: Index out of bounds: -1
    at org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader.getTextBetweenParagraphs(ParagraphPdfDocumentReader.java:248)
    at org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader.toDocument(ParagraphPdfDocumentReader.java:161)
    at org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader.get(ParagraphPdfDocumentReader.java:147)
    at org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader.get(ParagraphPdfDocumentReader.java:50)
    at org.springframework.ai.document.DocumentReader.read(DocumentReader.java:25)
    at org.pdfsam.spec.agent.service.DefaultPdfLoader.loadPdf(DefaultPdfLoader.java:59)
    at org.pdfsam.spec.agent.service.DefaultPdfLoader.loadPdfWithOutlineFrom(DefaultPdfLoader.java:54)
    at org.pdfsam.spec.agent.service.DefaultLoadService.loadPDFFilesWithOutline(DefaultLoadService.java:82)
    at org.pdfsam.spec.agent.service.DefaultLoadService.loadUnprocessed(DefaultLoadService.java:61)
    at org.pdfsam.spec.agent.ETLApplication.lambda$commandLineRunner$0(ETLApplication.java:43)
    at org.springframework.boot.SpringApplication.lambda$callRunner$5(SpringApplication.java:789)
    at org.springframework.util.function.ThrowingConsumer$1.acceptWithException(ThrowingConsumer.java:82)
    at org.springframework.util.function.ThrowingConsumer.accept(ThrowingConsumer.java:60)
    at org.springframework.util.function.ThrowingConsumer$1.accept(ThrowingConsumer.java:86)
    at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:797)
    at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:788)
    at org.springframework.boot.SpringApplication.lambda$callRunners$3(SpringApplication.java:773)
    at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:186)
    at java.base/java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:571)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560)
    at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:153)
    at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:176)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
    at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:636)
    at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:773)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:325)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1362)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1351)
    at org.pdfsam.spec.agent.ETLApplication.main(ETLApplication.java:34)
Caused by: java.lang.IndexOutOfBoundsException: Index out of bounds: -1
    at org.apache.pdfbox.pdmodel.PDPageTree.get(PDPageTree.java:299)
    at org.apache.pdfbox.pdmodel.PDPageTree.get(PDPageTree.java:263)
    at org.apache.pdfbox.pdmodel.PDDocument.getPage(PDDocument.java:1220)
    at org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader.getTextBetweenParagraphs(ParagraphPdfDocumentReader.java:196)
    ... 29 common frames omitted

The issue is with an outline item without any page destination (no Dest nor A item in the dictionary). This results in this printed as outline item: Bla [-1,17], children = 0, pos = 0

I cannot share the PDF file but I guess I can create one if needed.

Comment From: WOONBE

I'd like to contribute this issue!