Unable to extract Documents parts from PPTX

Solis Murillo, Marianela 0 Reputation points
2024-02-12T15:26:41.6+00:00

I'm trying to parse a pptx file using Apache Tika, but we got this error Exception in thread "main" org.apache.tika.exception.TikaException: No part found for relationship id=rId6 - container=OPCPackage{packageAccess=READ, relationships=4 relationship(s) = [/_rels/.rels,sourcePart=null,/_rels/.rels], packageProperties=Name: /docProps/core.xml - Content Type: application/vnd.openxmlformats-package.core-properties+xml, isDirty=false} - relationshipType=http://schemas.openxmlformats.org/officeDocument/2006/relationships/oleObject - source=/ppt/slides/slide7.xml - target=/ppt/slides/NULL,targetMode=INTERNAL at org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator.getMainDocumentParts(XSLFPowerPointExtractorDecorator.java:302) at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleEmbeddedParts(AbstractOOXMLExtractor.java:210) The pptx file can be open using PowerPoint, but it is failing when trying to parse and extract the parts. Why a relationship with a oleObject would have a target=NULL?. Is there a way to fix the 'target=NULL' relationships in the file? Thanks, in advance!

PowerPoint
PowerPoint
A family of Microsoft presentation graphics products that offer tools for creating presentations and adding graphic effects like multimedia objects and special effects with text.
236 questions
Office Development
Office Development
Office: A suite of Microsoft productivity software that supports common business tasks, including word processing, email, presentations, and data management and analysis.Development: The process of researching, productizing, and refining new or existing technologies.
3,624 questions
0 comments No comments
{count} votes