I often receive Microsoft Excel files that have documents created by
other Microsoft applications embedded within them. E.g., at the top of
a worksheet I may see something like =EMBED("Visio.Drawing.11","")
.
Sometimes I want to extract the embedded file. With a Microsoft Excel .xlsm
file that is easy to do, because XLSM is a
zipped,
XML-based file format. To extract embedded documents, such as
Visio
drawings or
PowerPoint presentations, I make a copy of the .xlsm file then rename the
copy's extension from .xlsm to .zip. I can then extract the contents of the zip
file. Within the directory that holds the extracted files, there will be a
xl
directory. Within that directory there is an embeddings
directory that holds the embedded files, such as the Visio drawings in
the example below.
$ ls ~/Documents/Work/CRQ/843940/unzipped [Content_Types].xml customXml xl _rels docProps $ ls ~/Documents/Work/CRQ/843940/unzipped/xl _rels comments19.xml comments9.xml calcChain.xml comments2.xml ctrlProps charts comments20.xml drawings comments1.xml comments21.xml embeddings comments10.xml comments22.xml media comments11.xml comments23.xml printerSettings comments12.xml comments24.xml sharedStrings.xml comments13.xml comments3.xml styles.xml comments14.xml comments4.xml theme comments15.xml comments5.xml vbaProject.bin comments16.xml comments6.xml workbook.xml comments17.xml comments7.xml worksheets comments18.xml comments8.xml $ ls ~/Documents/Work/CRQ/843940/unzipped/xl/embeddings Microsoft_Visio_2003-2010_Drawing111.vsd Microsoft_Visio_2003-2010_Drawing222.vsd Microsoft_Visio_2003-2010_Drawing333.vsd Microsoft_Visio_2003-2010_Drawing444.vsd oleObject1.bin oleObject2.bin oleObject3.bin oleObject4.bin $
I can then open the extracted documents in the application used to create them. In the case of Visio drawings, since Microsoft doesn't provided a Visio viewer program for Mac OS X systems, when I extract them on my MacBook Pro laptop, I use VSD Viewer Pro to view the files.
But with PowerPoint slides embedded in the .xlsm file, I may see
=EMBED("PowerPoint.Slide.8","")
in the function field at the
top of a worksheet, but when I rename the .xlsm file to a .zip file and
extract the contents of the zip file, I may see something like the following
for the files listed in the embeddings
directory:
$ ls ~/Documents/Work/CRQ/833131/unzipped/xl/embeddings oleObject1.bin oleObject2.bin oleObject3.bin oleObject4.bin $
Since I know that there was an embedded PowerPoint slide in the Excel workbook, but can't open the bin files in PowerPoint, I can rename them to .ppt and then open them with PowerPoint. In the case of the example above, I renamed all four .bin files to have a .ppt extension, instead. I was able to open the first three in PowerPoint and copy the text from the slides that I wanted to put into another application - I couldn't copy the text inside of Excel from the embedded slides. For the fourth one, I saw the message "PowerPoint cannot open the type of file represented by oleObject4.ppt." But the other files contained the information I wanted. The "ole" in the file names stands for Object Linking and Embedding", which is a technology created by Microsoft that allows embedding and linking to documents and other objects.
You can use the file command on a Mac OS X system to get information on the file type for files. E.g., in the example below when I made the current working directory the "embeddings" directory, which contained the extracted files from another .xlsm file, I saw the following when I used the command:
$ file * oleObject1.bin: CDF V2 Document, Little Endian, Os: Windows, Version 6.1, Code p age: 1252, Title: PowerPoint Presentation, Author: Tracy Wilhelm, Last Saved By: Bigelow, Andrew L. (ACCI-760.0)[ABCS], Revision Number: 7, Name of Creating App lication: Microsoft Office PowerPoint, Total Editing Time: 01:31:38, Create Time /Date: Mon Jan 7 00:03:32 2013, Last Saved Time/Date: Wed Apr 12 18:42:09 2017, Number of Words: 35 oleObject2.bin: CDF V2 Document, Little Endian, Os: Windows, Version 6.1, Code p age: 1252, Title: PowerPoint Presentation, Author: Tracy Wilhelm, Last Saved By: Bigelow, Andrew L. (ACCI-760.0)[ABCS], Revision Number: 4, Name of Creating App lication: Microsoft Office PowerPoint, Total Editing Time: 01:11:34, Create Time /Date: Mon Jan 7 00:04:06 2013, Last Saved Time/Date: Wed Apr 12 18:42:09 2017, Number of Words: 13 oleObject3.bin: CDF V2 Document, Little Endian, Os: Windows, Version 6.1, Code p age: 1252, Title: PowerPoint Presentation, Author: Tracy Wilhelm, Last Saved By: Clark, Jeff N. (NSFC-IS40)[ABCS], Revision Number: 8, Name of Creating App lication: Microsoft Office PowerPoint, Total Editing Time: 01:15:21, Create Time /Date: Mon Jan 7 00:03:32 2013, Last Saved Time/Date: Wed Oct 7 22:48:15 2015, Number of Words: 162 oleObject4.bin: CDF V2 Document, No summary info $ file --mime * oleObject1.bin: application/vnd.ms-office; charset=binary oleObject2.bin: application/vnd.ms-office; charset=binary oleObject3.bin: application/vnd.ms-office; charset=binary oleObject4.bin: CDF V2 Document, No summary info; charset=binary $
So I know I can rename the first 3 files to have a .ppt rather than a .bin filename extension and open the files with PowerPoint.
Related articles: