MoonPoint Support Logo

 

Shop Amazon Warehouse Deals - Deep Discounts on Open-box and Used ProductsAmazon Warehouse Deals



Advanced Search
July
Sun Mon Tue Wed Thu Fri Sat
 
17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      
2024
Months
JulAug Sep
Oct Nov Dec


Mon, Feb 19, 2018 11:24 pm

xlrd and hidden worksheets

I use the xlrd module in Python scripts to extract data from Excel workbooks. You can use the Python xlrd module to list the worksheets in a workbook and you can use the xlrd.sheet "visibility" value to determine whether a sheet is hidden and, if it is hidden, whether a user can unhide the sheet. The value should be either 0, 1, or 2 with the numbers having the following meaning:

Visibility of the sheet:

0 = visible
1 = hidden (can be unhidden by user -- Format -> Sheet -> Unhide)
2 = "very hidden" (can be unhidden only by VBA macro)

[ More Info ]

[/languages/python/excel] permanent link

Wed, Feb 14, 2018 9:31 pm

Extracting embedded Microsoft Office files from an Excel spreadsheet

I work with Excel workbooks on my MacBook Pro laptop that have embedded PowerPoint slides on some worksheets. The workbooks, which I need to review, are created by others. When I review them, I extract information from the Excel workbooks to an SQLite database with Python and also have begun extracting information embedded by Object Linking and Embedding (OLE) into files as noted in Extracting the contents of a directory in a zipfile using Python. Some of the embedded files are PowerPoint files, but when they are extracted they have a .bin extension, which I can't open in PowerPoint without changing the filename extension from .bin to .ppt. To automate the renaming process, I created a Python script, extract_embedded.py that will extract the embedded information to files in an "embedded" directory beneath the current working directory and then rename any .bin files that are PowerPoint files to have a .ppt extension. The script is shown below.

[ More Info ]

[/languages/python/excel] permanent link

Sat, Feb 10, 2018 10:55 pm

Extracting the contents of a directory in a zipfile using Python

A Microsoft Excel file with an .xlsx or .xlsm filename extension is an Office Open XML (OpenXML) zipped, XML-based file. The OpenXML format was developed by Microsoft for spreadsheets, charts, presentations and word processing documents. If you change the file extension to .zip by renaming the file, you can extract the contents of the zip file as you would with any other zip file - see Zipping and unzipping Excel xlsx files. Excel workbooks can contain other documents embedded within them using Object Linking and Embedding (OLE) technology - see Using olefile to obtain metadata from an OLE CDF V2 file. I often need to extract an embedded PowerPoint slide or Visio diagram from Excel .xlsm files, so I've been renaming the files to zip files and unzipping them as I would other zip files, but, since I want to automate the process and extract just specific embedded files for further processing within a Python script, I created the script below to extract the embedded files, which are contained within a xl/embeddings subdirectory within the .xlsm zip files. The script uses the zipfile module to deal with the zip files. Python's OS module is used to check for the existence of the destination directory and create it, if it doesn't yet exist.

#!/usr/bin/python

import os, zipfile

dirToExtract = "xl/embeddings/"
destinationDir = "embedded"
infile = raw_input("Enter zipfile: ")
archive = zipfile.ZipFile(infile)

if not os.path.exists(destinationDir):
    os.makedirs(destinationDir)

for file in archive.namelist():
    if file.startswith(dirToExtract):
        archive.extract(file, destinationDir)

The script prompts for the file to be unzipped and then extracts just the "xl/embeddings" folder and the files contained within it to a new directory it will create within the current working directory. The new directory will be named "embedded". After extracting the contents of the "xl/embeddings" directory to the newly created "embedded" folder, I had the the files below in the case of the particular .xlsm file I used for this example.

[ More Info ]

[/languages/python/excel] permanent link

Fri, Feb 09, 2018 10:04 pm

Using the Python xlrd module to list the worksheets in a workbook

To view the list of sheets in an Excel spreadsheet, I can use the xlrd module within the Python script below to obtain the list of worksheets within the workbook.

#!/usr/bin/python

import xlrd as xl

file_name = raw_input("File: ")
workbook = xl.open_workbook(file_name)
print workbook.sheet_names()

If I use the script to display the list of worksheets in a workbook named report.xlsx that has three sheets named alpha, beta, and gamma, I would see the following output:

$ ./sheetlist.py
File: report.xlsx
[u'alpha', u'beta', u'gamma']
$

[ More Info ]

[/languages/python/excel] permanent link

Tue, Jan 30, 2018 11:22 pm

Using Python to extract data from a spreadsheet and put it in a database

I need to review spreadsheets related to work requests on a daily basis. The Excel workbooks have multiple worksheets within them. I have been manually copying data from specific cells in one of the worksheets and pasting it into a SQLite database. I use DB Browser for SQLite to paste the information into the database. I wanted to automate the process of extracting the data from the .xlsm or .xlsx workbook files and inserting it into the SQLite database using a Python script. So I initially created a script that I could run from a Teminal window on my MacBook Pro laptop, as noted in Extracting data from cells in an Excel workbook with Python, that would just print the information to the Terminal window's command-line interface. I had another script to import data from a text file into an SQLite database with Python, so I combined code from that script with the one to read data from the spreadsheets to add the data to a table in the database.

[ More Info ]

[/languages/python/excel] permanent link

Fri, Jan 26, 2018 10:51 pm

Extracting data from cells in an Excel workbook with Python

I've been using xlrd to extract a column from an Excel spreadsheet with a Python script on my MacBook Pro laptop. With xlrd, you can read data from cells in a workbook by specifying a specific sheet and the cells from which you wish to extract the data in spreadsheet programs, such as Microsoft Excel, Apache OpenOffice Calc, or LibreOffice Cale.

With spreadsheet applications you normally have rows sequenced by numbers and columns sequenced by alphabetical characters. With xlrd, though, both rows and columns are specified by numbers with the first row starting at 0 and the first column starting at 0 as well.

[ More Info ]

[/languages/python/excel] permanent link

Fri, Oct 21, 2016 11:29 pm

Python script to extract a column from an Excel spreadsheet

I use the Python xlrd module to extract a column from an Excel spreadsheet. I've been using a Python script, extract-addresses.py to pull the data from a set column in an .xls spreadsheet, Directory.xls, and store it in a text file. I decided to make the script more generic, so that I can extract the data from a spreadsheet I specify when I run the script rather than a set spreadsheet and to have the specific column and output file name be variable as well. The new script is extract-column.py. The Python xlrd module will need to be installed for the script to work; xlrd can be downloaded from one of the links below:

Python Software Foundation
MoonPoint Support (version 1.0.0)

If no arguments are provided on the command line, the script will prompt for the name of the spreadsheet file from which the data should be extracted, the column number containing the data to be extracted (A is treated as column 1, B as column 2, etc.), and the name of the output file to hold the extracted data.

[ More Info ]

[/languages/python/excel] permanent link

Valid HTML 4.01 Transitional

Privacy Policy   Contact

Blosxom logo