XML vulnerabilities and Excel files¶
If your code ingests
.xlsx files that come from sources in which you do not
have absolute trust, please be aware that
.xlsx files are made up of XML
and, as such, are susceptible to the vulnerabilities of XML.
xlrd uses ElementTree to parse XML, but as you’ll find if you look into it, there are many different ElementTree implementations. A good summary of vulnerabilities you should worry can be found here: XML vulnerabilities.
For clarity, xlrd will try and import ElementTree from the following sources. The list is in priority order, with those earlier in the list being preferred to those later in the list:
To guard against these problems, you should consider the defusedxml project which can be used as follows:
import defusedxml from defusedxml.common import EntitiesForbidden from xlrd import open_workbook defusedxml.defuse_stdlib() def secure_open_workbook(**kwargs): try: return open_workbook(**kwargs) except EntitiesForbidden: raise ValueError('Please use a xlsx file without XEE')