././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/LICENSE.txt0000644000000000000000000000204500000000000015015 0ustar0000000000000000MIT License Copyright (c) 2015 nolze Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/README.md0000644000000000000000000001462700000000000014462 0ustar0000000000000000# msoffcrypto-tool [![PyPI](https://img.shields.io/pypi/v/msoffcrypto-tool.svg)](https://pypi.org/project/msoffcrypto-tool/) [![PyPI downloads](https://img.shields.io/pypi/dm/msoffcrypto-tool.svg)](https://pypistats.org/packages/msoffcrypto-tool) [![Build Status](https://travis-ci.com/nolze/msoffcrypto-tool.svg?branch=master)](https://travis-ci.com/nolze/msoffcrypto-tool) [![Coverage Status](https://codecov.io/gh/nolze/msoffcrypto-tool/branch/master/graph/badge.svg)](https://codecov.io/gh/nolze/msoffcrypto-tool) [![Documentation Status](https://readthedocs.org/projects/msoffcrypto-tool/badge/?version=latest)](http://msoffcrypto-tool.readthedocs.io/en/latest/?badge=latest) msoffcrypto-tool (formerly ms-offcrypto-tool) is Python tool and library for decrypting encrypted MS Office files with password, intermediate key, or private key which generated its escrow key. ## Contents * [Install](#install) * [Examples](#examples) * [Supported encryption methods](#supported-encryption-methods) * [Tests](#tests) * [Todo](#todo) * [Resources](#resources) * [Use cases and mentions](#use-cases-and-mentions) * [Contributors](#contributors) ## Install ``` pip install msoffcrypto-tool ``` ## Examples ### As CLI tool (with password) ``` msoffcrypto-tool encrypted.docx decrypted.docx -p Passw0rd ``` Password is prompted if you omit the password argument value: ```bash $ msoffcrypto-tool encrypted.docx decrypted.docx -p Password: ``` Test if the file is encrypted or not (exit code 0 or 1 is returned): ``` msoffcrypto-tool document.doc --test -v ``` ### As library Password and more key types are supported with library functions. Basic usage: ```python import msoffcrypto encrypted = open("encrypted.docx", "rb") file = msoffcrypto.OfficeFile(encrypted) file.load_key(password="Passw0rd") # Use password with open("decrypted.docx", "wb") as f: file.decrypt(f) encrypted.close() ``` Basic usage (in-memory): ```python import msoffcrypto import io import pandas as pd decrypted = io.BytesIO() with open("encrypted.xlsx", "rb") as f: file = msoffcrypto.OfficeFile(f) file.load_key(password="Passw0rd") # Use password file.decrypt(decrypted) df = pd.read_excel(decrypted) print(df) ``` Advanced usage: ```python # Verify password before decryption (default: False) # The ECMA-376 Agile/Standard crypto system allows one to know whether the supplied password is correct before actually decrypting the file # Currently, the verify_password option is only meaningful for ECMA-376 Agile/Standard Encryption file.load_key(password="Passw0rd", verify_password=True) # Use private key file.load_key(private_key=open("priv.pem", "rb")) # Use intermediate key (secretKey) file.load_key(secret_key=binascii.unhexlify("AE8C36E68B4BB9EA46E5544A5FDB6693875B2FDE1507CBC65C8BCF99E25C2562")) # Check the HMAC of the data payload before decryption (default: False) # Currently, the verify_integrity option is only meaningful for ECMA-376 Agile Encryption file.decrypt(open("decrypted.docx", "wb"), verify_integrity=True) ``` ## Supported encryption methods ### MS-OFFCRYPTO specs * [x] ECMA-376 (Agile Encryption/Standard Encryption) * [x] MS-DOCX (OOXML) (Word 2007-2016) * [x] MS-XLSX (OOXML) (Excel 2007-2016) * [x] MS-PPTX (OOXML) (PowerPoint 2007-2016) * [x] Office Binary Document RC4 CryptoAPI * [x] MS-DOC (Word 2002, 2003, 2004) * [x] MS-XLS (Excel 2002, 2003, 2004) (experimental) * [x] MS-PPT (PowerPoint 2002, 2003, 2004) (partial, experimental) * [x] Office Binary Document RC4 * [x] MS-DOC (Word 97, 98, 2000) * [x] MS-XLS (Excel 97, 98, 2000) (experimental) * [ ] ECMA-376 (Extensible Encryption) * [ ] XOR Obfuscation ### Other * [ ] Word 95 Encryption (Word 95 and prior) * [ ] Excel 95 Encryption (Excel 95 and prior) * [ ] PowerPoint 95 Encryption (PowerPoint 95 and prior) PRs are welcome! ## Tests With [coverage](https://github.com/nedbat/coveragepy) and [pytest](https://pytest.org/): ``` poetry install poetry run coverage run -m pytest -v ``` ## Todo * [x] Add tests * [x] Support decryption with passwords * [x] Support older encryption schemes * [x] Add function-level tests * [x] Add API documents * [x] Publish to PyPI * [x] Add decryption tests for various file formats * [x] Integrate with more comprehensive projects handling MS Office files (such as [oletools](https://github.com/decalage2/oletools/)?) if possible * [x] Add the password prompt mode for CLI * [x] Improve error types (v4.12.0) * [ ] Redesign APIs (v5.0.0) * [ ] Introduce something like `ctypes.Structure` * [ ] Support encryption * [ ] Isolate parser ## Resources * "Backdooring MS Office documents with secret master keys" * Technical Documents * [MS-OFFCRYPTO] Agile Encryption * LibreOffice/core * LibreOffice/mso-dumper * wvDecrypt * Microsoft Office password protection - Wikipedia * office2john.py ## Alternatives * herumi/msoffice * DocRecrypt * Apache POI - the Java API for Microsoft Documents ## Use cases and mentions ### General * (kudos to maintainers!) * ### Malware/maldoc analysis * * ### CTF * * ### In other languages * * ## Contributors * ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/__init__.py0000644000000000000000000000553300000000000017663 0ustar0000000000000000import olefile import zipfile import pkg_resources from msoffcrypto import exceptions __version__ = pkg_resources.get_distribution("msoffcrypto-tool").version def OfficeFile(file): """Return an office file object based on the format of given file. Args: file (:obj:`_io.BufferedReader`): Input file. Returns: BaseOfficeFile object. Examples: >>> with open("tests/inputs/example_password.docx", "rb") as f: ... officefile = OfficeFile(f) ... officefile.keyTypes ('password', 'private_key', 'secret_key') >>> with open("tests/inputs/example_password.docx", "rb") as f: ... officefile = OfficeFile(f) ... officefile.load_key(password="Password1234_", verify_password=True) >>> with open("README.md", "rb") as f: ... officefile = OfficeFile(f) Traceback (most recent call last): ... msoffcrypto.exceptions.FileFormatError: ... >>> with open("tests/inputs/example_password.docx", "rb") as f: ... officefile = OfficeFile(f) ... officefile.load_key(password="0000", verify_password=True) Traceback (most recent call last): ... msoffcrypto.exceptions.InvalidKeyError: ... Given file handle will not be closed, the file position will most certainly change. """ file.seek(0) # required by isOleFile if olefile.isOleFile(file): ole = olefile.OleFileIO(file) elif zipfile.is_zipfile(file): # Heuristic from msoffcrypto.format.ooxml import OOXMLFile return OOXMLFile(file) else: raise exceptions.FileFormatError("Unsupported file format") # TODO: Make format specifiable by option in case of obstruction # Try this first; see https://github.com/nolze/msoffcrypto-tool/issues/17 if ole.exists("EncryptionInfo"): from msoffcrypto.format.ooxml import OOXMLFile return OOXMLFile(file) # MS-DOC: The WordDocument stream MUST be present in the file. # https://msdn.microsoft.com/en-us/library/dd926131(v=office.12).aspx elif ole.exists("wordDocument"): from msoffcrypto.format.doc97 import Doc97File return Doc97File(file) # MS-XLS: A file MUST contain exactly one Workbook Stream, ... # https://msdn.microsoft.com/en-us/library/dd911009(v=office.12).aspx elif ole.exists("Workbook"): from msoffcrypto.format.xls97 import Xls97File return Xls97File(file) # MS-PPT: A required stream whose name MUST be "PowerPoint Document". # https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/1fc22d56-28f9-4818-bd45-67c2bf721ccf elif ole.exists("PowerPoint Document"): from msoffcrypto.format.ppt97 import Ppt97File return Ppt97File(file) else: raise exceptions.FileFormatError("Unrecognized file format") ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/__main__.py0000644000000000000000000000464500000000000017647 0ustar0000000000000000from __future__ import print_function import logging, sys import argparse import getpass import olefile from msoffcrypto import __version__ from msoffcrypto import OfficeFile from msoffcrypto import exceptions logger = logging.getLogger(__name__) logger.addHandler(logging.NullHandler()) def ifWIN32SetBinary(io): if sys.platform == "win32": import msvcrt, os msvcrt.setmode(io.fileno(), os.O_BINARY) def is_encrypted(file): r""" Test if the file is encrypted. >>> f = open("tests/inputs/plain.doc", "rb") >>> is_encrypted(f) False """ # TODO: Validate file if not olefile.isOleFile(file): return False file = OfficeFile(file) return file.is_encrypted() parser = argparse.ArgumentParser() group = parser.add_mutually_exclusive_group(required=True) group.add_argument("-p", "--password", nargs="?", const="", dest="password", help="password text") group.add_argument("-t", "--test", dest="test_encrypted", action="store_true", help="test if the file is encrypted") parser.add_argument("-v", dest="verbose", action="store_true", help="print verbose information") parser.add_argument("infile", nargs="?", type=argparse.FileType("rb"), help="input file") parser.add_argument("outfile", nargs="?", type=argparse.FileType("wb"), help="output file (if blank, stdout is used)") def main(): args = parser.parse_args() if args.verbose: logger.removeHandler(logging.NullHandler()) logging.basicConfig(level=logging.DEBUG, format="%(message)s") logger.debug("Version: {}".format(__version__)) if args.test_encrypted: if not is_encrypted(args.infile): print("{}: not encrypted".format(args.infile.name), file=sys.stderr) sys.exit(1) else: logger.debug("{}: encrypted".format(args.infile.name)) return if not olefile.isOleFile(args.infile): raise exceptions.FileFormatError("Not OLE file") file = OfficeFile(args.infile) if args.password: file.load_key(password=args.password) else: password = getpass.getpass() file.load_key(password=password) if args.outfile is None: ifWIN32SetBinary(sys.stdout) if hasattr(sys.stdout, "buffer"): # For Python 2 args.outfile = sys.stdout.buffer else: args.outfile = sys.stdout file.decrypt(args.outfile) if __name__ == "__main__": main() ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/exceptions.py0000644000000000000000000000073100000000000020300 0ustar0000000000000000class FileFormatError(Exception): """Raised when the format of given file is unsupported or unrecognized. """ pass class ParseError(Exception): """Raised when the file cannot be parsed correctly. """ pass class DecryptionError(Exception): """Raised when the file cannot be decrypted. """ pass class InvalidKeyError(DecryptionError): """Raised when the given password or key is incorrect or cannot be verified. """ pass ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/format/__init__.py0000644000000000000000000000000000000000000021133 0ustar0000000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/format/base.py0000644000000000000000000000061600000000000020323 0ustar0000000000000000import abc # For 2 and 3 compatibility # https://stackoverflow.com/questions/35673474/ ABC = abc.ABCMeta("ABC", (object,), {"__slots__": ()}) class BaseOfficeFile(ABC): def __init__(self): pass @abc.abstractmethod def load_key(self): pass @abc.abstractmethod def decrypt(self): pass @abc.abstractmethod def is_encrypted(self): pass ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/format/common.py0000644000000000000000000000320500000000000020676 0ustar0000000000000000import logging from struct import unpack logger = logging.getLogger(__name__) logger.addHandler(logging.NullHandler()) # https://msdn.microsoft.com/en-us/library/dd926359(v=office.12).aspx def _parse_encryptionheader(blob): (flags,) = unpack(">> blob = io.BytesIO(b'\xec\xa5\xc1\x00G\x00\t\x04\x00\x00\x00\x13\xbf\x004\x00\ ... \x00\x00\x00\x10\x00\x00\x00\x00\x00\x04\x00\x00\x16\x04\x00\x00') >>> fibbase = _parseFibBase(blob) >>> hex(fibbase.wIdent) '0xa5ec' >>> hex(fibbase.nFib) '0xc1' >>> hex(fibbase.fExtChar) '0x1' """ getBit = lambda bits, i: (bits & (1 << i)) >> i getBitSlice = lambda bits, i, w: (bits & (2 ** w - 1 << i)) >> i # https://msdn.microsoft.com/en-us/library/dd944620(v=office.12).aspx (buf,) = unpack_from(">> with open("tests/inputs/rc4cryptoapi_password.doc", "rb") as f: ... officefile = Doc97File(f) ... officefile.load_key(password="Password1234_") >>> with open("tests/inputs/rc4cryptoapi_password.doc", "rb") as f: ... officefile = Doc97File(f) ... officefile.load_key(password="0000") Traceback (most recent call last): ... msoffcrypto.exceptions.InvalidKeyError: ... """ def __init__(self, file): self.file = file ole = olefile.OleFileIO(file) # do not close this, would close file self.ole = ole self.format = "doc97" self.keyTypes = ["password"] self.key = None self.salt = None # https://msdn.microsoft.com/en-us/library/dd944620(v=office.12).aspx with ole.openstream("wordDocument") as stream: fib = _parseFib(stream) # https://msdn.microsoft.com/en-us/library/dd923367(v=office.12).aspx tablename = "1Table" if fib.base.fWhichTblStm == 1 else "0Table" Info = namedtuple("Info", ["fib", "tablename"]) self.info = Info( fib=fib, tablename=tablename, ) def load_key(self, password=None): fib = self.info.fib logger.debug("fEncrypted: {}, fObfuscation: {}".format(fib.base.fEncrypted, fib.base.fObfuscation)) if fib.base.fEncrypted == 1: if fib.base.fObfuscation == 1: # Using XOR obfuscation xor_obf_password_verifier = fib.base.IKey logger.debug(hex(xor_obf_password_verifier)) else: # elif fib.base.fObfuscation == 0: encryptionHeader_size = fib.base.IKey logger.debug("encryptionHeader_size: {}".format(hex(encryptionHeader_size))) with self.ole.openstream(self.info.tablename) as table: encryptionHeader = table # TODO why create a 2nd reference to same stream? encryptionVersionInfo = table.read(4) vMajor, vMinor = unpack(">> f = open("tests/inputs/plain.doc", "rb") >>> file = Doc97File(f) >>> file.is_encrypted() False >>> f = open("tests/inputs/rc4cryptoapi_password.doc", "rb") >>> file = Doc97File(f) >>> file.is_encrypted() True """ return True if self.info.fib.base.fEncrypted == 1 else False ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/format/ooxml.py0000644000000000000000000002232200000000000020545 0ustar0000000000000000import logging import base64, io from struct import unpack from xml.dom.minidom import parseString import zipfile import olefile from msoffcrypto import exceptions from msoffcrypto.format import base from msoffcrypto.format.common import _parse_encryptionheader, _parse_encryptionverifier from msoffcrypto.method.ecma376_agile import ECMA376Agile from msoffcrypto.method.ecma376_standard import ECMA376Standard logger = logging.getLogger(__name__) logger.addHandler(logging.NullHandler()) def _parseinfo_standard(ole): (headerFlags,) = unpack(">> with open("tests/inputs/example_password.docx", "rb") as f: ... officefile = OOXMLFile(f) ... officefile.load_key(password="Password1234_", verify_password=True) >>> with open("tests/inputs/example_password.docx", "rb") as f: ... officefile = OOXMLFile(f) ... officefile.load_key(password="0000", verify_password=True) Traceback (most recent call last): ... msoffcrypto.exceptions.InvalidKeyError: ... """ def __init__(self, file): self.format = "ooxml" file.seek(0) # TODO: Investigate the effect (required for olefile.isOleFile) # olefile cannot process non password protected ooxml files. # TODO: this code is duplicate of OfficeFile(). Merge? if olefile.isOleFile(file): ole = olefile.OleFileIO(file) self.file = ole with self.file.openstream("EncryptionInfo") as stream: self.type, self.info = _parseinfo(stream) logger.debug("OOXMLFile.type: {}".format(self.type)) self.secret_key = None if self.type == "agile": # TODO: Support aliases? self.keyTypes = ("password", "private_key", "secret_key") elif self.type == "standard": self.keyTypes = ("password", "secret_key") elif self.type == "extensible": pass elif zipfile.is_zipfile(file): raise exceptions.FileFormatError("Unencrypted document or unsupported file format") else: raise exceptions.FileFormatError("Unsupported file format") def load_key(self, password=None, private_key=None, secret_key=None, verify_password=False): if password: if self.type == "agile": self.secret_key = ECMA376Agile.makekey_from_password( password, self.info["passwordSalt"], self.info["passwordHashAlgorithm"], self.info["encryptedKeyValue"], self.info["spinValue"], self.info["passwordKeyBits"], ) if verify_password: verified = ECMA376Agile.verify_password( password, self.info["passwordSalt"], self.info["passwordHashAlgorithm"], self.info["encryptedVerifierHashInput"], self.info["encryptedVerifierHashValue"], self.info["spinValue"], self.info["passwordKeyBits"], ) if not verified: raise exceptions.InvalidKeyError("Key verification failed") elif self.type == "standard": self.secret_key = ECMA376Standard.makekey_from_password( password, self.info["header"]["algId"], self.info["header"]["algIdHash"], self.info["header"]["providerType"], self.info["header"]["keySize"], self.info["verifier"]["saltSize"], self.info["verifier"]["salt"], ) if verify_password: verified = ECMA376Standard.verifykey( self.secret_key, self.info["verifier"]["encryptedVerifier"], self.info["verifier"]["encryptedVerifierHash"] ) if not verified: raise exceptions.InvalidKeyError("Key verification failed") elif self.type == "extensible": pass elif private_key: if self.type == "agile": self.secret_key = ECMA376Agile.makekey_from_privkey(private_key, self.info["encryptedKeyValue"]) else: raise exceptions.DecryptionError("Unsupported key type for the encryption method") elif secret_key: self.secret_key = secret_key else: raise exceptions.DecryptionError("No key specified") def decrypt(self, ofile, verify_integrity=False): if self.type == "agile": with self.file.openstream("EncryptedPackage") as stream: if verify_integrity: verified = ECMA376Agile.verify_integrity( self.secret_key, self.info["keyDataSalt"], self.info["keyDataHashAlgorithm"], self.info["keyDataBlockSize"], self.info["encryptedHmacKey"], self.info["encryptedHmacValue"], stream, ) if not verified: raise exceptions.InvalidKeyError("Payload integrity verification failed") obuf = ECMA376Agile.decrypt(self.secret_key, self.info["keyDataSalt"], self.info["keyDataHashAlgorithm"], stream) ofile.write(obuf) elif self.type == "standard": with self.file.openstream("EncryptedPackage") as stream: obuf = ECMA376Standard.decrypt(self.secret_key, stream) ofile.write(obuf) else: raise exceptions.DecryptionError("Unsupported encryption method") # If the file is successfully decrypted, there must be a valid OOXML file, i.e. a valid zip file if not zipfile.is_zipfile(io.BytesIO(obuf)): raise exceptions.InvalidKeyError("The file could not be decrypted with this password") def is_encrypted(self): # Heuristic if isinstance(self.file, olefile.OleFileIO): return True else: return False ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/format/ppt97.py0000644000000000000000000006740100000000000020401 0ustar0000000000000000import logging, io, shutil, tempfile from struct import pack, unpack from collections import namedtuple import olefile from msoffcrypto import exceptions from msoffcrypto.format import base from msoffcrypto.format.common import _parse_encryptionheader, _parse_encryptionverifier from msoffcrypto.method.rc4_cryptoapi import DocumentRC4CryptoAPI logger = logging.getLogger(__name__) logger.addHandler(logging.NullHandler()) RecordHeader = namedtuple( "RecordHeader", [ "recVer", "recInstance", "recType", "recLen", ], ) def _parseRecordHeader(blob): # RecordHeader: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/df201194-0cd0-4dfb-bf10-eea353d8eabc getBitSlice = lambda bits, i, w: (bits & (2 ** w - 1 << i)) >> i blob.seek(0) (buf,) = unpack("> i (buf,) = unpack(" 0: persistdirectoryatom = persistdirectoryatom_stack.pop() for entry in persistdirectoryatom.rgPersistDirEntry: # logger.debug("persistId: %d" % entry.persistId) for i, offset in enumerate(entry.rgPersistOffset): persistobjectdirectory[entry.persistId + i] = offset return persistobjectdirectory def _parse_header_RC4CryptoAPI(encryptionInfo): flags = encryptionInfo.read(4) (headerSize,) = unpack(">> with open("tests/inputs/rc4cryptoapi_password.ppt", "rb") as f: ... officefile = Ppt97File(f) ... officefile.load_key(password="Password1234_") >>> with open("tests/inputs/rc4cryptoapi_password.ppt", "rb") as f: ... officefile = Ppt97File(f) ... officefile.load_key(password="0000") Traceback (most recent call last): ... msoffcrypto.exceptions.InvalidKeyError: ... """ def __init__(self, file): self.file = file ole = olefile.OleFileIO(file) # do not close this, would close file self.ole = ole self.format = "ppt97" self.keyTypes = ["password"] self.key = None self.salt = None # streams closed in destructor: currentuser = ole.openstream("Current User") powerpointdocument = ole.openstream("PowerPoint Document") Data = namedtuple("Data", ["currentuser", "powerpointdocument"]) self.data = Data( currentuser=currentuser, powerpointdocument=powerpointdocument, ) def __del__(self): """Destructor, closes opened streams.""" if hasattr(self, "data") and self.data: if self.data.currentuser: self.data.currentuser.close() if self.data.powerpointdocument: self.data.powerpointdocument.close() def load_key(self, password=None): persistobjectdirectory = construct_persistobjectdirectory(self.data) logger.debug("[*] persistobjectdirectory: {}".format(persistobjectdirectory)) self.data.currentuser.seek(0) currentuser = _parseCurrentUser(self.data.currentuser) logger.debug("[*] currentuser: {}".format(currentuser)) self.data.powerpointdocument.seek(currentuser.currentuseratom.offsetToCurrentEdit) usereditatom = _parseUserEditAtom(self.data.powerpointdocument) logger.debug("[*] usereditatom: {}".format(usereditatom)) # cf. Part 2 in https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/1fc22d56-28f9-4818-bd45-67c2bf721ccf cryptsession10container_offset = persistobjectdirectory[usereditatom.encryptSessionPersistIdRef] logger.debug("[*] cryptsession10container_offset: {}".format(cryptsession10container_offset)) self.data.powerpointdocument.seek(cryptsession10container_offset) cryptsession10container = _parseCryptSession10Container(self.data.powerpointdocument) logger.debug("[*] cryptsession10container: {}".format(cryptsession10container)) encryptionInfo = io.BytesIO(cryptsession10container.data) encryptionVersionInfo = encryptionInfo.read(4) vMajor, vMinor = unpack(" be an encrypted document. headerToken=0xE391C05F, offsetToCurrentEdit=cuatom.offsetToCurrentEdit, lenUserName=cuatom.lenUserName, docFileVersion=cuatom.docFileVersion, majorVersion=cuatom.majorVersion, minorVersion=cuatom.minorVersion, unused=cuatom.unused, ansiUserName=cuatom.ansiUserName, relVersion=cuatom.relVersion, unicodeUserName=cuatom.unicodeUserName, ) ) buf = _packCurrentUser(currentuser_new) buf.seek(0) currentuser_buf = buf # List of encrypted parts: https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-ppt/b0963334-4408-4621-879a-ef9c54551fd8 # PowerPoint Document Stream self.data.powerpointdocument.seek(0) powerpointdocument_size = len(self.data.powerpointdocument.read()) logger.debug("[*] powerpointdocument_size: {}".format(powerpointdocument_size)) self.data.powerpointdocument.seek(0) dec_bytearray = bytearray(self.data.powerpointdocument.read()) # UserEditAtom self.data.powerpointdocument.seek(currentuser.currentuseratom.offsetToCurrentEdit) # currentuseratom_raw = self.data.powerpointdocument.read(40) self.data.powerpointdocument.seek(currentuser.currentuseratom.offsetToCurrentEdit) usereditatom = _parseUserEditAtom(self.data.powerpointdocument) # logger.debug(usereditatom) # logger.debug(["offsetToCurrentEdit", currentuser.currentuseratom.offsetToCurrentEdit]) rh_new = RecordHeader( recVer=usereditatom.rh.recVer, recInstance=usereditatom.rh.recInstance, recType=usereditatom.rh.recType, recLen=usereditatom.rh.recLen - 4, # Omit encryptSessionPersistIdRef field ) # logger.debug([_packRecordHeader(usereditatom.rh).read(), _packRecordHeader(rh_new).read()]) usereditatom_new = UserEditAtom( rh=rh_new, lastSlideIdRef=usereditatom.lastSlideIdRef, version=usereditatom.version, minorVersion=usereditatom.minorVersion, majorVersion=usereditatom.majorVersion, offsetLastEdit=usereditatom.offsetLastEdit, offsetPersistDirectory=usereditatom.offsetPersistDirectory, docPersistIdRef=usereditatom.docPersistIdRef, persistIdSeed=usereditatom.persistIdSeed, lastView=usereditatom.lastView, unused=usereditatom.unused, encryptSessionPersistIdRef=0x00000000, # Clear ) # logger.debug(currentuseratom_raw) # logger.debug(_packUserEditAtom(usereditatom).read()) # logger.debug(_packUserEditAtom(usereditatom_new).read()) buf = _packUserEditAtom(usereditatom_new) buf.seek(0) buf_bytes = bytearray(buf.read()) offset = currentuser.currentuseratom.offsetToCurrentEdit dec_bytearray[offset : offset + len(buf_bytes)] = buf_bytes # PersistDirectoryAtom self.data.powerpointdocument.seek(currentuser.currentuseratom.offsetToCurrentEdit) usereditatom = _parseUserEditAtom(self.data.powerpointdocument) # logger.debug(usereditatom) self.data.powerpointdocument.seek(usereditatom.offsetPersistDirectory) persistdirectoryatom = _parsePersistDirectoryAtom(self.data.powerpointdocument) # logger.debug(persistdirectoryatom) persistdirectoryatom_new = PersistDirectoryAtom( rh=persistdirectoryatom.rh, rgPersistDirEntry=[ PersistDirectoryEntry( persistId=persistdirectoryatom.rgPersistDirEntry[0].persistId, # Omit CryptSession10Container cPersist=persistdirectoryatom.rgPersistDirEntry[0].cPersist - 1, rgPersistOffset=persistdirectoryatom.rgPersistDirEntry[0].rgPersistOffset, ), ], ) self.data.powerpointdocument.seek(usereditatom.offsetPersistDirectory) buf = _packPersistDirectoryAtom(persistdirectoryatom_new) buf_bytes = bytearray(buf.read()) offset = usereditatom.offsetPersistDirectory dec_bytearray[offset : offset + len(buf_bytes)] = buf_bytes # Persist Objects self.data.powerpointdocument.seek(0) persistobjectdirectory = construct_persistobjectdirectory(self.data) directory_items = list(persistobjectdirectory.items()) for i, (persistId, offset) in enumerate(directory_items): self.data.powerpointdocument.seek(offset) buf = self.data.powerpointdocument.read(8) rh = _parseRecordHeader(io.BytesIO(buf)) logger.debug("[*] rh: {}".format(rh)) # CryptSession10Container if rh.recType == 0x2F14: logger.debug("[*] CryptSession10Container found") # Remove encryption, pad by zero to preserve stream size dec_bytearray[offset : offset + (8 + rh.recLen)] = b"\x00" * (8 + rh.recLen) continue # The UserEditAtom record (section 2.3.3) and the PersistDirectoryAtom record (section 2.3.4) MUST NOT be encrypted. if rh.recType in [0x0FF5, 0x1772]: logger.debug("[*] UserEditAtom/PersistDirectoryAtom found") continue # TODO: Fix here recLen = directory_items[i + 1][1] - offset - 8 logger.debug("[*] recLen: {}".format(recLen)) self.data.powerpointdocument.seek(offset) enc_buf = io.BytesIO(self.data.powerpointdocument.read(8 + recLen)) blocksize = self.keySize * ((8 + recLen) // self.keySize + 1) # Undocumented dec = DocumentRC4CryptoAPI.decrypt(self.key, self.salt, self.keySize, enc_buf, blocksize=blocksize, block=persistId) dec_bytes = bytearray(dec.read()) dec_bytearray[offset : offset + len(dec_bytes)] = dec_bytes # To BytesIO dec_buf = io.BytesIO(dec_bytearray) dec_buf.seek(0) for i, (persistId, offset) in enumerate(directory_items): dec_buf.seek(offset) buf = dec_buf.read(8) rh = _parseRecordHeader(io.BytesIO(buf)) logger.debug("[*] rh: {}".format(rh)) dec_buf.seek(0) logger.debug("[*] powerpointdocument_size={}, len(dec_buf.read())={}".format(powerpointdocument_size, len(dec_buf.read()))) dec_buf.seek(0) powerpointdocument_dec_buf = dec_buf # TODO: Pictures Stream # TODO: Encrypted Summary Info Stream with tempfile.TemporaryFile() as _ofile: self.file.seek(0) shutil.copyfileobj(self.file, _ofile) outole = olefile.OleFileIO(_ofile, write_mode=True) outole.write_stream("Current User", currentuser_buf.read()) outole.write_stream("PowerPoint Document", powerpointdocument_dec_buf.read()) # Finalize _ofile.seek(0) shutil.copyfileobj(_ofile, ofile) return def is_encrypted(self): r""" Test if the file is encrypted. >>> f = open("tests/inputs/plain.ppt", "rb") >>> file = Ppt97File(f) >>> file.is_encrypted() False >>> f = open("tests/inputs/rc4cryptoapi_password.ppt", "rb") >>> file = Ppt97File(f) >>> file.is_encrypted() True """ self.data.currentuser.seek(0) currentuser = _parseCurrentUser(self.data.currentuser) logger.debug("[*] currentuser: {}".format(currentuser)) self.data.powerpointdocument.seek(currentuser.currentuseratom.offsetToCurrentEdit) usereditatom = _parseUserEditAtom(self.data.powerpointdocument) logger.debug("[*] usereditatom: {}".format(usereditatom)) if usereditatom.rh.recLen == 0x00000020: # Cf. _parseUserEditAtom return True else: return False ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/format/xls97.py0000644000000000000000000004415100000000000020401 0ustar0000000000000000import logging, io, shutil, tempfile from struct import pack, unpack from collections import namedtuple import olefile from msoffcrypto import exceptions from msoffcrypto.format import base from msoffcrypto.format.common import _parse_encryptionheader, _parse_encryptionverifier from msoffcrypto.method.rc4 import DocumentRC4 from msoffcrypto.method.rc4_cryptoapi import DocumentRC4CryptoAPI logger = logging.getLogger(__name__) logger.addHandler(logging.NullHandler()) recordNameNum = { "Formula": 6, "EOF": 10, "CalcCount": 12, "CalcMode": 13, "CalcPrecision": 14, "CalcRefMode": 15, "CalcDelta": 16, "CalcIter": 17, "Protect": 18, "Password": 19, "Header": 20, "Footer": 21, "ExternSheet": 23, "Lbl": 24, "WinProtect": 25, "VerticalPageBreaks": 26, "HorizontalPageBreaks": 27, "Note": 28, "Selection": 29, "Date1904": 34, "ExternName": 35, "LeftMargin": 38, "RightMargin": 39, "TopMargin": 40, "BottomMargin": 41, "PrintRowCol": 42, "PrintGrid": 43, "FilePass": 47, "Font": 49, "PrintSize": 51, "Continue": 60, "Window1": 61, "Backup": 64, "Pane": 65, "CodePage": 66, "Pls": 77, "DCon": 80, "DConRef": 81, "DConName": 82, "DefColWidth": 85, "XCT": 89, "CRN": 90, "FileSharing": 91, "WriteAccess": 92, "Obj": 93, "Uncalced": 94, "CalcSaveRecalc": 95, "Template": 96, "Intl": 97, "ObjProtect": 99, "ColInfo": 125, "Guts": 128, "WsBool": 129, "GridSet": 130, "HCenter": 131, "VCenter": 132, "BoundSheet8": 133, "WriteProtect": 134, "Country": 140, "HideObj": 141, "Sort": 144, "Palette": 146, "Sync": 151, "LPr": 152, "DxGCol": 153, "FnGroupName": 154, "FilterMode": 155, "BuiltInFnGroupCount": 156, "AutoFilterInfo": 157, "AutoFilter": 158, "Scl": 160, "Setup": 161, "ScenMan": 174, "SCENARIO": 175, "SxView": 176, "Sxvd": 177, "SXVI": 178, "SxIvd": 180, "SXLI": 181, "SXPI": 182, "DocRoute": 184, "RecipName": 185, "MulRk": 189, "MulBlank": 190, "Mms": 193, "SXDI": 197, "SXDB": 198, "SXFDB": 199, "SXDBB": 200, "SXNum": 201, "SxBool": 202, "SxErr": 203, "SXInt": 204, "SXString": 205, "SXDtr": 206, "SxNil": 207, "SXTbl": 208, "SXTBRGIITM": 209, "SxTbpg": 210, "ObProj": 211, "SXStreamID": 213, "DBCell": 215, "SXRng": 216, "SxIsxoper": 217, "BookBool": 218, "DbOrParamQry": 220, "ScenarioProtect": 221, "OleObjectSize": 222, "XF": 224, "InterfaceHdr": 225, "InterfaceEnd": 226, "SXVS": 227, "MergeCells": 229, "BkHim": 233, "MsoDrawingGroup": 235, "MsoDrawing": 236, "MsoDrawingSelection": 237, "PhoneticInfo": 239, "SxRule": 240, "SXEx": 241, "SxFilt": 242, "SxDXF": 244, "SxItm": 245, "SxName": 246, "SxSelect": 247, "SXPair": 248, "SxFmla": 249, "SxFormat": 251, "SST": 252, "LabelSst": 253, "ExtSST": 255, "SXVDEx": 256, "SXFormula": 259, "SXDBEx": 290, "RRDInsDel": 311, "RRDHead": 312, "RRDChgCell": 315, "RRTabId": 317, "RRDRenSheet": 318, "RRSort": 319, "RRDMove": 320, "RRFormat": 330, "RRAutoFmt": 331, "RRInsertSh": 333, "RRDMoveBegin": 334, "RRDMoveEnd": 335, "RRDInsDelBegin": 336, "RRDInsDelEnd": 337, "RRDConflict": 338, "RRDDefName": 339, "RRDRstEtxp": 340, "LRng": 351, "UsesELFs": 352, "DSF": 353, "CUsr": 401, "CbUsr": 402, "UsrInfo": 403, "UsrExcl": 404, "FileLock": 405, "RRDInfo": 406, "BCUsrs": 407, "UsrChk": 408, "UserBView": 425, "UserSViewBegin": 426, "UserSViewBegin_Chart": 426, "UserSViewEnd": 427, "RRDUserView": 428, "Qsi": 429, "SupBook": 430, "Prot4Rev": 431, "CondFmt": 432, "CF": 433, "DVal": 434, "DConBin": 437, "TxO": 438, "RefreshAll": 439, "HLink": 440, "Lel": 441, "CodeName": 442, "SXFDBType": 443, "Prot4RevPass": 444, "ObNoMacros": 445, "Dv": 446, "Excel9File": 448, "RecalcId": 449, "EntExU2": 450, "Dimensions": 512, "Blank": 513, "Number": 515, "Label": 516, "BoolErr": 517, "String": 519, "Row": 520, "Index": 523, "Array": 545, "DefaultRowHeight": 549, "Table": 566, "Window2": 574, "RK": 638, "Style": 659, "BigName": 1048, "Format": 1054, "ContinueBigName": 1084, "ShrFmla": 1212, "HLinkTooltip": 2048, "WebPub": 2049, "QsiSXTag": 2050, "DBQueryExt": 2051, "ExtString": 2052, "TxtQry": 2053, "Qsir": 2054, "Qsif": 2055, "RRDTQSIF": 2056, "BOF": 2057, "OleDbConn": 2058, "WOpt": 2059, "SXViewEx": 2060, "SXTH": 2061, "SXPIEx": 2062, "SXVDTEx": 2063, "SXViewEx9": 2064, "ContinueFrt": 2066, "RealTimeData": 2067, "ChartFrtInfo": 2128, "FrtWrapper": 2129, "StartBlock": 2130, "EndBlock": 2131, "StartObject": 2132, "EndObject": 2133, "CatLab": 2134, "YMult": 2135, "SXViewLink": 2136, "PivotChartBits": 2137, "FrtFontList": 2138, "SheetExt": 2146, "BookExt": 2147, "SXAddl": 2148, "CrErr": 2149, "HFPicture": 2150, "FeatHdr": 2151, "Feat": 2152, "DataLabExt": 2154, "DataLabExtContents": 2155, "CellWatch": 2156, "FeatHdr11": 2161, "Feature11": 2162, "DropDownObjIds": 2164, "ContinueFrt11": 2165, "DConn": 2166, "List12": 2167, "Feature12": 2168, "CondFmt12": 2169, "CF12": 2170, "CFEx": 2171, "XFCRC": 2172, "XFExt": 2173, "AutoFilter12": 2174, "ContinueFrt12": 2175, "MDTInfo": 2180, "MDXStr": 2181, "MDXTuple": 2182, "MDXSet": 2183, "MDXProp": 2184, "MDXKPI": 2185, "MDB": 2186, "PLV": 2187, "Compat12": 2188, "DXF": 2189, "TableStyles": 2190, "TableStyle": 2191, "TableStyleElement": 2192, "StyleExt": 2194, "NamePublish": 2195, "NameCmt": 2196, "SortData": 2197, "Theme": 2198, "GUIDTypeLib": 2199, "FnGrp12": 2200, "NameFnGrp12": 2201, "MTRSettings": 2202, "CompressPictures": 2203, "HeaderFooter": 2204, "CrtLayout12": 2205, "CrtMlFrt": 2206, "CrtMlFrtContinue": 2207, "ForceFullCalculation": 2211, "ShapePropsStream": 2212, "TextPropsStream": 2213, "RichTextStream": 2214, "CrtLayout12A": 2215, "Units": 4097, "Chart": 4098, "Series": 4099, "DataFormat": 4102, "LineFormat": 4103, "MarkerFormat": 4105, "AreaFormat": 4106, "PieFormat": 4107, "AttachedLabel": 4108, "SeriesText": 4109, "ChartFormat": 4116, "Legend": 4117, "SeriesList": 4118, "Bar": 4119, "Line": 4120, "Pie": 4121, "Area": 4122, "Scatter": 4123, "CrtLine": 4124, "Axis": 4125, "Tick": 4126, "ValueRange": 4127, "CatSerRange": 4128, "AxisLine": 4129, "CrtLink": 4130, "DefaultText": 4132, "Text": 4133, "FontX": 4134, "ObjectLink": 4135, "Frame": 4146, "Begin": 4147, "End": 4148, "PlotArea": 4149, "Chart3d": 4154, "PicF": 4156, "DropBar": 4157, "Radar": 4158, "Surf": 4159, "RadarArea": 4160, "AxisParent": 4161, "LegendException": 4163, "ShtProps": 4164, "SerToCrt": 4165, "AxesUsed": 4166, "SBaseRef": 4168, "SerParent": 4170, "SerAuxTrend": 4171, "IFmtRecord": 4174, "Pos": 4175, "AlRuns": 4176, "BRAI": 4177, "SerAuxErrBar": 4187, "ClrtClient": 4188, "SerFmt": 4189, "Chart3DBarShape": 4191, "Fbi": 4192, "BopPop": 4193, "AxcExt": 4194, "Dat": 4195, "PlotGrowth": 4196, "SIIndex": 4197, "GelFrame": 4198, "BopPopCustom": 4199, "Fbi2": 4200, } def _parse_header_RC4(encryptionInfo): # RC4: https://msdn.microsoft.com/en-us/library/dd908560(v=office.12).aspx salt = encryptionInfo.read(16) encryptedVerifier = encryptionInfo.read(16) encryptedVerifierHash = encryptionInfo.read(16) info = { "salt": salt, "encryptedVerifier": encryptedVerifier, "encryptedVerifierHash": encryptedVerifierHash, } return info def _parse_header_RC4CryptoAPI(encryptionInfo): flags = encryptionInfo.read(4) (headerSize,) = unpack(">> with open("tests/inputs/rc4cryptoapi_password.xls", "rb") as f: ... officefile = Xls97File(f) ... officefile.load_key(password="Password1234_") >>> with open("tests/inputs/rc4cryptoapi_password.xls", "rb") as f: ... officefile = Xls97File(f) ... officefile.load_key(password="0000") Traceback (most recent call last): ... msoffcrypto.exceptions.InvalidKeyError: ... """ def __init__(self, file): self.file = file ole = olefile.OleFileIO(file) # do not close this, would close file self.ole = ole self.format = "xls97" self.keyTypes = ["password"] self.key = None self.salt = None workbook = ole.openstream("Workbook") # closed in destructor Data = namedtuple("Data", ["workbook"]) self.data = Data( workbook=workbook, ) def __del__(self): """Destructor, closes opened stream.""" if hasattr(self, "data") and self.data and self.data.workbook: self.data.workbook.close() def load_key(self, password=None): self.data.workbook.seek(0) workbook = _BIFFStream(self.data.workbook) # workbook stream consists of records, each of which begins with its ID number. # Record IDs (in decimal) are listed here: https://msdn.microsoft.com/en-us/library/dd945945(v=office.12).aspx # workbook stream's structure is WORKBOOK = BOF WORKBOOKCONTENT and so forth # as in https://msdn.microsoft.com/en-us/library/dd952177(v=office.12).aspx # A record begins with its length (in bytes). (num,) = unpack(">> f = open("tests/inputs/plain.xls", "rb") >>> file = Xls97File(f) >>> file.is_encrypted() False >>> f = open("tests/inputs/rc4cryptoapi_password.xls", "rb") >>> file = Xls97File(f) >>> file.is_encrypted() True """ # Utilising the method above, check for encryption type. self.data.workbook.seek(0) workbook = _BIFFStream(self.data.workbook) (num,) = unpack(" spincount-1; hash = sha512(iterator + hash) for i in range(0, spinValue, 1): h = hashCalc(pack(">> key = b'@ f\t\xd9\xfa\xad\xf2K\x07j\xeb\xf2\xc45\xb7B\x92\xc8\xb8\xa7\xaa\x81\xbcg\x9b\xe8\x97\x11\xb0*\xc2' >>> keyDataSalt = b'\x8f\xc7x"+P\x8d\xdcL\xe6\x8c\xdd\x15<\x16\xb4' >>> hashAlgorithm = 'SHA512' """ SEGMENT_LENGTH = 4096 hashCalc = _get_hash_func(hashAlgorithm) obuf = io.BytesIO() totalSize = unpack(">> password = 'Password1234_' >>> saltValue = b'\xcb\xca\x1c\x99\x93C\xfb\xad\x92\x07V4\x15\x004\xb0' >>> hashAlgorithm = 'SHA512' >>> encryptedVerifierHashInput = b'9\xee\xa5N&\xe5\x14y\x8c(K\xc7qM8\xac' >>> encryptedVerifierHashValue = b'\x147mm\x81s4\xe6\xb0\xffO\xd8"\x1a|g\x8e]\x8axN\x8f\x99\x9fL\x18\x890\xc3jK)\xc5\xb33`' + \ ... b'[\\\xd4\x03\xb0P\x03\xad\xcf\x18\xcc\xa8\xcb\xab\x8d\xeb\xe3s\xc6V\x04\xa0\xbe\xcf\xae\\\n\xd0' >>> spinValue = 100000 >>> keyBits = 256 >>> ECMA376Agile.verify_password(password, saltValue, hashAlgorithm, encryptedVerifierHashInput, encryptedVerifierHashValue, spinValue, keyBits) True """ # NOTE: See https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-offcrypto/a57cb947-554f-4e5e-b150-3f2978225e92 block1 = bytearray([0xFE, 0xA7, 0xD2, 0x76, 0x3B, 0x4B, 0x9E, 0x79]) block2 = bytearray([0xD7, 0xAA, 0x0F, 0x6D, 0x30, 0x61, 0x34, 0x4E]) h = ECMA376Agile._derive_iterated_hash_from_password(password, saltValue, hashAlgorithm, spinValue) key1 = ECMA376Agile._derive_encryption_key(h.digest(), block1, hashAlgorithm, keyBits) key2 = ECMA376Agile._derive_encryption_key(h.digest(), block2, hashAlgorithm, keyBits) hash_input = _decrypt_aes_cbc(encryptedVerifierHashInput, key1, saltValue) hashCalc = _get_hash_func(hashAlgorithm) acutal_hash = hashCalc(hash_input) acutal_hash = acutal_hash.digest() expected_hash = _decrypt_aes_cbc(encryptedVerifierHashValue, key2, saltValue) return acutal_hash == expected_hash @staticmethod def verify_integrity(secretKey, keyDataSalt, keyDataHashAlgorithm, keyDataBlockSize, encryptedHmacKey, encryptedHmacValue, stream): r""" Return True if the HMAC of the data payload is valid. """ # NOTE: See https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-offcrypto/63d9c262-82b9-4fa3-a06d-d087b93e3b00 block4 = bytearray([0x5F, 0xB2, 0xAD, 0x01, 0x0C, 0xB9, 0xE1, 0xF6]) block5 = bytearray([0xA0, 0x67, 0x7F, 0x02, 0xB2, 0x2C, 0x84, 0x33]) hashCalc = _get_hash_func(keyDataHashAlgorithm) iv1 = hashCalc(keyDataSalt + block4).digest() iv1 = iv1[:keyDataBlockSize] iv2 = hashCalc(keyDataSalt + block5).digest() iv2 = iv2[:keyDataBlockSize] hmacKey = _decrypt_aes_cbc(encryptedHmacKey, secretKey, iv1) hmacValue = _decrypt_aes_cbc(encryptedHmacValue, secretKey, iv2) msg_hmac = hmac.new(hmacKey, stream.read(), hashCalc) actualHmac = msg_hmac.digest() stream.seek(0) return hmacValue == actualHmac @staticmethod def makekey_from_privkey(privkey, encryptedKeyValue): privkey = serialization.load_pem_private_key(privkey.read(), password=None, backend=default_backend()) skey = privkey.decrypt(encryptedKeyValue, padding.PKCS1v15()) return skey @staticmethod def makekey_from_password(password, saltValue, hashAlgorithm, encryptedKeyValue, spinValue, keyBits): r""" Generate intermediate key from given password. >>> password = 'Password1234_' >>> saltValue = b'Lr]E\xdca\x0f\x93\x94\x12\xa0M\xa7\x91\x04f' >>> hashAlgorithm = 'SHA512' >>> encryptedKeyValue = b"\xa1l\xd5\x16Zz\xb9\xd2q\x11>\xd3\x86\xa7\x8c\xf4\x96\x92\xe8\xe5'\xb0\xc5\xfc\x00U\xed\x08\x0b|\xb9K" >>> spinValue = 100000 >>> keyBits = 256 >>> expected = b'@ f\t\xd9\xfa\xad\xf2K\x07j\xeb\xf2\xc45\xb7B\x92\xc8\xb8\xa7\xaa\x81\xbcg\x9b\xe8\x97\x11\xb0*\xc2' >>> ECMA376Agile.makekey_from_password(password, saltValue, hashAlgorithm, encryptedKeyValue, spinValue, keyBits) == expected True """ block3 = bytearray([0x14, 0x6E, 0x0B, 0xE7, 0xAB, 0xAC, 0xD0, 0xD6]) h = ECMA376Agile._derive_iterated_hash_from_password(password, saltValue, hashAlgorithm, spinValue) encryption_key = ECMA376Agile._derive_encryption_key(h.digest(), block3, hashAlgorithm, keyBits) skey = _decrypt_aes_cbc(encryptedKeyValue, encryption_key, saltValue) return skey ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/method/ecma376_extensible.py0000644000000000000000000000007600000000000022770 0ustar0000000000000000class ECMA376Extensible: def __init__(self): pass ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/method/ecma376_standard.py0000644000000000000000000000752200000000000022431 0ustar0000000000000000import logging import io from hashlib import sha1 from struct import pack, unpack from cryptography.hazmat.backends import default_backend from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes logger = logging.getLogger(__name__) logger.addHandler(logging.NullHandler()) class ECMA376Standard: def __init__(self): pass @staticmethod def decrypt(key, ibuf): r""" Return decrypted data. """ obuf = io.BytesIO() totalSize = unpack(">> key = b'@\xb1:q\xf9\x0b\x96n7T\x08\xf2\xd1\x81\xa1\xaa' >>> encryptedVerifier = b'Qos.\x96o\xac\x17\xb1\xc5\xd7\xd8\xcc6\xc9(' >>> encryptedVerifierHash = b'+ah\xda\xbe)\x11\xad+\xd3|\x17Ft\\\x14\xd3\xcf\x1b\xb1@\xa4\x8fNo=#\x88\x08r\xb1j' >>> ECMA376Standard.verifykey(key, encryptedVerifier, encryptedVerifierHash) True """ # TODO: For consistency with Agile, rename method to verify_password or the like logger.debug([key, encryptedVerifier, encryptedVerifierHash]) # https://msdn.microsoft.com/en-us/library/dd926426(v=office.12).aspx aes = Cipher(algorithms.AES(key), modes.ECB(), backend=default_backend()) decryptor = aes.decryptor() verifier = decryptor.update(encryptedVerifier) expected_hash = sha1(verifier).digest() decryptor = aes.decryptor() verifierHash = decryptor.update(encryptedVerifierHash)[: sha1().digest_size] return expected_hash == verifierHash @staticmethod def makekey_from_password(password, algId, algIdHash, providerType, keySize, saltSize, salt): r""" Generate intermediate key from given password. >>> password = 'Password1234_' >>> algId = 0x660e >>> algIdHash = 0x8004 >>> providerType = 0x18 >>> keySize = 128 >>> saltSize = 16 >>> salt = b'\xe8\x82fI\x0c[\xd1\xee\xbd+C\x94\xe3\xf80\xef' >>> expected = b'@\xb1:q\xf9\x0b\x96n7T\x08\xf2\xd1\x81\xa1\xaa' >>> ECMA376Standard.makekey_from_password(password, algId, algIdHash, providerType, keySize, saltSize, salt) == expected True """ logger.debug([password, hex(algId), hex(algIdHash), hex(providerType), keySize, saltSize, salt]) xor_bytes = lambda a, b: bytearray([p ^ q for p, q in zip(bytearray(a), bytearray(b))]) # bytearray() for Python 2 compat. # https://msdn.microsoft.com/en-us/library/dd925430(v=office.12).aspx ITER_COUNT = 50000 password = password.encode("UTF-16LE") h = sha1(salt + password).digest() for i in range(ITER_COUNT): ibytes = pack(">> password = 'password1' >>> salt = b'\xe8w,\x1d\x91\xc5j7\x96Ga\xb2\x80\x182\x17' >>> block = 0 >>> expected = b' \xbf2\xdd\xf5@\x85\x8cQ7D\xaf\x0f$\xe0<' >>> _makekey(password, salt, block) == expected True """ # https://msdn.microsoft.com/en-us/library/dd920360(v=office.12).aspx password = password.encode("UTF-16LE") h0 = md5(password).digest() truncatedHash = h0[:5] intermediateBuffer = (truncatedHash + salt) * 16 h1 = md5(intermediateBuffer).digest() truncatedHash = h1[:5] blockbytes = pack(">> password = 'password1' >>> salt = b'\xe8w,\x1d\x91\xc5j7\x96Ga\xb2\x80\x182\x17' >>> encryptedVerifier = b'\xc9\xe9\x97\xd4T\x97=1\x0b\xb1\xbap\x14&\x83~' >>> encryptedVerifierHash = b'\xb1\xde\x17\x8f\x07\xe9\x89\xc4M\xae^L\xf9j\xc4\x07' >>> DocumentRC4.verifypw(password, salt, encryptedVerifier, encryptedVerifierHash) True """ # https://msdn.microsoft.com/en-us/library/dd952648(v=office.12).aspx block = 0 key = _makekey(password, salt, block) cipher = Cipher(algorithms.ARC4(key), mode=None, backend=default_backend()) decryptor = cipher.decryptor() verifier = decryptor.update(encryptedVerifier) verfiferHash = decryptor.update(encryptedVerifierHash) hash = md5(verifier).digest() logging.debug([verfiferHash, hash]) return hash == verfiferHash @staticmethod def decrypt(password, salt, ibuf, blocksize=0x200): r""" Return decrypted data. """ obuf = io.BytesIO() block = 0 key = _makekey(password, salt, block) for c, buf in enumerate(iter(functools.partial(ibuf.read, blocksize), b"")): cipher = Cipher(algorithms.ARC4(key), mode=None, backend=default_backend()) decryptor = cipher.decryptor() dec = decryptor.update(buf) + decryptor.finalize() obuf.write(dec) # From wvDecrypt: # at this stage we need to rekey the rc4 algorithm # Dieter Spaar figured out # this rekeying, big kudos to him block += 1 key = _makekey(password, salt, block) obuf.seek(0) return obuf ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/method/rc4_cryptoapi.py0000644000000000000000000000470400000000000022165 0ustar0000000000000000import functools, io, logging from hashlib import sha1 from struct import pack from cryptography.hazmat.backends import default_backend from cryptography.hazmat.primitives.ciphers import Cipher, algorithms logger = logging.getLogger(__name__) logger.addHandler(logging.NullHandler()) def _makekey(password, salt, keyLength, block, algIdHash=0x00008004): r""" Return a intermediate key. """ # https://msdn.microsoft.com/en-us/library/dd920677(v=office.12).aspx password = password.encode("UTF-16LE") h0 = sha1(salt + password).digest() blockbytes = pack(" figured out # this rekeying, big kudos to him block += 1 key = _makekey(password, salt, keySize, block) obuf.seek(0) return obuf ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/msoffcrypto/method/xor_obfuscation.py0000644000000000000000000000000000000000000022570 0ustar0000000000000000././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690887.7988966 msoffcrypto-tool-5.0.0/pyproject.toml0000644000000000000000000000176300000000000016114 0ustar0000000000000000[tool.poetry] name = "msoffcrypto-tool" version = "5.0.0" description = "Python tool and library for decrypting MS Office files with passwords or other keys" license = "MIT" homepage = "https://github.com/nolze/msoffcrypto-tool" authors = ["nolze "] readme = "README.md" packages = [{ include = "msoffcrypto" }] [tool.poetry.dependencies] python = "^3.6" cryptography = ">=2.3" olefile = ">=0.45" [tool.poetry.dev-dependencies] pytest = { version = ">=6.2.1", python = "^3.6" } black = { version = "^20.8b1", python = "^3.6" } coverage = { extras = ["toml"], version = "^5.3.1" } [tool.poetry.scripts] msoffcrypto-tool = 'msoffcrypto.__main__:main' [tool.black] line-length = 140 exclude = '/(\.git|\.pytest_cache|\.venv|\.vscode|dist|docs)/' [tool.pytest.ini_options] addopts = "-ra -q --doctest-modules" testpaths = ["msoffcrypto", "tests"] [tool.coverage.run] omit = [".venv/*", "tests/*"] [build-system] requires = ["poetry_core>=1.0.0"] build-backend = "poetry.core.masonry.api" ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690938.2696033 msoffcrypto-tool-5.0.0/setup.py0000644000000000000000000001672600000000000014717 0ustar0000000000000000# -*- coding: utf-8 -*- from setuptools import setup packages = \ ['msoffcrypto', 'msoffcrypto.format', 'msoffcrypto.method'] package_data = \ {'': ['*']} install_requires = \ ['cryptography>=2.3', 'olefile>=0.45'] entry_points = \ {'console_scripts': ['msoffcrypto-tool = msoffcrypto.__main__:main']} setup_kwargs = { 'name': 'msoffcrypto-tool', 'version': '5.0.0', 'description': 'Python tool and library for decrypting MS Office files with passwords or other keys', 'long_description': '# msoffcrypto-tool\n\n[![PyPI](https://img.shields.io/pypi/v/msoffcrypto-tool.svg)](https://pypi.org/project/msoffcrypto-tool/)\n[![PyPI downloads](https://img.shields.io/pypi/dm/msoffcrypto-tool.svg)](https://pypistats.org/packages/msoffcrypto-tool)\n[![Build Status](https://travis-ci.com/nolze/msoffcrypto-tool.svg?branch=master)](https://travis-ci.com/nolze/msoffcrypto-tool)\n[![Coverage Status](https://codecov.io/gh/nolze/msoffcrypto-tool/branch/master/graph/badge.svg)](https://codecov.io/gh/nolze/msoffcrypto-tool)\n[![Documentation Status](https://readthedocs.org/projects/msoffcrypto-tool/badge/?version=latest)](http://msoffcrypto-tool.readthedocs.io/en/latest/?badge=latest)\n\nmsoffcrypto-tool (formerly ms-offcrypto-tool) is Python tool and library for decrypting encrypted MS Office files with password, intermediate key, or private key which generated its escrow key.\n\n## Contents\n\n* [Install](#install)\n* [Examples](#examples)\n* [Supported encryption methods](#supported-encryption-methods)\n* [Tests](#tests)\n* [Todo](#todo)\n* [Resources](#resources)\n* [Use cases and mentions](#use-cases-and-mentions)\n* [Contributors](#contributors)\n\n## Install\n\n```\npip install msoffcrypto-tool\n```\n\n## Examples\n\n### As CLI tool (with password)\n\n```\nmsoffcrypto-tool encrypted.docx decrypted.docx -p Passw0rd\n```\n\nPassword is prompted if you omit the password argument value:\n\n```bash\n$ msoffcrypto-tool encrypted.docx decrypted.docx -p\nPassword:\n```\n\nTest if the file is encrypted or not (exit code 0 or 1 is returned):\n\n```\nmsoffcrypto-tool document.doc --test -v\n```\n\n### As library\n\nPassword and more key types are supported with library functions.\n\nBasic usage:\n\n```python\nimport msoffcrypto\n\nencrypted = open("encrypted.docx", "rb")\nfile = msoffcrypto.OfficeFile(encrypted)\n\nfile.load_key(password="Passw0rd") # Use password\n\nwith open("decrypted.docx", "wb") as f:\n file.decrypt(f)\n\nencrypted.close()\n```\n\nBasic usage (in-memory):\n\n```python\nimport msoffcrypto\nimport io\nimport pandas as pd\n\ndecrypted = io.BytesIO()\n\nwith open("encrypted.xlsx", "rb") as f:\n file = msoffcrypto.OfficeFile(f)\n file.load_key(password="Passw0rd") # Use password\n file.decrypt(decrypted)\n\ndf = pd.read_excel(decrypted)\nprint(df)\n```\n\nAdvanced usage:\n\n```python\n# Verify password before decryption (default: False)\n# The ECMA-376 Agile/Standard crypto system allows one to know whether the supplied password is correct before actually decrypting the file\n# Currently, the verify_password option is only meaningful for ECMA-376 Agile/Standard Encryption\nfile.load_key(password="Passw0rd", verify_password=True)\n\n# Use private key\nfile.load_key(private_key=open("priv.pem", "rb"))\n\n# Use intermediate key (secretKey)\nfile.load_key(secret_key=binascii.unhexlify("AE8C36E68B4BB9EA46E5544A5FDB6693875B2FDE1507CBC65C8BCF99E25C2562"))\n\n# Check the HMAC of the data payload before decryption (default: False)\n# Currently, the verify_integrity option is only meaningful for ECMA-376 Agile Encryption\nfile.decrypt(open("decrypted.docx", "wb"), verify_integrity=True)\n```\n\n## Supported encryption methods\n\n### MS-OFFCRYPTO specs\n\n* [x] ECMA-376 (Agile Encryption/Standard Encryption)\n * [x] MS-DOCX (OOXML) (Word 2007-2016)\n * [x] MS-XLSX (OOXML) (Excel 2007-2016)\n * [x] MS-PPTX (OOXML) (PowerPoint 2007-2016)\n* [x] Office Binary Document RC4 CryptoAPI\n * [x] MS-DOC (Word 2002, 2003, 2004)\n * [x] MS-XLS (Excel 2002, 2003, 2004) (experimental)\n * [x] MS-PPT (PowerPoint 2002, 2003, 2004) (partial, experimental)\n* [x] Office Binary Document RC4\n * [x] MS-DOC (Word 97, 98, 2000)\n * [x] MS-XLS (Excel 97, 98, 2000) (experimental)\n* [ ] ECMA-376 (Extensible Encryption)\n* [ ] XOR Obfuscation\n\n### Other\n\n* [ ] Word 95 Encryption (Word 95 and prior)\n* [ ] Excel 95 Encryption (Excel 95 and prior)\n* [ ] PowerPoint 95 Encryption (PowerPoint 95 and prior)\n\nPRs are welcome!\n\n## Tests\n\nWith [coverage](https://github.com/nedbat/coveragepy) and [pytest](https://pytest.org/):\n\n```\npoetry install\npoetry run coverage run -m pytest -v\n```\n\n## Todo\n\n* [x] Add tests\n* [x] Support decryption with passwords\n* [x] Support older encryption schemes\n* [x] Add function-level tests\n* [x] Add API documents\n* [x] Publish to PyPI\n* [x] Add decryption tests for various file formats\n* [x] Integrate with more comprehensive projects handling MS Office files (such as [oletools](https://github.com/decalage2/oletools/)?) if possible\n* [x] Add the password prompt mode for CLI\n* [x] Improve error types (v4.12.0)\n* [ ] Redesign APIs (v5.0.0)\n* [ ] Introduce something like `ctypes.Structure`\n* [ ] Support encryption\n* [ ] Isolate parser\n\n## Resources\n\n* "Backdooring MS Office documents with secret master keys" \n* Technical Documents \n * [MS-OFFCRYPTO] Agile Encryption \n* LibreOffice/core \n* LibreOffice/mso-dumper \n* wvDecrypt \n* Microsoft Office password protection - Wikipedia \n* office2john.py \n\n## Alternatives\n\n* herumi/msoffice \n* DocRecrypt \n* Apache POI - the Java API for Microsoft Documents \n\n## Use cases and mentions\n\n### General\n\n* (kudos to maintainers!)\n* \n\n### Malware/maldoc analysis\n\n* \n* \n\n### CTF\n\n* \n* \n\n### In other languages\n\n* \n* \n\n## Contributors\n\n* \n', 'author': 'nolze', 'author_email': 'nolze@archlinux.us', 'maintainer': None, 'maintainer_email': None, 'url': 'https://github.com/nolze/msoffcrypto-tool', 'packages': packages, 'package_data': package_data, 'install_requires': install_requires, 'entry_points': entry_points, 'python_requires': '>=3.6,<4.0', } setup(**setup_kwargs) ././@PaxHeader0000000000000000000000000000003400000000000011452 xustar000000000000000028 mtime=1642690938.2704792 msoffcrypto-tool-5.0.0/PKG-INFO0000644000000000000000000001621100000000000014267 0ustar0000000000000000Metadata-Version: 2.1 Name: msoffcrypto-tool Version: 5.0.0 Summary: Python tool and library for decrypting MS Office files with passwords or other keys Home-page: https://github.com/nolze/msoffcrypto-tool License: MIT Author: nolze Author-email: nolze@archlinux.us Requires-Python: >=3.6,<4.0 Classifier: License :: OSI Approved :: MIT License Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Requires-Dist: cryptography (>=2.3) Requires-Dist: olefile (>=0.45) Description-Content-Type: text/markdown # msoffcrypto-tool [![PyPI](https://img.shields.io/pypi/v/msoffcrypto-tool.svg)](https://pypi.org/project/msoffcrypto-tool/) [![PyPI downloads](https://img.shields.io/pypi/dm/msoffcrypto-tool.svg)](https://pypistats.org/packages/msoffcrypto-tool) [![Build Status](https://travis-ci.com/nolze/msoffcrypto-tool.svg?branch=master)](https://travis-ci.com/nolze/msoffcrypto-tool) [![Coverage Status](https://codecov.io/gh/nolze/msoffcrypto-tool/branch/master/graph/badge.svg)](https://codecov.io/gh/nolze/msoffcrypto-tool) [![Documentation Status](https://readthedocs.org/projects/msoffcrypto-tool/badge/?version=latest)](http://msoffcrypto-tool.readthedocs.io/en/latest/?badge=latest) msoffcrypto-tool (formerly ms-offcrypto-tool) is Python tool and library for decrypting encrypted MS Office files with password, intermediate key, or private key which generated its escrow key. ## Contents * [Install](#install) * [Examples](#examples) * [Supported encryption methods](#supported-encryption-methods) * [Tests](#tests) * [Todo](#todo) * [Resources](#resources) * [Use cases and mentions](#use-cases-and-mentions) * [Contributors](#contributors) ## Install ``` pip install msoffcrypto-tool ``` ## Examples ### As CLI tool (with password) ``` msoffcrypto-tool encrypted.docx decrypted.docx -p Passw0rd ``` Password is prompted if you omit the password argument value: ```bash $ msoffcrypto-tool encrypted.docx decrypted.docx -p Password: ``` Test if the file is encrypted or not (exit code 0 or 1 is returned): ``` msoffcrypto-tool document.doc --test -v ``` ### As library Password and more key types are supported with library functions. Basic usage: ```python import msoffcrypto encrypted = open("encrypted.docx", "rb") file = msoffcrypto.OfficeFile(encrypted) file.load_key(password="Passw0rd") # Use password with open("decrypted.docx", "wb") as f: file.decrypt(f) encrypted.close() ``` Basic usage (in-memory): ```python import msoffcrypto import io import pandas as pd decrypted = io.BytesIO() with open("encrypted.xlsx", "rb") as f: file = msoffcrypto.OfficeFile(f) file.load_key(password="Passw0rd") # Use password file.decrypt(decrypted) df = pd.read_excel(decrypted) print(df) ``` Advanced usage: ```python # Verify password before decryption (default: False) # The ECMA-376 Agile/Standard crypto system allows one to know whether the supplied password is correct before actually decrypting the file # Currently, the verify_password option is only meaningful for ECMA-376 Agile/Standard Encryption file.load_key(password="Passw0rd", verify_password=True) # Use private key file.load_key(private_key=open("priv.pem", "rb")) # Use intermediate key (secretKey) file.load_key(secret_key=binascii.unhexlify("AE8C36E68B4BB9EA46E5544A5FDB6693875B2FDE1507CBC65C8BCF99E25C2562")) # Check the HMAC of the data payload before decryption (default: False) # Currently, the verify_integrity option is only meaningful for ECMA-376 Agile Encryption file.decrypt(open("decrypted.docx", "wb"), verify_integrity=True) ``` ## Supported encryption methods ### MS-OFFCRYPTO specs * [x] ECMA-376 (Agile Encryption/Standard Encryption) * [x] MS-DOCX (OOXML) (Word 2007-2016) * [x] MS-XLSX (OOXML) (Excel 2007-2016) * [x] MS-PPTX (OOXML) (PowerPoint 2007-2016) * [x] Office Binary Document RC4 CryptoAPI * [x] MS-DOC (Word 2002, 2003, 2004) * [x] MS-XLS (Excel 2002, 2003, 2004) (experimental) * [x] MS-PPT (PowerPoint 2002, 2003, 2004) (partial, experimental) * [x] Office Binary Document RC4 * [x] MS-DOC (Word 97, 98, 2000) * [x] MS-XLS (Excel 97, 98, 2000) (experimental) * [ ] ECMA-376 (Extensible Encryption) * [ ] XOR Obfuscation ### Other * [ ] Word 95 Encryption (Word 95 and prior) * [ ] Excel 95 Encryption (Excel 95 and prior) * [ ] PowerPoint 95 Encryption (PowerPoint 95 and prior) PRs are welcome! ## Tests With [coverage](https://github.com/nedbat/coveragepy) and [pytest](https://pytest.org/): ``` poetry install poetry run coverage run -m pytest -v ``` ## Todo * [x] Add tests * [x] Support decryption with passwords * [x] Support older encryption schemes * [x] Add function-level tests * [x] Add API documents * [x] Publish to PyPI * [x] Add decryption tests for various file formats * [x] Integrate with more comprehensive projects handling MS Office files (such as [oletools](https://github.com/decalage2/oletools/)?) if possible * [x] Add the password prompt mode for CLI * [x] Improve error types (v4.12.0) * [ ] Redesign APIs (v5.0.0) * [ ] Introduce something like `ctypes.Structure` * [ ] Support encryption * [ ] Isolate parser ## Resources * "Backdooring MS Office documents with secret master keys" * Technical Documents * [MS-OFFCRYPTO] Agile Encryption * LibreOffice/core * LibreOffice/mso-dumper * wvDecrypt * Microsoft Office password protection - Wikipedia * office2john.py ## Alternatives * herumi/msoffice * DocRecrypt * Apache POI - the Java API for Microsoft Documents ## Use cases and mentions ### General * (kudos to maintainers!) * ### Malware/maldoc analysis * * ### CTF * * ### In other languages * * ## Contributors *