pyinstxtractor.py 的改进 - 反编译pyinstaller生成exe的工具

news/2024/5/19 20:57:52 标签: python, pyinstaller, 反编译, pyinstxtractor, uncompyle6

编写历程

使用网上的pyinstxtractor.py提取PyInstaller生成的exe文件, 发现无法用uncompyle6反编译提取的pyc文件, 报错。
对比原先的pyc文件, 和提取的文件, 发现:
使用notepad++对比pyc文件
提取的文件内容是一样的, 但文件头和原先的pyc文件不一样。(注意: 上图中数据e3是pyc文件内容部分的开始, 其前面是文件头)
提取的文件
然后, 对比PYZ_00…pyz_extracted文件夹里的文件, 也发现文件头不一样,
说明网上的pyinstxtractor.py有bug
仔细分析后, 重写pyinstxtractor.py, 代码如下:
(注意修改的部分, 和注释)

源代码

python"># coding:utf-8
# 改编自网上的pyinstxtractor.py
r"""
PyInstaller Extractor v2.1 (Supports pyinstaller 3.3+, 3.2, 3.1, 3.0, 2.1, 2.0)
Author : Extreme Coders
E-mail : extremecoders(at)hotmail(dot)com
Web    : https://0xec.blogspot.com
Date   : 29-November-2017
Url    : https://sourceforge.net/projects/pyinstallerextractor/

For any suggestions, leave a comment on
https://forum.tuts4you.com/topic/34455-pyinstaller-extractor/

This script extracts a pyinstaller generated executable file.
Pyinstaller installation is not needed. The script has it all.

For best results, it is recommended to run this script in the
same version of python as was used to create the executable.
This is just to prevent unmarshalling errors(if any) while
extracting the PYZ archive.

Usage : Just copy this script to the directory where your exe resides
        and run the script with the exe file name as a parameter

C:\path\to\exe\>python pyinstxtractor.py <filename>
$ /path/to/exe/python pyinstxtractor.py <filename>

Licensed under GNU General Public License (GPL) v3.
You are free to modify this source.

CHANGELOG
================================================

Version 1.1 (Jan 28, 2014)
-------------------------------------------------
- First Release
- Supports only pyinstaller 2.0

Version 1.2 (Sept 12, 2015)
-------------------------------------------------
- Added support for pyinstaller 2.1 and 3.0 dev
- Cleaned up code
- Script is now more verbose
- Executable extracted within a dedicated sub-directory

(Support for pyinstaller 3.0 dev is experimental)

Version 1.3 (Dec 12, 2015)
-------------------------------------------------
- Added support for pyinstaller 3.0 final
- Script is compatible with both python 2.x & 3.x (Thanks to Moritz Kroll @ Avira Operations GmbH & Co. KG)

Version 1.4 (Jan 19, 2016)
-------------------------------------------------
- Fixed a bug when writing pyc files >= version 3.3 (Thanks to Daniello Alto: https://github.com/Djamana)

Version 1.5 (March 1, 2016)
-------------------------------------------------
- Added support for pyinstaller 3.1 (Thanks to Berwyn Hoyt for reporting)

Version 1.6 (Sept 5, 2016)
-------------------------------------------------
- Added support for pyinstaller 3.2
- Extractor will use a random name while extracting unnamed files.
- For encrypted pyz archives it will dump the contents as is. Previously, the tool would fail.

Version 1.7 (March 13, 2017)
-------------------------------------------------
- Made the script compatible with python 2.6 (Thanks to Ross for reporting)

Version 1.8 (April 28, 2017)
-------------------------------------------------
- Support for sub-directories in .pyz files (Thanks to Moritz Kroll @ Avira Operations GmbH & Co. KG)

Version 1.9 (November 29, 2017)
-------------------------------------------------
- Added support for pyinstaller 3.3
- Display the scripts which are run at entry (Thanks to Michael Gillespie @ malwarehunterteam for the feature request)

***** 版本 2.0 (2020-12-13) *****
- 修复了提取pyc文件的bug。
***** 版本 2.1.1 (2021-2-23) *****
- 修复了从PYZ中提取pyc文件的bug, 兼容几乎所有Python3版本; 可直接提取pyz文件。
***** 版本 2.2 (2022-7-25) *****
- 兼容Python 3.10。
"""

from __future__ import print_function
import os
import struct
import marshal
import zlib
import sys
import imp
import types
from uuid import uuid4 as uniquename
# 新加入的代码
try:
    from xdis.magics import magics
except ImportError:print("错误: 需使用pip安装xdis模块。")

__version__='2.2'

class CTOCEntry:
    def __init__(self, position, cmprsdDataSize, uncmprsdDataSize, cmprsFlag, typeCmprsData, name):
        self.position = position
        self.cmprsdDataSize = cmprsdDataSize
        self.uncmprsdDataSize = uncmprsdDataSize
        self.cmprsFlag = cmprsFlag
        self.typeCmprsData = typeCmprsData
        self.name = name


class PyInstArchive:
    PYINST20_COOKIE_SIZE = 24           # For pyinstaller 2.0
    PYINST21_COOKIE_SIZE = 24 + 64      # For pyinstaller 2.1+
    MAGIC = b'MEI\014\013\012\013\016'  # Magic number which identifies pyinstaller

    def __init__(self, path):
        self.filePath = path


    def open(self):
        try:
            self.fPtr = open(self.filePath, 'rb')
            self.fileSize = os.stat(self.filePath).st_size
        except:
            print('[*] Error: Could not open {0}'.format(self.filePath))
            return False
        return True


    def close(self):
        try:
            self.fPtr.close()
        except:
            pass


    def checkFile(self):
        print('[*] Processing {0}'.format(self.filePath))
        # Check if it is a 2.0 archive
        self.fPtr.seek(self.fileSize - self.PYINST20_COOKIE_SIZE, os.SEEK_SET)
        magicFromFile = self.fPtr.read(len(self.MAGIC))

        if magicFromFile == self.MAGIC:
            self.pyinstVer = 20     # pyinstaller 2.0
            print('[*] Pyinstaller version: 2.0')
            return True

        # Check for pyinstaller 2.1+ before bailing out
        self.fPtr.seek(self.fileSize - self.PYINST21_COOKIE_SIZE, os.SEEK_SET)
        magicFromFile = self.fPtr.read(len(self.MAGIC))

        if magicFromFile == self.MAGIC:
            print('[*] Pyinstaller version: 2.1+')
            self.pyinstVer = 21     # pyinstaller 2.1+
            return True

        print('[*] Error : Unsupported pyinstaller version or not a pyinstaller archive')
        return False


    def getCArchiveInfo(self):
        try:
            if self.pyinstVer == 20:
                self.fPtr.seek(self.fileSize - self.PYINST20_COOKIE_SIZE, os.SEEK_SET)

                # Read CArchive cookie
                (magic, lengthofPackage, toc, tocLen, self.pyver) = \
                struct.unpack('!8siiii', self.fPtr.read(self.PYINST20_COOKIE_SIZE))

            elif self.pyinstVer == 21:
                self.fPtr.seek(self.fileSize - self.PYINST21_COOKIE_SIZE, os.SEEK_SET)

                # Read CArchive cookie
                (magic, lengthofPackage, toc, tocLen, self.pyver, pylibname) = \
                struct.unpack('!8siiii64s', self.fPtr.read(self.PYINST21_COOKIE_SIZE))

        except:
            print('[*] Error : The file is not a pyinstaller archive')
            return False

        print('[*] Python version: {0}'.format(self.pyver))

        # Overlay is the data appended at the end of the PE
        self.overlaySize = lengthofPackage
        self.overlayPos = self.fileSize - self.overlaySize
        self.tableOfContentsPos = self.overlayPos + toc
        self.tableOfContentsSize = tocLen

        print('[*] Length of package: {0} bytes'.format(self.overlaySize))
        return True


    def parseTOC(self):
        # Go to the table of contents
        self.fPtr.seek(self.tableOfContentsPos, os.SEEK_SET)

        self.tocList = []
        parsedLen = 0

        # Parse table of contents
        while parsedLen < self.tableOfContentsSize:
            (entrySize, ) = struct.unpack('!i', self.fPtr.read(4))
            nameLen = struct.calcsize('!iiiiBc')

            (entryPos, cmprsdDataSize, uncmprsdDataSize, cmprsFlag, typeCmprsData, name) = \
            struct.unpack( \
                '!iiiBc{0}s'.format(entrySize - nameLen), \
                self.fPtr.read(entrySize - 4))

            name = name.decode('utf-8').rstrip('\0')
            if len(name) == 0:
                name = str(uniquename())
                print('[!] Warning: Found an unamed file in CArchive. Using random name {0}'.format(name))

            self.tocList.append( \
                                CTOCEntry(                      \
                                    self.overlayPos + entryPos, \
                                    cmprsdDataSize,             \
                                    uncmprsdDataSize,           \
                                    cmprsFlag,                  \
                                    typeCmprsData,              \
                                    name                        \
                                ))

            parsedLen += entrySize
        print('[*] Found {0} files in CArchive'.format(len(self.tocList)))



    def extractFiles(self):
        print('[*] Beginning extraction...please standby')
        extractionDir = os.path.join(os.getcwd(), os.path.basename(self.filePath) + '_extracted')

        if not os.path.exists(extractionDir):
            os.mkdir(extractionDir)

        os.chdir(extractionDir)
        # 新加入的代码:加入pyc文件的magic部分
        pyverstr=str(self.pyver)
        if len(pyverstr)==2:
            magic=magics["%s.%s"%(pyverstr[0],pyverstr[1:])]
        else:
            magic=magics["%s.%s"%(pyverstr[0],pyverstr[2:])] # 兼容Python 3.10及以上
        if self.pyver>=37: # 2.2.1版改进
            pycheader=magic+b'\x00'*12 # 文件头
        else:
            pycheader=magic+b'\x00'*8 # 文件头

        for entry in self.tocList:
            basePath = os.path.dirname(entry.name)
            if basePath != '':
                # Check if path exists, create if not
                if not os.path.exists(basePath):
                    os.makedirs(basePath)

            self.fPtr.seek(entry.position, os.SEEK_SET)
            data = self.fPtr.read(entry.cmprsdDataSize)

            if entry.cmprsFlag == 1:
                data = zlib.decompress(data)
                # Malware may tamper with the uncompressed size
                # Comment out the assertion in such a case
                assert len(data) == entry.uncmprsdDataSize # Sanity Check

            f=open(entry.name, 'wb')
            if entry.typeCmprsData == b's':
                print('[+] Possible entry point: {0}'.format(entry.name))
                f.write(pycheader+data)
                f.close()
            elif entry.typeCmprsData == b'z' or entry.typeCmprsData == b'Z':
                f.write(data)
                f.close()
                self._extractPyz(entry.name)
    # 2.1版加入的代码
    def _checkPyz(self,name):
        with open(name, 'rb') as f:
            pyzMagic = f.read(4)
            return pyzMagic == b'PYZ\0' # Sanity Check

    def _extractPyz(self, name):
        dirName =  name + '_extracted'
        # Create a directory for the contents of the pyz
        if not os.path.exists(dirName):
            os.mkdir(dirName)

        with open(name, 'rb') as f:
            pyzMagic = f.read(4)
            assert pyzMagic == b'PYZ\0' # Sanity Check

            pycHeader = f.read(4) # Python magic value

            if imp.get_magic() != pycHeader:
                print('[!] Warning: The script is running in a different python version than the one used to build the executable')
                print('    Run this script in Python{0} to prevent extraction errors(if any) during unmarshalling'.format(self.pyver))

            (tocPosition, ) = struct.unpack('!i', f.read(4))
            f.seek(tocPosition, os.SEEK_SET)

            try:
                toc = marshal.load(f)
            except:
                print('[!] Unmarshalling FAILED. Cannot extract {0}. Extracting remaining files.'.format(name))
                return

            print('[*] Found {0} files in PYZ archive'.format(len(toc)))

            # From pyinstaller 3.1+ toc is a list of tuples
            if type(toc) == list:
                toc = dict(toc)

            for key in toc.keys():
                (ispkg, pos, length) = toc[key]
                f.seek(pos, os.SEEK_SET)

                fileName = key
                try:
                    # for Python > 3.3 some keys are bytes object some are str object
                    fileName = key.decode('utf-8')
                except:
                    pass

                # Make sure destination directory exists, ensuring we keep inside dirName
                destName = os.path.join(dirName, fileName.replace("..", "__"))
                destDirName = os.path.dirname(destName)
                if not os.path.exists(destDirName):
                    os.makedirs(destDirName)

                try:
                    data = f.read(length)
                    data = zlib.decompress(data)
                except:
                    print('[!] Error: Failed to decompress {0}, probably encrypted. Extracting as is.'.format(fileName))
                    open(destName + '.pyc.encrypted', 'wb').write(data)
                    continue

                with open(destName + '.pyc', 'wb') as pycFile:
                    pycFile.write(pycHeader)      # Write pyc magic
                    pycFile.write(b'\0' * 4)      # Write timestamp

                    if self.pyver>=37: # 2.2.1版改进
                        # 原来的代码: b'\0' * 4
                        pycFile.write(b'\0' * 8)
                    elif self.pyver>=33:
                        pycFile.write(b'\0' * 4) # Size parameter added in Python 3.3
                    pycFile.write(data)


def main():
    if len(sys.argv) < 2:
        print('[*] Usage: pyinstxtractor.py <filename>')

    else:
        arch = PyInstArchive(sys.argv[1])
        if arch.open():
            if arch.checkFile():
                if arch.getCArchiveInfo():
                    arch.parseTOC()
                    arch.extractFiles()
                    arch.close()
                    print('[*] Successfully extracted pyinstaller archive: {0}'.format(sys.argv[1]))
                    print('')
                    print('''You can now use a python decompiler \
on the pyc files within the extracted directory''')
                    # 加入的代码
                    try:
                        import uncompyle6
                    except ImportError:
                        print("Warning: 你可能没有安装pyc反编译器")

                    return
            # 2.1版加入的代码
            elif arch._checkPyz(sys.argv[1]):
                arch.pyver=100 # 默认pyver
                arch._extractPyz(sys.argv[1])

            arch.close()


if __name__ == '__main__':
    main()

uncompyle6_398">uncompyle6工具的使用

uncompyle6反编译pyc文件的一个Python库。
在Windows中,按Win+R键,输入cmd,启动命令提示符。
先输入命令回车:pip install uncompyle6
然后输入命令:python -m uncompyle6 文件名.pyc,等待一段时间后,就能看到反编译的输出结果了。
另外,使用命令python -m uncompyle6 文件名.pyc > 输出文件名.py 可以将反编译的输出结果写入特定的py文件里。

如果不想做这些繁琐的步骤,作者自己编写了一个调用uncompyle6的脚本,双击可以直接运行:

import sys,os,traceback
import uncompyle6.bin.uncompile as uncompiler
__version__='2.0.1'

def run_uncompile(filename):
    flag=False # 监测sys.stderr中有无警告或错误消息
    _w=sys.stderr.write
    def w(*arg,**kw):
        nonlocal flag
        flag=True
        _w(*arg,**kw)
    def start_check(): # 开始监测
        sys.stderr.write=w
    def end_check():  # 停止监测
        sys.stderr.write=_w

    tofilename=filename[:-1]
    if os.path.isfile(tofilename):
        result=input("文件%s已存在,要替换它吗? "%tofilename)
        if not result.lower().startswith('y'):return
    try:
        sys.stdout=open(tofilename,"w",encoding="utf-8")
        sys.argv[1]=filename
        start_check()
        uncompiler.main_bin()
    except Exception:
        end_check()
        print("文件%s反编译失败,错误消息详见%s"% (filename,tofilename)
              ,file=sys.stderr)
        #traceback.print_exc()
        traceback.print_exc(file=sys.stdout)
    else:
        end_check()
        if not flag:
            print("文件%s反编译成功"%filename,file=sys.stderr)
        else:
            print("文件%s反编译失败, 有警告或错误"%filename,file=sys.stderr)
            print("按Enter键继续...",end='',file=sys.stderr)
            input()
    finally:
        sys.stdout.close()

if __name__=="__main__":
    try:
        if len(sys.argv)>1:
            files=sys.argv[1:]
            sys.argv[0]=uncompiler.__file__
            sys.argv[1:]=['']
            for file in files:
                if not file.endswith(".pyc"):
                    print("警告: %s 可能不是pyc文件"%file,file=sys.stderr)
                run_uncompile(file)
        else:
            file=input("拖曳文件到本窗口,然后按回车 (或输入文件名):\n").strip('"')
            sys.argv[0]=uncompiler.__file__
            sys.argv.append('')
            run_uncompile(file)
    finally:
        sys.stdout=sys.__stdout__

结语
编写这个pyinstxtractor.py的目的, 是破解 - 木兰编程语言 …
功夫不负有心人, 我用uncompyle6工具成功提取了源代码(在这里:ulang - Gitcode)。
本文结束,以上是作者告诉后人的经验。


http://www.niftyadmin.cn/n/1735343.html

相关文章

Python ctypes 调用API函数模拟键盘鼠标事件

在Python编程中, 有时需要模拟键盘或鼠标事件, 自动操作计算机, 比如玩游戏等。 本文介绍使用ctypes模块调用API函数, 模拟键盘鼠标事件的方法。 目录1.导入ctypes模块2.通过ctypes.windll调用api函数3.模拟键盘事件4.模拟鼠标事件1.导入ctypes模块 ctypes 是 Python 的外部函…

Python 不为人知的对象引用机制

众所周知, Python的对象中有一些以__双下划线开头的属性, 如调用函数的__call__属性, 和调用函数, 结果是相同的。 >>> def f(x):print(hello world,x)>>> f(1) hello world 1 >>> f.__call__(1) hello world 1 >>> f.__call__ <metho…

python tkinter.Text 高级用法 -- 设计功能齐全的文本编辑器

众所周知, tkinter的Text文本框功能强大, Python自带的IDLE编辑器也是用tkinter编写的。这里作者也用tkinter的Text文本框等控件, 设计功能较齐全的文本编辑器程序。 目标功能: 编辑文本文件编辑二进制文件 (字符会以转义序列形式显示, 如’abc\xff’)支持ansi、gbk、utf-8等…

Python pyc文件 bytecode 字节码详解,及插入、编辑

Python中的字节码(bytecode)是一种数据类型。PyInstaller, py2exe等库会把编译生成的字节码打包进exe中。掌握字节码的知识, 对于PyInstaller打包exe的反编译, 以及源代码的保护是十分有用的。 目录字节码基础知识Python执行字节码的原理为什么使用字节码字节码对象的结构字节码…

Python ctypes模块调用API函数, 制作控制电脑其他窗口工具

有时, 需要使用程序自动控制某个窗口, 或者是系统默认的窗口样式太单调, 想换一个样式。 为此, 作者使用tkinter编写了窗口控制工具, 而控制窗口又需要调用API函数。 先展示效果图: 目录导入ctypes模块调用API函数获取窗口句柄获取/设置窗口标题最小化/取消最小化关闭窗口更改…

大事件笔记

这里我们使用到的是 layui &#xff0c;先登录找到 layui 的官方文档: Layui - 经典开源模块化前端 UI 框架 使用流程: • 左侧是快捷导航&#xff0c;能够让我们快速找到对应的一些页面效果分类或者是功能分类 • 左侧选中我们想要的分类之后&#xff0c;右侧会有一个子分类…

Python 设计真实反弹球算法及原理分析 (使用物理定律)

文章简单地使用物理定律, 编写程序模拟真实世界中的碰撞。 在开始正式讲解之前, 先看这两个代码: # 把球掉头 ball.speed[0] -ball.speed[0] ball.speed[1] -ball.speed[1]可以看到, 这个代码直接把球的速度反了一下, 比较粗糙。 这是提升的版本 (真实世界中两个球质量相同时…

Python turtle 实现图片旋转效果详解

众所周知, turtle模块由于使用简单, 是Python初学者中较受欢迎的模块之一。本文介绍如何使用PIL库实现turtle中的图片旋转效果。 如果前面看不懂&#xff0c;可以直接复制文章后面的代码。 目录使用PIL库实现图片旋转使用PIL库 在Python中, 实现图片的旋转需要使用PIL库, 该库…