pax_global_header 0000666 0000000 0000000 00000000064 13572204573 0014522 g ustar 00root root 0000000 0000000 52 comment=15bdb4ee3f0443e46d9cc03b6f1268fefc0433cf
pyelftools-0.26/ 0000775 0000000 0000000 00000000000 13572204573 0013651 5 ustar 00root root 0000000 0000000 pyelftools-0.26/.gitignore 0000664 0000000 0000000 00000000114 13572204573 0015635 0 ustar 00root root 0000000 0000000 *.pyc
.coverage
.tox
htmlcov
tags
build
dist
MANIFEST
*.sublime-workspace
pyelftools-0.26/.travis.yml 0000664 0000000 0000000 00000000154 13572204573 0015762 0 ustar 00root root 0000000 0000000 language: python
python:
- "2.7"
- "3.4"
- "3.5"
- "3.6"
- "3.7"
script: python test/all_tests.py
pyelftools-0.26/.vimrc 0000664 0000000 0000000 00000000422 13572204573 0014770 0 ustar 00root root 0000000 0000000 " Force indentation styles for this directory
autocmd FileType python set shiftwidth=4
autocmd FileType python set tabstop=4
autocmd FileType python set softtabstop=4
autocmd FileType c set shiftwidth=2
autocmd FileType c set tabstop=2
autocmd FileType c set softtabstop=2
pyelftools-0.26/CHANGES 0000664 0000000 0000000 00000007541 13572204573 0014653 0 ustar 00root root 0000000 0000000 Changelog
=========
+ Version 0.26 (2019.12.05)
- Call relocation for ARM v3 (#194)
- More complete architecture coverage for ENUM_E_MACHINE (#206)
- Support for .debug_pubtypes and .debug_pubnames sections (#208)
- Support for DWARF v4 location lists (#214)
- Decode strings in dynamic string tables (#217)
- Improve symbol table handling in dynamic segments (#219)
- Improved handling of location information (#225)
- Avoid deprecation warnings in Python 3.7+
- Add DWARF v5 OPs (#240)
- Handle many new translation forms and constants
- Lazy DIE parsing to speed up partial parsing of DWARF info (#249)
+ Version 0.25 (2018.09.01)
- Make parsing of SH_TYPE and PT_TYPE fields dependent on the machine
(e_machine header field), making it possible to support conflicting type
enums between different machines (#71 and #121).
- Add parsing and readelf dumping for .eh_frame (#155)
- Support compressed sections (#152)
- Better support for parsing core dumps (#147)
- More comprehensive handling of ARM relocations (#121)
- Convert all ascii encoding to utf-8 encoding (#182)
- Don't attempt to hex/string dump SHT_NOBITS sections in readelf (#119).
- Test with Python 3.6
- Minor bugfixes (#118)
- Cleanup: Use argparse instead of optparse
- Make readelf comparison tests run in parallel using multiprocessing; cuts
testing time 3-5x
- Improvements in MIPS flags handling (#165)
+ Version 0.24 (2016.08.04)
- Retrieve symbols by name - get_symbol_by_name (#58).
- Symbol/section names are strings internally now, not bytestrings (this may
affect API usage in Python 3) (#76).
- Added DT_MIPS_* constants to ENUM_D_TAG (#79)
- Made dwarf_decode_address example a bit more useful for command-line
invocation.
- More DWARF v4 support w.r.t decoding function ranges; DW_AT_high_pc value
is now either absolute or relative to DW_AT_low_pc, depending on the class
of the form encoded in the file. Also #89.
- Support for SHT_NOTE sections (#109)
- Support for .debug_aranges section (#108)
- Support for zlib-compressed debug sections (#102)
- Support for DWARF v4 line programs (#82)
+ Version 0.23 (2014.11.08)
- Minimal Python 2.x version raised to 2.7
- Basic support for MIPS (contributed by Karl Vogel).
- Support for PT_NOTE segment parsing (contributed by Alex Deymo).
- Support for parsing symbol table in dynamic segment
(contributed by Nam T. Nguyen).
+ Version 0.22 (2014.03.30)
- pyelftools repository moved to https://github.com/eliben/pyelftools
- Support for version sections - contributed by Yann Rouillard.
- Better ARM support (including AArch64) - contributed by Dobromir Stefanov.
- Added some initial support for parsing Solaris OpenCSW ELF files
(contributed by Yann Rouillard).
- Added some initial support for DWARF4 (as generated by gcc 4.8)
and DWARF generated by recent versions of Clang (3.3).
- Added the get_full_path utility method to DIEs that have an associated
file name / path (based on pull request #16 by Shaheed Haque).
- Set up Travis CI integration.
+ Version 0.21 (2013.04.17)
- Added new example: dwarf_decode_address - decode function name and
file & line information from an address.
- Issue #7: parsing incorrect DWARF was made a bit more forgiving for cases
where serialized DIE trees have extra NULLs at the end.
- Very initial support for ARM ELF files (Matthew Fernandez - pull
request #6).
- Support for dumping the dynamic section (Mike Frysinger - pull
request #7).
- Output of scripts/readelf.py now matches that of binutils 2.23.52.
- Added more machine EM_ values to ENUM_E_TYPE.
+ Version 0.20 (2012.01.27)
- Python 3 support
- Fixed some problems with running tests
- Issue #2: made all examples run (and test/run_examples_test.py pass)
on Windows.
+ Version 0.10 - Initial public release (2012.01.06)
pyelftools-0.26/LICENSE 0000664 0000000 0000000 00000002776 13572204573 0014672 0 ustar 00root root 0000000 0000000 pyelftools is in the public domain (see below if you need more details).
pyelftools uses the construct library for structured parsing of a binary
stream. construct is packaged in pyelftools/construct - see its LICENSE
file for the license.
-------------------------------------------------------------------------------
This is free and unencumbered software released into the public domain.
Anyone is free to copy, modify, publish, use, compile, sell, or
distribute this software, either in source code form or as a compiled
binary, for any purpose, commercial or non-commercial, and by any
means.
In jurisdictions that recognize copyright laws, the author or authors
of this software dedicate any and all copyright interest in the
software to the public domain. We make this dedication for the benefit
of the public at large and to the detriment of our heirs and
successors. We intend this dedication to be an overt act of
relinquishment in perpetuity of all present and future rights to this
software under copyright law.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
For more information, please refer to
pyelftools-0.26/MANIFEST.in 0000664 0000000 0000000 00000000341 13572204573 0015405 0 ustar 00root root 0000000 0000000 recursive-include elftools *.py
recursive-include scripts *.py
recursive-include examples *.py *.elf *.out
recursive-include test *.py *.elf *.arm *.mips *.o
include README.rst
include LICENSE
include CHANGES
include tox.ini
pyelftools-0.26/README.rst 0000664 0000000 0000000 00000004056 13572204573 0015345 0 ustar 00root root 0000000 0000000 Introduction: what is pyelftools?
---------------------------------
**pyelftools** is a pure-Python library for parsing and analyzing ELF files
and DWARF debugging information. See the
`User's guide `_
for more details.
Pre-requisites
--------------
As a user of **pyelftools**, one only needs Python to run. It works with
Python versions 2.7 and 3.x (x >= 4). For hacking on **pyelftools** the
requirements are a bit more strict, please see the
`hacking guide `_.
Installing
----------
**pyelftools** can be installed from PyPI (Python package index)::
> pip install pyelftools
Alternatively, you can download the source distribution for the most recent and
historic versions from the *Downloads* tab on the `pyelftools project page
`_ (by going to *Tags*). Then, you can
install from source, as usual::
> python setup.py install
Since **pyelftools** is a work in progress, it's recommended to have the most
recent version of the code. This can be done by downloading the `master zip
file `_ or just
cloning the Git repository.
Since **pyelftools** has no external dependencies, it's also easy to use it
without installing, by locally adjusting ``PYTHONPATH``.
How to use it?
--------------
**pyelftools** is a regular Python library: you import and invoke it from your
own code. For a detailed usage guide and links to examples, please consult the
`user's guide `_.
License
-------
**pyelftools** is open source software. Its code is in the public domain. See
the ``LICENSE`` file for more details.
CI Status
---------
**pyelftools** has automatic testing enabled through the convenient
`Travis CI project `_. Here is the latest build status:
.. image:: https://travis-ci.org/eliben/pyelftools.png?branch=master
:align: center
:target: https://travis-ci.org/eliben/pyelftools
pyelftools-0.26/TODO 0000775 0000000 0000000 00000001723 13572204573 0014347 0 ustar 00root root 0000000 0000000 New version
-----------
* Update elftools/__init__.py
* Update setup.py
* Update CHANGES
* Tag in git (v0.xx)
construct
---------
construct seems to be maintained again - they also backported my Python 3 fixes.
Theoretically, I can remove construct from pyelftools and use it as a dependency
instead. I don't really have time to play with this now, but may do so in the
future.
Distribution
------------
1. First install Twine (https://packaging.python.org/tutorials/packaging-projects/)
2. python3 -m twine upload dist/*
Credentials for PyPI are stored in ~/.pypirc
Preparing a new release
-----------------------
* Run 'tox' tests (with '-r' to create new venvs)
* Make sure new version was updated everywhere appropriate
* Run ``python setup.py build sdist bdist_wheel`` (no 'upload' yet)
* Untar the created ``dist/pyelftools-x.y.tar.gz`` and make sure
everything looks ok
* Now build with upload to send it to PyPi
* Test with pip install from some new virtualenv
pyelftools-0.26/elftools/ 0000775 0000000 0000000 00000000000 13572204573 0015500 5 ustar 00root root 0000000 0000000 pyelftools-0.26/elftools/__init__.py 0000664 0000000 0000000 00000000413 13572204573 0017607 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
__version__ = '0.26'
pyelftools-0.26/elftools/common/ 0000775 0000000 0000000 00000000000 13572204573 0016770 5 ustar 00root root 0000000 0000000 pyelftools-0.26/elftools/common/__init__.py 0000664 0000000 0000000 00000000000 13572204573 0021067 0 ustar 00root root 0000000 0000000 pyelftools-0.26/elftools/common/construct_utils.py 0000664 0000000 0000000 00000005530 13572204573 0022611 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: common/construct_utils.py
#
# Some complementary construct utilities
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..construct import (
Subconstruct, ConstructError, ArrayError, Adapter, Field, RepeatUntil,
Rename
)
class RepeatUntilExcluding(Subconstruct):
""" A version of construct's RepeatUntil that doesn't include the last
element (which casued the repeat to exit) in the return value.
Only parsing is currently implemented.
P.S. removed some code duplication
"""
__slots__ = ["predicate"]
def __init__(self, predicate, subcon):
Subconstruct.__init__(self, subcon)
self.predicate = predicate
self._clear_flag(self.FLAG_COPY_CONTEXT)
self._set_flag(self.FLAG_DYNAMIC)
def _parse(self, stream, context):
obj = []
try:
context_for_subcon = context
if self.subcon.conflags & self.FLAG_COPY_CONTEXT:
context_for_subcon = context.__copy__()
while True:
subobj = self.subcon._parse(stream, context_for_subcon)
if self.predicate(subobj, context):
break
obj.append(subobj)
except ConstructError as ex:
raise ArrayError("missing terminator", ex)
return obj
def _build(self, obj, stream, context):
raise NotImplementedError('no building')
def _sizeof(self, context):
raise SizeofError("can't calculate size")
def _LEB128_reader():
""" Read LEB128 variable-length data from the stream. The data is terminated
by a byte with 0 in its highest bit.
"""
return RepeatUntil(
lambda obj, ctx: ord(obj) < 0x80,
Field(None, 1))
class _ULEB128Adapter(Adapter):
""" An adapter for ULEB128, given a sequence of bytes in a sub-construct.
"""
def _decode(self, obj, context):
value = 0
for b in reversed(obj):
value = (value << 7) + (ord(b) & 0x7F)
return value
class _SLEB128Adapter(Adapter):
""" An adapter for SLEB128, given a sequence of bytes in a sub-construct.
"""
def _decode(self, obj, context):
value = 0
for b in reversed(obj):
value = (value << 7) + (ord(b) & 0x7F)
if ord(obj[-1]) & 0x40:
# negative -> sign extend
value |= - (1 << (7 * len(obj)))
return value
def ULEB128(name):
""" A construct creator for ULEB128 encoding.
"""
return Rename(name, _ULEB128Adapter(_LEB128_reader()))
def SLEB128(name):
""" A construct creator for SLEB128 encoding.
"""
return Rename(name, _SLEB128Adapter(_LEB128_reader()))
pyelftools-0.26/elftools/common/exceptions.py 0000664 0000000 0000000 00000001011 13572204573 0021514 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: common/exceptions.py
#
# Exception classes for elftools
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
class ELFError(Exception):
pass
class ELFRelocationError(ELFError):
pass
class ELFParseError(ELFError):
pass
class ELFCompressionError(ELFError):
pass
class DWARFError(Exception):
pass
pyelftools-0.26/elftools/common/py3compat.py 0000664 0000000 0000000 00000003737 13572204573 0021273 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: common/py3compat.py
#
# Python 2/3 compatibility code
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
import sys
PY3 = sys.version_info[0] == 3
if PY3:
import io
StringIO = io.StringIO
BytesIO = io.BytesIO
# Functions for acting on bytestrings and strings. In Python 2 and 3,
# strings and bytes are the same and chr/ord can be used to convert between
# numeric byte values and their string pepresentations. In Python 3, bytes
# and strings are different types and bytes hold numeric values when
# iterated over.
def bytes2str(b): return b.decode('latin-1')
def str2bytes(s): return s.encode('latin-1')
def int2byte(i): return bytes((i,))
def byte2int(b): return b
def iterbytes(b):
"""Return an iterator over the elements of a bytes object.
For example, for b'abc' yields b'a', b'b' and then b'c'.
"""
for i in range(len(b)):
yield b[i:i+1]
ifilter = filter
maxint = sys.maxsize
else:
import cStringIO
StringIO = BytesIO = cStringIO.StringIO
def bytes2str(b): return b
def str2bytes(s): return s
int2byte = chr
byte2int = ord
def iterbytes(b):
return iter(b)
from itertools import ifilter
maxint = sys.maxint
def iterkeys(d):
"""Return an iterator over the keys of a dictionary."""
return getattr(d, 'keys' if PY3 else 'iterkeys')()
def itervalues(d):
"""Return an iterator over the values of a dictionary."""
return getattr(d, 'values' if PY3 else 'itervalues')()
def iteritems(d):
"""Return an iterator over the items of a dictionary."""
return getattr(d, 'items' if PY3 else 'iteritems')()
try:
from collections.abc import Mapping # python >= 3.3
except ImportError:
from collections import Mapping # python < 3.3
pyelftools-0.26/elftools/common/utils.py 0000664 0000000 0000000 00000006621 13572204573 0020507 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: common/utils.py
#
# Miscellaneous utilities for elftools
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from contextlib import contextmanager
from .exceptions import ELFParseError, ELFError, DWARFError
from .py3compat import int2byte
from ..construct import ConstructError
def merge_dicts(*dicts):
"Given any number of dicts, merges them into a new one."""
result = {}
for d in dicts:
result.update(d)
return result
def bytelist2string(bytelist):
""" Convert a list of byte values (e.g. [0x10 0x20 0x00]) to a bytes object
(e.g. b'\x10\x20\x00').
"""
return b''.join(int2byte(b) for b in bytelist)
def struct_parse(struct, stream, stream_pos=None):
""" Convenience function for using the given struct to parse a stream.
If stream_pos is provided, the stream is seeked to this position before
the parsing is done. Otherwise, the current position of the stream is
used.
Wraps the error thrown by construct with ELFParseError.
"""
try:
if stream_pos is not None:
stream.seek(stream_pos)
return struct.parse_stream(stream)
except ConstructError as e:
raise ELFParseError(str(e))
def parse_cstring_from_stream(stream, stream_pos=None):
""" Parse a C-string from the given stream. The string is returned without
the terminating \x00 byte. If the terminating byte wasn't found, None
is returned (the stream is exhausted).
If stream_pos is provided, the stream is seeked to this position before
the parsing is done. Otherwise, the current position of the stream is
used.
Note: a bytes object is returned here, because this is what's read from
the binary file.
"""
if stream_pos is not None:
stream.seek(stream_pos)
CHUNKSIZE = 64
chunks = []
found = False
while True:
chunk = stream.read(CHUNKSIZE)
end_index = chunk.find(b'\x00')
if end_index >= 0:
chunks.append(chunk[:end_index])
found = True
break
else:
chunks.append(chunk)
if len(chunk) < CHUNKSIZE:
break
return b''.join(chunks) if found else None
def elf_assert(cond, msg=''):
""" Assert that cond is True, otherwise raise ELFError(msg)
"""
_assert_with_exception(cond, msg, ELFError)
def dwarf_assert(cond, msg=''):
""" Assert that cond is True, otherwise raise DWARFError(msg)
"""
_assert_with_exception(cond, msg, DWARFError)
@contextmanager
def preserve_stream_pos(stream):
""" Usage:
# stream has some position FOO (return value of stream.tell())
with preserve_stream_pos(stream):
# do stuff that manipulates the stream
# stream still has position FOO
"""
saved_pos = stream.tell()
yield
stream.seek(saved_pos)
def roundup(num, bits):
""" Round up a number to nearest multiple of 2^bits. The result is a number
where the least significant bits passed in bits are 0.
"""
return (num - 1 | (1 << bits) - 1) + 1
#------------------------- PRIVATE -------------------------
def _assert_with_exception(cond, msg, exception_type):
if not cond:
raise exception_type(msg)
pyelftools-0.26/elftools/construct/ 0000775 0000000 0000000 00000000000 13572204573 0017524 5 ustar 00root root 0000000 0000000 pyelftools-0.26/elftools/construct/LICENSE 0000664 0000000 0000000 00000002072 13572204573 0020532 0 ustar 00root root 0000000 0000000 Copyright (C) 2009 Tomer Filiba, 2010-2011 Corbin Simpson
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
pyelftools-0.26/elftools/construct/README 0000664 0000000 0000000 00000001276 13572204573 0020412 0 ustar 00root root 0000000 0000000 construct is a Python library for declarative parsing and building of binary
data. This is my fork of construct 2, with some modifications for Python 3
and bug fixes. The construct website is http://construct.readthedocs.org
pyelftools carries construct around because construct has been abandoned for
a long time and didn't get bugfixes; it also didn't work with Python 3.
These days (Feb 2018) construct is maintained again, but its APIs have
underwent extensive changes that would require rewriting all of the
construct-facing code in pyelftools. I'm still evaluating the pros/cons of
this effort. See https://github.com/eliben/pyelftools/issues/180 for details.
LICENSE is the original license.
pyelftools-0.26/elftools/construct/__init__.py 0000664 0000000 0000000 00000011002 13572204573 0021627 0 ustar 00root root 0000000 0000000 """
#### ####
## #### ## ## #### ###### ##### ## ## #### ###### ## ##
## ## ## ### ## ## ## ## ## ## ## ## ## #### ##
## ## ## ###### ### ## ##### ## ## ## ## ##
## ## ## ## ### ## ## ## ## ## ## ## ## ##
#### #### ## ## #### ## ## ## ##### #### ## ######
Parsing made even more fun (and faster too)
Homepage:
http://construct.wikispaces.com (including online tutorial)
Typical usage:
>>> from construct import *
Hands-on example:
>>> from construct import *
>>> s = Struct("foo",
... UBInt8("a"),
... UBInt16("b"),
... )
>>> s.parse("\\x01\\x02\\x03")
Container(a = 1, b = 515)
>>> print s.parse("\\x01\\x02\\x03")
Container:
a = 1
b = 515
>>> s.build(Container(a = 1, b = 0x0203))
"\\x01\\x02\\x03"
"""
from .core import *
from .adapters import *
from .macros import *
from .debug import Probe, Debugger
#===============================================================================
# Metadata
#===============================================================================
__author__ = "tomer filiba (tomerfiliba [at] gmail.com)"
__maintainer__ = "Corbin Simpson "
__version__ = "2.06"
#===============================================================================
# Shorthand expressions
#===============================================================================
Bits = BitField
Byte = UBInt8
Bytes = Field
Const = ConstAdapter
Tunnel = TunnelAdapter
Embed = Embedded
#===============================================================================
# Deprecated names
# Next scheduled name cleanout: 2.1
#===============================================================================
import functools, warnings
def deprecated(f):
@functools.wraps(f)
def wrapper(*args, **kwargs):
warnings.warn(
"This name is deprecated, use %s instead" % f.__name__,
DeprecationWarning, stacklevel=2)
return f(*args, **kwargs)
return wrapper
MetaBytes = deprecated(MetaField)
GreedyRepeater = deprecated(GreedyRange)
OptionalGreedyRepeater = deprecated(OptionalGreedyRange)
Repeater = deprecated(Range)
StrictRepeater = deprecated(Array)
MetaRepeater = deprecated(Array)
OneOfValidator = deprecated(OneOf)
NoneOfValidator = deprecated(NoneOf)
#===============================================================================
# exposed names
#===============================================================================
__all__ = [
'AdaptationError', 'Adapter', 'Alias', 'Aligned', 'AlignedStruct',
'Anchor', 'Array', 'ArrayError', 'BFloat32', 'BFloat64', 'Bit', 'BitField',
'BitIntegerAdapter', 'BitIntegerError', 'BitStruct', 'Bits', 'Bitwise',
'Buffered', 'Byte', 'Bytes', 'CString', 'CStringAdapter', 'Const',
'ConstAdapter', 'ConstError', 'Construct', 'ConstructError', 'Container',
'Debugger', 'Embed', 'Embedded', 'EmbeddedBitStruct', 'Enum', 'ExprAdapter',
'Field', 'FieldError', 'Flag', 'FlagsAdapter', 'FlagsContainer',
'FlagsEnum', 'FormatField', 'GreedyRange', 'GreedyRepeater',
'HexDumpAdapter', 'If', 'IfThenElse', 'IndexingAdapter', 'LFloat32',
'LFloat64', 'LazyBound', 'LengthValueAdapter', 'ListContainer',
'MappingAdapter', 'MappingError', 'MetaArray', 'MetaBytes', 'MetaField',
'MetaRepeater', 'NFloat32', 'NFloat64', 'Nibble', 'NoneOf',
'NoneOfValidator', 'Octet', 'OnDemand', 'OnDemandPointer', 'OneOf',
'OneOfValidator', 'OpenRange', 'Optional', 'OptionalGreedyRange',
'OptionalGreedyRepeater', 'PaddedStringAdapter', 'Padding',
'PaddingAdapter', 'PaddingError', 'PascalString', 'Pass', 'Peek',
'Pointer', 'PrefixedArray', 'Probe', 'Range', 'RangeError', 'Reconfig',
'Rename', 'RepeatUntil', 'Repeater', 'Restream', 'SBInt16', 'SBInt32',
'SBInt64', 'SBInt8', 'SLInt16', 'SLInt32', 'SLInt64', 'SLInt8', 'SNInt16',
'SNInt32', 'SNInt64', 'SNInt8', 'Select', 'SelectError', 'Sequence',
'SizeofError', 'SlicingAdapter', 'StaticField', 'StrictRepeater', 'String',
'StringAdapter', 'Struct', 'Subconstruct', 'Switch', 'SwitchError',
'SymmetricMapping', 'Terminator', 'TerminatorError', 'Tunnel',
'TunnelAdapter', 'UBInt16', 'UBInt32', 'UBInt64', 'UBInt8', 'ULInt16',
'ULInt32', 'ULInt64', 'ULInt8', 'UNInt16', 'UNInt32', 'UNInt64', 'UNInt8',
'Union', 'ValidationError', 'Validator', 'Value', "Magic",
]
pyelftools-0.26/elftools/construct/adapters.py 0000664 0000000 0000000 00000040333 13572204573 0021704 0 ustar 00root root 0000000 0000000 from .core import Adapter, AdaptationError, Pass
from .lib import int_to_bin, bin_to_int, swap_bytes
from .lib import FlagsContainer, HexString
from .lib.py3compat import BytesIO, decodebytes
#===============================================================================
# exceptions
#===============================================================================
class BitIntegerError(AdaptationError):
__slots__ = []
class MappingError(AdaptationError):
__slots__ = []
class ConstError(AdaptationError):
__slots__ = []
class ValidationError(AdaptationError):
__slots__ = []
class PaddingError(AdaptationError):
__slots__ = []
#===============================================================================
# adapters
#===============================================================================
class BitIntegerAdapter(Adapter):
"""
Adapter for bit-integers (converts bitstrings to integers, and vice versa).
See BitField.
Parameters:
* subcon - the subcon to adapt
* width - the size of the subcon, in bits
* swapped - whether to swap byte order (little endian/big endian).
default is False (big endian)
* signed - whether the value is signed (two's complement). the default
is False (unsigned)
* bytesize - number of bits per byte, used for byte-swapping (if swapped).
default is 8.
"""
__slots__ = ["width", "swapped", "signed", "bytesize"]
def __init__(self, subcon, width, swapped = False, signed = False,
bytesize = 8):
Adapter.__init__(self, subcon)
self.width = width
self.swapped = swapped
self.signed = signed
self.bytesize = bytesize
def _encode(self, obj, context):
if obj < 0 and not self.signed:
raise BitIntegerError("object is negative, but field is not signed",
obj)
obj2 = int_to_bin(obj, width = self.width)
if self.swapped:
obj2 = swap_bytes(obj2, bytesize = self.bytesize)
return obj2
def _decode(self, obj, context):
if self.swapped:
obj = swap_bytes(obj, bytesize = self.bytesize)
return bin_to_int(obj, signed = self.signed)
class MappingAdapter(Adapter):
"""
Adapter that maps objects to other objects.
See SymmetricMapping and Enum.
Parameters:
* subcon - the subcon to map
* decoding - the decoding (parsing) mapping (a dict)
* encoding - the encoding (building) mapping (a dict)
* decdefault - the default return value when the object is not found
in the decoding mapping. if no object is given, an exception is raised.
if `Pass` is used, the unmapped object will be passed as-is
* encdefault - the default return value when the object is not found
in the encoding mapping. if no object is given, an exception is raised.
if `Pass` is used, the unmapped object will be passed as-is
"""
__slots__ = ["encoding", "decoding", "encdefault", "decdefault"]
def __init__(self, subcon, decoding, encoding,
decdefault = NotImplemented, encdefault = NotImplemented):
Adapter.__init__(self, subcon)
self.decoding = decoding
self.encoding = encoding
self.decdefault = decdefault
self.encdefault = encdefault
def _encode(self, obj, context):
try:
return self.encoding[obj]
except (KeyError, TypeError):
if self.encdefault is NotImplemented:
raise MappingError("no encoding mapping for %r [%s]" % (
obj, self.subcon.name))
if self.encdefault is Pass:
return obj
return self.encdefault
def _decode(self, obj, context):
try:
return self.decoding[obj]
except (KeyError, TypeError):
if self.decdefault is NotImplemented:
raise MappingError("no decoding mapping for %r [%s]" % (
obj, self.subcon.name))
if self.decdefault is Pass:
return obj
return self.decdefault
class FlagsAdapter(Adapter):
"""
Adapter for flag fields. Each flag is extracted from the number, resulting
in a FlagsContainer object. Not intended for direct usage.
See FlagsEnum.
Parameters
* subcon - the subcon to extract
* flags - a dictionary mapping flag-names to their value
"""
__slots__ = ["flags"]
def __init__(self, subcon, flags):
Adapter.__init__(self, subcon)
self.flags = flags
def _encode(self, obj, context):
flags = 0
for name, value in self.flags.items():
if getattr(obj, name, False):
flags |= value
return flags
def _decode(self, obj, context):
obj2 = FlagsContainer()
for name, value in self.flags.items():
setattr(obj2, name, bool(obj & value))
return obj2
class StringAdapter(Adapter):
"""
Adapter for strings. Converts a sequence of characters into a python
string, and optionally handles character encoding.
See String.
Parameters:
* subcon - the subcon to convert
* encoding - the character encoding name (e.g., "utf8"), or None to
return raw bytes (usually 8-bit ASCII).
"""
__slots__ = ["encoding"]
def __init__(self, subcon, encoding = None):
Adapter.__init__(self, subcon)
self.encoding = encoding
def _encode(self, obj, context):
if self.encoding:
obj = obj.encode(self.encoding)
return obj
def _decode(self, obj, context):
if self.encoding:
obj = obj.decode(self.encoding)
return obj
class PaddedStringAdapter(Adapter):
r"""
Adapter for padded strings.
See String.
Parameters:
* subcon - the subcon to adapt
* padchar - the padding character. default is b"\x00".
* paddir - the direction where padding is placed ("right", "left", or
"center"). the default is "right".
* trimdir - the direction where trimming will take place ("right" or
"left"). the default is "right". trimming is only meaningful for
building, when the given string is too long.
"""
__slots__ = ["padchar", "paddir", "trimdir"]
def __init__(self, subcon, padchar = b"\x00", paddir = "right",
trimdir = "right"):
if paddir not in ("right", "left", "center"):
raise ValueError("paddir must be 'right', 'left' or 'center'",
paddir)
if trimdir not in ("right", "left"):
raise ValueError("trimdir must be 'right' or 'left'", trimdir)
Adapter.__init__(self, subcon)
self.padchar = padchar
self.paddir = paddir
self.trimdir = trimdir
def _decode(self, obj, context):
if self.paddir == "right":
obj = obj.rstrip(self.padchar)
elif self.paddir == "left":
obj = obj.lstrip(self.padchar)
else:
obj = obj.strip(self.padchar)
return obj
def _encode(self, obj, context):
size = self._sizeof(context)
if self.paddir == "right":
obj = obj.ljust(size, self.padchar)
elif self.paddir == "left":
obj = obj.rjust(size, self.padchar)
else:
obj = obj.center(size, self.padchar)
if len(obj) > size:
if self.trimdir == "right":
obj = obj[:size]
else:
obj = obj[-size:]
return obj
class LengthValueAdapter(Adapter):
"""
Adapter for length-value pairs. It extracts only the value from the
pair, and calculates the length based on the value.
See PrefixedArray and PascalString.
Parameters:
* subcon - the subcon returning a length-value pair
"""
__slots__ = []
def _encode(self, obj, context):
return (len(obj), obj)
def _decode(self, obj, context):
return obj[1]
class CStringAdapter(StringAdapter):
r"""
Adapter for C-style strings (strings terminated by a terminator char).
Parameters:
* subcon - the subcon to convert
* terminators - a sequence of terminator chars. default is b"\x00".
* encoding - the character encoding to use (e.g., "utf8"), or None to
return raw-bytes. the terminator characters are not affected by the
encoding.
"""
__slots__ = ["terminators"]
def __init__(self, subcon, terminators = b"\x00", encoding = None):
StringAdapter.__init__(self, subcon, encoding = encoding)
self.terminators = terminators
def _encode(self, obj, context):
return StringAdapter._encode(self, obj, context) + self.terminators[0:1]
def _decode(self, obj, context):
return StringAdapter._decode(self, b''.join(obj[:-1]), context)
class TunnelAdapter(Adapter):
"""
Adapter for tunneling (as in protocol tunneling). A tunnel is construct
nested upon another (layering). For parsing, the lower layer first parses
the data (note: it must return a string!), then the upper layer is called
to parse that data (bottom-up). For building it works in a top-down manner;
first the upper layer builds the data, then the lower layer takes it and
writes it to the stream.
Parameters:
* subcon - the lower layer subcon
* inner_subcon - the upper layer (tunneled/nested) subcon
Example:
# a pascal string containing compressed data (zlib encoding), so first
# the string is read, decompressed, and finally re-parsed as an array
# of UBInt16
TunnelAdapter(
PascalString("data", encoding = "zlib"),
GreedyRange(UBInt16("elements"))
)
"""
__slots__ = ["inner_subcon"]
def __init__(self, subcon, inner_subcon):
Adapter.__init__(self, subcon)
self.inner_subcon = inner_subcon
def _decode(self, obj, context):
return self.inner_subcon._parse(BytesIO(obj), context)
def _encode(self, obj, context):
stream = BytesIO()
self.inner_subcon._build(obj, stream, context)
return stream.getvalue()
class ExprAdapter(Adapter):
"""
A generic adapter that accepts 'encoder' and 'decoder' as parameters. You
can use ExprAdapter instead of writing a full-blown class when only a
simple expression is needed.
Parameters:
* subcon - the subcon to adapt
* encoder - a function that takes (obj, context) and returns an encoded
version of obj
* decoder - a function that takes (obj, context) and returns an decoded
version of obj
Example:
ExprAdapter(UBInt8("foo"),
encoder = lambda obj, ctx: obj / 4,
decoder = lambda obj, ctx: obj * 4,
)
"""
__slots__ = ["_encode", "_decode"]
def __init__(self, subcon, encoder, decoder):
Adapter.__init__(self, subcon)
self._encode = encoder
self._decode = decoder
class HexDumpAdapter(Adapter):
"""
Adapter for hex-dumping strings. It returns a HexString, which is a string
"""
__slots__ = ["linesize"]
def __init__(self, subcon, linesize = 16):
Adapter.__init__(self, subcon)
self.linesize = linesize
def _encode(self, obj, context):
return obj
def _decode(self, obj, context):
return HexString(obj, linesize = self.linesize)
class ConstAdapter(Adapter):
"""
Adapter for enforcing a constant value ("magic numbers"). When decoding,
the return value is checked; when building, the value is substituted in.
Parameters:
* subcon - the subcon to validate
* value - the expected value
Example:
Const(Field("signature", 2), "MZ")
"""
__slots__ = ["value"]
def __init__(self, subcon, value):
Adapter.__init__(self, subcon)
self.value = value
def _encode(self, obj, context):
if obj is None or obj == self.value:
return self.value
else:
raise ConstError("expected %r, found %r" % (self.value, obj))
def _decode(self, obj, context):
if obj != self.value:
raise ConstError("expected %r, found %r" % (self.value, obj))
return obj
class SlicingAdapter(Adapter):
"""
Adapter for slicing a list (getting a slice from that list)
Parameters:
* subcon - the subcon to slice
* start - start index
* stop - stop index (or None for up-to-end)
* step - step (or None for every element)
"""
__slots__ = ["start", "stop", "step"]
def __init__(self, subcon, start, stop = None):
Adapter.__init__(self, subcon)
self.start = start
self.stop = stop
def _encode(self, obj, context):
if self.start is None:
return obj
return [None] * self.start + obj
def _decode(self, obj, context):
return obj[self.start:self.stop]
class IndexingAdapter(Adapter):
"""
Adapter for indexing a list (getting a single item from that list)
Parameters:
* subcon - the subcon to index
* index - the index of the list to get
"""
__slots__ = ["index"]
def __init__(self, subcon, index):
Adapter.__init__(self, subcon)
if type(index) is not int:
raise TypeError("index must be an integer", type(index))
self.index = index
def _encode(self, obj, context):
return [None] * self.index + [obj]
def _decode(self, obj, context):
return obj[self.index]
class PaddingAdapter(Adapter):
r"""
Adapter for padding.
Parameters:
* subcon - the subcon to pad
* pattern - the padding pattern (character as byte). default is b"\x00"
* strict - whether or not to verify, during parsing, that the given
padding matches the padding pattern. default is False (unstrict)
"""
__slots__ = ["pattern", "strict"]
def __init__(self, subcon, pattern = b"\x00", strict = False):
Adapter.__init__(self, subcon)
self.pattern = pattern
self.strict = strict
def _encode(self, obj, context):
return self._sizeof(context) * self.pattern
def _decode(self, obj, context):
if self.strict:
expected = self._sizeof(context) * self.pattern
if obj != expected:
raise PaddingError("expected %r, found %r" % (expected, obj))
return obj
#===============================================================================
# validators
#===============================================================================
class Validator(Adapter):
"""
Abstract class: validates a condition on the encoded/decoded object.
Override _validate(obj, context) in deriving classes.
Parameters:
* subcon - the subcon to validate
"""
__slots__ = []
def _decode(self, obj, context):
if not self._validate(obj, context):
raise ValidationError("invalid object", obj)
return obj
def _encode(self, obj, context):
return self._decode(obj, context)
def _validate(self, obj, context):
raise NotImplementedError()
class OneOf(Validator):
"""
Validates that the object is one of the listed values.
:param ``Construct`` subcon: object to validate
:param iterable valids: a set of valid values
>>> OneOf(UBInt8("foo"), [4,5,6,7]).parse("\\x05")
5
>>> OneOf(UBInt8("foo"), [4,5,6,7]).parse("\\x08")
Traceback (most recent call last):
...
construct.core.ValidationError: ('invalid object', 8)
>>>
>>> OneOf(UBInt8("foo"), [4,5,6,7]).build(5)
'\\x05'
>>> OneOf(UBInt8("foo"), [4,5,6,7]).build(9)
Traceback (most recent call last):
...
construct.core.ValidationError: ('invalid object', 9)
"""
__slots__ = ["valids"]
def __init__(self, subcon, valids):
Validator.__init__(self, subcon)
self.valids = valids
def _validate(self, obj, context):
return obj in self.valids
class NoneOf(Validator):
"""
Validates that the object is none of the listed values.
:param ``Construct`` subcon: object to validate
:param iterable invalids: a set of invalid values
>>> NoneOf(UBInt8("foo"), [4,5,6,7]).parse("\\x08")
8
>>> NoneOf(UBInt8("foo"), [4,5,6,7]).parse("\\x06")
Traceback (most recent call last):
...
construct.core.ValidationError: ('invalid object', 6)
"""
__slots__ = ["invalids"]
def __init__(self, subcon, invalids):
Validator.__init__(self, subcon)
self.invalids = invalids
def _validate(self, obj, context):
return obj not in self.invalids
pyelftools-0.26/elftools/construct/core.py 0000664 0000000 0000000 00000126724 13572204573 0021042 0 ustar 00root root 0000000 0000000 from struct import Struct as Packer
from .lib.py3compat import BytesIO, advance_iterator, bchr
from .lib import Container, ListContainer, LazyContainer
#===============================================================================
# exceptions
#===============================================================================
class ConstructError(Exception):
__slots__ = []
class FieldError(ConstructError):
__slots__ = []
class SizeofError(ConstructError):
__slots__ = []
class AdaptationError(ConstructError):
__slots__ = []
class ArrayError(ConstructError):
__slots__ = []
class RangeError(ConstructError):
__slots__ = []
class SwitchError(ConstructError):
__slots__ = []
class SelectError(ConstructError):
__slots__ = []
class TerminatorError(ConstructError):
__slots__ = []
#===============================================================================
# abstract constructs
#===============================================================================
class Construct(object):
"""
The mother of all constructs.
This object is generally not directly instantiated, and it does not
directly implement parsing and building, so it is largely only of interest
to subclass implementors.
The external user API:
* parse()
* parse_stream()
* build()
* build_stream()
* sizeof()
Subclass authors should not override the external methods. Instead,
another API is available:
* _parse()
* _build()
* _sizeof()
There is also a flag API:
* _set_flag()
* _clear_flag()
* _inherit_flags()
* _is_flag()
And stateful copying:
* __getstate__()
* __setstate__()
Attributes and Inheritance
==========================
All constructs have a name and flags. The name is used for naming struct
members and context dictionaries. Note that the name can either be a
string, or None if the name is not needed. A single underscore ("_") is a
reserved name, and so are names starting with a less-than character ("<").
The name should be descriptive, short, and valid as a Python identifier,
although these rules are not enforced.
The flags specify additional behavioral information about this construct.
Flags are used by enclosing constructs to determine a proper course of
action. Flags are inherited by default, from inner subconstructs to outer
constructs. The enclosing construct may set new flags or clear existing
ones, as necessary.
For example, if FLAG_COPY_CONTEXT is set, repeaters will pass a copy of
the context for each iteration, which is necessary for OnDemand parsing.
"""
FLAG_COPY_CONTEXT = 0x0001
FLAG_DYNAMIC = 0x0002
FLAG_EMBED = 0x0004
FLAG_NESTING = 0x0008
__slots__ = ["name", "conflags"]
def __init__(self, name, flags = 0):
if name is not None:
if type(name) is not str:
raise TypeError("name must be a string or None", name)
if name == "_" or name.startswith("<"):
raise ValueError("reserved name", name)
self.name = name
self.conflags = flags
def __repr__(self):
return "%s(%r)" % (self.__class__.__name__, self.name)
def _set_flag(self, flag):
"""
Set the given flag or flags.
:param int flag: flag to set; may be OR'd combination of flags
"""
self.conflags |= flag
def _clear_flag(self, flag):
"""
Clear the given flag or flags.
:param int flag: flag to clear; may be OR'd combination of flags
"""
self.conflags &= ~flag
def _inherit_flags(self, *subcons):
"""
Pull flags from subconstructs.
"""
for sc in subcons:
self._set_flag(sc.conflags)
def _is_flag(self, flag):
"""
Check whether a given flag is set.
:param int flag: flag to check
"""
return bool(self.conflags & flag)
def __getstate__(self):
"""
Obtain a dictionary representing this construct's state.
"""
attrs = {}
if hasattr(self, "__dict__"):
attrs.update(self.__dict__)
slots = []
c = self.__class__
while c is not None:
if hasattr(c, "__slots__"):
slots.extend(c.__slots__)
c = c.__base__
for name in slots:
if hasattr(self, name):
attrs[name] = getattr(self, name)
return attrs
def __setstate__(self, attrs):
"""
Set this construct's state to a given state.
"""
for name, value in attrs.items():
setattr(self, name, value)
def __copy__(self):
"""returns a copy of this construct"""
self2 = object.__new__(self.__class__)
self2.__setstate__(self.__getstate__())
return self2
def parse(self, data):
"""
Parse an in-memory buffer.
Strings, buffers, memoryviews, and other complete buffers can be
parsed with this method.
"""
return self.parse_stream(BytesIO(data))
def parse_stream(self, stream):
"""
Parse a stream.
Files, pipes, sockets, and other streaming sources of data are handled
by this method.
"""
return self._parse(stream, Container())
def _parse(self, stream, context):
"""
Override me in your subclass.
"""
raise NotImplementedError()
def build(self, obj):
"""
Build an object in memory.
"""
stream = BytesIO()
self.build_stream(obj, stream)
return stream.getvalue()
def build_stream(self, obj, stream):
"""
Build an object directly into a stream.
"""
self._build(obj, stream, Container())
def _build(self, obj, stream, context):
"""
Override me in your subclass.
"""
raise NotImplementedError()
def sizeof(self, context=None):
"""
Calculate the size of this object, optionally using a context.
Some constructs have no fixed size and can only know their size for a
given hunk of data; these constructs will raise an error if they are
not passed a context.
:param ``Container`` context: contextual data
:returns: int of the length of this construct
:raises SizeofError: the size could not be determined
"""
if context is None:
context = Container()
try:
return self._sizeof(context)
except Exception as e:
raise SizeofError(e)
def _sizeof(self, context):
"""
Override me in your subclass.
"""
raise SizeofError("Raw Constructs have no size!")
class Subconstruct(Construct):
"""
Abstract subconstruct (wraps an inner construct, inheriting its
name and flags).
Parameters:
* subcon - the construct to wrap
"""
__slots__ = ["subcon"]
def __init__(self, subcon):
Construct.__init__(self, subcon.name, subcon.conflags)
self.subcon = subcon
def _parse(self, stream, context):
return self.subcon._parse(stream, context)
def _build(self, obj, stream, context):
self.subcon._build(obj, stream, context)
def _sizeof(self, context):
return self.subcon._sizeof(context)
class Adapter(Subconstruct):
"""
Abstract adapter: calls _decode for parsing and _encode for building.
Parameters:
* subcon - the construct to wrap
"""
__slots__ = []
def _parse(self, stream, context):
return self._decode(self.subcon._parse(stream, context), context)
def _build(self, obj, stream, context):
self.subcon._build(self._encode(obj, context), stream, context)
def _decode(self, obj, context):
raise NotImplementedError()
def _encode(self, obj, context):
raise NotImplementedError()
#===============================================================================
# Fields
#===============================================================================
def _read_stream(stream, length):
if length < 0:
raise ValueError("length must be >= 0", length)
data = stream.read(length)
if len(data) != length:
raise FieldError("expected %d, found %d" % (length, len(data)))
return data
def _write_stream(stream, length, data):
if length < 0:
raise ValueError("length must be >= 0", length)
if len(data) != length:
raise FieldError("expected %d, found %d" % (length, len(data)))
stream.write(data)
class StaticField(Construct):
"""
A fixed-size byte field.
:param str name: field name
:param int length: number of bytes in the field
"""
__slots__ = ["length"]
def __init__(self, name, length):
Construct.__init__(self, name)
self.length = length
def _parse(self, stream, context):
return _read_stream(stream, self.length)
def _build(self, obj, stream, context):
_write_stream(stream, self.length, obj)
def _sizeof(self, context):
return self.length
class FormatField(StaticField):
"""
A field that uses ``struct`` to pack and unpack data.
See ``struct`` documentation for instructions on crafting format strings.
:param str name: name of the field
:param str endianness: format endianness string; one of "<", ">", or "="
:param str format: a single format character
"""
__slots__ = ["packer"]
def __init__(self, name, endianity, format):
if endianity not in (">", "<", "="):
raise ValueError("endianity must be be '=', '<', or '>'",
endianity)
if len(format) != 1:
raise ValueError("must specify one and only one format char")
self.packer = Packer(endianity + format)
StaticField.__init__(self, name, self.packer.size)
def __getstate__(self):
attrs = StaticField.__getstate__(self)
attrs["packer"] = attrs["packer"].format
return attrs
def __setstate__(self, attrs):
attrs["packer"] = Packer(attrs["packer"])
return StaticField.__setstate__(attrs)
def _parse(self, stream, context):
try:
return self.packer.unpack(_read_stream(stream, self.length))[0]
except Exception as ex:
raise FieldError(ex)
def _build(self, obj, stream, context):
try:
_write_stream(stream, self.length, self.packer.pack(obj))
except Exception as ex:
raise FieldError(ex)
class MetaField(Construct):
"""
A variable-length field. The length is obtained at runtime from a
function.
:param str name: name of the field
:param callable lengthfunc: callable that takes a context and returns
length as an int
>>> foo = Struct("foo",
... Byte("length"),
... MetaField("data", lambda ctx: ctx["length"])
... )
>>> foo.parse("\\x03ABC")
Container(data = 'ABC', length = 3)
>>> foo.parse("\\x04ABCD")
Container(data = 'ABCD', length = 4)
"""
__slots__ = ["lengthfunc"]
def __init__(self, name, lengthfunc):
Construct.__init__(self, name)
self.lengthfunc = lengthfunc
self._set_flag(self.FLAG_DYNAMIC)
def _parse(self, stream, context):
return _read_stream(stream, self.lengthfunc(context))
def _build(self, obj, stream, context):
_write_stream(stream, self.lengthfunc(context), obj)
def _sizeof(self, context):
return self.lengthfunc(context)
#===============================================================================
# arrays and repeaters
#===============================================================================
class MetaArray(Subconstruct):
"""
An array (repeater) of a meta-count. The array will iterate exactly
`countfunc()` times. Will raise ArrayError if less elements are found.
See also Array, Range and RepeatUntil.
Parameters:
* countfunc - a function that takes the context as a parameter and returns
the number of elements of the array (count)
* subcon - the subcon to repeat `countfunc()` times
Example:
MetaArray(lambda ctx: 5, UBInt8("foo"))
"""
__slots__ = ["countfunc"]
def __init__(self, countfunc, subcon):
Subconstruct.__init__(self, subcon)
self.countfunc = countfunc
self._clear_flag(self.FLAG_COPY_CONTEXT)
self._set_flag(self.FLAG_DYNAMIC)
def _parse(self, stream, context):
obj = ListContainer()
c = 0
count = self.countfunc(context)
try:
if self.subcon.conflags & self.FLAG_COPY_CONTEXT:
while c < count:
obj.append(self.subcon._parse(stream, context.__copy__()))
c += 1
else:
while c < count:
obj.append(self.subcon._parse(stream, context))
c += 1
except ConstructError as ex:
raise ArrayError("expected %d, found %d" % (count, c), ex)
return obj
def _build(self, obj, stream, context):
count = self.countfunc(context)
if len(obj) != count:
raise ArrayError("expected %d, found %d" % (count, len(obj)))
if self.subcon.conflags & self.FLAG_COPY_CONTEXT:
for subobj in obj:
self.subcon._build(subobj, stream, context.__copy__())
else:
for subobj in obj:
self.subcon._build(subobj, stream, context)
def _sizeof(self, context):
return self.subcon._sizeof(context) * self.countfunc(context)
class Range(Subconstruct):
"""
A range-array. The subcon will iterate between `mincount` to `maxcount`
times. If less than `mincount` elements are found, raises RangeError.
See also GreedyRange and OptionalGreedyRange.
The general-case repeater. Repeats the given unit for at least mincount
times, and up to maxcount times. If an exception occurs (EOF, validation
error), the repeater exits. If less than mincount units have been
successfully parsed, a RangeError is raised.
.. note::
This object requires a seekable stream for parsing.
:param int mincount: the minimal count
:param int maxcount: the maximal count
:param Construct subcon: the subcon to repeat
>>> c = Range(3, 7, UBInt8("foo"))
>>> c.parse("\\x01\\x02")
Traceback (most recent call last):
...
construct.core.RangeError: expected 3..7, found 2
>>> c.parse("\\x01\\x02\\x03")
[1, 2, 3]
>>> c.parse("\\x01\\x02\\x03\\x04\\x05\\x06")
[1, 2, 3, 4, 5, 6]
>>> c.parse("\\x01\\x02\\x03\\x04\\x05\\x06\\x07")
[1, 2, 3, 4, 5, 6, 7]
>>> c.parse("\\x01\\x02\\x03\\x04\\x05\\x06\\x07\\x08\\x09")
[1, 2, 3, 4, 5, 6, 7]
>>> c.build([1,2])
Traceback (most recent call last):
...
construct.core.RangeError: expected 3..7, found 2
>>> c.build([1,2,3,4])
'\\x01\\x02\\x03\\x04'
>>> c.build([1,2,3,4,5,6,7,8])
Traceback (most recent call last):
...
construct.core.RangeError: expected 3..7, found 8
"""
__slots__ = ["mincount", "maxcout"]
def __init__(self, mincount, maxcout, subcon):
Subconstruct.__init__(self, subcon)
self.mincount = mincount
self.maxcout = maxcout
self._clear_flag(self.FLAG_COPY_CONTEXT)
self._set_flag(self.FLAG_DYNAMIC)
def _parse(self, stream, context):
obj = ListContainer()
c = 0
try:
if self.subcon.conflags & self.FLAG_COPY_CONTEXT:
while c < self.maxcout:
pos = stream.tell()
obj.append(self.subcon._parse(stream, context.__copy__()))
c += 1
else:
while c < self.maxcout:
pos = stream.tell()
obj.append(self.subcon._parse(stream, context))
c += 1
except ConstructError as ex:
if c < self.mincount:
raise RangeError("expected %d to %d, found %d" %
(self.mincount, self.maxcout, c), ex)
stream.seek(pos)
return obj
def _build(self, obj, stream, context):
if len(obj) < self.mincount or len(obj) > self.maxcout:
raise RangeError("expected %d to %d, found %d" %
(self.mincount, self.maxcout, len(obj)))
cnt = 0
try:
if self.subcon.conflags & self.FLAG_COPY_CONTEXT:
for subobj in obj:
if isinstance(obj, bytes):
subobj = bchr(subobj)
self.subcon._build(subobj, stream, context.__copy__())
cnt += 1
else:
for subobj in obj:
if isinstance(obj, bytes):
subobj = bchr(subobj)
self.subcon._build(subobj, stream, context)
cnt += 1
except ConstructError as ex:
if cnt < self.mincount:
raise RangeError("expected %d to %d, found %d" %
(self.mincount, self.maxcout, len(obj)), ex)
def _sizeof(self, context):
raise SizeofError("can't calculate size")
class RepeatUntil(Subconstruct):
"""
An array that repeats until the predicate indicates it to stop. Note that
the last element (which caused the repeat to exit) is included in the
return value.
Parameters:
* predicate - a predicate function that takes (obj, context) and returns
True if the stop-condition is met, or False to continue.
* subcon - the subcon to repeat.
Example:
# will read chars until b\x00 (inclusive)
RepeatUntil(lambda obj, ctx: obj == b"\x00",
Field("chars", 1)
)
"""
__slots__ = ["predicate"]
def __init__(self, predicate, subcon):
Subconstruct.__init__(self, subcon)
self.predicate = predicate
self._clear_flag(self.FLAG_COPY_CONTEXT)
self._set_flag(self.FLAG_DYNAMIC)
def _parse(self, stream, context):
obj = []
try:
if self.subcon.conflags & self.FLAG_COPY_CONTEXT:
while True:
subobj = self.subcon._parse(stream, context.__copy__())
obj.append(subobj)
if self.predicate(subobj, context):
break
else:
while True:
subobj = self.subcon._parse(stream, context)
obj.append(subobj)
if self.predicate(subobj, context):
break
except ConstructError as ex:
raise ArrayError("missing terminator", ex)
return obj
def _build(self, obj, stream, context):
terminated = False
if self.subcon.conflags & self.FLAG_COPY_CONTEXT:
for subobj in obj:
self.subcon._build(subobj, stream, context.__copy__())
if self.predicate(subobj, context):
terminated = True
break
else:
for subobj in obj:
subobj = bchr(subobj)
self.subcon._build(subobj, stream, context.__copy__())
if self.predicate(subobj, context):
terminated = True
break
if not terminated:
raise ArrayError("missing terminator")
def _sizeof(self, context):
raise SizeofError("can't calculate size")
#===============================================================================
# structures and sequences
#===============================================================================
class Struct(Construct):
"""
A sequence of named constructs, similar to structs in C. The elements are
parsed and built in the order they are defined.
See also Embedded.
Parameters:
* name - the name of the structure
* subcons - a sequence of subconstructs that make up this structure.
* nested - a keyword-only argument that indicates whether this struct
creates a nested context. The default is True. This parameter is
considered "advanced usage", and may be removed in the future.
Example:
Struct("foo",
UBInt8("first_element"),
UBInt16("second_element"),
Padding(2),
UBInt8("third_element"),
)
"""
__slots__ = ["subcons", "nested"]
def __init__(self, name, *subcons, **kw):
self.nested = kw.pop("nested", True)
if kw:
raise TypeError("the only keyword argument accepted is 'nested'", kw)
Construct.__init__(self, name)
self.subcons = subcons
self._inherit_flags(*subcons)
self._clear_flag(self.FLAG_EMBED)
def _parse(self, stream, context):
if "" in context:
obj = context[""]
del context[""]
else:
obj = Container()
if self.nested:
context = Container(_ = context)
for sc in self.subcons:
if sc.conflags & self.FLAG_EMBED:
context[""] = obj
sc._parse(stream, context)
else:
subobj = sc._parse(stream, context)
if sc.name is not None:
obj[sc.name] = subobj
context[sc.name] = subobj
return obj
def _build(self, obj, stream, context):
if "" in context:
del context[""]
elif self.nested:
context = Container(_ = context)
for sc in self.subcons:
if sc.conflags & self.FLAG_EMBED:
context[""] = True
subobj = obj
elif sc.name is None:
subobj = None
else:
subobj = getattr(obj, sc.name)
context[sc.name] = subobj
sc._build(subobj, stream, context)
def _sizeof(self, context):
if self.nested:
context = Container(_ = context)
return sum(sc._sizeof(context) for sc in self.subcons)
class Sequence(Struct):
"""
A sequence of unnamed constructs. The elements are parsed and built in the
order they are defined.
See also Embedded.
Parameters:
* name - the name of the structure
* subcons - a sequence of subconstructs that make up this structure.
* nested - a keyword-only argument that indicates whether this struct
creates a nested context. The default is True. This parameter is
considered "advanced usage", and may be removed in the future.
Example:
Sequence("foo",
UBInt8("first_element"),
UBInt16("second_element"),
Padding(2),
UBInt8("third_element"),
)
"""
__slots__ = []
def _parse(self, stream, context):
if "" in context:
obj = context[""]
del context[""]
else:
obj = ListContainer()
if self.nested:
context = Container(_ = context)
for sc in self.subcons:
if sc.conflags & self.FLAG_EMBED:
context[""] = obj
sc._parse(stream, context)
else:
subobj = sc._parse(stream, context)
if sc.name is not None:
obj.append(subobj)
context[sc.name] = subobj
return obj
def _build(self, obj, stream, context):
if "" in context:
del context[""]
elif self.nested:
context = Container(_ = context)
objiter = iter(obj)
for sc in self.subcons:
if sc.conflags & self.FLAG_EMBED:
context[""] = True
subobj = objiter
elif sc.name is None:
subobj = None
else:
subobj = advance_iterator(objiter)
context[sc.name] = subobj
sc._build(subobj, stream, context)
class Union(Construct):
"""
a set of overlapping fields (like unions in C). when parsing,
all fields read the same data; when building, only the first subcon
(called "master") is used.
Parameters:
* name - the name of the union
* master - the master subcon, i.e., the subcon used for building and
calculating the total size
* subcons - additional subcons
Example:
Union("what_are_four_bytes",
UBInt32("one_dword"),
Struct("two_words", UBInt16("first"), UBInt16("second")),
Struct("four_bytes",
UBInt8("a"),
UBInt8("b"),
UBInt8("c"),
UBInt8("d")
),
)
"""
__slots__ = ["parser", "builder"]
def __init__(self, name, master, *subcons, **kw):
Construct.__init__(self, name)
args = [Peek(sc) for sc in subcons]
args.append(MetaField(None, lambda ctx: master._sizeof(ctx)))
self.parser = Struct(name, Peek(master, perform_build = True), *args)
self.builder = Struct(name, master)
def _parse(self, stream, context):
return self.parser._parse(stream, context)
def _build(self, obj, stream, context):
return self.builder._build(obj, stream, context)
def _sizeof(self, context):
return self.builder._sizeof(context)
#===============================================================================
# conditional
#===============================================================================
class Switch(Construct):
"""
A conditional branch. Switch will choose the case to follow based on
the return value of keyfunc. If no case is matched, and no default value
is given, SwitchError will be raised.
See also Pass.
Parameters:
* name - the name of the construct
* keyfunc - a function that takes the context and returns a key, which
will ne used to choose the relevant case.
* cases - a dictionary mapping keys to constructs. the keys can be any
values that may be returned by keyfunc.
* default - a default value to use when the key is not found in the cases.
if not supplied, an exception will be raised when the key is not found.
You can use the builtin construct Pass for 'do-nothing'.
* include_key - whether or not to include the key in the return value
of parsing. defualt is False.
Example:
Struct("foo",
UBInt8("type"),
Switch("value", lambda ctx: ctx.type, {
1 : UBInt8("spam"),
2 : UBInt16("spam"),
3 : UBInt32("spam"),
4 : UBInt64("spam"),
}
),
)
"""
class NoDefault(Construct):
def _parse(self, stream, context):
raise SwitchError("no default case defined")
def _build(self, obj, stream, context):
raise SwitchError("no default case defined")
def _sizeof(self, context):
raise SwitchError("no default case defined")
NoDefault = NoDefault("No default value specified")
__slots__ = ["subcons", "keyfunc", "cases", "default", "include_key"]
def __init__(self, name, keyfunc, cases, default = NoDefault,
include_key = False):
Construct.__init__(self, name)
self._inherit_flags(*cases.values())
self.keyfunc = keyfunc
self.cases = cases
self.default = default
self.include_key = include_key
self._inherit_flags(*cases.values())
self._set_flag(self.FLAG_DYNAMIC)
def _parse(self, stream, context):
key = self.keyfunc(context)
obj = self.cases.get(key, self.default)._parse(stream, context)
if self.include_key:
return key, obj
else:
return obj
def _build(self, obj, stream, context):
if self.include_key:
key, obj = obj
else:
key = self.keyfunc(context)
case = self.cases.get(key, self.default)
case._build(obj, stream, context)
def _sizeof(self, context):
case = self.cases.get(self.keyfunc(context), self.default)
return case._sizeof(context)
class Select(Construct):
"""
Selects the first matching subconstruct. It will literally try each of
the subconstructs, until one matches.
Notes:
* requires a seekable stream.
Parameters:
* name - the name of the construct
* subcons - the subcons to try (order-sensitive)
* include_name - a keyword only argument, indicating whether to include
the name of the selected subcon in the return value of parsing. default
is false.
Example:
Select("foo",
UBInt64("large"),
UBInt32("medium"),
UBInt16("small"),
UBInt8("tiny"),
)
"""
__slots__ = ["subcons", "include_name"]
def __init__(self, name, *subcons, **kw):
include_name = kw.pop("include_name", False)
if kw:
raise TypeError("the only keyword argument accepted "
"is 'include_name'", kw)
Construct.__init__(self, name)
self.subcons = subcons
self.include_name = include_name
self._inherit_flags(*subcons)
self._set_flag(self.FLAG_DYNAMIC)
def _parse(self, stream, context):
for sc in self.subcons:
pos = stream.tell()
context2 = context.__copy__()
try:
obj = sc._parse(stream, context2)
except ConstructError:
stream.seek(pos)
else:
context.__update__(context2)
if self.include_name:
return sc.name, obj
else:
return obj
raise SelectError("no subconstruct matched")
def _build(self, obj, stream, context):
if self.include_name:
name, obj = obj
for sc in self.subcons:
if sc.name == name:
sc._build(obj, stream, context)
return
else:
for sc in self.subcons:
stream2 = BytesIO()
context2 = context.__copy__()
try:
sc._build(obj, stream2, context2)
except Exception:
pass
else:
context.__update__(context2)
stream.write(stream2.getvalue())
return
raise SelectError("no subconstruct matched", obj)
def _sizeof(self, context):
raise SizeofError("can't calculate size")
#===============================================================================
# stream manipulation
#===============================================================================
class Pointer(Subconstruct):
"""
Changes the stream position to a given offset, where the construction
should take place, and restores the stream position when finished.
See also Anchor, OnDemand and OnDemandPointer.
Notes:
* requires a seekable stream.
Parameters:
* offsetfunc: a function that takes the context and returns an absolute
stream position, where the construction would take place
* subcon - the subcon to use at `offsetfunc()`
Example:
Struct("foo",
UBInt32("spam_pointer"),
Pointer(lambda ctx: ctx.spam_pointer,
Array(5, UBInt8("spam"))
)
)
"""
__slots__ = ["offsetfunc"]
def __init__(self, offsetfunc, subcon):
Subconstruct.__init__(self, subcon)
self.offsetfunc = offsetfunc
def _parse(self, stream, context):
newpos = self.offsetfunc(context)
origpos = stream.tell()
stream.seek(newpos)
obj = self.subcon._parse(stream, context)
stream.seek(origpos)
return obj
def _build(self, obj, stream, context):
newpos = self.offsetfunc(context)
origpos = stream.tell()
stream.seek(newpos)
self.subcon._build(obj, stream, context)
stream.seek(origpos)
def _sizeof(self, context):
return 0
class Peek(Subconstruct):
"""
Peeks at the stream: parses without changing the stream position.
See also Union. If the end of the stream is reached when peeking,
returns None.
Notes:
* requires a seekable stream.
Parameters:
* subcon - the subcon to peek at
* perform_build - whether or not to perform building. by default this
parameter is set to False, meaning building is a no-op.
Example:
Peek(UBInt8("foo"))
"""
__slots__ = ["perform_build"]
def __init__(self, subcon, perform_build = False):
Subconstruct.__init__(self, subcon)
self.perform_build = perform_build
def _parse(self, stream, context):
pos = stream.tell()
try:
return self.subcon._parse(stream, context)
except FieldError:
pass
finally:
stream.seek(pos)
def _build(self, obj, stream, context):
if self.perform_build:
self.subcon._build(obj, stream, context)
def _sizeof(self, context):
return 0
class OnDemand(Subconstruct):
"""
Allows for on-demand (lazy) parsing. When parsing, it will return a
LazyContainer that represents a pointer to the data, but does not actually
parses it from stream until it's "demanded".
By accessing the 'value' property of LazyContainers, you will demand the
data from the stream. The data will be parsed and cached for later use.
You can use the 'has_value' property to know whether the data has already
been demanded.
See also OnDemandPointer.
Notes:
* requires a seekable stream.
Parameters:
* subcon -
* advance_stream - whether or not to advance the stream position. by
default this is True, but if subcon is a pointer, this should be False.
* force_build - whether or not to force build. If set to False, and the
LazyContainer has not been demaned, building is a no-op.
Example:
OnDemand(Array(10000, UBInt8("foo"))
"""
__slots__ = ["advance_stream", "force_build"]
def __init__(self, subcon, advance_stream = True, force_build = True):
Subconstruct.__init__(self, subcon)
self.advance_stream = advance_stream
self.force_build = force_build
def _parse(self, stream, context):
obj = LazyContainer(self.subcon, stream, stream.tell(), context)
if self.advance_stream:
stream.seek(self.subcon._sizeof(context), 1)
return obj
def _build(self, obj, stream, context):
if not isinstance(obj, LazyContainer):
self.subcon._build(obj, stream, context)
elif self.force_build or obj.has_value:
self.subcon._build(obj.value, stream, context)
elif self.advance_stream:
stream.seek(self.subcon._sizeof(context), 1)
class Buffered(Subconstruct):
"""
Creates an in-memory buffered stream, which can undergo encoding and
decoding prior to being passed on to the subconstruct.
See also Bitwise.
Note:
* Do not use pointers inside Buffered
Parameters:
* subcon - the subcon which will operate on the buffer
* encoder - a function that takes a string and returns an encoded
string (used after building)
* decoder - a function that takes a string and returns a decoded
string (used before parsing)
* resizer - a function that takes the size of the subcon and "adjusts"
or "resizes" it according to the encoding/decoding process.
Example:
Buffered(BitField("foo", 16),
encoder = decode_bin,
decoder = encode_bin,
resizer = lambda size: size / 8,
)
"""
__slots__ = ["encoder", "decoder", "resizer"]
def __init__(self, subcon, decoder, encoder, resizer):
Subconstruct.__init__(self, subcon)
self.encoder = encoder
self.decoder = decoder
self.resizer = resizer
def _parse(self, stream, context):
data = _read_stream(stream, self._sizeof(context))
stream2 = BytesIO(self.decoder(data))
return self.subcon._parse(stream2, context)
def _build(self, obj, stream, context):
size = self._sizeof(context)
stream2 = BytesIO()
self.subcon._build(obj, stream2, context)
data = self.encoder(stream2.getvalue())
assert len(data) == size
_write_stream(stream, self._sizeof(context), data)
def _sizeof(self, context):
return self.resizer(self.subcon._sizeof(context))
class Restream(Subconstruct):
"""
Wraps the stream with a read-wrapper (for parsing) or a
write-wrapper (for building). The stream wrapper can buffer the data
internally, reading it from- or writing it to the underlying stream
as needed. For example, BitStreamReader reads whole bytes from the
underlying stream, but returns them as individual bits.
See also Bitwise.
When the parsing or building is done, the stream's close method
will be invoked. It can perform any finalization needed for the stream
wrapper, but it must not close the underlying stream.
Note:
* Do not use pointers inside Restream
Parameters:
* subcon - the subcon
* stream_reader - the read-wrapper
* stream_writer - the write wrapper
* resizer - a function that takes the size of the subcon and "adjusts"
or "resizes" it according to the encoding/decoding process.
Example:
Restream(BitField("foo", 16),
stream_reader = BitStreamReader,
stream_writer = BitStreamWriter,
resizer = lambda size: size / 8,
)
"""
__slots__ = ["stream_reader", "stream_writer", "resizer"]
def __init__(self, subcon, stream_reader, stream_writer, resizer):
Subconstruct.__init__(self, subcon)
self.stream_reader = stream_reader
self.stream_writer = stream_writer
self.resizer = resizer
def _parse(self, stream, context):
stream2 = self.stream_reader(stream)
obj = self.subcon._parse(stream2, context)
stream2.close()
return obj
def _build(self, obj, stream, context):
stream2 = self.stream_writer(stream)
self.subcon._build(obj, stream2, context)
stream2.close()
def _sizeof(self, context):
return self.resizer(self.subcon._sizeof(context))
#===============================================================================
# miscellaneous
#===============================================================================
class Reconfig(Subconstruct):
"""
Reconfigures a subconstruct. Reconfig can be used to change the name and
set and clear flags of the inner subcon.
Parameters:
* name - the new name
* subcon - the subcon to reconfigure
* setflags - the flags to set (default is 0)
* clearflags - the flags to clear (default is 0)
Example:
Reconfig("foo", UBInt8("bar"))
"""
__slots__ = []
def __init__(self, name, subcon, setflags = 0, clearflags = 0):
Construct.__init__(self, name, subcon.conflags)
self.subcon = subcon
self._set_flag(setflags)
self._clear_flag(clearflags)
class Anchor(Construct):
"""
Returns the "anchor" (stream position) at the point where it's inserted.
Useful for adjusting relative offsets to absolute positions, or to measure
sizes of constructs.
absolute pointer = anchor + relative offset
size = anchor_after - anchor_before
See also Pointer.
Notes:
* requires a seekable stream.
Parameters:
* name - the name of the anchor
Example:
Struct("foo",
Anchor("base"),
UBInt8("relative_offset"),
Pointer(lambda ctx: ctx.relative_offset + ctx.base,
UBInt8("data")
)
)
"""
__slots__ = []
def _parse(self, stream, context):
return stream.tell()
def _build(self, obj, stream, context):
context[self.name] = stream.tell()
def _sizeof(self, context):
return 0
class Value(Construct):
"""
A computed value.
Parameters:
* name - the name of the value
* func - a function that takes the context and return the computed value
Example:
Struct("foo",
UBInt8("width"),
UBInt8("height"),
Value("total_pixels", lambda ctx: ctx.width * ctx.height),
)
"""
__slots__ = ["func"]
def __init__(self, name, func):
Construct.__init__(self, name)
self.func = func
self._set_flag(self.FLAG_DYNAMIC)
def _parse(self, stream, context):
return self.func(context)
def _build(self, obj, stream, context):
context[self.name] = self.func(context)
def _sizeof(self, context):
return 0
#class Dynamic(Construct):
# """
# Dynamically creates a construct and uses it for parsing and building.
# This allows you to create change the construction tree on the fly.
# Deprecated.
#
# Parameters:
# * name - the name of the construct
# * factoryfunc - a function that takes the context and returns a new
# construct object which will be used for parsing and building.
#
# Example:
# def factory(ctx):
# if ctx.bar == 8:
# return UBInt8("spam")
# if ctx.bar == 9:
# return String("spam", 9)
#
# Struct("foo",
# UBInt8("bar"),
# Dynamic("spam", factory),
# )
# """
# __slots__ = ["factoryfunc"]
# def __init__(self, name, factoryfunc):
# Construct.__init__(self, name, self.FLAG_COPY_CONTEXT)
# self.factoryfunc = factoryfunc
# self._set_flag(self.FLAG_DYNAMIC)
# def _parse(self, stream, context):
# return self.factoryfunc(context)._parse(stream, context)
# def _build(self, obj, stream, context):
# return self.factoryfunc(context)._build(obj, stream, context)
# def _sizeof(self, context):
# return self.factoryfunc(context)._sizeof(context)
class LazyBound(Construct):
"""
Lazily bound construct, useful for constructs that need to make cyclic
references (linked-lists, expression trees, etc.).
Parameters:
Example:
foo = Struct("foo",
UBInt8("bar"),
LazyBound("next", lambda: foo),
)
"""
__slots__ = ["bindfunc", "bound"]
def __init__(self, name, bindfunc):
Construct.__init__(self, name)
self.bound = None
self.bindfunc = bindfunc
def _parse(self, stream, context):
if self.bound is None:
self.bound = self.bindfunc()
return self.bound._parse(stream, context)
def _build(self, obj, stream, context):
if self.bound is None:
self.bound = self.bindfunc()
self.bound._build(obj, stream, context)
def _sizeof(self, context):
if self.bound is None:
self.bound = self.bindfunc()
return self.bound._sizeof(context)
class Pass(Construct):
"""
A do-nothing construct, useful as the default case for Switch, or
to indicate Enums.
See also Switch and Enum.
Notes:
* this construct is a singleton. do not try to instatiate it, as it
will not work...
Example:
Pass
"""
__slots__ = []
def _parse(self, stream, context):
pass
def _build(self, obj, stream, context):
assert obj is None
def _sizeof(self, context):
return 0
Pass = Pass(None)
class Terminator(Construct):
"""
Asserts the end of the stream has been reached at the point it's placed.
You can use this to ensure no more unparsed data follows.
Notes:
* this construct is only meaningful for parsing. for building, it's
a no-op.
* this construct is a singleton. do not try to instatiate it, as it
will not work...
Example:
Terminator
"""
__slots__ = []
def _parse(self, stream, context):
if stream.read(1):
raise TerminatorError("expected end of stream")
def _build(self, obj, stream, context):
assert obj is None
def _sizeof(self, context):
return 0
Terminator = Terminator(None)
pyelftools-0.26/elftools/construct/debug.py 0000664 0000000 0000000 00000010024 13572204573 0021161 0 ustar 00root root 0000000 0000000 """
Debugging utilities for constructs
"""
from __future__ import print_function
import sys
import traceback
import pdb
import inspect
from .core import Construct, Subconstruct
from .lib import HexString, Container, ListContainer
class Probe(Construct):
"""
A probe: dumps the context, stack frames, and stream content to the screen
to aid the debugging process.
See also Debugger.
Parameters:
* name - the display name
* show_stream - whether or not to show stream contents. default is True.
the stream must be seekable.
* show_context - whether or not to show the context. default is True.
* show_stack - whether or not to show the upper stack frames. default
is True.
* stream_lookahead - the number of bytes to dump when show_stack is set.
default is 100.
Example:
Struct("foo",
UBInt8("a"),
Probe("between a and b"),
UBInt8("b"),
)
"""
__slots__ = [
"printname", "show_stream", "show_context", "show_stack",
"stream_lookahead"
]
counter = 0
def __init__(self, name = None, show_stream = True,
show_context = True, show_stack = True,
stream_lookahead = 100):
Construct.__init__(self, None)
if name is None:
Probe.counter += 1
name = "" % (Probe.counter,)
self.printname = name
self.show_stream = show_stream
self.show_context = show_context
self.show_stack = show_stack
self.stream_lookahead = stream_lookahead
def __repr__(self):
return "%s(%r)" % (self.__class__.__name__, self.printname)
def _parse(self, stream, context):
self.printout(stream, context)
def _build(self, obj, stream, context):
self.printout(stream, context)
def _sizeof(self, context):
return 0
def printout(self, stream, context):
obj = Container()
if self.show_stream:
obj.stream_position = stream.tell()
follows = stream.read(self.stream_lookahead)
if not follows:
obj.following_stream_data = "EOF reached"
else:
stream.seek(-len(follows), 1)
obj.following_stream_data = HexString(follows)
print
if self.show_context:
obj.context = context
if self.show_stack:
obj.stack = ListContainer()
frames = [s[0] for s in inspect.stack()][1:-1]
frames.reverse()
for f in frames:
a = Container()
a.__update__(f.f_locals)
obj.stack.append(a)
print("=" * 80)
print("Probe", self.printname)
print(obj)
print("=" * 80)
class Debugger(Subconstruct):
"""
A pdb-based debugger. When an exception occurs in the subcon, a debugger
will appear and allow you to debug the error (and even fix on-the-fly).
Parameters:
* subcon - the subcon to debug
Example:
Debugger(
Enum(UBInt8("foo"),
a = 1,
b = 2,
c = 3
)
)
"""
__slots__ = ["retval"]
def _parse(self, stream, context):
try:
return self.subcon._parse(stream, context)
except Exception:
self.retval = NotImplemented
self.handle_exc("(you can set the value of 'self.retval', "
"which will be returned)")
if self.retval is NotImplemented:
raise
else:
return self.retval
def _build(self, obj, stream, context):
try:
self.subcon._build(obj, stream, context)
except Exception:
self.handle_exc()
def handle_exc(self, msg = None):
print("=" * 80)
print("Debugging exception of %s:" % (self.subcon,))
print("".join(traceback.format_exception(*sys.exc_info())[1:]))
if msg:
print(msg)
pdb.post_mortem(sys.exc_info()[2])
print("=" * 80)
pyelftools-0.26/elftools/construct/lib/ 0000775 0000000 0000000 00000000000 13572204573 0020272 5 ustar 00root root 0000000 0000000 pyelftools-0.26/elftools/construct/lib/__init__.py 0000664 0000000 0000000 00000000410 13572204573 0022376 0 ustar 00root root 0000000 0000000 from .binary import (
int_to_bin, bin_to_int, swap_bytes, encode_bin, decode_bin)
from .bitstream import BitStreamReader, BitStreamWriter
from .container import (Container, FlagsContainer, ListContainer,
LazyContainer)
from .hex import HexString, hexdump
pyelftools-0.26/elftools/construct/lib/binary.py 0000664 0000000 0000000 00000005617 13572204573 0022141 0 ustar 00root root 0000000 0000000 from .py3compat import int2byte
def int_to_bin(number, width=32):
r"""
Convert an integer into its binary representation in a bytes object.
Width is the amount of bits to generate. If width is larger than the actual
amount of bits required to represent number in binary, sign-extension is
used. If it's smaller, the representation is trimmed to width bits.
Each "bit" is either '\x00' or '\x01'. The MSBit is first.
Examples:
>>> int_to_bin(19, 5)
b'\x01\x00\x00\x01\x01'
>>> int_to_bin(19, 8)
b'\x00\x00\x00\x01\x00\x00\x01\x01'
"""
if number < 0:
number += 1 << width
i = width - 1
bits = bytearray(width)
while number and i >= 0:
bits[i] = number & 1
number >>= 1
i -= 1
return bytes(bits)
_bit_values = {
0: 0,
1: 1,
48: 0, # '0'
49: 1, # '1'
# The following are for Python 2, in which iteration over a bytes object
# yields single-character bytes and not integers.
'\x00': 0,
'\x01': 1,
'0': 0,
'1': 1,
}
def bin_to_int(bits, signed=False):
r"""
Logical opposite of int_to_bin. Both '0' and '\x00' are considered zero,
and both '1' and '\x01' are considered one. Set sign to True to interpret
the number as a 2-s complement signed integer.
"""
number = 0
bias = 0
ptr = 0
if signed and _bit_values[bits[0]] == 1:
bits = bits[1:]
bias = 1 << len(bits)
for b in bits:
number <<= 1
number |= _bit_values[b]
return number - bias
def swap_bytes(bits, bytesize=8):
r"""
Bits is a b'' object containing a binary representation. Assuming each
bytesize bits constitute a bytes, perform a endianness byte swap. Example:
>>> swap_bytes(b'00011011', 2)
b'11100100'
"""
i = 0
l = len(bits)
output = [b""] * ((l // bytesize) + 1)
j = len(output) - 1
while i < l:
output[j] = bits[i : i + bytesize]
i += bytesize
j -= 1
return b"".join(output)
_char_to_bin = {}
_bin_to_char = {}
for i in range(256):
ch = int2byte(i)
bin = int_to_bin(i, 8)
# Populate with for both keys i and ch, to support Python 2 & 3
_char_to_bin[ch] = bin
_char_to_bin[i] = bin
_bin_to_char[bin] = ch
def encode_bin(data):
"""
Create a binary representation of the given b'' object. Assume 8-bit
ASCII. Example:
>>> encode_bin('ab')
b"\x00\x01\x01\x00\x00\x00\x00\x01\x00\x01\x01\x00\x00\x00\x01\x00"
"""
return b"".join(_char_to_bin[ch] for ch in data)
def decode_bin(data):
"""
Locical opposite of decode_bin.
"""
if len(data) & 7:
raise ValueError("Data length must be a multiple of 8")
i = 0
j = 0
l = len(data) // 8
chars = [b""] * l
while j < l:
chars[j] = _bin_to_char[data[i:i+8]]
i += 8
j += 1
return b"".join(chars)
pyelftools-0.26/elftools/construct/lib/bitstream.py 0000664 0000000 0000000 00000003747 13572204573 0022651 0 ustar 00root root 0000000 0000000 from .binary import encode_bin, decode_bin
class BitStreamReader(object):
__slots__ = ["substream", "buffer", "total_size"]
def __init__(self, substream):
self.substream = substream
self.total_size = 0
self.buffer = ""
def close(self):
if self.total_size % 8 != 0:
raise ValueError("total size of read data must be a multiple of 8",
self.total_size)
def tell(self):
return self.substream.tell()
def seek(self, pos, whence = 0):
self.buffer = ""
self.total_size = 0
self.substream.seek(pos, whence)
def read(self, count):
if count < 0:
raise ValueError("count cannot be negative")
l = len(self.buffer)
if count == 0:
data = ""
elif count <= l:
data = self.buffer[:count]
self.buffer = self.buffer[count:]
else:
data = self.buffer
count -= l
bytes = count // 8
if count & 7:
bytes += 1
buf = encode_bin(self.substream.read(bytes))
data += buf[:count]
self.buffer = buf[count:]
self.total_size += len(data)
return data
class BitStreamWriter(object):
__slots__ = ["substream", "buffer", "pos"]
def __init__(self, substream):
self.substream = substream
self.buffer = []
self.pos = 0
def close(self):
self.flush()
def flush(self):
bytes = decode_bin("".join(self.buffer))
self.substream.write(bytes)
self.buffer = []
self.pos = 0
def tell(self):
return self.substream.tell() + self.pos // 8
def seek(self, pos, whence = 0):
self.flush()
self.substream.seek(pos, whence)
def write(self, data):
if not data:
return
if type(data) is not str:
raise TypeError("data must be a string, not %r" % (type(data),))
self.buffer.append(data)
pyelftools-0.26/elftools/construct/lib/container.py 0000664 0000000 0000000 00000007554 13572204573 0022641 0 ustar 00root root 0000000 0000000 """
Various containers.
"""
from pprint import pformat
from .py3compat import MutableMapping
def recursion_lock(retval, lock_name = "__recursion_lock__"):
def decorator(func):
def wrapper(self, *args, **kw):
if getattr(self, lock_name, False):
return retval
setattr(self, lock_name, True)
try:
return func(self, *args, **kw)
finally:
setattr(self, lock_name, False)
wrapper.__name__ = func.__name__
return wrapper
return decorator
class Container(MutableMapping):
"""
A generic container of attributes.
Containers are the common way to express parsed data.
"""
def __init__(self, **kw):
self.__dict__ = kw
# The core dictionary interface.
def __getitem__(self, name):
return self.__dict__[name]
def __delitem__(self, name):
del self.__dict__[name]
def __setitem__(self, name, value):
self.__dict__[name] = value
def keys(self):
return self.__dict__.keys()
def __len__(self):
return len(self.__dict__.keys())
# Extended dictionary interface.
def update(self, other):
self.__dict__.update(other)
__update__ = update
def __contains__(self, value):
return value in self.__dict__
# Rich comparisons.
def __eq__(self, other):
try:
return self.__dict__ == other.__dict__
except AttributeError:
return False
def __ne__(self, other):
return not self == other
# Copy interface.
def copy(self):
return self.__class__(**self.__dict__)
__copy__ = copy
# Iterator interface.
def __iter__(self):
return iter(self.__dict__)
def __repr__(self):
return "%s(%s)" % (self.__class__.__name__, repr(self.__dict__))
def __str__(self):
return "%s(%s)" % (self.__class__.__name__, str(self.__dict__))
class FlagsContainer(Container):
"""
A container providing pretty-printing for flags.
Only set flags are displayed.
"""
@recursion_lock("<...>")
def __str__(self):
d = dict((k, self[k]) for k in self
if self[k] and not k.startswith("_"))
return "%s(%s)" % (self.__class__.__name__, pformat(d))
class ListContainer(list):
"""
A container for lists.
"""
__slots__ = ["__recursion_lock__"]
@recursion_lock("[...]")
def __str__(self):
return pformat(self)
class LazyContainer(object):
__slots__ = ["subcon", "stream", "pos", "context", "_value"]
def __init__(self, subcon, stream, pos, context):
self.subcon = subcon
self.stream = stream
self.pos = pos
self.context = context
self._value = NotImplemented
def __eq__(self, other):
try:
return self._value == other._value
except AttributeError:
return False
def __ne__(self, other):
return not (self == other)
def __str__(self):
return self.__pretty_str__()
def __pretty_str__(self, nesting = 1, indentation = " "):
if self._value is NotImplemented:
text = ""
elif hasattr(self._value, "__pretty_str__"):
text = self._value.__pretty_str__(nesting, indentation)
else:
text = str(self._value)
return "%s: %s" % (self.__class__.__name__, text)
def read(self):
self.stream.seek(self.pos)
return self.subcon._parse(self.stream, self.context)
def dispose(self):
self.subcon = None
self.stream = None
self.context = None
self.pos = None
def _get_value(self):
if self._value is NotImplemented:
self._value = self.read()
return self._value
value = property(_get_value)
has_value = property(lambda self: self._value is not NotImplemented)
pyelftools-0.26/elftools/construct/lib/hex.py 0000664 0000000 0000000 00000002436 13572204573 0021435 0 ustar 00root root 0000000 0000000 from .py3compat import byte2int, int2byte, bytes2str
# Map an integer in the inclusive range 0-255 to its string byte representation
_printable = dict((i, ".") for i in range(256))
_printable.update((i, bytes2str(int2byte(i))) for i in range(32, 128))
def hexdump(data, linesize):
"""
data is a bytes object. The returned result is a string.
"""
prettylines = []
if len(data) < 65536:
fmt = "%%04X %%-%ds %%s"
else:
fmt = "%%08X %%-%ds %%s"
fmt = fmt % (3 * linesize - 1,)
for i in range(0, len(data), linesize):
line = data[i : i + linesize]
hextext = " ".join('%02x' % byte2int(b) for b in line)
rawtext = "".join(_printable[byte2int(b)] for b in line)
prettylines.append(fmt % (i, str(hextext), str(rawtext)))
return prettylines
class HexString(bytes):
"""
Represents bytes that will be hex-dumped to a string when its string
representation is requested.
"""
def __init__(self, data, linesize = 16):
self.linesize = linesize
def __new__(cls, data, *args, **kwargs):
return bytes.__new__(cls, data)
def __str__(self):
if not self:
return "''"
sep = "\n"
return sep + sep.join(
hexdump(self, self.linesize))
pyelftools-0.26/elftools/construct/lib/py3compat.py 0000664 0000000 0000000 00000003052 13572204573 0022563 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# py3compat.py
#
# Some Python2&3 compatibility code
#-------------------------------------------------------------------------------
import sys
PY3 = sys.version_info[0] == 3
try:
from collections.abc import MutableMapping # python >= 3.3
except ImportError:
from collections import MutableMapping # python < 3.3
if PY3:
import io
StringIO = io.StringIO
BytesIO = io.BytesIO
def bchr(i):
""" When iterating over b'...' in Python 2 you get single b'_' chars
and in Python 3 you get integers. Call bchr to always turn this
to single b'_' chars.
"""
return bytes((i,))
def u(s):
return s
def int2byte(i):
return bytes((i,))
def byte2int(b):
return b
def str2bytes(s):
return s.encode("latin-1")
def str2unicode(s):
return s
def bytes2str(b):
return b.decode('latin-1')
def decodebytes(b, encoding):
return bytes(b, encoding)
advance_iterator = next
else:
import cStringIO
StringIO = BytesIO = cStringIO.StringIO
int2byte = chr
byte2int = ord
bchr = lambda i: i
def u(s):
return unicode(s, "unicode_escape")
def str2bytes(s):
return s
def str2unicode(s):
return unicode(s, "unicode_escape")
def bytes2str(b):
return b
def decodebytes(b, encoding):
return b.decode(encoding)
def advance_iterator(it):
return it.next()
pyelftools-0.26/elftools/construct/macros.py 0000664 0000000 0000000 00000051620 13572204573 0021366 0 ustar 00root root 0000000 0000000 from .lib.py3compat import int2byte
from .lib import (BitStreamReader, BitStreamWriter, encode_bin,
decode_bin)
from .core import (Struct, MetaField, StaticField, FormatField,
OnDemand, Pointer, Switch, Value, RepeatUntil, MetaArray, Sequence, Range,
Select, Pass, SizeofError, Buffered, Restream, Reconfig)
from .adapters import (BitIntegerAdapter, PaddingAdapter,
ConstAdapter, CStringAdapter, LengthValueAdapter, IndexingAdapter,
PaddedStringAdapter, FlagsAdapter, StringAdapter, MappingAdapter)
#===============================================================================
# fields
#===============================================================================
def Field(name, length):
"""
A field consisting of a specified number of bytes.
:param str name: the name of the field
:param length: the length of the field. the length can be either an integer
(StaticField), or a function that takes the context as an argument and
returns the length (MetaField)
"""
if callable(length):
return MetaField(name, length)
else:
return StaticField(name, length)
def BitField(name, length, swapped = False, signed = False, bytesize = 8):
"""
BitFields, as the name suggests, are fields that operate on raw, unaligned
bits, and therefore must be enclosed in a BitStruct. Using them is very
similar to all normal fields: they take a name and a length (in bits).
:param str name: name of the field
:param int length: number of bits in the field, or a function that takes
the context as its argument and returns the length
:param bool swapped: whether the value is byte-swapped
:param bool signed: whether the value is signed
:param int bytesize: number of bits per byte, for byte-swapping
>>> foo = BitStruct("foo",
... BitField("a", 3),
... Flag("b"),
... Padding(3),
... Nibble("c"),
... BitField("d", 5),
... )
>>> foo.parse("\\xe1\\x1f")
Container(a = 7, b = False, c = 8, d = 31)
>>> foo = BitStruct("foo",
... BitField("a", 3),
... Flag("b"),
... Padding(3),
... Nibble("c"),
... Struct("bar",
... Nibble("d"),
... Bit("e"),
... )
... )
>>> foo.parse("\\xe1\\x1f")
Container(a = 7, b = False, bar = Container(d = 15, e = 1), c = 8)
"""
return BitIntegerAdapter(Field(name, length),
length,
swapped=swapped,
signed=signed,
bytesize=bytesize
)
def Padding(length, pattern = b"\x00", strict = False):
r"""a padding field (value is discarded)
* length - the length of the field. the length can be either an integer,
or a function that takes the context as an argument and returns the
length
* pattern - the padding pattern (character/byte) to use. default is b"\x00"
* strict - whether or not to raise an exception is the actual padding
pattern mismatches the desired pattern. default is False.
"""
return PaddingAdapter(Field(None, length),
pattern = pattern,
strict = strict,
)
def Flag(name, truth = 1, falsehood = 0, default = False):
"""
A flag.
Flags are usually used to signify a Boolean value, and this construct
maps values onto the ``bool`` type.
.. note:: This construct works with both bit and byte contexts.
.. warning:: Flags default to False, not True. This is different from the
C and Python way of thinking about truth, and may be subject to change
in the future.
:param str name: field name
:param int truth: value of truth (default 1)
:param int falsehood: value of falsehood (default 0)
:param bool default: default value (default False)
"""
return SymmetricMapping(Field(name, 1),
{True : int2byte(truth), False : int2byte(falsehood)},
default = default,
)
#===============================================================================
# field shortcuts
#===============================================================================
def Bit(name):
"""a 1-bit BitField; must be enclosed in a BitStruct"""
return BitField(name, 1)
def Nibble(name):
"""a 4-bit BitField; must be enclosed in a BitStruct"""
return BitField(name, 4)
def Octet(name):
"""an 8-bit BitField; must be enclosed in a BitStruct"""
return BitField(name, 8)
def UBInt8(name):
"""unsigned, big endian 8-bit integer"""
return FormatField(name, ">", "B")
def UBInt16(name):
"""unsigned, big endian 16-bit integer"""
return FormatField(name, ">", "H")
def UBInt32(name):
"""unsigned, big endian 32-bit integer"""
return FormatField(name, ">", "L")
def UBInt64(name):
"""unsigned, big endian 64-bit integer"""
return FormatField(name, ">", "Q")
def SBInt8(name):
"""signed, big endian 8-bit integer"""
return FormatField(name, ">", "b")
def SBInt16(name):
"""signed, big endian 16-bit integer"""
return FormatField(name, ">", "h")
def SBInt32(name):
"""signed, big endian 32-bit integer"""
return FormatField(name, ">", "l")
def SBInt64(name):
"""signed, big endian 64-bit integer"""
return FormatField(name, ">", "q")
def ULInt8(name):
"""unsigned, little endian 8-bit integer"""
return FormatField(name, "<", "B")
def ULInt16(name):
"""unsigned, little endian 16-bit integer"""
return FormatField(name, "<", "H")
def ULInt32(name):
"""unsigned, little endian 32-bit integer"""
return FormatField(name, "<", "L")
def ULInt64(name):
"""unsigned, little endian 64-bit integer"""
return FormatField(name, "<", "Q")
def SLInt8(name):
"""signed, little endian 8-bit integer"""
return FormatField(name, "<", "b")
def SLInt16(name):
"""signed, little endian 16-bit integer"""
return FormatField(name, "<", "h")
def SLInt32(name):
"""signed, little endian 32-bit integer"""
return FormatField(name, "<", "l")
def SLInt64(name):
"""signed, little endian 64-bit integer"""
return FormatField(name, "<", "q")
def UNInt8(name):
"""unsigned, native endianity 8-bit integer"""
return FormatField(name, "=", "B")
def UNInt16(name):
"""unsigned, native endianity 16-bit integer"""
return FormatField(name, "=", "H")
def UNInt32(name):
"""unsigned, native endianity 32-bit integer"""
return FormatField(name, "=", "L")
def UNInt64(name):
"""unsigned, native endianity 64-bit integer"""
return FormatField(name, "=", "Q")
def SNInt8(name):
"""signed, native endianity 8-bit integer"""
return FormatField(name, "=", "b")
def SNInt16(name):
"""signed, native endianity 16-bit integer"""
return FormatField(name, "=", "h")
def SNInt32(name):
"""signed, native endianity 32-bit integer"""
return FormatField(name, "=", "l")
def SNInt64(name):
"""signed, native endianity 64-bit integer"""
return FormatField(name, "=", "q")
def BFloat32(name):
"""big endian, 32-bit IEEE floating point number"""
return FormatField(name, ">", "f")
def LFloat32(name):
"""little endian, 32-bit IEEE floating point number"""
return FormatField(name, "<", "f")
def NFloat32(name):
"""native endianity, 32-bit IEEE floating point number"""
return FormatField(name, "=", "f")
def BFloat64(name):
"""big endian, 64-bit IEEE floating point number"""
return FormatField(name, ">", "d")
def LFloat64(name):
"""little endian, 64-bit IEEE floating point number"""
return FormatField(name, "<", "d")
def NFloat64(name):
"""native endianity, 64-bit IEEE floating point number"""
return FormatField(name, "=", "d")
#===============================================================================
# arrays
#===============================================================================
def Array(count, subcon):
"""
Repeats the given unit a fixed number of times.
:param int count: number of times to repeat
:param ``Construct`` subcon: construct to repeat
>>> c = Array(4, UBInt8("foo"))
>>> c.parse("\\x01\\x02\\x03\\x04")
[1, 2, 3, 4]
>>> c.parse("\\x01\\x02\\x03\\x04\\x05\\x06")
[1, 2, 3, 4]
>>> c.build([5,6,7,8])
'\\x05\\x06\\x07\\x08'
>>> c.build([5,6,7,8,9])
Traceback (most recent call last):
...
construct.core.RangeError: expected 4..4, found 5
"""
if callable(count):
con = MetaArray(count, subcon)
else:
con = MetaArray(lambda ctx: count, subcon)
con._clear_flag(con.FLAG_DYNAMIC)
return con
def PrefixedArray(subcon, length_field = UBInt8("length")):
"""an array prefixed by a length field.
* subcon - the subcon to be repeated
* length_field - a construct returning an integer
"""
return LengthValueAdapter(
Sequence(subcon.name,
length_field,
Array(lambda ctx: ctx[length_field.name], subcon),
nested = False
)
)
def OpenRange(mincount, subcon):
from sys import maxsize
return Range(mincount, maxsize, subcon)
def GreedyRange(subcon):
"""
Repeats the given unit one or more times.
:param ``Construct`` subcon: construct to repeat
>>> from construct import GreedyRange, UBInt8
>>> c = GreedyRange(UBInt8("foo"))
>>> c.parse("\\x01")
[1]
>>> c.parse("\\x01\\x02\\x03")
[1, 2, 3]
>>> c.parse("\\x01\\x02\\x03\\x04\\x05\\x06")
[1, 2, 3, 4, 5, 6]
>>> c.parse("")
Traceback (most recent call last):
...
construct.core.RangeError: expected 1..2147483647, found 0
>>> c.build([1,2])
'\\x01\\x02'
>>> c.build([])
Traceback (most recent call last):
...
construct.core.RangeError: expected 1..2147483647, found 0
"""
return OpenRange(1, subcon)
def OptionalGreedyRange(subcon):
"""
Repeats the given unit zero or more times. This repeater can't
fail, as it accepts lists of any length.
:param ``Construct`` subcon: construct to repeat
>>> from construct import OptionalGreedyRange, UBInt8
>>> c = OptionalGreedyRange(UBInt8("foo"))
>>> c.parse("")
[]
>>> c.parse("\\x01\\x02")
[1, 2]
>>> c.build([])
''
>>> c.build([1,2])
'\\x01\\x02'
"""
return OpenRange(0, subcon)
#===============================================================================
# subconstructs
#===============================================================================
def Optional(subcon):
"""an optional construct. if parsing fails, returns None.
* subcon - the subcon to optionally parse or build
"""
return Select(subcon.name, subcon, Pass)
def Bitwise(subcon):
"""converts the stream to bits, and passes the bitstream to subcon
* subcon - a bitwise construct (usually BitField)
"""
# subcons larger than MAX_BUFFER will be wrapped by Restream instead
# of Buffered. implementation details, don't stick your nose in :)
MAX_BUFFER = 1024 * 8
def resizer(length):
if length & 7:
raise SizeofError("size must be a multiple of 8", length)
return length >> 3
if not subcon._is_flag(subcon.FLAG_DYNAMIC) and subcon.sizeof() < MAX_BUFFER:
con = Buffered(subcon,
encoder = decode_bin,
decoder = encode_bin,
resizer = resizer
)
else:
con = Restream(subcon,
stream_reader = BitStreamReader,
stream_writer = BitStreamWriter,
resizer = resizer)
return con
def Aligned(subcon, modulus = 4, pattern = b"\x00"):
r"""aligns subcon to modulus boundary using padding pattern
* subcon - the subcon to align
* modulus - the modulus boundary (default is 4)
* pattern - the padding pattern (default is \x00)
"""
if modulus < 2:
raise ValueError("modulus must be >= 2", modulus)
def padlength(ctx):
return (modulus - (subcon._sizeof(ctx) % modulus)) % modulus
return SeqOfOne(subcon.name,
subcon,
# ??????
# ??????
# ??????
# ??????
Padding(padlength, pattern = pattern),
nested = False,
)
def SeqOfOne(name, *args, **kw):
"""a sequence of one element. only the first element is meaningful, the
rest are discarded
* name - the name of the sequence
* args - subconstructs
* kw - any keyword arguments to Sequence
"""
return IndexingAdapter(Sequence(name, *args, **kw), index = 0)
def Embedded(subcon):
"""embeds a struct into the enclosing struct.
* subcon - the struct to embed
"""
return Reconfig(subcon.name, subcon, subcon.FLAG_EMBED)
def Rename(newname, subcon):
"""renames an existing construct
* newname - the new name
* subcon - the subcon to rename
"""
return Reconfig(newname, subcon)
def Alias(newname, oldname):
"""creates an alias for an existing element in a struct
* newname - the new name
* oldname - the name of an existing element
"""
return Value(newname, lambda ctx: ctx[oldname])
#===============================================================================
# mapping
#===============================================================================
def SymmetricMapping(subcon, mapping, default = NotImplemented):
"""defines a symmetrical mapping: a->b, b->a.
* subcon - the subcon to map
* mapping - the encoding mapping (a dict); the decoding mapping is
achieved by reversing this mapping
* default - the default value to use when no mapping is found. if no
default value is given, and exception is raised. setting to Pass would
return the value "as is" (unmapped)
"""
reversed_mapping = dict((v, k) for k, v in mapping.items())
return MappingAdapter(subcon,
encoding = mapping,
decoding = reversed_mapping,
encdefault = default,
decdefault = default,
)
def Enum(subcon, **kw):
"""a set of named values mapping.
* subcon - the subcon to map
* kw - keyword arguments which serve as the encoding mapping
* _default_ - an optional, keyword-only argument that specifies the
default value to use when the mapping is undefined. if not given,
and exception is raised when the mapping is undefined. use `Pass` to
pass the unmapped value as-is
"""
return SymmetricMapping(subcon, kw, kw.pop("_default_", NotImplemented))
def FlagsEnum(subcon, **kw):
"""a set of flag values mapping.
* subcon - the subcon to map
* kw - keyword arguments which serve as the encoding mapping
"""
return FlagsAdapter(subcon, kw)
#===============================================================================
# structs
#===============================================================================
def AlignedStruct(name, *subcons, **kw):
"""a struct of aligned fields
* name - the name of the struct
* subcons - the subcons that make up this structure
* kw - keyword arguments to pass to Aligned: 'modulus' and 'pattern'
"""
return Struct(name, *(Aligned(sc, **kw) for sc in subcons))
def BitStruct(name, *subcons):
"""a struct of bitwise fields
* name - the name of the struct
* subcons - the subcons that make up this structure
"""
return Bitwise(Struct(name, *subcons))
def EmbeddedBitStruct(*subcons):
"""an embedded BitStruct. no name is necessary.
* subcons - the subcons that make up this structure
"""
return Bitwise(Embedded(Struct(None, *subcons)))
#===============================================================================
# strings
#===============================================================================
def String(name, length, encoding=None, padchar=None, paddir="right",
trimdir="right"):
"""
A configurable, fixed-length string field.
The padding character must be specified for padding and trimming to work.
:param str name: name
:param int length: length, in bytes
:param str encoding: encoding (e.g. "utf8") or None for no encoding
:param str padchar: optional character to pad out strings
:param str paddir: direction to pad out strings; one of "right", "left",
or "both"
:param str trim: direction to trim strings; one of "right", "left"
>>> from construct import String
>>> String("foo", 5).parse("hello")
'hello'
>>>
>>> String("foo", 12, encoding = "utf8").parse("hello joh\\xd4\\x83n")
u'hello joh\\u0503n'
>>>
>>> foo = String("foo", 10, padchar = "X", paddir = "right")
>>> foo.parse("helloXXXXX")
'hello'
>>> foo.build("hello")
'helloXXXXX'
"""
con = StringAdapter(Field(name, length), encoding=encoding)
if padchar is not None:
con = PaddedStringAdapter(con, padchar=padchar, paddir=paddir,
trimdir=trimdir)
return con
def PascalString(name, length_field=UBInt8("length"), encoding=None):
"""
A length-prefixed string.
``PascalString`` is named after the string types of Pascal, which are
length-prefixed. Lisp strings also follow this convention.
The length field will appear in the same ``Container`` as the
``PascalString``, with the given name.
:param str name: name
:param ``Construct`` length_field: a field which will store the length of
the string
:param str encoding: encoding (e.g. "utf8") or None for no encoding
>>> foo = PascalString("foo")
>>> foo.parse("\\x05hello")
'hello'
>>> foo.build("hello world")
'\\x0bhello world'
>>>
>>> foo = PascalString("foo", length_field = UBInt16("length"))
>>> foo.parse("\\x00\\x05hello")
'hello'
>>> foo.build("hello")
'\\x00\\x05hello'
"""
return StringAdapter(
LengthValueAdapter(
Sequence(name,
length_field,
Field("data", lambda ctx: ctx[length_field.name]),
)
),
encoding=encoding,
)
def CString(name, terminators=b"\x00", encoding=None,
char_field=Field(None, 1)):
"""
A string ending in a terminator.
``CString`` is similar to the strings of C, C++, and other related
programming languages.
By default, the terminator is the NULL byte (b``0x00``).
:param str name: name
:param iterable terminators: sequence of valid terminators, in order of
preference
:param str encoding: encoding (e.g. "utf8") or None for no encoding
:param ``Construct`` char_field: construct representing a single character
>>> foo = CString("foo")
>>> foo.parse(b"hello\\x00")
b'hello'
>>> foo.build(b"hello")
b'hello\\x00'
>>> foo = CString("foo", terminators = b"XYZ")
>>> foo.parse(b"helloX")
b'hello'
>>> foo.parse(b"helloY")
b'hello'
>>> foo.parse(b"helloZ")
b'hello'
>>> foo.build(b"hello")
b'helloX'
"""
return Rename(name,
CStringAdapter(
RepeatUntil(lambda obj, ctx: obj in terminators, char_field),
terminators=terminators,
encoding=encoding,
)
)
#===============================================================================
# conditional
#===============================================================================
def IfThenElse(name, predicate, then_subcon, else_subcon):
"""an if-then-else conditional construct: if the predicate indicates True,
`then_subcon` will be used; otherwise `else_subcon`
* name - the name of the construct
* predicate - a function taking the context as an argument and returning
True or False
* then_subcon - the subcon that will be used if the predicate returns True
* else_subcon - the subcon that will be used if the predicate returns False
"""
return Switch(name, lambda ctx: bool(predicate(ctx)),
{
True : then_subcon,
False : else_subcon,
}
)
def If(predicate, subcon, elsevalue = None):
"""an if-then conditional construct: if the predicate indicates True,
subcon will be used; otherwise, `elsevalue` will be returned instead.
* predicate - a function taking the context as an argument and returning
True or False
* subcon - the subcon that will be used if the predicate returns True
* elsevalue - the value that will be used should the predicate return False.
by default this value is None.
"""
return IfThenElse(subcon.name,
predicate,
subcon,
Value("elsevalue", lambda ctx: elsevalue)
)
#===============================================================================
# misc
#===============================================================================
def OnDemandPointer(offsetfunc, subcon, force_build = True):
"""an on-demand pointer.
* offsetfunc - a function taking the context as an argument and returning
the absolute stream position
* subcon - the subcon that will be parsed from the `offsetfunc()` stream
position on demand
* force_build - see OnDemand. by default True.
"""
return OnDemand(Pointer(offsetfunc, subcon),
advance_stream = False,
force_build = force_build
)
def Magic(data):
return ConstAdapter(Field(None, len(data)), data)
pyelftools-0.26/elftools/dwarf/ 0000775 0000000 0000000 00000000000 13572204573 0016603 5 ustar 00root root 0000000 0000000 pyelftools-0.26/elftools/dwarf/__init__.py 0000664 0000000 0000000 00000000000 13572204573 0020702 0 ustar 00root root 0000000 0000000 pyelftools-0.26/elftools/dwarf/abbrevtable.py 0000664 0000000 0000000 00000004762 13572204573 0021437 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/abbrevtable.py
#
# DWARF abbreviation table
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..common.utils import struct_parse, dwarf_assert
class AbbrevTable(object):
""" Represents a DWARF abbreviation table.
"""
def __init__(self, structs, stream, offset):
""" Create new abbreviation table. Parses the actual table from the
stream and stores it internally.
structs:
A DWARFStructs instance for parsing the data
stream, offset:
The stream and offset into the stream where this abbreviation
table lives.
"""
self.structs = structs
self.stream = stream
self.offset = offset
self._abbrev_map = self._parse_abbrev_table()
def get_abbrev(self, code):
""" Get the AbbrevDecl for a given code. Raise KeyError if no
declaration for this code exists.
"""
return self._abbrev_map[code]
def _parse_abbrev_table(self):
""" Parse the abbrev table from the stream
"""
map = {}
self.stream.seek(self.offset)
while True:
decl_code = struct_parse(
struct=self.structs.Dwarf_uleb128(''),
stream=self.stream)
if decl_code == 0:
break
declaration = struct_parse(
struct=self.structs.Dwarf_abbrev_declaration,
stream=self.stream)
map[decl_code] = AbbrevDecl(decl_code, declaration)
return map
class AbbrevDecl(object):
""" Wraps a parsed abbreviation declaration, exposing its fields with
dict-like access, and adding some convenience methods.
The abbreviation declaration represents an "entry" that points to it.
"""
def __init__(self, code, decl):
self.code = code
self.decl = decl
def has_children(self):
""" Does the entry have children?
"""
return self['children_flag'] == 'DW_CHILDREN_yes'
def iter_attr_specs(self):
""" Iterate over the attribute specifications for the entry. Yield
(name, form) pairs.
"""
for attr_spec in self['attr_spec']:
yield attr_spec.name, attr_spec.form
def __getitem__(self, entry):
return self.decl[entry]
pyelftools-0.26/elftools/dwarf/aranges.py 0000664 0000000 0000000 00000010445 13572204573 0020601 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/aranges.py
#
# DWARF aranges section decoding (.debug_aranges)
#
# Dorothy Chen (dorothchen@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
import os
from collections import namedtuple
from ..common.utils import struct_parse
from bisect import bisect_right
import math
# An entry in the aranges table;
# begin_addr: The beginning address in the CU
# length: The length of the address range in this entry
# info_offset: The CU's offset into .debug_info
# see 6.1.2 in DWARF4 docs for explanation of the remaining fields
ARangeEntry = namedtuple('ARangeEntry',
'begin_addr length info_offset unit_length version address_size segment_size')
class ARanges(object):
""" ARanges table in DWARF
stream, size:
A stream holding the .debug_aranges section, and its size
structs:
A DWARFStructs instance for parsing the data
"""
def __init__(self, stream, size, structs):
self.stream = stream
self.size = size
self.structs = structs
# Get entries of aranges table in the form of ARangeEntry tuples
self.entries = self._get_entries()
# Sort entries by the beginning address
self.entries.sort(key=lambda entry: entry.begin_addr)
# Create list of keys (first addresses) for better searching
self.keys = [entry.begin_addr for entry in self.entries]
def cu_offset_at_addr(self, addr):
""" Given an address, get the offset of the CU it belongs to, where
'offset' refers to the offset in the .debug_info section.
"""
tup = self.entries[bisect_right(self.keys, addr) - 1]
return tup.info_offset
#------ PRIVATE ------#
def _get_entries(self):
""" Populate self.entries with ARangeEntry tuples for each range of addresses
"""
self.stream.seek(0)
entries = []
offset = 0
# one loop == one "set" == one CU
while offset < self.size :
aranges_header = struct_parse(self.structs.Dwarf_aranges_header,
self.stream, offset)
addr_size = self._get_addr_size_struct(aranges_header["address_size"])
# No segmentation
if aranges_header["segment_size"] == 0:
# pad to nearest multiple of tuple size
tuple_size = aranges_header["address_size"] * 2
fp = self.stream.tell()
seek_to = int(math.ceil(fp/float(tuple_size)) * tuple_size)
self.stream.seek(seek_to)
# entries in this set/CU
addr = struct_parse(addr_size('addr'), self.stream)
length = struct_parse(addr_size('length'), self.stream)
while addr != 0 or length != 0:
# 'begin_addr length info_offset version address_size segment_size'
entries.append(
ARangeEntry(begin_addr=addr,
length=length,
info_offset=aranges_header["debug_info_offset"],
unit_length=aranges_header["unit_length"],
version=aranges_header["version"],
address_size=aranges_header["address_size"],
segment_size=aranges_header["segment_size"]))
addr = struct_parse(addr_size('addr'), self.stream)
length = struct_parse(addr_size('length'), self.stream)
# Segmentation exists in executable
elif aranges_header["segment_size"] != 0:
raise NotImplementedError("Segmentation not implemented")
offset = (offset
+ aranges_header.unit_length
+ self.structs.initial_length_field_size())
return entries
def _get_addr_size_struct(self, addr_header_value):
""" Given this set's header value (int) for the address size,
get the Construct representation of that size
"""
if addr_header_value == 4:
return self.structs.Dwarf_uint32
else:
assert addr_header_value == 8
return self.structs.Dwarf_uint64
pyelftools-0.26/elftools/dwarf/callframe.py 0000664 0000000 0000000 00000066347 13572204573 0021123 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/callframe.py
#
# DWARF call frame information
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
import copy
from collections import namedtuple
from ..common.utils import (struct_parse, dwarf_assert, preserve_stream_pos)
from ..common.py3compat import iterbytes, iterkeys
from ..construct import Struct, Switch
from .enums import DW_EH_encoding_flags
from .structs import DWARFStructs
from .constants import *
class CallFrameInfo(object):
""" DWARF CFI (Call Frame Info)
Note that this also supports unwinding information as found in .eh_frame
sections: its format differs slightly from the one in .debug_frame. See
.
stream, size:
A stream holding the .debug_frame section, and the size of the
section in it.
address:
Virtual address for this section. This is used to decode relative
addresses.
base_structs:
The structs to be used as the base for parsing this section.
Eventually, each entry gets its own structs based on the initial
length field it starts with. The address_size, however, is taken
from base_structs. This appears to be a limitation of the DWARFv3
standard, fixed in v4.
A discussion I had on dwarf-discuss confirms this.
So for DWARFv4 we'll take the address size from the CIE header,
but for earlier versions will use the elfclass of the containing
file; more sophisticated methods are used by libdwarf and others,
such as guessing which CU contains which FDEs (based on their
address ranges) and taking the address_size from those CUs.
"""
def __init__(self, stream, size, address, base_structs,
for_eh_frame=False):
self.stream = stream
self.size = size
self.address = address
self.base_structs = base_structs
self.entries = None
# Map between an offset in the stream and the entry object found at this
# offset. Useful for assigning CIE to FDEs according to the CIE_pointer
# header field which contains a stream offset.
self._entry_cache = {}
# The .eh_frame and .debug_frame section use almost the same CFI
# encoding, but there are tiny variations we need to handle during
# parsing.
self.for_eh_frame = for_eh_frame
def get_entries(self):
""" Get a list of entries that constitute this CFI. The list consists
of CIE or FDE objects, in the order of their appearance in the
section.
"""
if self.entries is None:
self.entries = self._parse_entries()
return self.entries
#-------------------------
def _parse_entries(self):
entries = []
offset = 0
while offset < self.size:
entries.append(self._parse_entry_at(offset))
offset = self.stream.tell()
return entries
def _parse_entry_at(self, offset):
""" Parse an entry from self.stream starting with the given offset.
Return the entry object. self.stream will point right after the
entry.
"""
if offset in self._entry_cache:
return self._entry_cache[offset]
entry_length = struct_parse(
self.base_structs.Dwarf_uint32(''), self.stream, offset)
if self.for_eh_frame and entry_length == 0:
return ZERO(offset)
dwarf_format = 64 if entry_length == 0xFFFFFFFF else 32
entry_structs = DWARFStructs(
little_endian=self.base_structs.little_endian,
dwarf_format=dwarf_format,
address_size=self.base_structs.address_size)
# Read the next field to see whether this is a CIE or FDE
CIE_id = struct_parse(
entry_structs.Dwarf_offset(''), self.stream)
if self.for_eh_frame:
is_CIE = CIE_id == 0
else:
is_CIE = (
(dwarf_format == 32 and CIE_id == 0xFFFFFFFF) or
CIE_id == 0xFFFFFFFFFFFFFFFF)
# Parse the header, which goes up to and excluding the sequence of
# instructions.
if is_CIE:
header_struct = (entry_structs.EH_CIE_header
if self.for_eh_frame else
entry_structs.Dwarf_CIE_header)
header = struct_parse(
header_struct, self.stream, offset)
else:
header = self._parse_fde_header(entry_structs, offset)
# If this is DWARF version 4 or later, we can have a more precise
# address size, read from the CIE header.
if not self.for_eh_frame and entry_structs.dwarf_version >= 4:
entry_structs = DWARFStructs(
little_endian=entry_structs.little_endian,
dwarf_format=entry_structs.dwarf_format,
address_size=header.address_size)
# If the augmentation string is not empty, hope to find a length field
# in order to skip the data specified augmentation.
if is_CIE:
aug_bytes, aug_dict = self._parse_cie_augmentation(
header, entry_structs)
else:
cie = self._parse_cie_for_fde(offset, header, entry_structs)
aug_bytes = self._read_augmentation_data(entry_structs)
# For convenience, compute the end offset for this entry
end_offset = (
offset + header.length +
entry_structs.initial_length_field_size())
# At this point self.stream is at the start of the instruction list
# for this entry
instructions = self._parse_instructions(
entry_structs, self.stream.tell(), end_offset)
if is_CIE:
self._entry_cache[offset] = CIE(
header=header, instructions=instructions, offset=offset,
augmentation_dict=aug_dict,
augmentation_bytes=aug_bytes,
structs=entry_structs)
else: # FDE
cie = self._parse_cie_for_fde(offset, header, entry_structs)
self._entry_cache[offset] = FDE(
header=header, instructions=instructions, offset=offset,
augmentation_bytes=aug_bytes,
structs=entry_structs, cie=cie)
return self._entry_cache[offset]
def _parse_instructions(self, structs, offset, end_offset):
""" Parse a list of CFI instructions from self.stream, starting with
the offset and until (not including) end_offset.
Return a list of CallFrameInstruction objects.
"""
instructions = []
while offset < end_offset:
opcode = struct_parse(structs.Dwarf_uint8(''), self.stream, offset)
args = []
primary = opcode & _PRIMARY_MASK
primary_arg = opcode & _PRIMARY_ARG_MASK
if primary == DW_CFA_advance_loc:
args = [primary_arg]
elif primary == DW_CFA_offset:
args = [
primary_arg,
struct_parse(structs.Dwarf_uleb128(''), self.stream)]
elif primary == DW_CFA_restore:
args = [primary_arg]
# primary == 0 and real opcode is extended
elif opcode in (DW_CFA_nop, DW_CFA_remember_state,
DW_CFA_restore_state):
args = []
elif opcode == DW_CFA_set_loc:
args = [
struct_parse(structs.Dwarf_target_addr(''), self.stream)]
elif opcode == DW_CFA_advance_loc1:
args = [struct_parse(structs.Dwarf_uint8(''), self.stream)]
elif opcode == DW_CFA_advance_loc2:
args = [struct_parse(structs.Dwarf_uint16(''), self.stream)]
elif opcode == DW_CFA_advance_loc4:
args = [struct_parse(structs.Dwarf_uint32(''), self.stream)]
elif opcode in (DW_CFA_offset_extended, DW_CFA_register,
DW_CFA_def_cfa, DW_CFA_val_offset):
args = [
struct_parse(structs.Dwarf_uleb128(''), self.stream),
struct_parse(structs.Dwarf_uleb128(''), self.stream)]
elif opcode in (DW_CFA_restore_extended, DW_CFA_undefined,
DW_CFA_same_value, DW_CFA_def_cfa_register,
DW_CFA_def_cfa_offset):
args = [struct_parse(structs.Dwarf_uleb128(''), self.stream)]
elif opcode == DW_CFA_def_cfa_offset_sf:
args = [struct_parse(structs.Dwarf_sleb128(''), self.stream)]
elif opcode == DW_CFA_def_cfa_expression:
args = [struct_parse(
structs.Dwarf_dw_form['DW_FORM_block'], self.stream)]
elif opcode in (DW_CFA_expression, DW_CFA_val_expression):
args = [
struct_parse(structs.Dwarf_uleb128(''), self.stream),
struct_parse(
structs.Dwarf_dw_form['DW_FORM_block'], self.stream)]
elif opcode in (DW_CFA_offset_extended_sf,
DW_CFA_def_cfa_sf, DW_CFA_val_offset_sf):
args = [
struct_parse(structs.Dwarf_uleb128(''), self.stream),
struct_parse(structs.Dwarf_sleb128(''), self.stream)]
else:
dwarf_assert(False, 'Unknown CFI opcode: 0x%x' % opcode)
instructions.append(CallFrameInstruction(opcode=opcode, args=args))
offset = self.stream.tell()
return instructions
def _parse_cie_for_fde(self, fde_offset, fde_header, entry_structs):
""" Parse the CIE that corresponds to an FDE.
"""
# Determine the offset of the CIE that corresponds to this FDE
if self.for_eh_frame:
# CIE_pointer contains the offset for a reverse displacement from
# the section offset of the CIE_pointer field itself (not from the
# FDE header offset).
cie_displacement = fde_header['CIE_pointer']
cie_offset = (fde_offset + entry_structs.dwarf_format // 8
- cie_displacement)
else:
cie_offset = fde_header['CIE_pointer']
# Then read it
with preserve_stream_pos(self.stream):
return self._parse_entry_at(cie_offset)
def _parse_cie_augmentation(self, header, entry_structs):
""" Parse CIE augmentation data from the annotation string in `header`.
Return a tuple that contains 1) the augmentation data as a string
(without the length field) and 2) the augmentation data as a dict.
"""
augmentation = header.get('augmentation')
if not augmentation:
return ('', {})
# Augmentation parsing works in minimal mode here: we need the length
# field to be able to skip unhandled augmentation fields.
assert augmentation.startswith(b'z'), (
'Unhandled augmentation string: {}'.format(repr(augmentation)))
available_fields = {
b'z': entry_structs.Dwarf_uleb128('length'),
b'L': entry_structs.Dwarf_uint8('LSDA_encoding'),
b'R': entry_structs.Dwarf_uint8('FDE_encoding'),
b'S': True,
b'P': Struct(
'personality',
entry_structs.Dwarf_uint8('encoding'),
Switch('function', lambda ctx: ctx.encoding & 0x0f, {
enc: fld_cons('function')
for enc, fld_cons
in self._eh_encoding_to_field(entry_structs).items()})),
}
# Build the Struct we will be using to parse the augmentation data.
# Stop as soon as we are not able to match the augmentation string.
fields = []
aug_dict = {}
for b in iterbytes(augmentation):
try:
fld = available_fields[b]
except KeyError:
break
if fld is True:
aug_dict[fld] = True
else:
fields.append(fld)
# Read the augmentation twice: once with the Struct, once for the raw
# bytes. Read the raw bytes last so we are sure we leave the stream
# pointing right after the augmentation: the Struct may be incomplete
# (missing trailing fields) due to an unknown char: see the KeyError
# above.
offset = self.stream.tell()
struct = Struct('Augmentation_Data', *fields)
aug_dict.update(struct_parse(struct, self.stream, offset))
self.stream.seek(offset)
aug_bytes = self._read_augmentation_data(entry_structs)
return (aug_bytes, aug_dict)
def _read_augmentation_data(self, entry_structs):
""" Read augmentation data.
This assumes that the augmentation string starts with 'z', i.e. that
augmentation data is prefixed by a length field, which is not returned.
"""
if not self.for_eh_frame:
return b''
augmentation_data_length = struct_parse(
Struct('Dummy_Augmentation_Data',
entry_structs.Dwarf_uleb128('length')),
self.stream)['length']
return self.stream.read(augmentation_data_length)
def _parse_fde_header(self, entry_structs, offset):
""" Compute a struct to parse the header of the current FDE.
"""
if not self.for_eh_frame:
return struct_parse(entry_structs.Dwarf_FDE_header, self.stream,
offset)
fields = [entry_structs.Dwarf_initial_length('length'),
entry_structs.Dwarf_offset('CIE_pointer')]
# Parse the couple of header fields that are always here so we can
# fetch the corresponding CIE.
minimal_header = struct_parse(Struct('eh_frame_minimal_header',
*fields), self.stream, offset)
cie = self._parse_cie_for_fde(offset, minimal_header, entry_structs)
initial_location_offset = self.stream.tell()
# Try to parse the initial location. We need the initial location in
# order to create a meaningful FDE, so assume it's there. Omission does
# not seem to happen in practice.
encoding = cie.augmentation_dict['FDE_encoding']
assert encoding != DW_EH_encoding_flags['DW_EH_PE_omit']
basic_encoding = encoding & 0x0f
encoding_modifier = encoding & 0xf0
# Depending on the specified encoding, complete the header Struct
formats = self._eh_encoding_to_field(entry_structs)
fields.append(formats[basic_encoding]('initial_location'))
fields.append(formats[basic_encoding]('address_range'))
result = struct_parse(Struct('Dwarf_FDE_header', *fields),
self.stream, offset)
if encoding_modifier == 0:
pass
elif encoding_modifier == DW_EH_encoding_flags['DW_EH_PE_pcrel']:
# Start address is relative to the address of the
# "initial_location" field.
result['initial_location'] += (
self.address + initial_location_offset)
else:
assert False, 'Unsupported encoding: {:#x}'.format(encoding)
return result
def _eh_encoding_to_field(self, entry_structs):
"""
Return a mapping from basic encodings (DW_EH_encoding_flags) the
corresponding field constructors (for instance
entry_structs.Dwarf_uint32).
"""
return {
DW_EH_encoding_flags['DW_EH_PE_absptr']:
entry_structs.Dwarf_uint32
if entry_structs.dwarf_format == 32 else
entry_structs.Dwarf_uint64,
DW_EH_encoding_flags['DW_EH_PE_uleb128']:
entry_structs.Dwarf_uleb128,
DW_EH_encoding_flags['DW_EH_PE_udata2']:
entry_structs.Dwarf_uint16,
DW_EH_encoding_flags['DW_EH_PE_udata4']:
entry_structs.Dwarf_uint32,
DW_EH_encoding_flags['DW_EH_PE_udata8']:
entry_structs.Dwarf_uint64,
DW_EH_encoding_flags['DW_EH_PE_sleb128']:
entry_structs.Dwarf_sleb128,
DW_EH_encoding_flags['DW_EH_PE_sdata2']:
entry_structs.Dwarf_int16,
DW_EH_encoding_flags['DW_EH_PE_sdata4']:
entry_structs.Dwarf_int32,
DW_EH_encoding_flags['DW_EH_PE_sdata8']:
entry_structs.Dwarf_int64,
}
def instruction_name(opcode):
""" Given an opcode, return the instruction name.
"""
primary = opcode & _PRIMARY_MASK
if primary == 0:
return _OPCODE_NAME_MAP[opcode]
else:
return _OPCODE_NAME_MAP[primary]
class CallFrameInstruction(object):
""" An instruction in the CFI section. opcode is the instruction
opcode, numeric - as it appears in the section. args is a list of
arguments (including arguments embedded in the low bits of some
instructions, when applicable), decoded from the stream.
"""
def __init__(self, opcode, args):
self.opcode = opcode
self.args = args
def __repr__(self):
return '%s (0x%x): %s' % (
instruction_name(self.opcode), self.opcode, self.args)
class CFIEntry(object):
""" A common base class for CFI entries.
Contains a header and a list of instructions (CallFrameInstruction).
offset: the offset of this entry from the beginning of the section
cie: for FDEs, a CIE pointer is required
augmentation_dict: Augmentation data as a parsed struct (dict): see
CallFrameInfo._parse_cie_augmentation and
http://www.airs.com/blog/archives/460.
augmentation_bytes: Augmentation data as a chain of bytes: see
CallFrameInfo._parse_cie_augmentation and
http://www.airs.com/blog/archives/460.
"""
def __init__(self, header, structs, instructions, offset,
augmentation_dict={}, augmentation_bytes=b'', cie=None):
self.header = header
self.structs = structs
self.instructions = instructions
self.offset = offset
self.cie = cie
self._decoded_table = None
self.augmentation_dict = augmentation_dict
self.augmentation_bytes = augmentation_bytes
def get_decoded(self):
""" Decode the CFI contained in this entry and return a
DecodedCallFrameTable object representing it. See the documentation
of that class to understand how to interpret the decoded table.
"""
if self._decoded_table is None:
self._decoded_table = self._decode_CFI_table()
return self._decoded_table
def __getitem__(self, name):
""" Implement dict-like access to header entries
"""
return self.header[name]
def _decode_CFI_table(self):
""" Decode the instructions contained in the given CFI entry and return
a DecodedCallFrameTable.
"""
if isinstance(self, CIE):
# For a CIE, initialize cur_line to an "empty" line
cie = self
cur_line = dict(pc=0, cfa=CFARule(reg=None, offset=0))
reg_order = []
else: # FDE
# For a FDE, we need to decode the attached CIE first, because its
# decoded table is needed. Its "initial instructions" describe a
# line that serves as the base (first) line in the FDE's table.
cie = self.cie
cie_decoded_table = cie.get_decoded()
if len(cie_decoded_table.table) > 0:
last_line_in_CIE = copy.copy(cie_decoded_table.table[-1])
cur_line = copy.copy(last_line_in_CIE)
else:
cur_line = dict(cfa=CFARule(reg=None, offset=0))
cur_line['pc'] = self['initial_location']
reg_order = copy.copy(cie_decoded_table.reg_order)
table = []
# Keeps a stack for the use of DW_CFA_{remember|restore}_state
# instructions.
line_stack = []
def _add_to_order(regnum):
if regnum not in cur_line:
reg_order.append(regnum)
for instr in self.instructions:
# Throughout this loop, cur_line is the current line. Some
# instructions add it to the table, but most instructions just
# update it without adding it to the table.
name = instruction_name(instr.opcode)
if name == 'DW_CFA_set_loc':
table.append(copy.copy(cur_line))
cur_line['pc'] = instr.args[0]
elif name in ( 'DW_CFA_advance_loc1', 'DW_CFA_advance_loc2',
'DW_CFA_advance_loc4', 'DW_CFA_advance_loc'):
table.append(copy.copy(cur_line))
cur_line['pc'] += instr.args[0] * cie['code_alignment_factor']
elif name == 'DW_CFA_def_cfa':
cur_line['cfa'] = CFARule(
reg=instr.args[0],
offset=instr.args[1])
elif name == 'DW_CFA_def_cfa_sf':
cur_line['cfa'] = CFARule(
reg=instr.args[0],
offset=instr.args[1] * cie['code_alignment_factor'])
elif name == 'DW_CFA_def_cfa_register':
cur_line['cfa'] = CFARule(
reg=instr.args[0],
offset=cur_line['cfa'].offset)
elif name == 'DW_CFA_def_cfa_offset':
cur_line['cfa'] = CFARule(
reg=cur_line['cfa'].reg,
offset=instr.args[0])
elif name == 'DW_CFA_def_cfa_expression':
cur_line['cfa'] = CFARule(expr=instr.args[0])
elif name == 'DW_CFA_undefined':
_add_to_order(instr.args[0])
cur_line[instr.args[0]] = RegisterRule(RegisterRule.UNDEFINED)
elif name == 'DW_CFA_same_value':
_add_to_order(instr.args[0])
cur_line[instr.args[0]] = RegisterRule(RegisterRule.SAME_VALUE)
elif name in ( 'DW_CFA_offset', 'DW_CFA_offset_extended',
'DW_CFA_offset_extended_sf'):
_add_to_order(instr.args[0])
cur_line[instr.args[0]] = RegisterRule(
RegisterRule.OFFSET,
instr.args[1] * cie['data_alignment_factor'])
elif name in ('DW_CFA_val_offset', 'DW_CFA_val_offset_sf'):
_add_to_order(instr.args[0])
cur_line[instr.args[0]] = RegisterRule(
RegisterRule.VAL_OFFSET,
instr.args[1] * cie['data_alignment_factor'])
elif name == 'DW_CFA_register':
_add_to_order(instr.args[0])
cur_line[instr.args[0]] = RegisterRule(
RegisterRule.REGISTER,
instr.args[1])
elif name == 'DW_CFA_expression':
_add_to_order(instr.args[0])
cur_line[instr.args[0]] = RegisterRule(
RegisterRule.EXPRESSION,
instr.args[1])
elif name == 'DW_CFA_val_expression':
_add_to_order(instr.args[0])
cur_line[instr.args[0]] = RegisterRule(
RegisterRule.VAL_EXPRESSION,
instr.args[1])
elif name in ('DW_CFA_restore', 'DW_CFA_restore_extended'):
_add_to_order(instr.args[0])
dwarf_assert(
isinstance(self, FDE),
'%s instruction must be in a FDE' % name)
if instr.args[0] in last_line_in_CIE:
cur_line[instr.args[0]] = last_line_in_CIE[instr.args[0]]
else:
cur_line.pop(instr.args[0], None)
elif name == 'DW_CFA_remember_state':
line_stack.append(copy.deepcopy(cur_line))
elif name == 'DW_CFA_restore_state':
pc = cur_line['pc']
cur_line = line_stack.pop()
cur_line['pc'] = pc
# The current line is appended to the table after all instructions
# have ended, if there were instructions.
if cur_line['cfa'].reg is not None or len(cur_line) > 2:
table.append(cur_line)
return DecodedCallFrameTable(table=table, reg_order=reg_order)
# A CIE and FDE have exactly the same functionality, except that a FDE has
# a pointer to its CIE. The functionality was wholly encapsulated in CFIEntry,
# so the CIE and FDE classes exists separately for identification (instead
# of having an explicit "entry_type" field in CFIEntry).
#
class CIE(CFIEntry):
pass
class FDE(CFIEntry):
pass
class ZERO(object):
""" End marker for the sequence of CIE/FDE.
This is specific to `.eh_frame` sections: this kind of entry does not exist
in pure DWARF. `readelf` displays these as "ZERO terminator", hence the
class name.
"""
def __init__(self, offset):
self.offset = offset
class RegisterRule(object):
""" Register rules are used to find registers in call frames. Each rule
consists of a type (enumeration following DWARFv3 section 6.4.1)
and an optional argument to augment the type.
"""
UNDEFINED = 'UNDEFINED'
SAME_VALUE = 'SAME_VALUE'
OFFSET = 'OFFSET'
VAL_OFFSET = 'VAL_OFFSET'
REGISTER = 'REGISTER'
EXPRESSION = 'EXPRESSION'
VAL_EXPRESSION = 'VAL_EXPRESSION'
ARCHITECTURAL = 'ARCHITECTURAL'
def __init__(self, type, arg=None):
self.type = type
self.arg = arg
def __repr__(self):
return 'RegisterRule(%s, %s)' % (self.type, self.arg)
class CFARule(object):
""" A CFA rule is used to compute the CFA for each location. It either
consists of a register+offset, or a DWARF expression.
"""
def __init__(self, reg=None, offset=None, expr=None):
self.reg = reg
self.offset = offset
self.expr = expr
def __repr__(self):
return 'CFARule(reg=%s, offset=%s, expr=%s)' % (
self.reg, self.offset, self.expr)
# Represents the decoded CFI for an entry, which is just a large table,
# according to DWARFv3 section 6.4.1
#
# DecodedCallFrameTable is a simple named tuple to group together the table
# and the register appearance order.
#
# table:
#
# A list of dicts that represent "lines" in the decoded table. Each line has
# some special dict entries: 'pc' for the location/program counter (LOC),
# and 'cfa' for the CFARule to locate the CFA on that line.
# The other entries are keyed by register numbers with RegisterRule values,
# and describe the rules for these registers.
#
# reg_order:
#
# A list of register numbers that are described in the table by the order of
# their appearance.
#
DecodedCallFrameTable = namedtuple(
'DecodedCallFrameTable', 'table reg_order')
#---------------- PRIVATE ----------------#
_PRIMARY_MASK = 0b11000000
_PRIMARY_ARG_MASK = 0b00111111
# This dictionary is filled by automatically scanning the constants module
# for DW_CFA_* instructions, and mapping their values to names. Since all
# names were imported from constants with `import *`, we look in globals()
_OPCODE_NAME_MAP = {}
for name in list(iterkeys(globals())):
if name.startswith('DW_CFA'):
_OPCODE_NAME_MAP[globals()[name]] = name
pyelftools-0.26/elftools/dwarf/compileunit.py 0000664 0000000 0000000 00000015101 13572204573 0021503 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/compileunit.py
#
# DWARF compile unit
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from bisect import bisect_left
from .die import DIE
class CompileUnit(object):
""" A DWARF compilation unit (CU).
A normal compilation unit typically represents the text and data
contributed to an executable by a single relocatable object file.
It may be derived from several source files,
including pre-processed "include files"
Serves as a container and context to DIEs that describe objects and code
belonging to a compilation unit.
CU header entries can be accessed as dict keys from this object, i.e.
cu = CompileUnit(...)
cu['version'] # version field of the CU header
To get the top-level DIE describing the compilation unit, call the
get_top_DIE method.
"""
def __init__(self, header, dwarfinfo, structs, cu_offset, cu_die_offset):
""" header:
CU header for this compile unit
dwarfinfo:
The DWARFInfo context object which created this one
structs:
A DWARFStructs instance suitable for this compile unit
cu_offset:
Offset in the stream to the beginning of this CU (its header)
cu_die_offset:
Offset in the stream of the top DIE of this CU
"""
self.dwarfinfo = dwarfinfo
self.header = header
self.structs = structs
self.cu_offset = cu_offset
self.cu_die_offset = cu_die_offset
# The abbreviation table for this CU. Filled lazily when DIEs are
# requested.
self._abbrev_table = None
# A list of DIEs belonging to this CU.
# This list is lazily constructed as DIEs are iterated over.
self._dielist = []
# A list of file offsets, corresponding (by index) to the DIEs
# in `self._dielist`. This list exists separately from
# `self._dielist` to make it binary searchable, enabling the
# DIE population strategy used in `iter_DIE_children`.
# Like `self._dielist`, this list is lazily constructed
# as DIEs are iterated over.
self._diemap = []
def dwarf_format(self):
""" Get the DWARF format (32 or 64) for this CU
"""
return self.structs.dwarf_format
def get_abbrev_table(self):
""" Get the abbreviation table (AbbrevTable object) for this CU
"""
if self._abbrev_table is None:
self._abbrev_table = self.dwarfinfo.get_abbrev_table(
self['debug_abbrev_offset'])
return self._abbrev_table
def get_top_DIE(self):
""" Get the top DIE (which is either a DW_TAG_compile_unit or
DW_TAG_partial_unit) of this CU
"""
# Note that a top DIE always has minimal offset and is therefore
# at the beginning of our lists, so no bisect is required.
if len(self._diemap) > 0:
return self._dielist[0]
top = DIE(
cu=self,
stream=self.dwarfinfo.debug_info_sec.stream,
offset=self.cu_die_offset)
self._dielist.insert(0, top)
self._diemap.insert(0, self.cu_die_offset)
return top
def iter_DIEs(self):
""" Iterate over all the DIEs in the CU, in order of their appearance.
Note that null DIEs will also be returned.
"""
return self._iter_DIE_subtree(self.get_top_DIE())
def iter_DIE_children(self, die):
""" Given a DIE, yields either its children, without null DIE list
terminator, or nothing, if that DIE has no children.
The null DIE terminator is saved in that DIE when iteration ended.
"""
if not die.has_children:
return
# `cur_offset` tracks the offset past our current DIE as we iterate
# over children, providing the pivot as we bisect `self._diemap`
# and ensuring that we insert our children (and child offsets)
# in the correct order within both `self._dielist` and `self._diemap`.
cur_offset = die.offset + die.size
while True:
i = bisect_left(self._diemap, cur_offset)
# Note that `self._diemap` cannot be empty because a `die`, the argument,
# is already parsed.
if i < len(self._diemap) and cur_offset == self._diemap[i]:
child = self._dielist[i]
else:
child = DIE(
cu=self,
stream=die.stream,
offset=cur_offset)
self._dielist.insert(i, child)
self._diemap.insert(i, cur_offset)
child.set_parent(die)
if child.is_null():
die._terminator = child
return
yield child
if not child.has_children:
cur_offset += child.size
elif "DW_AT_sibling" in child.attributes:
sibling = child.attributes["DW_AT_sibling"]
cur_offset = sibling.value + self.cu_offset
else:
# If no DW_AT_sibling attribute is provided by the producer
# then the whole child subtree must be parsed to find its next
# sibling. There is one zero byte representing null DIE
# terminating children list. It is used to locate child subtree
# bounds.
# If children are not parsed yet, this instruction will manage
# to recursive call of this function which will result in
# setting of `_terminator` attribute of the `child`.
if child._terminator is None:
for _ in self.iter_DIE_children(child):
pass
cur_offset = child._terminator.offset + child._terminator.size
#------ PRIVATE ------#
def __getitem__(self, name):
""" Implement dict-like access to header entries
"""
return self.header[name]
def _iter_DIE_subtree(self, die):
""" Given a DIE, this yields it with its subtree including null DIEs
(child list terminators).
"""
yield die
if die.has_children:
for c in die.iter_children():
for d in self._iter_DIE_subtree(c):
yield d
yield die._terminator
pyelftools-0.26/elftools/dwarf/constants.py 0000664 0000000 0000000 00000010542 13572204573 0021173 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/constants.py
#
# Constants and flags
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
# Inline codes
#
DW_INL_not_inlined = 0
DW_INL_inlined = 1
DW_INL_declared_not_inlined = 2
DW_INL_declared_inlined = 3
# Source languages
#
DW_LANG_C89 = 0x0001
DW_LANG_C = 0x0002
DW_LANG_Ada83 = 0x0003
DW_LANG_C_plus_plus = 0x0004
DW_LANG_Cobol74 = 0x0005
DW_LANG_Cobol85 = 0x0006
DW_LANG_Fortran77 = 0x0007
DW_LANG_Fortran90 = 0x0008
DW_LANG_Pascal83 = 0x0009
DW_LANG_Modula2 = 0x000a
DW_LANG_Java = 0x000b
DW_LANG_C99 = 0x000c
DW_LANG_Ada95 = 0x000d
DW_LANG_Fortran95 = 0x000e
DW_LANG_PLI = 0x000f
DW_LANG_ObjC = 0x0010
DW_LANG_ObjC_plus_plus = 0x0011
DW_LANG_UPC = 0x0012
DW_LANG_D = 0x0013
DW_LANG_Python = 0x0014
DW_LANG_OpenCL = 0x0015
DW_LANG_Go = 0x0016
DW_LANG_Modula3 = 0x0017
DW_LANG_Haskell = 0x0018
DW_LANG_C_plus_plus_03 = 0x0019
DW_LANG_C_plus_plus_11 = 0x001a
DW_LANG_OCaml = 0x001b
DW_LANG_Rust = 0x001c
DW_LANG_C11 = 0x001d
DW_LANG_Swift = 0x001e
DW_LANG_Julia = 0x001f
DW_LANG_Dylan = 0x0020
DW_LANG_C_plus_plus_14 = 0x0021
DW_LANG_Fortran03 = 0x0022
DW_LANG_Fortran08 = 0x0023
DW_LANG_RenderScript = 0x0024
DW_LANG_BLISS = 0x0025
DW_LANG_Mips_Assembler = 0x8001
DW_LANG_Upc = 0x8765
DW_LANG_HP_Bliss = 0x8003
DW_LANG_HP_Basic91 = 0x8004
DW_LANG_HP_Pascal91 = 0x8005
DW_LANG_HP_IMacro = 0x8006
DW_LANG_HP_Assembler = 0x8007
DW_LANG_GOOGLE_RenderScript = 0x8e57
DW_LANG_BORLAND_Delphi = 0xb000
# Encoding
#
DW_ATE_void = 0x0
DW_ATE_address = 0x1
DW_ATE_boolean = 0x2
DW_ATE_complex_float = 0x3
DW_ATE_float = 0x4
DW_ATE_signed = 0x5
DW_ATE_signed_char = 0x6
DW_ATE_unsigned = 0x7
DW_ATE_unsigned_char = 0x8
DW_ATE_imaginary_float = 0x9
DW_ATE_packed_decimal = 0xa
DW_ATE_numeric_string = 0xb
DW_ATE_edited = 0xc
DW_ATE_signed_fixed = 0xd
DW_ATE_unsigned_fixed = 0xe
DW_ATE_decimal_float = 0xf
DW_ATE_UTF = 0x10
DW_ATE_UCS = 0x11
DW_ATE_ASCII = 0x12
DW_ATE_lo_user = 0x80
DW_ATE_hi_user = 0xff
DW_ATE_HP_float80 = 0x80
DW_ATE_HP_complex_float80 = 0x81
DW_ATE_HP_float128 = 0x82
DW_ATE_HP_complex_float128 = 0x83
DW_ATE_HP_floathpintel = 0x84
DW_ATE_HP_imaginary_float80 = 0x85
DW_ATE_HP_imaginary_float128 = 0x86
# Access
#
DW_ACCESS_public = 1
DW_ACCESS_protected = 2
DW_ACCESS_private = 3
# Visibility
#
DW_VIS_local = 1
DW_VIS_exported = 2
DW_VIS_qualified = 3
# Virtuality
#
DW_VIRTUALITY_none = 0
DW_VIRTUALITY_virtual = 1
DW_VIRTUALITY_pure_virtual = 2
# ID case
#
DW_ID_case_sensitive = 0
DW_ID_up_case = 1
DW_ID_down_case = 2
DW_ID_case_insensitive = 3
# Calling convention
#
DW_CC_normal = 0x1
DW_CC_program = 0x2
DW_CC_nocall = 0x3
# Ordering
#
DW_ORD_row_major = 0
DW_ORD_col_major = 1
# Line program opcodes
#
DW_LNS_copy = 0x01
DW_LNS_advance_pc = 0x02
DW_LNS_advance_line = 0x03
DW_LNS_set_file = 0x04
DW_LNS_set_column = 0x05
DW_LNS_negate_stmt = 0x06
DW_LNS_set_basic_block = 0x07
DW_LNS_const_add_pc = 0x08
DW_LNS_fixed_advance_pc = 0x09
DW_LNS_set_prologue_end = 0x0a
DW_LNS_set_epilogue_begin = 0x0b
DW_LNS_set_isa = 0x0c
DW_LNE_end_sequence = 0x01
DW_LNE_set_address = 0x02
DW_LNE_define_file = 0x03
# Call frame instructions
#
# Note that the first 3 instructions have the so-called "primary opcode"
# (as described in DWARFv3 7.23), so only their highest 2 bits take part
# in the opcode decoding. They are kept as constants with the low bits masked
# out, and the callframe module knows how to handle this.
# The other instructions use an "extended opcode" encoded just in the low 6
# bits, with the high 2 bits, so these constants are exactly as they would
# appear in an actual file.
#
DW_CFA_advance_loc = 0b01000000
DW_CFA_offset = 0b10000000
DW_CFA_restore = 0b11000000
DW_CFA_nop = 0x00
DW_CFA_set_loc = 0x01
DW_CFA_advance_loc1 = 0x02
DW_CFA_advance_loc2 = 0x03
DW_CFA_advance_loc4 = 0x04
DW_CFA_offset_extended = 0x05
DW_CFA_restore_extended = 0x06
DW_CFA_undefined = 0x07
DW_CFA_same_value = 0x08
DW_CFA_register = 0x09
DW_CFA_remember_state = 0x0a
DW_CFA_restore_state = 0x0b
DW_CFA_def_cfa = 0x0c
DW_CFA_def_cfa_register = 0x0d
DW_CFA_def_cfa_offset = 0x0e
DW_CFA_def_cfa_expression = 0x0f
DW_CFA_expression = 0x10
DW_CFA_offset_extended_sf = 0x11
DW_CFA_def_cfa_sf = 0x12
DW_CFA_def_cfa_offset_sf = 0x13
DW_CFA_val_offset = 0x14
DW_CFA_val_offset_sf = 0x15
DW_CFA_val_expression = 0x16
pyelftools-0.26/elftools/dwarf/descriptions.py 0000664 0000000 0000000 00000052504 13572204573 0021671 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/descriptions.py
#
# Textual descriptions of the various values and enums of DWARF
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from collections import defaultdict
from .constants import *
from .dwarf_expr import GenericExprVisitor
from .die import DIE
from ..common.utils import preserve_stream_pos, dwarf_assert
from ..common.py3compat import bytes2str
from .callframe import instruction_name, CIE, FDE
def set_global_machine_arch(machine_arch):
global _MACHINE_ARCH
_MACHINE_ARCH = machine_arch
def describe_attr_value(attr, die, section_offset):
""" Given an attribute attr, return the textual representation of its
value, suitable for tools like readelf.
To cover all cases, this function needs some extra arguments:
die: the DIE this attribute was extracted from
section_offset: offset in the stream of the section the DIE belongs to
"""
descr_func = _ATTR_DESCRIPTION_MAP[attr.form]
val_description = descr_func(attr, die, section_offset)
# For some attributes we can display further information
extra_info_func = _EXTRA_INFO_DESCRIPTION_MAP[attr.name]
extra_info = extra_info_func(attr, die, section_offset)
return str(val_description) + '\t' + extra_info
def describe_CFI_instructions(entry):
""" Given a CFI entry (CIE or FDE), return the textual description of its
instructions.
"""
def _assert_FDE_instruction(instr):
dwarf_assert(
isinstance(entry, FDE),
'Unexpected instruction "%s" for a CIE' % instr)
def _full_reg_name(regnum):
regname = describe_reg_name(regnum, _MACHINE_ARCH, False)
if regname:
return 'r%s (%s)' % (regnum, regname)
else:
return 'r%s' % regnum
if isinstance(entry, CIE):
cie = entry
else: # FDE
cie = entry.cie
pc = entry['initial_location']
s = ''
for instr in entry.instructions:
name = instruction_name(instr.opcode)
if name in ('DW_CFA_offset',
'DW_CFA_offset_extended', 'DW_CFA_offset_extended_sf',
'DW_CFA_val_offset', 'DW_CFA_val_offset_sf'):
s += ' %s: %s at cfa%+d\n' % (
name, _full_reg_name(instr.args[0]),
instr.args[1] * cie['data_alignment_factor'])
elif name in ( 'DW_CFA_restore', 'DW_CFA_restore_extended',
'DW_CFA_undefined', 'DW_CFA_same_value',
'DW_CFA_def_cfa_register'):
s += ' %s: %s\n' % (name, _full_reg_name(instr.args[0]))
elif name == 'DW_CFA_register':
s += ' %s: %s in %s' % (
name, _full_reg_name(instr.args[0]),
_full_reg_name(instr.args[1]))
elif name == 'DW_CFA_set_loc':
pc = instr.args[0]
s += ' %s: %08x\n' % (name, pc)
elif name in ( 'DW_CFA_advance_loc1', 'DW_CFA_advance_loc2',
'DW_CFA_advance_loc4', 'DW_CFA_advance_loc'):
_assert_FDE_instruction(instr)
factored_offset = instr.args[0] * cie['code_alignment_factor']
s += ' %s: %s to %08x\n' % (
name, factored_offset, factored_offset + pc)
pc += factored_offset
elif name in ( 'DW_CFA_remember_state', 'DW_CFA_restore_state',
'DW_CFA_nop'):
s += ' %s\n' % name
elif name == 'DW_CFA_def_cfa':
s += ' %s: %s ofs %s\n' % (
name, _full_reg_name(instr.args[0]), instr.args[1])
elif name == 'DW_CFA_def_cfa_sf':
s += ' %s: %s ofs %s\n' % (
name, _full_reg_name(instr.args[0]),
instr.args[1] * cie['data_alignment_factor'])
elif name == 'DW_CFA_def_cfa_offset':
s += ' %s: %s\n' % (name, instr.args[0])
elif name == 'DW_CFA_def_cfa_expression':
expr_dumper = ExprDumper(entry.structs)
expr_dumper.process_expr(instr.args[0])
# readelf output is missing a colon for DW_CFA_def_cfa_expression
s += ' %s (%s)\n' % (name, expr_dumper.get_str())
elif name == 'DW_CFA_expression':
expr_dumper = ExprDumper(entry.structs)
expr_dumper.process_expr(instr.args[1])
s += ' %s: %s (%s)\n' % (
name, _full_reg_name(instr.args[0]), expr_dumper.get_str())
else:
s += ' %s: ?>\n' % name
return s
def describe_CFI_register_rule(rule):
s = _DESCR_CFI_REGISTER_RULE_TYPE[rule.type]
if rule.type in ('OFFSET', 'VAL_OFFSET'):
s += '%+d' % rule.arg
elif rule.type == 'REGISTER':
s += describe_reg_name(rule.arg)
return s
def describe_CFI_CFA_rule(rule):
if rule.expr:
return 'exp'
else:
return '%s%+d' % (describe_reg_name(rule.reg), rule.offset)
def describe_DWARF_expr(expr, structs):
""" Textual description of a DWARF expression encoded in 'expr'.
structs should come from the entity encompassing the expression - it's
needed to be able to parse it correctly.
"""
# Since this function can be called a lot, initializing a fresh new
# ExprDumper per call is expensive. So a rudimentary caching scheme is in
# place to create only one such dumper per instance of structs.
cache_key = id(structs)
if cache_key not in _DWARF_EXPR_DUMPER_CACHE:
_DWARF_EXPR_DUMPER_CACHE[cache_key] = \
ExprDumper(structs)
dwarf_expr_dumper = _DWARF_EXPR_DUMPER_CACHE[cache_key]
dwarf_expr_dumper.clear()
dwarf_expr_dumper.process_expr(expr)
return '(' + dwarf_expr_dumper.get_str() + ')'
def describe_reg_name(regnum, machine_arch=None, default=True):
""" Provide a textual description for a register name, given its serial
number. The number is expected to be valid.
"""
if machine_arch is None:
machine_arch = _MACHINE_ARCH
if machine_arch == 'x86':
return _REG_NAMES_x86[regnum]
elif machine_arch == 'x64':
return _REG_NAMES_x64[regnum]
elif default:
return 'r%s' % regnum
else:
return None
def describe_form_class(form):
"""For a given form name, determine its value class.
For example, given 'DW_FORM_data1' returns 'constant'.
For some forms, like DW_FORM_indirect and DW_FORM_sec_offset, the class is
not hard-coded and extra information is required. For these, None is
returned.
"""
return _FORM_CLASS[form]
#-------------------------------------------------------------------------------
# The machine architecture. Set globally via set_global_machine_arch
#
_MACHINE_ARCH = None
def _describe_attr_ref(attr, die, section_offset):
return '<0x%x>' % (attr.value + die.cu.cu_offset)
def _describe_attr_value_passthrough(attr, die, section_offset):
return attr.value
def _describe_attr_hex(attr, die, section_offset):
return '0x%x' % (attr.value)
def _describe_attr_hex_addr(attr, die, section_offset):
return '<0x%x>' % (attr.value)
def _describe_attr_split_64bit(attr, die, section_offset):
low_word = attr.value & 0xFFFFFFFF
high_word = (attr.value >> 32) & 0xFFFFFFFF
return '0x%x 0x%x' % (low_word, high_word)
def _describe_attr_strp(attr, die, section_offset):
return '(indirect string, offset: 0x%x): %s' % (
attr.raw_value, bytes2str(attr.value))
def _describe_attr_string(attr, die, section_offset):
return bytes2str(attr.value)
def _describe_attr_debool(attr, die, section_offset):
""" To be consistent with readelf, generate 1 for True flags, 0 for False
flags.
"""
return '1' if attr.value else '0'
def _describe_attr_present(attr, die, section_offset):
""" Some forms may simply mean that an attribute is present,
without providing any value.
"""
return '1'
def _describe_attr_block(attr, die, section_offset):
s = '%s byte block: ' % len(attr.value)
s += ' '.join('%x' % item for item in attr.value) + ' '
return s
_ATTR_DESCRIPTION_MAP = defaultdict(
lambda: _describe_attr_value_passthrough, # default_factory
DW_FORM_ref1=_describe_attr_ref,
DW_FORM_ref2=_describe_attr_ref,
DW_FORM_ref4=_describe_attr_ref,
DW_FORM_ref8=_describe_attr_split_64bit,
DW_FORM_ref_udata=_describe_attr_ref,
DW_FORM_ref_addr=_describe_attr_hex_addr,
DW_FORM_data4=_describe_attr_hex,
DW_FORM_data8=_describe_attr_hex,
DW_FORM_addr=_describe_attr_hex,
DW_FORM_sec_offset=_describe_attr_hex,
DW_FORM_flag=_describe_attr_debool,
DW_FORM_data1=_describe_attr_value_passthrough,
DW_FORM_data2=_describe_attr_value_passthrough,
DW_FORM_sdata=_describe_attr_value_passthrough,
DW_FORM_udata=_describe_attr_value_passthrough,
DW_FORM_string=_describe_attr_string,
DW_FORM_strp=_describe_attr_strp,
DW_FORM_block1=_describe_attr_block,
DW_FORM_block2=_describe_attr_block,
DW_FORM_block4=_describe_attr_block,
DW_FORM_block=_describe_attr_block,
DW_FORM_flag_present=_describe_attr_present,
DW_FORM_exprloc=_describe_attr_block,
DW_FORM_ref_sig8=_describe_attr_ref,
)
_FORM_CLASS = dict(
DW_FORM_addr='address',
DW_FORM_block2='block',
DW_FORM_block4='block',
DW_FORM_data2='constant',
DW_FORM_data4='constant',
DW_FORM_data8='constant',
DW_FORM_string='string',
DW_FORM_block='block',
DW_FORM_block1='block',
DW_FORM_data1='constant',
DW_FORM_flag='flag',
DW_FORM_sdata='constant',
DW_FORM_strp='string',
DW_FORM_udata='constant',
DW_FORM_ref_addr='reference',
DW_FORM_ref1='reference',
DW_FORM_ref2='reference',
DW_FORM_ref4='reference',
DW_FORM_ref8='reference',
DW_FORM_ref_udata='reference',
DW_FORM_indirect=None,
DW_FORM_sec_offset=None,
DW_FORM_exprloc='exprloc',
DW_FORM_flag_present='flag',
DW_FORM_ref_sig8='reference',
)
_DESCR_DW_INL = {
DW_INL_not_inlined: '(not inlined)',
DW_INL_inlined: '(inlined)',
DW_INL_declared_not_inlined: '(declared as inline but ignored)',
DW_INL_declared_inlined: '(declared as inline and inlined)',
}
_DESCR_DW_LANG = {
DW_LANG_C89: '(ANSI C)',
DW_LANG_C: '(non-ANSI C)',
DW_LANG_Ada83: '(Ada)',
DW_LANG_C_plus_plus: '(C++)',
DW_LANG_Cobol74: '(Cobol 74)',
DW_LANG_Cobol85: '(Cobol 85)',
DW_LANG_Fortran77: '(FORTRAN 77)',
DW_LANG_Fortran90: '(Fortran 90)',
DW_LANG_Pascal83: '(ANSI Pascal)',
DW_LANG_Modula2: '(Modula 2)',
DW_LANG_Java: '(Java)',
DW_LANG_C99: '(ANSI C99)',
DW_LANG_Ada95: '(ADA 95)',
DW_LANG_Fortran95: '(Fortran 95)',
DW_LANG_PLI: '(PLI)',
DW_LANG_ObjC: '(Objective C)',
DW_LANG_ObjC_plus_plus: '(Objective C++)',
DW_LANG_UPC: '(Unified Parallel C)',
DW_LANG_D: '(D)',
DW_LANG_Python: '(Python)',
DW_LANG_Mips_Assembler: '(MIPS assembler)',
DW_LANG_HP_Bliss: '(HP Bliss)',
DW_LANG_HP_Basic91: '(HP Basic 91)',
DW_LANG_HP_Pascal91: '(HP Pascal 91)',
DW_LANG_HP_IMacro: '(HP IMacro)',
DW_LANG_HP_Assembler: '(HP assembler)',
}
_DESCR_DW_ATE = {
DW_ATE_void: '(void)',
DW_ATE_address: '(machine address)',
DW_ATE_boolean: '(boolean)',
DW_ATE_complex_float: '(complex float)',
DW_ATE_float: '(float)',
DW_ATE_signed: '(signed)',
DW_ATE_signed_char: '(signed char)',
DW_ATE_unsigned: '(unsigned)',
DW_ATE_unsigned_char: '(unsigned char)',
DW_ATE_imaginary_float: '(imaginary float)',
DW_ATE_decimal_float: '(decimal float)',
DW_ATE_packed_decimal: '(packed_decimal)',
DW_ATE_numeric_string: '(numeric_string)',
DW_ATE_edited: '(edited)',
DW_ATE_signed_fixed: '(signed_fixed)',
DW_ATE_unsigned_fixed: '(unsigned_fixed)',
DW_ATE_HP_float80: '(HP_float80)',
DW_ATE_HP_complex_float80: '(HP_complex_float80)',
DW_ATE_HP_float128: '(HP_float128)',
DW_ATE_HP_complex_float128: '(HP_complex_float128)',
DW_ATE_HP_floathpintel: '(HP_floathpintel)',
DW_ATE_HP_imaginary_float80: '(HP_imaginary_float80)',
DW_ATE_HP_imaginary_float128: '(HP_imaginary_float128)',
}
_DESCR_DW_ACCESS = {
DW_ACCESS_public: '(public)',
DW_ACCESS_protected: '(protected)',
DW_ACCESS_private: '(private)',
}
_DESCR_DW_VIS = {
DW_VIS_local: '(local)',
DW_VIS_exported: '(exported)',
DW_VIS_qualified: '(qualified)',
}
_DESCR_DW_VIRTUALITY = {
DW_VIRTUALITY_none: '(none)',
DW_VIRTUALITY_virtual: '(virtual)',
DW_VIRTUALITY_pure_virtual: '(pure virtual)',
}
_DESCR_DW_ID_CASE = {
DW_ID_case_sensitive: '(case_sensitive)',
DW_ID_up_case: '(up_case)',
DW_ID_down_case: '(down_case)',
DW_ID_case_insensitive: '(case_insensitive)',
}
_DESCR_DW_CC = {
DW_CC_normal: '(normal)',
DW_CC_program: '(program)',
DW_CC_nocall: '(nocall)',
}
_DESCR_DW_ORD = {
DW_ORD_row_major: '(row major)',
DW_ORD_col_major: '(column major)',
}
_DESCR_CFI_REGISTER_RULE_TYPE = dict(
UNDEFINED='u',
SAME_VALUE='s',
OFFSET='c',
VAL_OFFSET='v',
REGISTER='',
EXPRESSION='exp',
VAL_EXPRESSION='vexp',
ARCHITECTURAL='a',
)
def _make_extra_mapper(mapping, default, default_interpolate_value=False):
""" Create a mapping function from attribute parameters to an extra
value that should be displayed.
"""
def mapper(attr, die, section_offset):
if default_interpolate_value:
d = default % attr.value
else:
d = default
return mapping.get(attr.value, d)
return mapper
def _make_extra_string(s=''):
""" Create an extra function that just returns a constant string.
"""
def extra(attr, die, section_offset):
return s
return extra
_DWARF_EXPR_DUMPER_CACHE = {}
def _location_list_extra(attr, die, section_offset):
# According to section 2.6 of the DWARF spec v3, class loclistptr means
# a location list, and class block means a location expression.
# DW_FORM_sec_offset is new in DWARFv4 as a section offset.
if attr.form in ('DW_FORM_data4', 'DW_FORM_data8', 'DW_FORM_sec_offset'):
return '(location list)'
else:
return describe_DWARF_expr(attr.value, die.cu.structs)
def _data_member_location_extra(attr, die, section_offset):
# According to section 5.5.6 of the DWARF spec v4, a data member location
# can be an integer offset, or a location description.
#
if attr.form in ('DW_FORM_data1', 'DW_FORM_data2',
'DW_FORM_data4', 'DW_FORM_data8'):
return '' # No extra description needed
elif attr.form == 'DW_FORM_sdata':
return str(attr.value)
else:
return describe_DWARF_expr(attr.value, die.cu.structs)
def _import_extra(attr, die, section_offset):
# For DW_AT_import the value points to a DIE (that can be either in the
# current DIE's CU or in another CU, depending on the FORM). The extra
# information for it is the abbreviation number in this DIE and its tag.
if attr.form == 'DW_FORM_ref_addr':
# Absolute offset value
ref_die_offset = section_offset + attr.value
else:
# Relative offset to the current DIE's CU
ref_die_offset = attr.value + die.cu.cu_offset
# Now find the CU this DIE belongs to (since we have to find its abbrev
# table). This is done by linearly scanning through all CUs, looking for
# one spanning an address space containing the referred DIE's offset.
for cu in die.dwarfinfo.iter_CUs():
if cu['unit_length'] + cu.cu_offset > ref_die_offset >= cu.cu_offset:
# Once we have the CU, we can actually parse this DIE from the
# stream.
with preserve_stream_pos(die.stream):
ref_die = DIE(cu, die.stream, ref_die_offset)
#print '&&& ref_die', ref_die
return '[Abbrev Number: %s (%s)]' % (
ref_die.abbrev_code, ref_die.tag)
return '[unknown]'
_EXTRA_INFO_DESCRIPTION_MAP = defaultdict(
lambda: _make_extra_string(''), # default_factory
DW_AT_inline=_make_extra_mapper(
_DESCR_DW_INL, '(Unknown inline attribute value: %x',
default_interpolate_value=True),
DW_AT_language=_make_extra_mapper(
_DESCR_DW_LANG, '(Unknown: %x)', default_interpolate_value=True),
DW_AT_encoding=_make_extra_mapper(_DESCR_DW_ATE, '(unknown type)'),
DW_AT_accessibility=_make_extra_mapper(
_DESCR_DW_ACCESS, '(unknown accessibility)'),
DW_AT_visibility=_make_extra_mapper(
_DESCR_DW_VIS, '(unknown visibility)'),
DW_AT_virtuality=_make_extra_mapper(
_DESCR_DW_VIRTUALITY, '(unknown virtuality)'),
DW_AT_identifier_case=_make_extra_mapper(
_DESCR_DW_ID_CASE, '(unknown case)'),
DW_AT_calling_convention=_make_extra_mapper(
_DESCR_DW_CC, '(unknown convention)'),
DW_AT_ordering=_make_extra_mapper(
_DESCR_DW_ORD, '(undefined)'),
DW_AT_frame_base=_location_list_extra,
DW_AT_location=_location_list_extra,
DW_AT_string_length=_location_list_extra,
DW_AT_return_addr=_location_list_extra,
DW_AT_data_member_location=_data_member_location_extra,
DW_AT_vtable_elem_location=_location_list_extra,
DW_AT_segment=_location_list_extra,
DW_AT_static_link=_location_list_extra,
DW_AT_use_location=_location_list_extra,
DW_AT_allocated=_location_list_extra,
DW_AT_associated=_location_list_extra,
DW_AT_data_location=_location_list_extra,
DW_AT_stride=_location_list_extra,
DW_AT_import=_import_extra,
DW_AT_GNU_call_site_value=_location_list_extra,
DW_AT_GNU_call_site_data_value=_location_list_extra,
DW_AT_GNU_call_site_target=_location_list_extra,
DW_AT_GNU_call_site_target_clobbered=_location_list_extra,
)
# 8 in a line, for easier counting
_REG_NAMES_x86 = [
'eax', 'ecx', 'edx', 'ebx', 'esp', 'ebp', 'esi', 'edi',
'eip', 'eflags', '', 'st0', 'st1', 'st2', 'st3', 'st4',
'st5', 'st6', 'st7', '', '', 'xmm0', 'xmm1', 'xmm2',
'xmm3', 'xmm4', 'xmm5', 'xmm6', 'xmm7', 'mm0', 'mm1', 'mm2',
'mm3', 'mm4', 'mm5', 'mm6', 'mm7', 'fcw', 'fsw', 'mxcsr',
'es', 'cs', 'ss', 'ds', 'fs', 'gs', '', '', 'tr', 'ldtr'
]
_REG_NAMES_x64 = [
'rax', 'rdx', 'rcx', 'rbx', 'rsi', 'rdi', 'rbp', 'rsp',
'r8', 'r9', 'r10', 'r11', 'r12', 'r13', 'r14', 'r15',
'rip', 'xmm0', 'xmm1', 'xmm2', 'xmm3', 'xmm4', 'xmm5', 'xmm6',
'xmm7', 'xmm8', 'xmm9', 'xmm10', 'xmm11', 'xmm12', 'xmm13', 'xmm14',
'xmm15', 'st0', 'st1', 'st2', 'st3', 'st4', 'st5', 'st6',
'st7', 'mm0', 'mm1', 'mm2', 'mm3', 'mm4', 'mm5', 'mm6',
'mm7', 'rflags', 'es', 'cs', 'ss', 'ds', 'fs', 'gs',
'', '', 'fs.base', 'gs.base', '', '', 'tr', 'ldtr',
'mxcsr', 'fcw', 'fsw'
]
class ExprDumper(GenericExprVisitor):
""" A concrete visitor for DWARF expressions that dumps a textual
representation of the complete expression.
Usage: after creation, call process_expr, and then get_str for a
semicolon-delimited string representation of the decoded expression.
"""
def __init__(self, structs):
super(ExprDumper, self).__init__(structs)
self._init_lookups()
self._str_parts = []
def clear(self):
self._str_parts = []
def get_str(self):
return '; '.join(self._str_parts)
def _init_lookups(self):
self._ops_with_decimal_arg = set([
'DW_OP_const1u', 'DW_OP_const1s', 'DW_OP_const2u', 'DW_OP_const2s',
'DW_OP_const4u', 'DW_OP_const4s', 'DW_OP_constu', 'DW_OP_consts',
'DW_OP_pick', 'DW_OP_plus_uconst', 'DW_OP_bra', 'DW_OP_skip',
'DW_OP_fbreg', 'DW_OP_piece', 'DW_OP_deref_size',
'DW_OP_xderef_size', 'DW_OP_regx',])
for n in range(0, 32):
self._ops_with_decimal_arg.add('DW_OP_breg%s' % n)
self._ops_with_two_decimal_args = set([
'DW_OP_const8u', 'DW_OP_const8s', 'DW_OP_bregx', 'DW_OP_bit_piece'])
self._ops_with_hex_arg = set(
['DW_OP_addr', 'DW_OP_call2', 'DW_OP_call4', 'DW_OP_call_ref'])
def _after_visit(self, opcode, opcode_name, args):
self._str_parts.append(self._dump_to_string(opcode, opcode_name, args))
def _dump_to_string(self, opcode, opcode_name, args):
if len(args) == 0:
if opcode_name.startswith('DW_OP_reg'):
regnum = int(opcode_name[9:])
return '%s (%s)' % (
opcode_name,
describe_reg_name(regnum, _MACHINE_ARCH))
else:
return opcode_name
elif opcode_name in self._ops_with_decimal_arg:
if opcode_name.startswith('DW_OP_breg'):
regnum = int(opcode_name[10:])
return '%s (%s): %s' % (
opcode_name,
describe_reg_name(regnum, _MACHINE_ARCH),
args[0])
elif opcode_name.endswith('regx'):
# applies to both regx and bregx
return '%s: %s (%s)' % (
opcode_name,
args[0],
describe_reg_name(args[0], _MACHINE_ARCH))
else:
return '%s: %s' % (opcode_name, args[0])
elif opcode_name in self._ops_with_hex_arg:
return '%s: %x' % (opcode_name, args[0])
elif opcode_name in self._ops_with_two_decimal_args:
return '%s: %s %s' % (opcode_name, args[0], args[1])
else:
return '' % opcode_name
pyelftools-0.26/elftools/dwarf/die.py 0000775 0000000 0000000 00000016451 13572204573 0017730 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/die.py
#
# DWARF Debugging Information Entry
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from collections import namedtuple, OrderedDict
import os
from ..common.exceptions import DWARFError
from ..common.py3compat import bytes2str, iteritems
from ..common.utils import struct_parse, preserve_stream_pos
from .enums import DW_FORM_raw2name
# AttributeValue - describes an attribute value in the DIE:
#
# name:
# The name (DW_AT_*) of this attribute
#
# form:
# The DW_FORM_* name of this attribute
#
# value:
# The value parsed from the section and translated accordingly to the form
# (e.g. for a DW_FORM_strp it's the actual string taken from the string table)
#
# raw_value:
# Raw value as parsed from the section - used for debugging and presentation
# (e.g. for a DW_FORM_strp it's the raw string offset into the table)
#
# offset:
# Offset of this attribute's value in the stream (absolute offset, relative
# the beginning of the whole stream)
#
AttributeValue = namedtuple(
'AttributeValue', 'name form value raw_value offset')
class DIE(object):
""" A DWARF debugging information entry. On creation, parses itself from
the stream. Each DIE is held by a CU.
Accessible attributes:
tag:
The DIE tag
size:
The size this DIE occupies in the section
offset:
The offset of this DIE in the stream
attributes:
An ordered dictionary mapping attribute names to values. It's
ordered to preserve the order of attributes in the section
has_children:
Specifies whether this DIE has children
abbrev_code:
The abbreviation code pointing to an abbreviation entry (note
that this is for informational pusposes only - this object
interacts with its abbreviation table transparently).
See also the public methods.
"""
def __init__(self, cu, stream, offset):
""" cu:
CompileUnit object this DIE belongs to. Used to obtain context
information (structs, abbrev table, etc.)
stream, offset:
The stream and offset into it where this DIE's data is located
"""
self.cu = cu
self.dwarfinfo = self.cu.dwarfinfo # get DWARFInfo context
self.stream = stream
self.offset = offset
self.attributes = OrderedDict()
self.tag = None
self.has_children = None
self.abbrev_code = None
self.size = 0
# Null DIE terminator. It can be used to obtain offset range occupied
# by this DIE including its whole subtree.
self._terminator = None
self._parent = None
self._parse_DIE()
def is_null(self):
""" Is this a null entry?
"""
return self.tag is None
def get_parent(self):
""" The parent DIE of this DIE. None if the DIE has no parent (i.e. a
top-level DIE).
"""
return self._parent
def get_full_path(self):
""" Return the full path filename for the DIE.
The filename is the join of 'DW_AT_comp_dir' and 'DW_AT_name',
either of which may be missing in practice. Note that its value is
usually a string taken from the .debug_string section and the
returned value will be a string.
"""
comp_dir_attr = self.attributes.get('DW_AT_comp_dir', None)
comp_dir = bytes2str(comp_dir_attr.value) if comp_dir_attr else ''
fname_attr = self.attributes.get('DW_AT_name', None)
fname = bytes2str(fname_attr.value) if fname_attr else ''
return os.path.join(comp_dir, fname)
def iter_children(self):
""" Iterates all children of this DIE
"""
return self.cu.iter_DIE_children(self)
def iter_siblings(self):
""" Yield all siblings of this DIE
"""
if self._parent:
for sibling in self._parent.iter_children():
if sibling is not self:
yield sibling
else:
raise StopIteration()
# The following methods are used while creating the DIE and should not be
# interesting to consumers
#
def set_parent(self, die):
self._parent = die
#------ PRIVATE ------#
def __repr__(self):
s = 'DIE %s, size=%s, has_children=%s\n' % (
self.tag, self.size, self.has_children)
for attrname, attrval in iteritems(self.attributes):
s += ' |%-18s: %s\n' % (attrname, attrval)
return s
def __str__(self):
return self.__repr__()
def _parse_DIE(self):
""" Parses the DIE info from the section, based on the abbreviation
table of the CU
"""
structs = self.cu.structs
# A DIE begins with the abbreviation code. Read it and use it to
# obtain the abbrev declaration for this DIE.
# Note: here and elsewhere, preserve_stream_pos is used on operations
# that manipulate the stream by reading data from it.
self.abbrev_code = struct_parse(
structs.Dwarf_uleb128(''), self.stream, self.offset)
# This may be a null entry
if self.abbrev_code == 0:
self.size = self.stream.tell() - self.offset
return
abbrev_decl = self.cu.get_abbrev_table().get_abbrev(self.abbrev_code)
self.tag = abbrev_decl['tag']
self.has_children = abbrev_decl.has_children()
# Guided by the attributes listed in the abbreviation declaration, parse
# values from the stream.
for name, form in abbrev_decl.iter_attr_specs():
attr_offset = self.stream.tell()
raw_value = struct_parse(structs.Dwarf_dw_form[form], self.stream)
value = self._translate_attr_value(form, raw_value)
self.attributes[name] = AttributeValue(
name=name,
form=form,
value=value,
raw_value=raw_value,
offset=attr_offset)
self.size = self.stream.tell() - self.offset
def _translate_attr_value(self, form, raw_value):
""" Translate a raw attr value according to the form
"""
value = None
if form == 'DW_FORM_strp':
with preserve_stream_pos(self.stream):
value = self.dwarfinfo.get_string_from_table(raw_value)
elif form == 'DW_FORM_flag':
value = not raw_value == 0
elif form == 'DW_FORM_flag_present':
value = True
elif form == 'DW_FORM_indirect':
try:
form = DW_FORM_raw2name[raw_value]
except KeyError as err:
raise DWARFError(
'Found DW_FORM_indirect with unknown raw_value=' +
str(raw_value))
raw_value = struct_parse(
self.cu.structs.Dwarf_dw_form[form], self.stream)
# Let's hope this doesn't get too deep :-)
return self._translate_attr_value(form, raw_value)
else:
value = raw_value
return value
pyelftools-0.26/elftools/dwarf/dwarf_expr.py 0000664 0000000 0000000 00000023777 13572204573 0021336 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/dwarf_expr.py
#
# Decoding DWARF expressions
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..common.py3compat import BytesIO, iteritems
from ..common.utils import struct_parse, bytelist2string
# DWARF expression opcodes. name -> opcode mapping
DW_OP_name2opcode = dict(
DW_OP_addr=0x03,
DW_OP_deref=0x06,
DW_OP_const1u=0x08,
DW_OP_const1s=0x09,
DW_OP_const2u=0x0a,
DW_OP_const2s=0x0b,
DW_OP_const4u=0x0c,
DW_OP_const4s=0x0d,
DW_OP_const8u=0x0e,
DW_OP_const8s=0x0f,
DW_OP_constu=0x10,
DW_OP_consts=0x11,
DW_OP_dup=0x12,
DW_OP_drop=0x13,
DW_OP_over=0x14,
DW_OP_pick=0x15,
DW_OP_swap=0x16,
DW_OP_rot=0x17,
DW_OP_xderef=0x18,
DW_OP_abs=0x19,
DW_OP_and=0x1a,
DW_OP_div=0x1b,
DW_OP_minus=0x1c,
DW_OP_mod=0x1d,
DW_OP_mul=0x1e,
DW_OP_neg=0x1f,
DW_OP_not=0x20,
DW_OP_or=0x21,
DW_OP_plus=0x22,
DW_OP_plus_uconst=0x23,
DW_OP_shl=0x24,
DW_OP_shr=0x25,
DW_OP_shra=0x26,
DW_OP_xor=0x27,
DW_OP_bra=0x28,
DW_OP_eq=0x29,
DW_OP_ge=0x2a,
DW_OP_gt=0x2b,
DW_OP_le=0x2c,
DW_OP_lt=0x2d,
DW_OP_ne=0x2e,
DW_OP_skip=0x2f,
DW_OP_regx=0x90,
DW_OP_fbreg=0x91,
DW_OP_bregx=0x92,
DW_OP_piece=0x93,
DW_OP_deref_size=0x94,
DW_OP_xderef_size=0x95,
DW_OP_nop=0x96,
DW_OP_push_object_address=0x97,
DW_OP_call2=0x98,
DW_OP_call4=0x99,
DW_OP_call_ref=0x9a,
DW_OP_form_tls_address=0x9b,
DW_OP_call_frame_cfa=0x9c,
DW_OP_bit_piece=0x9d,
DW_OP_implicit_value=0x9e,
DW_OP_stack_value=0x9f,
DW_OP_implicit_pointer=0xa0,
DW_OP_addrx=0xa1,
DW_OP_constx=0xa2,
DW_OP_entry_value=0xa3,
DW_OP_const_type=0xa4,
DW_OP_regval_type=0xa5,
DW_OP_deref_type=0xa6,
DW_OP_xderef_type=0xa7,
DW_OP_convert=0xa8,
DW_OP_reinterpret=0xa9,
DW_OP_lo_user=0xe0,
DW_OP_hi_user=0xff,
)
def _generate_dynamic_values(map, prefix, index_start, index_end, value_start):
""" Generate values in a map (dict) dynamically. Each key starts with
a (string) prefix, followed by an index in the inclusive range
[index_start, index_end]. The values start at value_start.
"""
for index in range(index_start, index_end + 1):
name = '%s%s' % (prefix, index)
value = value_start + index - index_start
map[name] = value
_generate_dynamic_values(DW_OP_name2opcode, 'DW_OP_lit', 0, 31, 0x30)
_generate_dynamic_values(DW_OP_name2opcode, 'DW_OP_reg', 0, 31, 0x50)
_generate_dynamic_values(DW_OP_name2opcode, 'DW_OP_breg', 0, 31, 0x70)
# opcode -> name mapping
DW_OP_opcode2name = dict((v, k) for k, v in iteritems(DW_OP_name2opcode))
class GenericExprVisitor(object):
""" A DWARF expression is a sequence of instructions encoded in a block
of bytes. This class decodes the sequence into discrete instructions
with their arguments and allows generic "visiting" to process them.
Usage: subclass this class, and override the needed methods. The
easiest way would be to just override _after_visit, which gets passed
each decoded instruction (with its arguments) in order. Clients of
the visitor then just execute process_expr. The subclass can keep
its own internal information updated in _after_visit and provide
methods to extract it. For a good example of this usage, see the
ExprDumper class in the descriptions module.
A more complex usage could be to override visiting methods for
specific instructions, by placing them into the dispatch table.
"""
def __init__(self, structs):
self.structs = structs
self._init_dispatch_table()
self.stream = None
self._cur_opcode = None
self._cur_opcode_name = None
self._cur_args = []
def process_expr(self, expr):
""" Process (visit) a DWARF expression. expr should be a list of
(integer) byte values.
"""
self.stream = BytesIO(bytelist2string(expr))
while True:
# Get the next opcode from the stream. If nothing is left in the
# stream, we're done.
byte = self.stream.read(1)
if len(byte) == 0:
break
# Decode the opcode and its name
self._cur_opcode = ord(byte)
self._cur_opcode_name = DW_OP_opcode2name.get(
self._cur_opcode, 'OP:0x%x' % self._cur_opcode)
# Will be filled in by visitors
self._cur_args = []
# Dispatch to a visitor function
visitor = self._dispatch_table.get(
self._cur_opcode,
self._default_visitor)
visitor(self._cur_opcode, self._cur_opcode_name)
# Finally call the post-visit function
self._after_visit(
self._cur_opcode, self._cur_opcode_name, self._cur_args)
def _after_visit(self, opcode, opcode_name, args):
pass
def _default_visitor(self, opcode, opcode_name):
pass
def _visit_OP_with_no_args(self, opcode, opcode_name):
self._cur_args = []
def _visit_OP_addr(self, opcode, opcode_name):
self._cur_args = [
struct_parse(self.structs.Dwarf_target_addr(''), self.stream)]
def _make_visitor_arg_struct(self, struct_arg):
""" Create a visitor method for an opcode that that accepts a single
argument, specified by a struct.
"""
def visitor(opcode, opcode_name):
self._cur_args = [struct_parse(struct_arg, self.stream)]
return visitor
def _make_visitor_arg_struct2(self, struct_arg1, struct_arg2):
""" Create a visitor method for an opcode that that accepts two
arguments, specified by structs.
"""
def visitor(opcode, opcode_name):
self._cur_args = [
struct_parse(struct_arg1, self.stream),
struct_parse(struct_arg2, self.stream)]
return visitor
def _init_dispatch_table(self):
self._dispatch_table = {}
def add(opcode_name, func):
self._dispatch_table[DW_OP_name2opcode[opcode_name]] = func
add('DW_OP_addr', self._visit_OP_addr)
add('DW_OP_const1u',
self._make_visitor_arg_struct(self.structs.Dwarf_uint8('')))
add('DW_OP_const1s',
self._make_visitor_arg_struct(self.structs.Dwarf_int8('')))
add('DW_OP_const2u',
self._make_visitor_arg_struct(self.structs.Dwarf_uint16('')))
add('DW_OP_const2s',
self._make_visitor_arg_struct(self.structs.Dwarf_int16('')))
add('DW_OP_const4u',
self._make_visitor_arg_struct(self.structs.Dwarf_uint32('')))
add('DW_OP_const4s',
self._make_visitor_arg_struct(self.structs.Dwarf_int32('')))
add('DW_OP_const8u',
self._make_visitor_arg_struct2(
self.structs.Dwarf_uint32(''),
self.structs.Dwarf_uint32('')))
add('DW_OP_const8s',
self._make_visitor_arg_struct2(
self.structs.Dwarf_int32(''),
self.structs.Dwarf_int32('')))
add('DW_OP_constu',
self._make_visitor_arg_struct(self.structs.Dwarf_uleb128('')))
add('DW_OP_consts',
self._make_visitor_arg_struct(self.structs.Dwarf_sleb128('')))
add('DW_OP_pick',
self._make_visitor_arg_struct(self.structs.Dwarf_uint8('')))
add('DW_OP_plus_uconst',
self._make_visitor_arg_struct(self.structs.Dwarf_uleb128('')))
add('DW_OP_bra',
self._make_visitor_arg_struct(self.structs.Dwarf_int16('')))
add('DW_OP_skip',
self._make_visitor_arg_struct(self.structs.Dwarf_int16('')))
for opname in [ 'DW_OP_deref', 'DW_OP_dup', 'DW_OP_drop', 'DW_OP_over',
'DW_OP_swap', 'DW_OP_swap', 'DW_OP_rot', 'DW_OP_xderef',
'DW_OP_abs', 'DW_OP_and', 'DW_OP_div', 'DW_OP_minus',
'DW_OP_mod', 'DW_OP_mul', 'DW_OP_neg', 'DW_OP_not',
'DW_OP_plus', 'DW_OP_shl', 'DW_OP_shr', 'DW_OP_shra',
'DW_OP_xor', 'DW_OP_eq', 'DW_OP_ge', 'DW_OP_gt',
'DW_OP_le', 'DW_OP_lt', 'DW_OP_ne', 'DW_OP_nop',
'DW_OP_push_object_address', 'DW_OP_form_tls_address',
'DW_OP_call_frame_cfa']:
add(opname, self._visit_OP_with_no_args)
for n in range(0, 32):
add('DW_OP_lit%s' % n, self._visit_OP_with_no_args)
add('DW_OP_reg%s' % n, self._visit_OP_with_no_args)
add('DW_OP_breg%s' % n,
self._make_visitor_arg_struct(self.structs.Dwarf_sleb128('')))
add('DW_OP_fbreg',
self._make_visitor_arg_struct(self.structs.Dwarf_sleb128('')))
add('DW_OP_regx',
self._make_visitor_arg_struct(self.structs.Dwarf_uleb128('')))
add('DW_OP_bregx',
self._make_visitor_arg_struct2(
self.structs.Dwarf_uleb128(''),
self.structs.Dwarf_sleb128('')))
add('DW_OP_piece',
self._make_visitor_arg_struct(self.structs.Dwarf_uleb128('')))
add('DW_OP_bit_piece',
self._make_visitor_arg_struct2(
self.structs.Dwarf_uleb128(''),
self.structs.Dwarf_uleb128('')))
add('DW_OP_deref_size',
self._make_visitor_arg_struct(self.structs.Dwarf_int8('')))
add('DW_OP_xderef_size',
self._make_visitor_arg_struct(self.structs.Dwarf_int8('')))
add('DW_OP_call2',
self._make_visitor_arg_struct(self.structs.Dwarf_uint16('')))
add('DW_OP_call4',
self._make_visitor_arg_struct(self.structs.Dwarf_uint32('')))
add('DW_OP_call_ref',
self._make_visitor_arg_struct(self.structs.Dwarf_offset('')))
pyelftools-0.26/elftools/dwarf/dwarfinfo.py 0000664 0000000 0000000 00000031307 13572204573 0021140 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/dwarfinfo.py
#
# DWARFInfo - Main class for accessing DWARF debug information
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from collections import namedtuple
from ..common.exceptions import DWARFError
from ..common.utils import (struct_parse, dwarf_assert,
parse_cstring_from_stream)
from .structs import DWARFStructs
from .compileunit import CompileUnit
from .abbrevtable import AbbrevTable
from .lineprogram import LineProgram
from .callframe import CallFrameInfo
from .locationlists import LocationLists
from .ranges import RangeLists
from .aranges import ARanges
from .namelut import NameLUT
# Describes a debug section
#
# stream: a stream object containing the data of this section
# name: section name in the container file
# global_offset: the global offset of the section in its container file
# size: the size of the section's data, in bytes
# address: the virtual address for the section's data
#
# 'name' and 'global_offset' are for descriptional purposes only and
# aren't strictly required for the DWARF parsing to work. 'address' is required
# to properly decode the special '.eh_frame' format.
#
DebugSectionDescriptor = namedtuple('DebugSectionDescriptor',
'stream name global_offset size address')
# Some configuration parameters for the DWARF reader. This exists to allow
# DWARFInfo to be independent from any specific file format/container.
#
# little_endian:
# boolean flag specifying whether the data in the file is little endian
#
# machine_arch:
# Machine architecture as a string. For example 'x86' or 'x64'
#
# default_address_size:
# The default address size for the container file (sizeof pointer, in bytes)
#
DwarfConfig = namedtuple('DwarfConfig',
'little_endian machine_arch default_address_size')
class DWARFInfo(object):
""" Acts also as a "context" to other major objects, bridging between
various parts of the debug infromation.
"""
def __init__(self,
config,
debug_info_sec,
debug_aranges_sec,
debug_abbrev_sec,
debug_frame_sec,
eh_frame_sec,
debug_str_sec,
debug_loc_sec,
debug_ranges_sec,
debug_line_sec,
debug_pubtypes_sec,
debug_pubnames_sec):
""" config:
A DwarfConfig object
debug_*_sec:
DebugSectionDescriptor for a section. Pass None for sections
that don't exist. These arguments are best given with
keyword syntax.
"""
self.config = config
self.debug_info_sec = debug_info_sec
self.debug_aranges_sec = debug_aranges_sec
self.debug_abbrev_sec = debug_abbrev_sec
self.debug_frame_sec = debug_frame_sec
self.eh_frame_sec = eh_frame_sec
self.debug_str_sec = debug_str_sec
self.debug_loc_sec = debug_loc_sec
self.debug_ranges_sec = debug_ranges_sec
self.debug_line_sec = debug_line_sec
self.debug_pubtypes_sec = debug_pubtypes_sec
self.debug_pubnames_sec = debug_pubnames_sec
# This is the DWARFStructs the context uses, so it doesn't depend on
# DWARF format and address_size (these are determined per CU) - set them
# to default values.
self.structs = DWARFStructs(
little_endian=self.config.little_endian,
dwarf_format=32,
address_size=self.config.default_address_size)
# Cache for abbrev tables: a dict keyed by offset
self._abbrevtable_cache = {}
@property
def has_debug_info(self):
""" Return whether this contains debug information.
It can be not the case when the ELF only contains .eh_frame, which is
encoded DWARF but not actually for debugging.
"""
return bool(self.debug_info_sec)
def iter_CUs(self):
""" Yield all the compile units (CompileUnit objects) in the debug info
"""
return self._parse_CUs_iter()
def get_abbrev_table(self, offset):
""" Get an AbbrevTable from the given offset in the debug_abbrev
section.
The only verification done on the offset is that it's within the
bounds of the section (if not, an exception is raised).
It is the caller's responsibility to make sure the offset actually
points to a valid abbreviation table.
AbbrevTable objects are cached internally (two calls for the same
offset will return the same object).
"""
dwarf_assert(
offset < self.debug_abbrev_sec.size,
"Offset '0x%x' to abbrev table out of section bounds" % offset)
if offset not in self._abbrevtable_cache:
self._abbrevtable_cache[offset] = AbbrevTable(
structs=self.structs,
stream=self.debug_abbrev_sec.stream,
offset=offset)
return self._abbrevtable_cache[offset]
def get_string_from_table(self, offset):
""" Obtain a string from the string table section, given an offset
relative to the section.
"""
return parse_cstring_from_stream(self.debug_str_sec.stream, offset)
def line_program_for_CU(self, CU):
""" Given a CU object, fetch the line program it points to from the
.debug_line section.
If the CU doesn't point to a line program, return None.
"""
# The line program is pointed to by the DW_AT_stmt_list attribute of
# the top DIE of a CU.
top_DIE = CU.get_top_DIE()
if 'DW_AT_stmt_list' in top_DIE.attributes:
return self._parse_line_program_at_offset(
top_DIE.attributes['DW_AT_stmt_list'].value, CU.structs)
else:
return None
def has_CFI(self):
""" Does this dwarf info have a dwarf_frame CFI section?
"""
return self.debug_frame_sec is not None
def CFI_entries(self):
""" Get a list of dwarf_frame CFI entries from the .debug_frame section.
"""
cfi = CallFrameInfo(
stream=self.debug_frame_sec.stream,
size=self.debug_frame_sec.size,
address=self.debug_frame_sec.address,
base_structs=self.structs)
return cfi.get_entries()
def has_EH_CFI(self):
""" Does this dwarf info have a eh_frame CFI section?
"""
return self.eh_frame_sec is not None
def EH_CFI_entries(self):
""" Get a list of eh_frame CFI entries from the .eh_frame section.
"""
cfi = CallFrameInfo(
stream=self.eh_frame_sec.stream,
size=self.eh_frame_sec.size,
address=self.eh_frame_sec.address,
base_structs=self.structs,
for_eh_frame=True)
return cfi.get_entries()
def get_pubtypes(self):
"""
Returns a NameLUT object that contains information read from the
.debug_pubtypes section in the ELF file.
NameLUT is essentially a dictionary containing the CU/DIE offsets of
each symbol. See the NameLUT doc string for more details.
"""
if self.debug_pubtypes_sec:
return NameLUT(self.debug_pubtypes_sec.stream,
self.debug_pubtypes_sec.size,
self.structs)
else:
return None
def get_pubnames(self):
"""
Returns a NameLUT object that contains information read from the
.debug_pubnames section in the ELF file.
NameLUT is essentially a dictionary containing the CU/DIE offsets of
each symbol. See the NameLUT doc string for more details.
"""
if self.debug_pubnames_sec:
return NameLUT(self.debug_pubnames_sec.stream,
self.debug_pubnames_sec.size,
self.structs)
else:
return None
def get_aranges(self):
""" Get an ARanges object representing the .debug_aranges section of
the DWARF data, or None if the section doesn't exist
"""
if self.debug_aranges_sec:
return ARanges(self.debug_aranges_sec.stream,
self.debug_aranges_sec.size,
self.structs)
else:
return None
def location_lists(self):
""" Get a LocationLists object representing the .debug_loc section of
the DWARF data, or None if this section doesn't exist.
"""
if self.debug_loc_sec:
return LocationLists(self.debug_loc_sec.stream, self.structs)
else:
return None
def range_lists(self):
""" Get a RangeLists object representing the .debug_ranges section of
the DWARF data, or None if this section doesn't exist.
"""
if self.debug_ranges_sec:
return RangeLists(self.debug_ranges_sec.stream, self.structs)
else:
return None
#------ PRIVATE ------#
def _parse_CUs_iter(self):
""" Parse CU entries from debug_info. Yield CUs in order of appearance.
"""
if self.debug_info_sec is None:
return
offset = 0
while offset < self.debug_info_sec.size:
cu = self._parse_CU_at_offset(offset)
# Compute the offset of the next CU in the section. The unit_length
# field of the CU header contains its size not including the length
# field itself.
offset = ( offset +
cu['unit_length'] +
cu.structs.initial_length_field_size())
yield cu
def _parse_CU_at_offset(self, offset):
""" Parse and return a CU at the given offset in the debug_info stream.
"""
# Section 7.4 (32-bit and 64-bit DWARF Formats) of the DWARF spec v3
# states that the first 32-bit word of the CU header determines
# whether the CU is represented with 32-bit or 64-bit DWARF format.
#
# So we peek at the first word in the CU header to determine its
# dwarf format. Based on it, we then create a new DWARFStructs
# instance suitable for this CU and use it to parse the rest.
#
initial_length = struct_parse(
self.structs.Dwarf_uint32(''), self.debug_info_sec.stream, offset)
dwarf_format = 64 if initial_length == 0xFFFFFFFF else 32
# At this point we still haven't read the whole header, so we don't
# know the address_size. Therefore, we're going to create structs
# with a default address_size=4. If, after parsing the header, we
# find out address_size is actually 8, we just create a new structs
# object for this CU.
#
cu_structs = DWARFStructs(
little_endian=self.config.little_endian,
dwarf_format=dwarf_format,
address_size=4)
cu_header = struct_parse(
cu_structs.Dwarf_CU_header, self.debug_info_sec.stream, offset)
if cu_header['address_size'] == 8:
cu_structs = DWARFStructs(
little_endian=self.config.little_endian,
dwarf_format=dwarf_format,
address_size=8)
cu_die_offset = self.debug_info_sec.stream.tell()
dwarf_assert(
self._is_supported_version(cu_header['version']),
"Expected supported DWARF version. Got '%s'" % cu_header['version'])
return CompileUnit(
header=cu_header,
dwarfinfo=self,
structs=cu_structs,
cu_offset=offset,
cu_die_offset=cu_die_offset)
def _is_supported_version(self, version):
""" DWARF version supported by this parser
"""
return 2 <= version <= 4
def _parse_line_program_at_offset(self, debug_line_offset, structs):
""" Given an offset to the .debug_line section, parse the line program
starting at this offset in the section and return it.
structs is the DWARFStructs object used to do this parsing.
"""
lineprog_header = struct_parse(
structs.Dwarf_lineprog_header,
self.debug_line_sec.stream,
debug_line_offset)
# Calculate the offset to the next line program (see DWARF 6.2.4)
end_offset = ( debug_line_offset + lineprog_header['unit_length'] +
structs.initial_length_field_size())
return LineProgram(
header=lineprog_header,
stream=self.debug_line_sec.stream,
structs=structs,
program_start_offset=self.debug_line_sec.stream.tell(),
program_end_offset=end_offset)
pyelftools-0.26/elftools/dwarf/enums.py 0000664 0000000 0000000 00000031705 13572204573 0020312 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/enums.py
#
# Mappings of enum names to values
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..construct import Pass
from ..common.py3compat import iteritems
ENUM_DW_TAG = dict(
DW_TAG_null = 0x00,
DW_TAG_array_type = 0x01,
DW_TAG_class_type = 0x02,
DW_TAG_entry_point = 0x03,
DW_TAG_enumeration_type = 0x04,
DW_TAG_formal_parameter = 0x05,
DW_TAG_imported_declaration = 0x08,
DW_TAG_label = 0x0a,
DW_TAG_lexical_block = 0x0b,
DW_TAG_member = 0x0d,
DW_TAG_pointer_type = 0x0f,
DW_TAG_reference_type = 0x10,
DW_TAG_compile_unit = 0x11,
DW_TAG_string_type = 0x12,
DW_TAG_structure_type = 0x13,
DW_TAG_subroutine_type = 0x15,
DW_TAG_typedef = 0x16,
DW_TAG_union_type = 0x17,
DW_TAG_unspecified_parameters = 0x18,
DW_TAG_variant = 0x19,
DW_TAG_common_block = 0x1a,
DW_TAG_common_inclusion = 0x1b,
DW_TAG_inheritance = 0x1c,
DW_TAG_inlined_subroutine = 0x1d,
DW_TAG_module = 0x1e,
DW_TAG_ptr_to_member_type = 0x1f,
DW_TAG_set_type = 0x20,
DW_TAG_subrange_type = 0x21,
DW_TAG_with_stmt = 0x22,
DW_TAG_access_declaration = 0x23,
DW_TAG_base_type = 0x24,
DW_TAG_catch_block = 0x25,
DW_TAG_const_type = 0x26,
DW_TAG_constant = 0x27,
DW_TAG_enumerator = 0x28,
DW_TAG_file_type = 0x29,
DW_TAG_friend = 0x2a,
DW_TAG_namelist = 0x2b,
DW_TAG_namelist_item = 0x2c,
DW_TAG_namelist_items = 0x2c,
DW_TAG_packed_type = 0x2d,
DW_TAG_subprogram = 0x2e,
# The DWARF standard defines these as _parameter, not _param, but we
# maintain compatibility with readelf.
DW_TAG_template_type_param = 0x2f,
DW_TAG_template_value_param = 0x30,
DW_TAG_thrown_type = 0x31,
DW_TAG_try_block = 0x32,
DW_TAG_variant_part = 0x33,
DW_TAG_variable = 0x34,
DW_TAG_volatile_type = 0x35,
DW_TAG_dwarf_procedure = 0x36,
DW_TAG_restrict_type = 0x37,
DW_TAG_interface_type = 0x38,
DW_TAG_namespace = 0x39,
DW_TAG_imported_module = 0x3a,
DW_TAG_unspecified_type = 0x3b,
DW_TAG_partial_unit = 0x3c,
DW_TAG_imported_unit = 0x3d,
DW_TAG_mutable_type = 0x3e,
DW_TAG_condition = 0x3f,
DW_TAG_shared_type = 0x40,
DW_TAG_type_unit = 0x41,
DW_TAG_rvalue_reference_type = 0x42,
DW_TAG_atomic_type = 0x47,
DW_TAG_lo_user = 0x4080,
DW_TAG_GNU_template_template_param = 0x4106,
DW_TAG_GNU_template_parameter_pack = 0x4107,
DW_TAG_GNU_formal_parameter_pack = 0x4108,
DW_TAG_GNU_call_site = 0x4109,
DW_TAG_GNU_call_site_parameter = 0x410a,
DW_TAG_hi_user = 0xffff,
_default_ = Pass,
)
ENUM_DW_CHILDREN = dict(
DW_CHILDREN_no = 0x00,
DW_CHILDREN_yes = 0x01,
)
ENUM_DW_AT = dict(
DW_AT_null = 0x00,
DW_AT_sibling = 0x01,
DW_AT_location = 0x02,
DW_AT_name = 0x03,
DW_AT_ordering = 0x09,
DW_AT_subscr_data = 0x0a,
DW_AT_byte_size = 0x0b,
DW_AT_bit_offset = 0x0c,
DW_AT_bit_size = 0x0d,
DW_AT_element_list = 0x0f,
DW_AT_stmt_list = 0x10,
DW_AT_low_pc = 0x11,
DW_AT_high_pc = 0x12,
DW_AT_language = 0x13,
DW_AT_member = 0x14,
DW_AT_discr = 0x15,
DW_AT_discr_value = 0x16,
DW_AT_visibility = 0x17,
DW_AT_import = 0x18,
DW_AT_string_length = 0x19,
DW_AT_common_reference = 0x1a,
DW_AT_comp_dir = 0x1b,
DW_AT_const_value = 0x1c,
DW_AT_containing_type = 0x1d,
DW_AT_default_value = 0x1e,
DW_AT_inline = 0x20,
DW_AT_is_optional = 0x21,
DW_AT_lower_bound = 0x22,
DW_AT_producer = 0x25,
DW_AT_prototyped = 0x27,
DW_AT_return_addr = 0x2a,
DW_AT_start_scope = 0x2c,
DW_AT_bit_stride = 0x2e,
DW_AT_stride_size = 0x2e,
DW_AT_upper_bound = 0x2f,
DW_AT_abstract_origin = 0x31,
DW_AT_accessibility = 0x32,
DW_AT_address_class = 0x33,
DW_AT_artificial = 0x34,
DW_AT_base_types = 0x35,
DW_AT_calling_convention = 0x36,
DW_AT_count = 0x37,
DW_AT_data_member_location = 0x38,
DW_AT_decl_column = 0x39,
DW_AT_decl_file = 0x3a,
DW_AT_decl_line = 0x3b,
DW_AT_declaration = 0x3c,
DW_AT_discr_list = 0x3d,
DW_AT_encoding = 0x3e,
DW_AT_external = 0x3f,
DW_AT_frame_base = 0x40,
DW_AT_friend = 0x41,
DW_AT_identifier_case = 0x42,
DW_AT_macro_info = 0x43,
DW_AT_namelist_item = 0x44,
DW_AT_priority = 0x45,
DW_AT_segment = 0x46,
DW_AT_specification = 0x47,
DW_AT_static_link = 0x48,
DW_AT_type = 0x49,
DW_AT_use_location = 0x4a,
DW_AT_variable_parameter = 0x4b,
DW_AT_virtuality = 0x4c,
DW_AT_vtable_elem_location = 0x4d,
DW_AT_allocated = 0x4e,
DW_AT_associated = 0x4f,
DW_AT_data_location = 0x50,
DW_AT_byte_stride = 0x51,
DW_AT_stride = 0x51,
DW_AT_entry_pc = 0x52,
DW_AT_use_UTF8 = 0x53,
DW_AT_extension = 0x54,
DW_AT_ranges = 0x55,
DW_AT_trampoline = 0x56,
DW_AT_call_column = 0x57,
DW_AT_call_file = 0x58,
DW_AT_call_line = 0x59,
DW_AT_description = 0x5a,
DW_AT_binary_scale = 0x5b,
DW_AT_decimal_scale = 0x5c,
DW_AT_small = 0x5d,
DW_AT_decimal_sign = 0x5e,
DW_AT_digit_count = 0x5f,
DW_AT_picture_string = 0x60,
DW_AT_mutable = 0x61,
DW_AT_threads_scaled = 0x62,
DW_AT_explicit = 0x63,
DW_AT_object_pointer = 0x64,
DW_AT_endianity = 0x65,
DW_AT_elemental = 0x66,
DW_AT_pure = 0x67,
DW_AT_recursive = 0x68,
DW_AT_signature = 0x69,
DW_AT_main_subprogram = 0x6a,
DW_AT_data_bit_offset = 0x6b,
DW_AT_const_expr = 0x6c,
DW_AT_enum_class = 0x6d,
DW_AT_linkage_name = 0x6e,
DW_AT_MIPS_fde = 0x2001,
DW_AT_MIPS_loop_begin = 0x2002,
DW_AT_MIPS_tail_loop_begin = 0x2003,
DW_AT_MIPS_epilog_begin = 0x2004,
DW_AT_MIPS_loop_unroll_factor = 0x2005,
DW_AT_MIPS_software_pipeline_depth = 0x2006,
DW_AT_MIPS_linkage_name = 0x2007,
DW_AT_MIPS_stride = 0x2008,
DW_AT_MIPS_abstract_name = 0x2009,
DW_AT_MIPS_clone_origin = 0x200a,
DW_AT_MIPS_has_inlines = 0x200b,
DW_AT_MIPS_stride_byte = 0x200c,
DW_AT_MIPS_stride_elem = 0x200d,
DW_AT_MIPS_ptr_dopetype = 0x200e,
DW_AT_MIPS_allocatable_dopetype = 0x200f,
DW_AT_MIPS_assumed_shape_dopetype = 0x2010,
DW_AT_MIPS_assumed_size = 0x2011,
DW_AT_sf_names = 0x2101,
DW_AT_src_info = 0x2102,
DW_AT_mac_info = 0x2103,
DW_AT_src_coords = 0x2104,
DW_AT_body_begin = 0x2105,
DW_AT_body_end = 0x2106,
DW_AT_GNU_vector = 0x2107,
DW_AT_GNU_template_name = 0x2110,
DW_AT_GNU_odr_signature = 0x210f,
DW_AT_GNU_call_site_value = 0x2111,
DW_AT_GNU_call_site_data_value = 0x2112,
DW_AT_GNU_call_site_target = 0x2113,
DW_AT_GNU_call_site_target_clobbered = 0x2114,
DW_AT_GNU_tail_call = 0x2115,
DW_AT_GNU_all_tail_call_sites = 0x2116,
DW_AT_GNU_all_call_sites = 0x2117,
DW_AT_GNU_all_source_call_sites = 0x2118,
DW_AT_GNU_macros = 0x2119,
DW_AT_GNU_deleted = 0x211a,
DW_AT_LLVM_include_path = 0x3e00,
DW_AT_LLVM_config_macros = 0x3e01,
DW_AT_LLVM_isysroot = 0x3e02,
DW_AT_LLVM_tag_offset = 0x3e03,
DW_AT_APPLE_optimized = 0x3fe1,
DW_AT_APPLE_flags = 0x3fe2,
DW_AT_APPLE_isa = 0x3fe3,
DW_AT_APPLE_block = 0x3fe4,
DW_AT_APPLE_major_runtime_vers = 0x3fe5,
DW_AT_APPLE_runtime_class = 0x3fe6,
DW_AT_APPLE_omit_frame_ptr = 0x3fe7,
DW_AT_APPLE_property_name = 0x3fe8,
DW_AT_APPLE_property_getter = 0x3fe9,
DW_AT_APPLE_property_setter = 0x3fea,
DW_AT_APPLE_property_attribute = 0x3feb,
DW_AT_APPLE_objc_complete_type = 0x3fec,
DW_AT_APPLE_property = 0x3fed,
_default_ = Pass,
)
ENUM_DW_FORM = dict(
DW_FORM_null = 0x00,
DW_FORM_addr = 0x01,
DW_FORM_block2 = 0x03,
DW_FORM_block4 = 0x04,
DW_FORM_data2 = 0x05,
DW_FORM_data4 = 0x06,
DW_FORM_data8 = 0x07,
DW_FORM_string = 0x08,
DW_FORM_block = 0x09,
DW_FORM_block1 = 0x0a,
DW_FORM_data1 = 0x0b,
DW_FORM_flag = 0x0c,
DW_FORM_sdata = 0x0d,
DW_FORM_strp = 0x0e,
DW_FORM_udata = 0x0f,
DW_FORM_ref_addr = 0x10,
DW_FORM_ref1 = 0x11,
DW_FORM_ref2 = 0x12,
DW_FORM_ref4 = 0x13,
DW_FORM_ref8 = 0x14,
DW_FORM_ref_udata = 0x15,
DW_FORM_indirect = 0x16,
DW_FORM_sec_offset = 0x17,
DW_FORM_exprloc = 0x18,
DW_FORM_flag_present = 0x19,
DW_FORM_strx = 0x1a,
DW_FORM_addrx = 0x1b,
DW_FORM_ref_sup4 = 0x1c,
DW_FORM_strp_sup = 0x1d,
DW_FORM_data16 = 0x1e,
DW_FORM_line_strp = 0x1f,
DW_FORM_ref_sig8 = 0x20,
DW_FORM_implicit_const = 0x21,
DW_FORM_loclistx = 0x22,
DW_FORM_rnglistx = 0x23,
DW_FORM_ref_sup8 = 0x24,
DW_FORM_strx1 = 0x25,
DW_FORM_strx2 = 0x26,
DW_FORM_strx3 = 0x27,
DW_FORM_strx4 = 0x28,
DW_FORM_addrx1 = 0x29,
DW_FORM_addrx2 = 0x2a,
DW_FORM_addrx3 = 0x2b,
DW_FORM_addrx4 = 0x2c,
DW_FORM_GNU_addr_index = 0x1f01,
DW_FORM_GNU_str_index = 0x1f02,
DW_FORM_GNU_ref_alt = 0x1f20,
DW_FORM_GNU_strp_alt = 0x1f21,
_default_ = Pass,
)
# Inverse mapping for ENUM_DW_FORM
DW_FORM_raw2name = dict((v, k) for k, v in iteritems(ENUM_DW_FORM))
# See http://www.airs.com/blog/archives/460
DW_EH_encoding_flags = dict(
DW_EH_PE_absptr = 0x00,
DW_EH_PE_uleb128 = 0x01,
DW_EH_PE_udata2 = 0x02,
DW_EH_PE_udata4 = 0x03,
DW_EH_PE_udata8 = 0x04,
DW_EH_PE_signed = 0x08,
DW_EH_PE_sleb128 = 0x09,
DW_EH_PE_sdata2 = 0x0a,
DW_EH_PE_sdata4 = 0x0b,
DW_EH_PE_sdata8 = 0x0c,
DW_EH_PE_pcrel = 0x10,
DW_EH_PE_textrel = 0x20,
DW_EH_PE_datarel = 0x30,
DW_EH_PE_funcrel = 0x40,
DW_EH_PE_aligned = 0x50,
DW_EH_PE_indirect = 0x80,
DW_EH_PE_omit = 0xff,
)
pyelftools-0.26/elftools/dwarf/lineprogram.py 0000664 0000000 0000000 00000026665 13572204573 0021513 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/lineprogram.py
#
# DWARF line number program
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
import os
import copy
from collections import namedtuple
from ..common.utils import struct_parse
from .constants import *
# LineProgramEntry - an entry in the line program.
# A line program is a sequence of encoded entries. Some of these entries add a
# new LineState (mapping between line and address), and some don't.
#
# command:
# The command/opcode - always numeric. For standard commands - it's the opcode
# that can be matched with one of the DW_LNS_* constants. For extended commands
# it's the extended opcode that can be matched with one of the DW_LNE_*
# constants. For special commands, it's the opcode itself.
#
# args:
# A list of decoded arguments of the command.
#
# is_extended:
# Since extended commands are encoded by a zero followed by an extended
# opcode, and these extended opcodes overlap with other opcodes, this
# flag is needed to mark that the command has an extended opcode.
#
# state:
# For commands that add a new state, it's the relevant LineState object.
# For commands that don't add a new state, it's None.
#
LineProgramEntry = namedtuple(
'LineProgramEntry', 'command is_extended args state')
class LineState(object):
""" Represents a line program state (or a "row" in the matrix
describing debug location information for addresses).
The instance variables of this class are the "state machine registers"
described in section 6.2.2 of DWARFv3
"""
def __init__(self, default_is_stmt):
self.address = 0
self.file = 1
self.line = 1
self.column = 0
self.op_index = 0
self.is_stmt = default_is_stmt
self.basic_block = False
self.end_sequence = False
self.prologue_end = False
self.epilogue_begin = False
self.isa = 0
def __repr__(self):
a = ['\n'
class LineProgram(object):
""" Builds a "line table", which is essentially the matrix described
in section 6.2 of DWARFv3. It's a list of LineState objects,
sorted by increasing address, so it can be used to obtain the
state information for each address.
"""
def __init__(self, header, stream, structs,
program_start_offset, program_end_offset):
"""
header:
The header of this line program. Note: LineProgram may modify
its header by appending file entries if DW_LNE_define_file
instructions are encountered.
stream:
The stream this program can be read from.
structs:
A DWARFStructs instance suitable for this line program
program_{start|end}_offset:
Offset in the debug_line section stream where this program
starts (the actual program, after the header), and where it
ends.
The actual range includes start but not end: [start, end - 1]
"""
self.stream = stream
self.header = header
self.structs = structs
self.program_start_offset = program_start_offset
self.program_end_offset = program_end_offset
self._decoded_entries = None
def get_entries(self):
""" Get the decoded entries for this line program. Return a list of
LineProgramEntry objects.
Note that this contains more information than absolutely required
for the line table. The line table can be easily extracted from
the list of entries by looking only at entries with non-None
state. The extra information is mainly for the purposes of display
with readelf and debugging.
"""
if self._decoded_entries is None:
self._decoded_entries = self._decode_line_program()
return self._decoded_entries
#------ PRIVATE ------#
def __getitem__(self, name):
""" Implement dict-like access to header entries
"""
return self.header[name]
def _decode_line_program(self):
entries = []
state = LineState(self.header['default_is_stmt'])
def add_entry_new_state(cmd, args, is_extended=False):
# Add an entry that sets a new state.
# After adding, clear some state registers.
entries.append(LineProgramEntry(
cmd, is_extended, args, copy.copy(state)))
state.basic_block = False
state.prologue_end = False
state.epilogue_begin = False
def add_entry_old_state(cmd, args, is_extended=False):
# Add an entry that doesn't visibly set a new state
entries.append(LineProgramEntry(cmd, is_extended, args, None))
offset = self.program_start_offset
while offset < self.program_end_offset:
opcode = struct_parse(
self.structs.Dwarf_uint8(''),
self.stream,
offset)
# As an exercise in avoiding premature optimization, if...elif
# chains are used here for standard and extended opcodes instead
# of dispatch tables. This keeps the code much cleaner. Besides,
# the majority of instructions in a typical program are special
# opcodes anyway.
if opcode >= self.header['opcode_base']:
# Special opcode (follow the recipe in 6.2.5.1)
maximum_operations_per_instruction = self['maximum_operations_per_instruction']
adjusted_opcode = opcode - self['opcode_base']
operation_advance = adjusted_opcode // self['line_range']
address_addend = (
self['minimum_instruction_length'] *
((state.op_index + operation_advance) //
maximum_operations_per_instruction))
state.address += address_addend
state.op_index = (state.op_index + operation_advance) % maximum_operations_per_instruction
line_addend = self['line_base'] + (adjusted_opcode % self['line_range'])
state.line += line_addend
add_entry_new_state(
opcode, [line_addend, address_addend, state.op_index])
elif opcode == 0:
# Extended opcode: start with a zero byte, followed by
# instruction size and the instruction itself.
inst_len = struct_parse(self.structs.Dwarf_uleb128(''),
self.stream)
ex_opcode = struct_parse(self.structs.Dwarf_uint8(''),
self.stream)
if ex_opcode == DW_LNE_end_sequence:
state.end_sequence = True
add_entry_new_state(ex_opcode, [], is_extended=True)
# reset state
state = LineState(self.header['default_is_stmt'])
elif ex_opcode == DW_LNE_set_address:
operand = struct_parse(self.structs.Dwarf_target_addr(''),
self.stream)
state.address = operand
add_entry_old_state(ex_opcode, [operand], is_extended=True)
elif ex_opcode == DW_LNE_define_file:
operand = struct_parse(
self.structs.Dwarf_lineprog_file_entry, self.stream)
self['file_entry'].append(operand)
add_entry_old_state(ex_opcode, [operand], is_extended=True)
else:
# Unknown, but need to roll forward the stream because the
# length is specified. Seek forward inst_len - 1 because
# we've already read the extended opcode, which takes part
# in the length.
self.stream.seek(inst_len - 1, os.SEEK_CUR)
else: # 0 < opcode < opcode_base
# Standard opcode
if opcode == DW_LNS_copy:
add_entry_new_state(opcode, [])
elif opcode == DW_LNS_advance_pc:
operand = struct_parse(self.structs.Dwarf_uleb128(''),
self.stream)
address_addend = (
operand * self.header['minimum_instruction_length'])
state.address += address_addend
add_entry_old_state(opcode, [address_addend])
elif opcode == DW_LNS_advance_line:
operand = struct_parse(self.structs.Dwarf_sleb128(''),
self.stream)
state.line += operand
elif opcode == DW_LNS_set_file:
operand = struct_parse(self.structs.Dwarf_uleb128(''),
self.stream)
state.file = operand
add_entry_old_state(opcode, [operand])
elif opcode == DW_LNS_set_column:
operand = struct_parse(self.structs.Dwarf_uleb128(''),
self.stream)
state.column = operand
add_entry_old_state(opcode, [operand])
elif opcode == DW_LNS_negate_stmt:
state.is_stmt = not state.is_stmt
add_entry_old_state(opcode, [])
elif opcode == DW_LNS_set_basic_block:
state.basic_block = True
add_entry_old_state(opcode, [])
elif opcode == DW_LNS_const_add_pc:
adjusted_opcode = 255 - self['opcode_base']
address_addend = ((adjusted_opcode // self['line_range']) *
self['minimum_instruction_length'])
state.address += address_addend
add_entry_old_state(opcode, [address_addend])
elif opcode == DW_LNS_fixed_advance_pc:
operand = struct_parse(self.structs.Dwarf_uint16(''),
self.stream)
state.address += operand
add_entry_old_state(opcode, [operand])
elif opcode == DW_LNS_set_prologue_end:
state.prologue_end = True
add_entry_old_state(opcode, [])
elif opcode == DW_LNS_set_epilogue_begin:
state.epilogue_begin = True
add_entry_old_state(opcode, [])
elif opcode == DW_LNS_set_isa:
operand = struct_parse(self.structs.Dwarf_uleb128(''),
self.stream)
state.isa = operand
add_entry_old_state(opcode, [operand])
else:
dwarf_assert(False, 'Invalid standard line program opcode: %s' % (
opcode,))
offset = self.stream.tell()
return entries
pyelftools-0.26/elftools/dwarf/locationlists.py 0000664 0000000 0000000 00000011641 13572204573 0022047 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/locationlists.py
#
# DWARF location lists section decoding (.debug_loc)
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
import os
from collections import namedtuple
from ..common.utils import struct_parse
LocationExpr = namedtuple('LocationExpr', 'loc_expr')
LocationEntry = namedtuple('LocationEntry', 'begin_offset end_offset loc_expr')
BaseAddressEntry = namedtuple('BaseAddressEntry', 'base_address')
class LocationLists(object):
""" A single location list is a Python list consisting of LocationEntry or
BaseAddressEntry objects.
"""
def __init__(self, stream, structs):
self.stream = stream
self.structs = structs
self._max_addr = 2 ** (self.structs.address_size * 8) - 1
def get_location_list_at_offset(self, offset):
""" Get a location list at the given offset in the section.
"""
self.stream.seek(offset, os.SEEK_SET)
return self._parse_location_list_from_stream()
def iter_location_lists(self):
""" Yield all location lists found in the section.
"""
# Just call _parse_location_list_from_stream until the stream ends
self.stream.seek(0, os.SEEK_END)
endpos = self.stream.tell()
self.stream.seek(0, os.SEEK_SET)
while self.stream.tell() < endpos:
yield self._parse_location_list_from_stream()
#------ PRIVATE ------#
def _parse_location_list_from_stream(self):
lst = []
while True:
begin_offset = struct_parse(
self.structs.Dwarf_target_addr(''), self.stream)
end_offset = struct_parse(
self.structs.Dwarf_target_addr(''), self.stream)
if begin_offset == 0 and end_offset == 0:
# End of list - we're done.
break
elif begin_offset == self._max_addr:
# Base address selection entry
lst.append(BaseAddressEntry(base_address=end_offset))
else:
# Location list entry
expr_len = struct_parse(
self.structs.Dwarf_uint16(''), self.stream)
loc_expr = [struct_parse(self.structs.Dwarf_uint8(''),
self.stream)
for i in range(expr_len)]
lst.append(LocationEntry(
begin_offset=begin_offset,
end_offset=end_offset,
loc_expr=loc_expr))
return lst
class LocationParser(object):
""" A parser for location information in DIEs.
Handles both location information contained within the attribute
itself (represented as a LocationExpr object) and references to
location lists in the .debug_loc section (represented as a
list).
"""
def __init__(self, location_lists):
self.location_lists = location_lists
@staticmethod
def attribute_has_location(attr, dwarf_version):
""" Checks if a DIE attribute contains location information.
"""
return (LocationParser._attribute_is_loclistptr_class(attr) and
(LocationParser._attribute_has_loc_expr(attr, dwarf_version) or
LocationParser._attribute_has_loc_list(attr, dwarf_version)))
def parse_from_attribute(self, attr, dwarf_version):
""" Parses a DIE attribute and returns either a LocationExpr or
a list.
"""
if self.attribute_has_location(attr, dwarf_version):
if self._attribute_has_loc_expr(attr, dwarf_version):
return LocationExpr(attr.value)
elif self._attribute_has_loc_list(attr, dwarf_version):
return self.location_lists.get_location_list_at_offset(
attr.value)
else:
raise ValueError("Attribute does not have location information")
#------ PRIVATE ------#
@staticmethod
def _attribute_has_loc_expr(attr, dwarf_version):
return (dwarf_version < 4 and attr.form == 'DW_FORM_block1' or
attr.form == 'DW_FORM_exprloc')
@staticmethod
def _attribute_has_loc_list(attr, dwarf_version):
return ((dwarf_version < 4 and
attr.form in ('DW_FORM_data4', 'DW_FORM_data8')) or
attr.form == 'DW_FORM_sec_offset')
@staticmethod
def _attribute_is_loclistptr_class(attr):
return (attr.name in ( 'DW_AT_location', 'DW_AT_string_length',
'DW_AT_const_value', 'DW_AT_return_addr',
'DW_AT_data_member_location',
'DW_AT_frame_base', 'DW_AT_segment',
'DW_AT_static_link', 'DW_AT_use_location',
'DW_AT_vtable_elem_location'))
pyelftools-0.26/elftools/dwarf/namelut.py 0000775 0000000 0000000 00000016233 13572204573 0020632 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/namelut.py
#
# DWARF pubtypes/pubnames section decoding (.debug_pubtypes, .debug_pubnames)
#
# Vijay Ramasami (rvijayc@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
import os
import collections
from collections import OrderedDict
from ..common.utils import struct_parse
from ..common.py3compat import Mapping
from bisect import bisect_right
import math
from ..construct import CString, Struct, If
NameLUTEntry = collections.namedtuple('NameLUTEntry', 'cu_ofs die_ofs')
class NameLUT(Mapping):
"""
A "Name LUT" holds any of the tables specified by .debug_pubtypes or
.debug_pubnames sections. This is basically a dictionary where the key is
the symbol name (either a public variable, function or a type), and the
value is the tuple (cu_offset, die_offset) corresponding to the variable.
The die_offset is an absolute offset (meaning, it can be used to search the
CU by iterating until a match is obtained).
An ordered dictionary is used to preserve the CU order (i.e, items are
stored on a per-CU basis (as it was originally in the .debug_* section).
Usage:
The NameLUT walks and talks like a dictionary and hence it can be used as
such. Some examples below:
# get the pubnames (a NameLUT from DWARF info).
pubnames = dwarf_info.get_pubnames()
# lookup a variable.
entry1 = pubnames["var_name1"]
entry2 = pubnames.get("var_name2", default=)
print(entry2.cu_ofs)
...
# iterate over items.
for (name, entry) in pubnames.items():
# do stuff with name, entry.cu_ofs, entry.die_ofs
# iterate over items on a per-CU basis.
import itertools
for cu_ofs, item_list in itertools.groupby(pubnames.items(),
key = lambda x: x[1].cu_ofs):
# items are now grouped by cu_ofs.
# item_list is an iterator yeilding NameLUTEntry'ies belonging
# to cu_ofs.
# We can parse the CU at cu_offset and use the parsed CU results
# to parse the pubname DIEs in the CU listed by item_list.
for item in item_list:
# work with item which is part of the CU with cu_ofs.
"""
def __init__(self, stream, size, structs):
self._stream = stream
self._size = size
self._structs = structs
# entries are lazily loaded on demand.
self._entries = None
# CU headers (for readelf).
self._cu_headers = None
def get_entries(self):
"""
Returns the parsed NameLUT entries. The returned object is a dictionary
with the symbol name as the key and NameLUTEntry(cu_ofs, die_ofs) as
the value.
This is useful when dealing with very large ELF files with millions of
entries. The returned entries can be pickled to a file and restored by
calling set_entries on subsequent loads.
"""
if self._entries is None:
self._entries, self._cu_headers = self._get_entries()
return self._entries
def set_entries(self, entries, cu_headers):
"""
Set the NameLUT entries from an external source. The input is a
dictionary with the symbol name as the key and NameLUTEntry(cu_ofs,
die_ofs) as the value.
This option is useful when dealing with very large ELF files with
millions of entries. The entries can be parsed once and pickled to a
file and can be restored via this function on subsequent loads.
"""
self._entries = entries
self._cu_headers = cu_headers
def __len__(self):
"""
Returns the number of entries in the NameLUT.
"""
if self._entries is None:
self._entries, self._cu_headers = self._get_entries()
return len(self._entries)
def __getitem__(self, name):
"""
Returns a namedtuple - NameLUTEntry(cu_ofs, die_ofs) - that corresponds
to the given symbol name.
"""
if self._entries is None:
self._entries, self._cu_headers = self._get_entries()
return self._entries.get(name)
def __iter__(self):
"""
Returns an iterator to the NameLUT dictionary.
"""
if self._entries is None:
self._entries, self._cu_headers = self._get_entries()
return iter(self._entries)
def items(self):
"""
Returns the NameLUT dictionary items.
"""
if self._entries is None:
self._entries, self._cu_headers = self._get_entries()
return self._entries.items()
def get(self, name, default=None):
"""
Returns NameLUTEntry(cu_ofs, die_ofs) for the provided symbol name or
None if the symbol does not exist in the corresponding section.
"""
if self._entries is None:
self._entries, self._cu_headers = self._get_entries()
return self._entries.get(name, default)
def get_cu_headers(self):
"""
Returns all CU headers. Mainly required for readelf.
"""
if self._cu_headers is None:
self._entries, self._cu_headers = self._get_entries()
return self._cu_headers
def _get_entries(self):
"""
Parse the (name, cu_ofs, die_ofs) information from this section and
store as a dictionary.
"""
self._stream.seek(0)
entries = OrderedDict()
cu_headers = []
offset = 0
# According to 6.1.1. of DWARFv4, each set of names is terminated by
# an offset field containing zero (and no following string). Because
# of sequential parsing, every next entry may be that terminator.
# So, field "name" is conditional.
entry_struct = Struct("Dwarf_offset_name_pair",
self._structs.Dwarf_offset('die_ofs'),
If(lambda ctx: ctx['die_ofs'], CString('name')))
# each run of this loop will fetch one CU worth of entries.
while offset < self._size:
# read the header for this CU.
namelut_hdr = struct_parse(self._structs.Dwarf_nameLUT_header,
self._stream, offset)
cu_headers.append(namelut_hdr)
# compute the next offset.
offset = (offset + namelut_hdr.unit_length +
self._structs.initial_length_field_size())
# before inner loop, latch data that will be used in the inner
# loop to avoid attribute access and other computation.
hdr_cu_ofs = namelut_hdr.debug_info_offset
# while die_ofs of the entry is non-zero (which indicates the end) ...
while True:
entry = struct_parse(entry_struct, self._stream)
# if it is zero, this is the terminating record.
if entry.die_ofs == 0:
break
# add this entry to the look-up dictionary.
entries[entry.name.decode('utf-8')] = NameLUTEntry(
cu_ofs = hdr_cu_ofs,
die_ofs = hdr_cu_ofs + entry.die_ofs)
# return the entries parsed so far.
return (entries, cu_headers)
pyelftools-0.26/elftools/dwarf/ranges.py 0000664 0000000 0000000 00000004331 13572204573 0020435 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/ranges.py
#
# DWARF ranges section decoding (.debug_ranges)
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
import os
from collections import namedtuple
from ..common.utils import struct_parse
RangeEntry = namedtuple('RangeEntry', 'begin_offset end_offset')
BaseAddressEntry = namedtuple('BaseAddressEntry', 'base_address')
class RangeLists(object):
""" A single range list is a Python list consisting of RangeEntry or
BaseAddressEntry objects.
"""
def __init__(self, stream, structs):
self.stream = stream
self.structs = structs
self._max_addr = 2 ** (self.structs.address_size * 8) - 1
def get_range_list_at_offset(self, offset):
""" Get a range list at the given offset in the section.
"""
self.stream.seek(offset, os.SEEK_SET)
return self._parse_range_list_from_stream()
def iter_range_lists(self):
""" Yield all range lists found in the section.
"""
# Just call _parse_range_list_from_stream until the stream ends
self.stream.seek(0, os.SEEK_END)
endpos = self.stream.tell()
self.stream.seek(0, os.SEEK_SET)
while self.stream.tell() < endpos:
yield self._parse_range_list_from_stream()
#------ PRIVATE ------#
def _parse_range_list_from_stream(self):
lst = []
while True:
begin_offset = struct_parse(
self.structs.Dwarf_target_addr(''), self.stream)
end_offset = struct_parse(
self.structs.Dwarf_target_addr(''), self.stream)
if begin_offset == 0 and end_offset == 0:
# End of list - we're done.
break
elif begin_offset == self._max_addr:
# Base address selection entry
lst.append(BaseAddressEntry(base_address=end_offset))
else:
# Range entry
lst.append(RangeEntry(
begin_offset=begin_offset,
end_offset=end_offset))
return lst
pyelftools-0.26/elftools/dwarf/structs.py 0000664 0000000 0000000 00000030747 13572204573 0020677 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: dwarf/structs.py
#
# Encapsulation of Construct structs for parsing DWARF, adjusted for correct
# endianness and word-size.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..construct import (
UBInt8, UBInt16, UBInt32, UBInt64, ULInt8, ULInt16, ULInt32, ULInt64,
SBInt8, SBInt16, SBInt32, SBInt64, SLInt8, SLInt16, SLInt32, SLInt64,
Adapter, Struct, ConstructError, If, Enum, Array, PrefixedArray,
CString, Embed, StaticField
)
from ..common.construct_utils import RepeatUntilExcluding, ULEB128, SLEB128
from .enums import *
class DWARFStructs(object):
""" Exposes Construct structs suitable for parsing information from DWARF
sections. Each compile unit in DWARF info can have its own structs
object. Keep in mind that these structs have to be given a name (by
calling them with a name) before being used for parsing (like other
Construct structs). Those that should be used without a name are marked
by (+).
Accessible attributes (mostly as described in chapter 7 of the DWARF
spec v3):
Dwarf_[u]int{8,16,32,64):
Data chunks of the common sizes
Dwarf_offset:
32-bit or 64-bit word, depending on dwarf_format
Dwarf_length:
32-bit or 64-bit word, depending on dwarf_format
Dwarf_target_addr:
32-bit or 64-bit word, depending on address size
Dwarf_initial_length:
"Initial length field" encoding
section 7.4
Dwarf_{u,s}leb128:
ULEB128 and SLEB128 variable-length encoding
Dwarf_CU_header (+):
Compilation unit header
Dwarf_abbrev_declaration (+):
Abbreviation table declaration - doesn't include the initial
code, only the contents.
Dwarf_dw_form (+):
A dictionary mapping 'DW_FORM_*' keys into construct Structs
that parse such forms. These Structs have already been given
dummy names.
Dwarf_lineprog_header (+):
Line program header
Dwarf_lineprog_file_entry (+):
A single file entry in a line program header or instruction
Dwarf_CIE_header (+):
A call-frame CIE
Dwarf_FDE_header (+):
A call-frame FDE
See also the documentation of public methods.
"""
def __init__(self,
little_endian, dwarf_format, address_size, dwarf_version=2):
""" dwarf_version:
Numeric DWARF version
little_endian:
True if the file is little endian, False if big
dwarf_format:
DWARF Format: 32 or 64-bit (see spec section 7.4)
address_size:
Target machine address size, in bytes (4 or 8). (See spec
section 7.5.1)
"""
assert dwarf_format == 32 or dwarf_format == 64
assert address_size == 8 or address_size == 4
self.little_endian = little_endian
self.dwarf_format = dwarf_format
self.address_size = address_size
self.dwarf_version = dwarf_version
self._create_structs()
def initial_length_field_size(self):
""" Size of an initial length field.
"""
return 4 if self.dwarf_format == 32 else 12
def _create_structs(self):
if self.little_endian:
self.Dwarf_uint8 = ULInt8
self.Dwarf_uint16 = ULInt16
self.Dwarf_uint32 = ULInt32
self.Dwarf_uint64 = ULInt64
self.Dwarf_offset = ULInt32 if self.dwarf_format == 32 else ULInt64
self.Dwarf_length = ULInt32 if self.dwarf_format == 32 else ULInt64
self.Dwarf_target_addr = (
ULInt32 if self.address_size == 4 else ULInt64)
self.Dwarf_int8 = SLInt8
self.Dwarf_int16 = SLInt16
self.Dwarf_int32 = SLInt32
self.Dwarf_int64 = SLInt64
else:
self.Dwarf_uint8 = UBInt8
self.Dwarf_uint16 = UBInt16
self.Dwarf_uint32 = UBInt32
self.Dwarf_uint64 = UBInt64
self.Dwarf_offset = UBInt32 if self.dwarf_format == 32 else UBInt64
self.Dwarf_length = UBInt32 if self.dwarf_format == 32 else UBInt64
self.Dwarf_target_addr = (
UBInt32 if self.address_size == 4 else UBInt64)
self.Dwarf_int8 = SBInt8
self.Dwarf_int16 = SBInt16
self.Dwarf_int32 = SBInt32
self.Dwarf_int64 = SBInt64
self._create_initial_length()
self._create_leb128()
self._create_cu_header()
self._create_abbrev_declaration()
self._create_dw_form()
self._create_lineprog_header()
self._create_callframe_entry_headers()
self._create_aranges_header()
self._create_nameLUT_header()
def _create_initial_length(self):
def _InitialLength(name):
# Adapts a Struct that parses forward a full initial length field.
# Only if the first word is the continuation value, the second
# word is parsed from the stream.
return _InitialLengthAdapter(
Struct(name,
self.Dwarf_uint32('first'),
If(lambda ctx: ctx.first == 0xFFFFFFFF,
self.Dwarf_uint64('second'),
elsevalue=None)))
self.Dwarf_initial_length = _InitialLength
def _create_leb128(self):
self.Dwarf_uleb128 = ULEB128
self.Dwarf_sleb128 = SLEB128
def _create_cu_header(self):
self.Dwarf_CU_header = Struct('Dwarf_CU_header',
self.Dwarf_initial_length('unit_length'),
self.Dwarf_uint16('version'),
self.Dwarf_offset('debug_abbrev_offset'),
self.Dwarf_uint8('address_size'))
def _create_abbrev_declaration(self):
self.Dwarf_abbrev_declaration = Struct('Dwarf_abbrev_entry',
Enum(self.Dwarf_uleb128('tag'), **ENUM_DW_TAG),
Enum(self.Dwarf_uint8('children_flag'), **ENUM_DW_CHILDREN),
RepeatUntilExcluding(
lambda obj, ctx:
obj.name == 'DW_AT_null' and obj.form == 'DW_FORM_null',
Struct('attr_spec',
Enum(self.Dwarf_uleb128('name'), **ENUM_DW_AT),
Enum(self.Dwarf_uleb128('form'), **ENUM_DW_FORM))))
def _create_dw_form(self):
self.Dwarf_dw_form = dict(
DW_FORM_addr=self.Dwarf_target_addr(''),
DW_FORM_block1=self._make_block_struct(self.Dwarf_uint8),
DW_FORM_block2=self._make_block_struct(self.Dwarf_uint16),
DW_FORM_block4=self._make_block_struct(self.Dwarf_uint32),
DW_FORM_block=self._make_block_struct(self.Dwarf_uleb128),
# All DW_FORM_data forms are assumed to be unsigned
DW_FORM_data1=self.Dwarf_uint8(''),
DW_FORM_data2=self.Dwarf_uint16(''),
DW_FORM_data4=self.Dwarf_uint32(''),
DW_FORM_data8=self.Dwarf_uint64(''),
DW_FORM_sdata=self.Dwarf_sleb128(''),
DW_FORM_udata=self.Dwarf_uleb128(''),
DW_FORM_string=CString(''),
DW_FORM_strp=self.Dwarf_offset(''),
DW_FORM_flag=self.Dwarf_uint8(''),
DW_FORM_ref1=self.Dwarf_uint8(''),
DW_FORM_ref2=self.Dwarf_uint16(''),
DW_FORM_ref4=self.Dwarf_uint32(''),
DW_FORM_ref8=self.Dwarf_uint64(''),
DW_FORM_ref_udata=self.Dwarf_uleb128(''),
DW_FORM_ref_addr=self.Dwarf_offset(''),
DW_FORM_indirect=self.Dwarf_uleb128(''),
# New forms in DWARFv4
DW_FORM_flag_present = StaticField('', 0),
DW_FORM_sec_offset = self.Dwarf_offset(''),
DW_FORM_exprloc = self._make_block_struct(self.Dwarf_uleb128),
DW_FORM_ref_sig8 = self.Dwarf_uint64(''),
DW_FORM_GNU_strp_alt=self.Dwarf_offset(''),
DW_FORM_GNU_ref_alt=self.Dwarf_offset(''),
DW_AT_GNU_all_call_sites=self.Dwarf_uleb128(''),
)
def _create_aranges_header(self):
self.Dwarf_aranges_header = Struct("Dwarf_aranges_header",
self.Dwarf_initial_length('unit_length'),
self.Dwarf_uint16('version'),
self.Dwarf_offset('debug_info_offset'), # a little tbd
self.Dwarf_uint8('address_size'),
self.Dwarf_uint8('segment_size')
)
def _create_nameLUT_header(self):
self.Dwarf_nameLUT_header = Struct("Dwarf_nameLUT_header",
self.Dwarf_initial_length('unit_length'),
self.Dwarf_uint16('version'),
self.Dwarf_offset('debug_info_offset'),
self.Dwarf_length('debug_info_length')
)
def _create_lineprog_header(self):
# A file entry is terminated by a NULL byte, so we don't want to parse
# past it. Therefore an If is used.
self.Dwarf_lineprog_file_entry = Struct('file_entry',
CString('name'),
If(lambda ctx: len(ctx.name) != 0,
Embed(Struct('',
self.Dwarf_uleb128('dir_index'),
self.Dwarf_uleb128('mtime'),
self.Dwarf_uleb128('length')))))
self.Dwarf_lineprog_header = Struct('Dwarf_lineprog_header',
self.Dwarf_initial_length('unit_length'),
self.Dwarf_uint16('version'),
self.Dwarf_offset('header_length'),
self.Dwarf_uint8('minimum_instruction_length'),
If(lambda ctx: ctx['version'] >= 4,
self.Dwarf_uint8("maximum_operations_per_instruction"),
1),
self.Dwarf_uint8('default_is_stmt'),
self.Dwarf_int8('line_base'),
self.Dwarf_uint8('line_range'),
self.Dwarf_uint8('opcode_base'),
Array(lambda ctx: ctx['opcode_base'] - 1,
self.Dwarf_uint8('standard_opcode_lengths')),
RepeatUntilExcluding(
lambda obj, ctx: obj == b'',
CString('include_directory')),
RepeatUntilExcluding(
lambda obj, ctx: len(obj.name) == 0,
self.Dwarf_lineprog_file_entry),
)
def _create_callframe_entry_headers(self):
self.Dwarf_CIE_header = Struct('Dwarf_CIE_header',
self.Dwarf_initial_length('length'),
self.Dwarf_offset('CIE_id'),
self.Dwarf_uint8('version'),
CString('augmentation'),
self.Dwarf_uleb128('code_alignment_factor'),
self.Dwarf_sleb128('data_alignment_factor'),
self.Dwarf_uleb128('return_address_register'))
self.EH_CIE_header = self.Dwarf_CIE_header
# The CIE header was modified in DWARFv4.
if self.dwarf_version == 4:
self.Dwarf_CIE_header = Struct('Dwarf_CIE_header',
self.Dwarf_initial_length('length'),
self.Dwarf_offset('CIE_id'),
self.Dwarf_uint8('version'),
CString('augmentation'),
self.Dwarf_uint8('address_size'),
self.Dwarf_uint8('segment_size'),
self.Dwarf_uleb128('code_alignment_factor'),
self.Dwarf_sleb128('data_alignment_factor'),
self.Dwarf_uleb128('return_address_register'))
self.Dwarf_FDE_header = Struct('Dwarf_FDE_header',
self.Dwarf_initial_length('length'),
self.Dwarf_offset('CIE_pointer'),
self.Dwarf_target_addr('initial_location'),
self.Dwarf_target_addr('address_range'))
def _make_block_struct(self, length_field):
""" Create a struct for DW_FORM_block
"""
return PrefixedArray(
subcon=self.Dwarf_uint8('elem'),
length_field=length_field(''))
class _InitialLengthAdapter(Adapter):
""" A standard Construct adapter that expects a sub-construct
as a struct with one or two values (first, second).
"""
def _decode(self, obj, context):
if obj.first < 0xFFFFFF00:
return obj.first
else:
if obj.first == 0xFFFFFFFF:
return obj.second
else:
raise ConstructError("Failed decoding initial length for %X" % (
obj.first))
pyelftools-0.26/elftools/elf/ 0000775 0000000 0000000 00000000000 13572204573 0016246 5 ustar 00root root 0000000 0000000 pyelftools-0.26/elftools/elf/__init__.py 0000664 0000000 0000000 00000000000 13572204573 0020345 0 ustar 00root root 0000000 0000000 pyelftools-0.26/elftools/elf/constants.py 0000664 0000000 0000000 00000006166 13572204573 0020645 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/constants.py
#
# Constants and flags, placed into classes for namespacing
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
class E_FLAGS(object):
""" Flag values for the e_flags field of the ELF header
"""
EF_ARM_EABIMASK=0xFF000000
EF_ARM_EABI_VER1=0x01000000
EF_ARM_EABI_VER2=0x02000000
EF_ARM_EABI_VER3=0x03000000
EF_ARM_EABI_VER4=0x04000000
EF_ARM_EABI_VER5=0x05000000
EF_ARM_GCCMASK=0x00400FFF
EF_ARM_RELEXEC=0x01
EF_ARM_HASENTRY=0x02
EF_ARM_SYMSARESORTED=0x04
EF_ARM_DYNSYMSUSESEGIDX=0x8
EF_ARM_MAPSYMSFIRST=0x10
EF_ARM_LE8=0x00400000
EF_ARM_BE8=0x00800000
EF_ARM_ABI_FLOAT_SOFT=0x00000200
EF_ARM_ABI_FLOAT_HARD=0x00000400
EF_MIPS_NOREORDER=1
EF_MIPS_PIC=2
EF_MIPS_CPIC=4
EF_MIPS_XGOT=8
EF_MIPS_64BIT_WHIRL=16
EF_MIPS_ABI2=32
EF_MIPS_ABI_ON32=64
EF_MIPS_32BITMODE = 256
EF_MIPS_NAN2008=1024
EF_MIPS_ARCH=0xf0000000
EF_MIPS_ARCH_1=0x00000000
EF_MIPS_ARCH_2=0x10000000
EF_MIPS_ARCH_3=0x20000000
EF_MIPS_ARCH_4=0x30000000
EF_MIPS_ARCH_5=0x40000000
EF_MIPS_ARCH_32=0x50000000
EF_MIPS_ARCH_64=0x60000000
EF_MIPS_ARCH_32R2=0x70000000
EF_MIPS_ARCH_64R2=0x80000000
class E_FLAGS_MASKS(object):
"""Masks to be used for convenience when working with E_FLAGS
This is a simplified approach that is also used by GNU binutils
readelf
"""
EFM_MIPS_ABI = 0x0000F000
EFM_MIPS_ABI_O32 = 0x00001000
EFM_MIPS_ABI_O64 = 0x00002000
EFM_MIPS_ABI_EABI32 = 0x00003000
EFM_MIPS_ABI_EABI64 = 0x00004000
class SHN_INDICES(object):
""" Special section indices
"""
SHN_UNDEF=0
SHN_LORESERVE=0xff00
SHN_LOPROC=0xff00
SHN_HIPROC=0xff1f
SHN_ABS=0xfff1
SHN_COMMON=0xfff2
SHN_HIRESERVE=0xffff
SHN_XINDEX=0xffff
class SH_FLAGS(object):
""" Flag values for the sh_flags field of section headers
"""
SHF_WRITE=0x1
SHF_ALLOC=0x2
SHF_EXECINSTR=0x4
SHF_MERGE=0x10
SHF_STRINGS=0x20
SHF_INFO_LINK=0x40
SHF_LINK_ORDER=0x80
SHF_OS_NONCONFORMING=0x100
SHF_GROUP=0x200
SHF_TLS=0x400
SHF_COMPRESSED=0x800
SHF_MASKOS=0x0ff00000
SHF_EXCLUDE=0x80000000
SHF_MASKPROC=0xf0000000
class P_FLAGS(object):
""" Flag values for the p_flags field of program headers
"""
PF_X=0x1
PF_W=0x2
PF_R=0x4
PF_MASKOS=0x00FF0000
PF_MASKPROC=0xFF000000
# symbol info flags for entries
# in the .SUNW_syminfo section
class SUNW_SYMINFO_FLAGS(object):
""" Flags for the si_flags field of entries
in the .SUNW_syminfo section
"""
SYMINFO_FLG_DIRECT=0x1
SYMINFO_FLG_FILTER=0x2
SYMINFO_FLG_COPY=0x4
SYMINFO_FLG_LAZYLOAD=0x8
SYMINFO_FLG_DIRECTBIND=0x10
SYMINFO_FLG_NOEXTDIRECT=0x20
SYMINFO_FLG_AUXILIARY=0x40
SYMINFO_FLG_INTERPOSE=0x80
SYMINFO_FLG_CAP=0x100
SYMINFO_FLG_DEFERRED=0x200
class VER_FLAGS(object):
VER_FLG_BASE=0x1
VER_FLG_WEAK=0x2
VER_FLG_INFO=0x4
pyelftools-0.26/elftools/elf/descriptions.py 0000664 0000000 0000000 00000052471 13572204573 0021337 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/descriptions.py
#
# Textual descriptions of the various enums and flags of ELF
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from .enums import (
ENUM_D_TAG, ENUM_E_VERSION, ENUM_P_TYPE_BASE, ENUM_SH_TYPE_BASE,
ENUM_RELOC_TYPE_i386, ENUM_RELOC_TYPE_x64,
ENUM_RELOC_TYPE_ARM, ENUM_RELOC_TYPE_AARCH64, ENUM_RELOC_TYPE_MIPS,
ENUM_ATTR_TAG_ARM, ENUM_DT_FLAGS, ENUM_DT_FLAGS_1)
from .constants import P_FLAGS, SH_FLAGS, SUNW_SYMINFO_FLAGS, VER_FLAGS
from ..common.py3compat import iteritems
def describe_ei_class(x):
return _DESCR_EI_CLASS.get(x, _unknown)
def describe_ei_data(x):
return _DESCR_EI_DATA.get(x, _unknown)
def describe_ei_version(x):
s = '%d' % ENUM_E_VERSION[x]
if x == 'EV_CURRENT':
s += ' (current)'
return s
def describe_ei_osabi(x):
return _DESCR_EI_OSABI.get(x, _unknown)
def describe_e_type(x):
return _DESCR_E_TYPE.get(x, _unknown)
def describe_e_machine(x):
return _DESCR_E_MACHINE.get(x, _unknown)
def describe_e_version_numeric(x):
return '0x%x' % ENUM_E_VERSION[x]
def describe_p_type(x):
if x in _DESCR_P_TYPE:
return _DESCR_P_TYPE.get(x)
elif x >= ENUM_P_TYPE_BASE['PT_LOOS'] and x <= ENUM_P_TYPE_BASE['PT_HIOS']:
return 'LOOS+%lx' % (x - ENUM_P_TYPE_BASE['PT_LOOS'])
else:
return _unknown
def describe_p_flags(x):
s = ''
for flag in (P_FLAGS.PF_R, P_FLAGS.PF_W, P_FLAGS.PF_X):
s += _DESCR_P_FLAGS[flag] if (x & flag) else ' '
return s
def describe_sh_type(x):
if x in _DESCR_SH_TYPE:
return _DESCR_SH_TYPE.get(x)
elif (x >= ENUM_SH_TYPE_BASE['SHT_LOOS'] and
x < ENUM_SH_TYPE_BASE['SHT_GNU_versym']):
return 'loos+%lx' % (x - ENUM_SH_TYPE_BASE['SHT_LOOS'])
else:
return _unknown
def describe_sh_flags(x):
s = ''
for flag in (
SH_FLAGS.SHF_WRITE, SH_FLAGS.SHF_ALLOC, SH_FLAGS.SHF_EXECINSTR,
SH_FLAGS.SHF_MERGE, SH_FLAGS.SHF_STRINGS, SH_FLAGS.SHF_INFO_LINK,
SH_FLAGS.SHF_LINK_ORDER, SH_FLAGS.SHF_OS_NONCONFORMING,
SH_FLAGS.SHF_GROUP, SH_FLAGS.SHF_TLS, SH_FLAGS.SHF_EXCLUDE):
s += _DESCR_SH_FLAGS[flag] if (x & flag) else ''
return s
def describe_symbol_type(x):
return _DESCR_ST_INFO_TYPE.get(x, _unknown)
def describe_symbol_bind(x):
return _DESCR_ST_INFO_BIND.get(x, _unknown)
def describe_symbol_visibility(x):
return _DESCR_ST_VISIBILITY.get(x, _unknown)
def describe_symbol_shndx(x):
return _DESCR_ST_SHNDX.get(x, '%3s' % x)
def describe_reloc_type(x, elffile):
arch = elffile.get_machine_arch()
if arch == 'x86':
return _DESCR_RELOC_TYPE_i386.get(x, _unknown)
elif arch == 'x64':
return _DESCR_RELOC_TYPE_x64.get(x, _unknown)
elif arch == 'ARM':
return _DESCR_RELOC_TYPE_ARM.get(x, _unknown)
elif arch == 'AArch64':
return _DESCR_RELOC_TYPE_AARCH64.get(x, _unknown)
elif arch == 'MIPS':
return _DESCR_RELOC_TYPE_MIPS.get(x, _unknown)
else:
return 'unrecognized: %-7x' % (x & 0xFFFFFFFF)
def describe_dyn_tag(x):
return _DESCR_D_TAG.get(x, _unknown)
def describe_dt_flags(x):
return ' '.join(key[3:] for key, val in
sorted(ENUM_DT_FLAGS.items(), key=lambda t: t[1]) if x & val)
def describe_dt_flags_1(x):
return ' '.join(key[5:] for key, val in
sorted(ENUM_DT_FLAGS_1.items(), key=lambda t: t[1]) if x & val)
def describe_syminfo_flags(x):
return ''.join(_DESCR_SYMINFO_FLAGS[flag] for flag in (
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_CAP,
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_DIRECT,
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_FILTER,
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_AUXILIARY,
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_DIRECTBIND,
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_COPY,
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_LAZYLOAD,
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_NOEXTDIRECT,
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_INTERPOSE,
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_DEFERRED) if x & flag)
def describe_symbol_boundto(x):
return _DESCR_SYMINFO_BOUNDTO.get(x, '%3s' % x)
def describe_ver_flags(x):
return ' | '.join(_DESCR_VER_FLAGS[flag] for flag in (
VER_FLAGS.VER_FLG_WEAK,
VER_FLAGS.VER_FLG_BASE,
VER_FLAGS.VER_FLG_INFO) if x & flag)
def describe_note(x):
n_desc = x['n_desc']
desc = ''
if x['n_type'] == 'NT_GNU_ABI_TAG':
desc = '\n OS: %s, ABI: %d.%d.%d' % (
_DESCR_NOTE_ABI_TAG_OS.get(n_desc['abi_os'], _unknown),
n_desc['abi_major'], n_desc['abi_minor'], n_desc['abi_tiny'])
elif x['n_type'] == 'NT_GNU_BUILD_ID':
desc = '\n Build ID: %s' % (n_desc)
else:
desc = '\n description data: {}'.format(' '.join(
'{:02x}'.format(ord(byte)) for byte in n_desc
))
note_type = (x['n_type'] if isinstance(x['n_type'], str)
else 'Unknown note type:')
note_type_desc = ('0x%.8x' % x['n_type']
if isinstance(x['n_type'], int) else
_DESCR_NOTE_N_TYPE.get(x['n_type'], _unknown))
return '%s (%s)%s' % (note_type, note_type_desc, desc)
def describe_attr_tag_arm(tag, val, extra):
idx = ENUM_ATTR_TAG_ARM[tag] - 1
d_entry = _DESCR_ATTR_VAL_ARM[idx]
if d_entry is None:
if tag == 'TAG_COMPATIBILITY':
return (_DESCR_ATTR_TAG_ARM[tag]
+ 'flag = %d, vendor = %s' % (val, extra))
elif tag == 'TAG_ALSO_COMPATIBLE_WITH':
if val.tag == 'TAG_CPU_ARCH':
return _DESCR_ATTR_TAG_ARM[tag] + d_entry[val]
else:
return _DESCR_ATTR_TAG_ARM[tag] + '??? (%d)' % val.tag
elif tag == 'TAG_NODEFAULTS':
return _DESCR_ATTR_TAG_ARM[tag] + 'True'
s = _DESCR_ATTR_TAG_ARM[tag]
s += '"%s"' % val if val else ''
return s
else:
return _DESCR_ATTR_TAG_ARM[tag] + d_entry[val]
#-------------------------------------------------------------------------------
_unknown = ''
_DESCR_EI_CLASS = dict(
ELFCLASSNONE='none',
ELFCLASS32='ELF32',
ELFCLASS64='ELF64',
)
_DESCR_EI_DATA = dict(
ELFDATANONE='none',
ELFDATA2LSB="2's complement, little endian",
ELFDATA2MSB="2's complement, big endian",
)
_DESCR_EI_OSABI = dict(
ELFOSABI_SYSV='UNIX - System V',
ELFOSABI_HPUX='UNIX - HP-UX',
ELFOSABI_NETBSD='UNIX - NetBSD',
ELFOSABI_LINUX='UNIX - Linux',
ELFOSABI_HURD='UNIX - GNU/Hurd',
ELFOSABI_SOLARIS='UNIX - Solaris',
ELFOSABI_AIX='UNIX - AIX',
ELFOSABI_IRIX='UNIX - IRIX',
ELFOSABI_FREEBSD='UNIX - FreeBSD',
ELFOSABI_TRU64='UNIX - TRU64',
ELFOSABI_MODESTO='Novell - Modesto',
ELFOSABI_OPENBSD='UNIX - OpenBSD',
ELFOSABI_OPENVMS='VMS - OpenVMS',
ELFOSABI_NSK='HP - Non-Stop Kernel',
ELFOSABI_AROS='AROS',
ELFOSABI_FENIXOS='Fenix OS',
ELFOSABI_CLOUD='Nuxi - CloudABI',
ELFOSABI_SORTIX='Sortix',
ELFOSABI_ARM_AEABI='ARM - EABI',
ELFOSABI_ARM='ARM - ABI',
ELFOSABI_STANDALONE='Standalone App',
)
_DESCR_E_TYPE = dict(
ET_NONE='NONE (None)',
ET_REL='REL (Relocatable file)',
ET_EXEC='EXEC (Executable file)',
ET_DYN='DYN (Shared object file)',
ET_CORE='CORE (Core file)',
PROC_SPECIFIC='Processor Specific',
)
_DESCR_E_MACHINE = dict(
EM_NONE='None',
EM_M32='WE32100',
EM_SPARC='Sparc',
EM_386='Intel 80386',
EM_68K='MC68000',
EM_88K='MC88000',
EM_860='Intel 80860',
EM_MIPS='MIPS R3000',
EM_S370='IBM System/370',
EM_MIPS_RS4_BE='MIPS 4000 big-endian',
EM_IA_64='Intel IA-64',
EM_X86_64='Advanced Micro Devices X86-64',
EM_AVR='Atmel AVR 8-bit microcontroller',
EM_ARM='ARM',
EM_AARCH64='AArch64',
EM_BLACKFIN='Analog Devices Blackfin',
EM_PPC='PowerPC',
RESERVED='RESERVED',
)
_DESCR_P_TYPE = dict(
PT_NULL='NULL',
PT_LOAD='LOAD',
PT_DYNAMIC='DYNAMIC',
PT_INTERP='INTERP',
PT_NOTE='NOTE',
PT_SHLIB='SHLIB',
PT_PHDR='PHDR',
PT_GNU_EH_FRAME='GNU_EH_FRAME',
PT_GNU_STACK='GNU_STACK',
PT_GNU_RELRO='GNU_RELRO',
PT_ARM_ARCHEXT='ARM_ARCHEXT',
PT_ARM_EXIDX='EXIDX', # binutils calls this EXIDX, not ARM_EXIDX
PT_AARCH64_ARCHEXT='AARCH64_ARCHEXT',
PT_AARCH64_UNWIND='AARCH64_UNWIND',
PT_TLS='TLS',
PT_MIPS_ABIFLAGS='ABIFLAGS'
)
_DESCR_P_FLAGS = {
P_FLAGS.PF_X: 'E',
P_FLAGS.PF_R: 'R',
P_FLAGS.PF_W: 'W',
}
_DESCR_SH_TYPE = dict(
SHT_NULL='NULL',
SHT_PROGBITS='PROGBITS',
SHT_SYMTAB='SYMTAB',
SHT_STRTAB='STRTAB',
SHT_RELA='RELA',
SHT_HASH='HASH',
SHT_DYNAMIC='DYNAMIC',
SHT_NOTE='NOTE',
SHT_NOBITS='NOBITS',
SHT_REL='REL',
SHT_SHLIB='SHLIB',
SHT_DYNSYM='DYNSYM',
SHT_INIT_ARRAY='INIT_ARRAY',
SHT_FINI_ARRAY='FINI_ARRAY',
SHT_PREINIT_ARRAY='PREINIT_ARRAY',
SHT_GNU_ATTRIBUTES='GNU_ATTRIBUTES',
SHT_GNU_HASH='GNU_HASH',
SHT_GROUP='GROUP',
SHT_SYMTAB_SHNDX='SYMTAB SECTION INDICIES',
SHT_GNU_verdef='VERDEF',
SHT_GNU_verneed='VERNEED',
SHT_GNU_versym='VERSYM',
SHT_GNU_LIBLIST='GNU_LIBLIST',
SHT_ARM_EXIDX='ARM_EXIDX',
SHT_ARM_PREEMPTMAP='ARM_PREEMPTMAP',
SHT_ARM_ATTRIBUTES='ARM_ATTRIBUTES',
SHT_ARM_DEBUGOVERLAY='ARM_DEBUGOVERLAY',
SHT_MIPS_LIBLIST='MIPS_LIBLIST',
SHT_MIPS_DEBUG='MIPS_DEBUG',
SHT_MIPS_REGINFO='MIPS_REGINFO',
SHT_MIPS_PACKAGE='MIPS_PACKAGE',
SHT_MIPS_PACKSYM='MIPS_PACKSYM',
SHT_MIPS_RELD='MIPS_RELD',
SHT_MIPS_IFACE='MIPS_IFACE',
SHT_MIPS_CONTENT='MIPS_CONTENT',
SHT_MIPS_OPTIONS='MIPS_OPTIONS',
SHT_MIPS_SHDR='MIPS_SHDR',
SHT_MIPS_FDESC='MIPS_FDESC',
SHT_MIPS_EXTSYM='MIPS_EXTSYM',
SHT_MIPS_DENSE='MIPS_DENSE',
SHT_MIPS_PDESC='MIPS_PDESC',
SHT_MIPS_LOCSYM='MIPS_LOCSYM',
SHT_MIPS_AUXSYM='MIPS_AUXSYM',
SHT_MIPS_OPTSYM='MIPS_OPTSYM',
SHT_MIPS_LOCSTR='MIPS_LOCSTR',
SHT_MIPS_LINE='MIPS_LINE',
SHT_MIPS_RFDESC='MIPS_RFDESC',
SHT_MIPS_DELTASYM='MIPS_DELTASYM',
SHT_MIPS_DELTAINST='MIPS_DELTAINST',
SHT_MIPS_DELTACLASS='MIPS_DELTACLASS',
SHT_MIPS_DWARF='MIPS_DWARF',
SHT_MIPS_DELTADECL='MIPS_DELTADECL',
SHT_MIPS_SYMBOL_LIB='MIPS_SYMBOL_LIB',
SHT_MIPS_EVENTS='MIPS_EVENTS',
SHT_MIPS_TRANSLATE='MIPS_TRANSLATE',
SHT_MIPS_PIXIE='MIPS_PIXIE',
SHT_MIPS_XLATE='MIPS_XLATE',
SHT_MIPS_XLATE_DEBUG='MIPS_XLATE_DEBUG',
SHT_MIPS_WHIRL='MIPS_WHIRL',
SHT_MIPS_EH_REGION='MIPS_EH_REGION',
SHT_MIPS_XLATE_OLD='MIPS_XLATE_OLD',
SHT_MIPS_PDR_EXCEPTION='MIPS_PDR_EXCEPTION',
)
_DESCR_SH_FLAGS = {
SH_FLAGS.SHF_WRITE: 'W',
SH_FLAGS.SHF_ALLOC: 'A',
SH_FLAGS.SHF_EXECINSTR: 'X',
SH_FLAGS.SHF_MERGE: 'M',
SH_FLAGS.SHF_STRINGS: 'S',
SH_FLAGS.SHF_INFO_LINK: 'I',
SH_FLAGS.SHF_LINK_ORDER: 'L',
SH_FLAGS.SHF_OS_NONCONFORMING: 'O',
SH_FLAGS.SHF_GROUP: 'G',
SH_FLAGS.SHF_TLS: 'T',
SH_FLAGS.SHF_EXCLUDE: 'E',
}
_DESCR_ST_INFO_TYPE = dict(
STT_NOTYPE='NOTYPE',
STT_OBJECT='OBJECT',
STT_FUNC='FUNC',
STT_SECTION='SECTION',
STT_FILE='FILE',
STT_COMMON='COMMON',
STT_TLS='TLS',
STT_NUM='NUM',
STT_RELC='RELC',
STT_SRELC='SRELC',
)
_DESCR_ST_INFO_BIND = dict(
STB_LOCAL='LOCAL',
STB_GLOBAL='GLOBAL',
STB_WEAK='WEAK',
)
_DESCR_ST_VISIBILITY = dict(
STV_DEFAULT='DEFAULT',
STV_INTERNAL='INTERNAL',
STV_HIDDEN='HIDDEN',
STV_PROTECTED='PROTECTED',
STV_EXPORTED='EXPORTED',
STV_SINGLETON='SINGLETON',
STV_ELIMINATE='ELIMINATE',
)
_DESCR_ST_SHNDX = dict(
SHN_UNDEF='UND',
SHN_ABS='ABS',
SHN_COMMON='COM',
)
_DESCR_SYMINFO_FLAGS = {
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_DIRECT: 'D',
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_DIRECTBIND: 'B',
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_COPY: 'C',
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_LAZYLOAD: 'L',
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_NOEXTDIRECT: 'N',
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_AUXILIARY: 'A',
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_FILTER: 'F',
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_INTERPOSE: 'I',
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_CAP: 'S',
SUNW_SYMINFO_FLAGS.SYMINFO_FLG_DEFERRED: 'P',
}
_DESCR_SYMINFO_BOUNDTO = dict(
SYMINFO_BT_SELF='',
SYMINFO_BT_PARENT='',
SYMINFO_BT_NONE='',
SYMINFO_BT_EXTERN='',
)
_DESCR_VER_FLAGS = {
0: '',
VER_FLAGS.VER_FLG_BASE: 'BASE',
VER_FLAGS.VER_FLG_WEAK: 'WEAK',
VER_FLAGS.VER_FLG_INFO: 'INFO',
}
# PT_NOTE section types
_DESCR_NOTE_N_TYPE = dict(
NT_GNU_ABI_TAG='ABI version tag',
NT_GNU_HWCAP='DSO-supplied software HWCAP info',
NT_GNU_BUILD_ID='unique build ID bitstring',
NT_GNU_GOLD_VERSION='gold version',
)
# Values in GNU .note.ABI-tag notes (n_type=='NT_GNU_ABI_TAG')
_DESCR_NOTE_ABI_TAG_OS = dict(
ELF_NOTE_OS_LINUX='Linux',
ELF_NOTE_OS_GNU='GNU',
ELF_NOTE_OS_SOLARIS2='Solaris 2',
ELF_NOTE_OS_FREEBSD='FreeBSD',
ELF_NOTE_OS_NETBSD='NetBSD',
ELF_NOTE_OS_SYLLABLE='Syllable',
)
def _reverse_dict(d, low_priority=()):
"""
This is a tiny helper function to "reverse" the keys/values of a dictionary
provided in the first argument, i.e. {k: v} becomes {v: k}.
The second argument (optional) provides primitive control over what to do in
the case of conflicting values - if a value is present in this list, it will
not override any other entries of the same value.
"""
out = {}
for k, v in iteritems(d):
if v in out and k in low_priority:
continue
out[v] = k
return out
_DESCR_RELOC_TYPE_i386 = _reverse_dict(ENUM_RELOC_TYPE_i386)
_DESCR_RELOC_TYPE_x64 = _reverse_dict(ENUM_RELOC_TYPE_x64)
_DESCR_RELOC_TYPE_ARM = _reverse_dict(ENUM_RELOC_TYPE_ARM)
_DESCR_RELOC_TYPE_AARCH64 = _reverse_dict(ENUM_RELOC_TYPE_AARCH64)
_DESCR_RELOC_TYPE_MIPS = _reverse_dict(ENUM_RELOC_TYPE_MIPS)
_low_priority_D_TAG = (
# these are 'meta-tags' marking semantics of numeric ranges of the enum
# they should not override other tags with the same numbers
# see https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-42444.html
'DT_LOOS',
'DT_HIOS',
'DT_LOPROC',
'DT_HIPROC',
'DT_ENCODING',
)
_DESCR_D_TAG = _reverse_dict(ENUM_D_TAG, low_priority=_low_priority_D_TAG)
_DESCR_ATTR_TAG_ARM = dict(
TAG_FILE='File Attributes',
TAG_SECTION='Section Attributes:',
TAG_SYMBOL='Symbol Attributes:',
TAG_CPU_RAW_NAME='Tag_CPU_raw_name: ',
TAG_CPU_NAME='Tag_CPU_name: ',
TAG_CPU_ARCH='Tag_CPU_arch: ',
TAG_CPU_ARCH_PROFILE='Tag_CPU_arch_profile: ',
TAG_ARM_ISA_USE='Tag_ARM_ISA_use: ',
TAG_THUMB_ISA_USE='Tag_Thumb_ISA_use: ',
TAG_FP_ARCH='Tag_FP_arch: ',
TAG_WMMX_ARCH='Tag_WMMX_arch: ',
TAG_ADVANCED_SIMD_ARCH='Tag_Advanced_SIMD_arch: ',
TAG_PCS_CONFIG='Tag_PCS_config: ',
TAG_ABI_PCS_R9_USE='Tag_ABI_PCS_R9_use: ',
TAG_ABI_PCS_RW_DATA='Tag_ABI_PCS_RW_use: ',
TAG_ABI_PCS_RO_DATA='Tag_ABI_PCS_RO_use: ',
TAG_ABI_PCS_GOT_USE='Tag_ABI_PCS_GOT_use: ',
TAG_ABI_PCS_WCHAR_T='Tag_ABI_PCS_wchar_t: ',
TAG_ABI_FP_ROUNDING='Tag_ABI_FP_rounding: ',
TAG_ABI_FP_DENORMAL='Tag_ABI_FP_denormal: ',
TAG_ABI_FP_EXCEPTIONS='Tag_ABI_FP_exceptions: ',
TAG_ABI_FP_USER_EXCEPTIONS='Tag_ABI_FP_user_exceptions: ',
TAG_ABI_FP_NUMBER_MODEL='Tag_ABI_FP_number_model: ',
TAG_ABI_ALIGN_NEEDED='Tag_ABI_align_needed: ',
TAG_ABI_ALIGN_PRESERVED='Tag_ABI_align_preserved: ',
TAG_ABI_ENUM_SIZE='Tag_ABI_enum_size: ',
TAG_ABI_HARDFP_USE='Tag_ABI_HardFP_use: ',
TAG_ABI_VFP_ARGS='Tag_ABI_VFP_args: ',
TAG_ABI_WMMX_ARGS='Tag_ABI_WMMX_args: ',
TAG_ABI_OPTIMIZATION_GOALS='Tag_ABI_optimization_goals: ',
TAG_ABI_FP_OPTIMIZATION_GOALS='Tag_ABI_FP_optimization_goals: ',
TAG_COMPATIBILITY='Tag_compatibility: ',
TAG_CPU_UNALIGNED_ACCESS='Tag_CPU_unaligned_access: ',
TAG_FP_HP_EXTENSION='Tag_FP_HP_extension: ',
TAG_ABI_FP_16BIT_FORMAT='Tag_ABI_FP_16bit_format: ',
TAG_MPEXTENSION_USE='Tag_MPextension_use: ',
TAG_DIV_USE='Tag_DIV_use: ',
TAG_NODEFAULTS='Tag_nodefaults: ',
TAG_ALSO_COMPATIBLE_WITH='Tag_also_compatible_with: ',
TAG_T2EE_USE='Tag_T2EE_use: ',
TAG_CONFORMANCE='Tag_conformance: ',
TAG_VIRTUALIZATION_USE='Tag_Virtualization_use: ',
TAG_MPEXTENSION_USE_OLD='Tag_MPextension_use_old: ',
)
_DESCR_ATTR_VAL_ARM = [
None, #1
None, #2
None, #3
None, #4
None, #5
{ #6 TAG_CPU_ARCH
0 : 'Pre-v4',
1 : 'v4',
2 : 'v4T',
3 : 'v5T',
4 : 'v5TE',
5 : 'v5TEJ',
6 : 'v6',
7 : 'v6KZ',
8 : 'v6T2',
9 : 'v6K',
10: 'v7',
11: 'v6-M',
12: 'v6S-M',
13: 'v7E-M',
14: 'v8',
15: 'v8-R',
16: 'v8-M.baseline',
17: 'v8-M.mainline',
},
{ #7 TAG_CPU_ARCH_PROFILE
0x00: 'None',
0x41: 'Application',
0x52: 'Realtime',
0x4D: 'Microcontroller',
0x53: 'Application or Realtime',
},
{ #8 TAG_ARM_ISA
0: 'No',
1: 'Yes',
},
{ #9 TAG_THUMB_ISA
0: 'No',
1: 'Thumb-1',
2: 'Thumb-2',
3: 'Yes',
},
{ #10 TAG_FP_ARCH
0: 'No',
1: 'VFPv1',
2: 'VFPv2 ',
3: 'VFPv3',
4: 'VFPv3-D16',
5: 'VFPv4',
6: 'VFPv4-D16',
7: 'FP ARM v8',
8: 'FPv5/FP-D16 for ARMv8',
},
{ #11 TAG_WMMX_ARCH
0: 'No',
1: 'WMMXv1',
2: 'WMMXv2',
},
{ #12 TAG_ADVANCED_SIMD_ARCH
0: 'No',
1: 'NEONv1',
2: 'NEONv1 with Fused-MAC',
3: 'NEON for ARMv8',
4: 'NEON for ARMv8.1',
},
{ #13 TAG_PCS_CONFIG
0: 'None',
1: 'Bare platform',
2: 'Linux application',
3: 'Linux DSO',
4: 'PalmOS 2004',
5: 'PalmOS (reserved)',
6: 'SymbianOS 2004',
7: 'SymbianOS (reserved)',
},
{ #14 TAG_ABI_PCS_R9_USE
0: 'v6',
1: 'SB',
2: 'TLS',
3: 'Unused',
},
{ #15 TAG_ABI_PCS_RW_DATA
0: 'Absolute',
1: 'PC-relative',
2: 'SB-relative',
3: 'None',
},
{ #16 TAG_ABI_PCS_RO_DATA
0: 'Absolute',
1: 'PC-relative',
2: 'None',
},
{ #17 TAG_ABI_PCS_GOT_USE
0: 'None',
1: 'direct',
2: 'GOT-indirect',
},
{ #18 TAG_ABI_PCS_WCHAR_T
0: 'None',
1: '??? 1',
2: '2',
3: '??? 3',
4: '4',
},
{ #19 TAG_ABI_FP_ROUNDING
0: 'Unused',
1: 'Needed',
},
{ #20 TAG_ABI_FP_DENORMAL
0: 'Unused',
1: 'Needed',
2: 'Sign only',
},
{ #21 TAG_ABI_FP_EXCEPTIONS
0: 'Unused',
1: 'Needed',
},
{ #22 TAG_ABI_FP_USER_EXCEPTIONS
0: 'Unused',
1: 'Needed',
},
{ #23 TAG_ABI_FP_NUMBER_MODEL
0: 'Unused',
1: 'Finite',
2: 'RTABI',
3: 'IEEE 754',
},
{ #24 TAG_ABI_ALIGN_NEEDED
0: 'None',
1: '8-byte',
2: '4-byte',
3: '??? 3',
},
{ #25 TAG_ABI_ALIGN_PRESERVED
0: 'None',
1: '8-byte, except leaf SP',
2: '8-byte',
3: '??? 3',
},
{ #26 TAG_ABI_ENUM_SIZE
0: 'Unused',
1: 'small',
2: 'int',
3: 'forced to int',
},
{ #27 TAG_ABI_HARDFP_USE
0: 'As Tag_FP_arch',
1: 'SP only',
2: 'Reserved',
3: 'Deprecated',
},
{ #28 TAG_ABI_VFP_ARGS
0: 'AAPCS',
1: 'VFP registers',
2: 'custom',
3: 'compatible',
},
{ #29 TAG_ABI_WMMX_ARGS
0: 'AAPCS',
1: 'WMMX registers',
2: 'custom',
},
{ #30 TAG_ABI_OPTIMIZATION_GOALS
0: 'None',
1: 'Prefer Speed',
2: 'Aggressive Speed',
3: 'Prefer Size',
4: 'Aggressive Size',
5: 'Prefer Debug',
6: 'Aggressive Debug',
},
{ #31 TAG_ABI_FP_OPTIMIZATION_GOALS
0: 'None',
1: 'Prefer Speed',
2: 'Aggressive Speed',
3: 'Prefer Size',
4: 'Aggressive Size',
5: 'Prefer Accuracy',
6: 'Aggressive Accuracy',
},
{ #32 TAG_COMPATIBILITY
0: 'No',
1: 'Yes',
},
None, #33
{ #34 TAG_CPU_UNALIGNED_ACCESS
0: 'None',
1: 'v6',
},
None, #35
{ #36 TAG_FP_HP_EXTENSION
0: 'Not Allowed',
1: 'Allowed',
},
None, #37
{ #38 TAG_ABI_FP_16BIT_FORMAT
0: 'None',
1: 'IEEE 754',
2: 'Alternative Format',
},
None, #39
None, #40
None, #41
{ #42 TAG_MPEXTENSION_USE
0: 'Not Allowed',
1: 'Allowed',
},
None, #43
{ #44 TAG_DIV_USE
0: 'Allowed in Thumb-ISA, v7-R or v7-M',
1: 'Not allowed',
2: 'Allowed in v7-A with integer division extension',
},
None, #45
None, #46
None, #47
None, #48
None, #49
None, #50
None, #51
None, #52
None, #53
None, #54
None, #55
None, #56
None, #57
None, #58
None, #59
None, #60
None, #61
None, #62
None, #63
None, #64
None, #65
{ #66 TAG_FP_HP_EXTENSION
0: 'Not Allowed',
1: 'Allowed',
},
None, #67
{ #68 TAG_VIRTUALIZATION_USE
0: 'Not Allowed',
1: 'TrustZone',
2: 'Virtualization Extensions',
3: 'TrustZone and Virtualization Extensions',
},
None, #69
{ #70 TAG_MPEXTENSION_USE_OLD
0: 'Not Allowed',
1: 'Allowed',
},
]
pyelftools-0.26/elftools/elf/dynamic.py 0000664 0000000 0000000 00000031256 13572204573 0020253 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/dynamic.py
#
# ELF Dynamic Tags
#
# Mike Frysinger (vapier@gentoo.org)
# This code is in the public domain
#-------------------------------------------------------------------------------
import itertools
from collections import defaultdict
from .hash import HashSection, GNUHashSection
from .sections import Section, Symbol
from .enums import ENUM_D_TAG
from .segments import Segment
from .relocation import RelocationTable
from ..common.exceptions import ELFError
from ..common.utils import elf_assert, struct_parse, parse_cstring_from_stream
class _DynamicStringTable(object):
""" Bare string table based on values found via ELF dynamic tags and
loadable segments only. Good enough for get_string() only.
"""
def __init__(self, stream, table_offset):
self._stream = stream
self._table_offset = table_offset
def get_string(self, offset):
""" Get the string stored at the given offset in this string table.
"""
s = parse_cstring_from_stream(self._stream, self._table_offset + offset)
return s.decode('utf-8') if s else ''
class DynamicTag(object):
""" Dynamic Tag object - representing a single dynamic tag entry from a
dynamic section.
Allows dictionary-like access to the dynamic structure. For special
tags (those listed in the _HANDLED_TAGS set below), creates additional
attributes for convenience. For example, .soname will contain the actual
value of DT_SONAME (fetched from the dynamic symbol table).
"""
_HANDLED_TAGS = frozenset(
['DT_NEEDED', 'DT_RPATH', 'DT_RUNPATH', 'DT_SONAME',
'DT_SUNW_FILTER'])
def __init__(self, entry, stringtable):
if stringtable is None:
raise ELFError('Creating DynamicTag without string table')
self.entry = entry
if entry.d_tag in self._HANDLED_TAGS:
setattr(self, entry.d_tag[3:].lower(),
stringtable.get_string(self.entry.d_val))
def __getitem__(self, name):
""" Implement dict-like access to entries
"""
return self.entry[name]
def __repr__(self):
return '' % (self.entry.d_tag, self.entry)
def __str__(self):
if self.entry.d_tag in self._HANDLED_TAGS:
s = '"%s"' % getattr(self, self.entry.d_tag[3:].lower())
else:
s = '%#x' % self.entry.d_ptr
return '' % (self.entry.d_tag, s)
class Dynamic(object):
""" Shared functionality between dynamic sections and segments.
"""
def __init__(self, stream, elffile, stringtable, position):
self.elffile = elffile
self.elfstructs = elffile.structs
self._stream = stream
self._num_tags = -1
self._offset = position
self._tagsize = self.elfstructs.Elf_Dyn.sizeof()
# Do not access this directly yourself; use _get_stringtable() instead.
self._stringtable = stringtable
def get_table_offset(self, tag_name):
""" Return the virtual address and file offset of a dynamic table.
"""
ptr = None
for tag in self._iter_tags(type=tag_name):
ptr = tag['d_ptr']
break
# If we found a virtual address, locate the offset in the file
# by using the program headers.
offset = None
if ptr:
offset = next(self.elffile.address_offsets(ptr), None)
return ptr, offset
def _get_stringtable(self):
""" Return a string table for looking up dynamic tag related strings.
This won't be a "full" string table object, but will at least
support the get_string() function.
"""
if self._stringtable:
return self._stringtable
# If the ELF has stripped its section table (which is unusual, but
# perfectly valid), we need to use the dynamic tags to locate the
# dynamic string table.
_, table_offset = self.get_table_offset('DT_STRTAB')
if table_offset is not None:
self._stringtable = _DynamicStringTable(self._stream, table_offset)
return self._stringtable
# That didn't work for some reason. Let's use the section header
# even though this ELF is super weird.
self._stringtable = self.elffile.get_section_by_name('.dynstr')
return self._stringtable
def _iter_tags(self, type=None):
""" Yield all raw tags (limit to |type| if specified)
"""
for n in itertools.count():
tag = self._get_tag(n)
if type is None or tag['d_tag'] == type:
yield tag
if tag['d_tag'] == 'DT_NULL':
break
def iter_tags(self, type=None):
""" Yield all tags (limit to |type| if specified)
"""
for tag in self._iter_tags(type=type):
yield DynamicTag(tag, self._get_stringtable())
def _get_tag(self, n):
""" Get the raw tag at index #n from the file
"""
offset = self._offset + n * self._tagsize
return struct_parse(
self.elfstructs.Elf_Dyn,
self._stream,
stream_pos=offset)
def get_tag(self, n):
""" Get the tag at index #n from the file (DynamicTag object)
"""
return DynamicTag(self._get_tag(n), self._get_stringtable())
def num_tags(self):
""" Number of dynamic tags in the file
"""
if self._num_tags != -1:
return self._num_tags
for n in itertools.count():
tag = self.get_tag(n)
if tag.entry.d_tag == 'DT_NULL':
self._num_tags = n + 1
return self._num_tags
def get_relocation_tables(self):
""" Load all available relocation tables from DYNAMIC tags.
Returns a dictionary mapping found table types (REL, RELA,
JMPREL) to RelocationTable objects.
"""
result = {}
if list(self.iter_tags('DT_REL')):
result['REL'] = RelocationTable(self.elffile,
self.get_table_offset('DT_REL')[1],
next(self.iter_tags('DT_RELSZ'))['d_val'], False)
relentsz = next(self.iter_tags('DT_RELENT'))['d_val']
elf_assert(result['REL'].entry_size == relentsz,
'Expected DT_RELENT to be %s' % relentsz)
if list(self.iter_tags('DT_RELA')):
result['RELA'] = RelocationTable(self.elffile,
self.get_table_offset('DT_RELA')[1],
next(self.iter_tags('DT_RELASZ'))['d_val'], True)
relentsz = next(self.iter_tags('DT_RELAENT'))['d_val']
elf_assert(result['RELA'].entry_size == relentsz,
'Expected DT_RELAENT to be %s' % relentsz)
if list(self.iter_tags('DT_JMPREL')):
result['JMPREL'] = RelocationTable(self.elffile,
self.get_table_offset('DT_JMPREL')[1],
next(self.iter_tags('DT_PLTRELSZ'))['d_val'],
next(self.iter_tags('DT_PLTREL'))['d_val'] == ENUM_D_TAG['DT_RELA'])
return result
class DynamicSection(Section, Dynamic):
""" ELF dynamic table section. Knows how to process the list of tags.
"""
def __init__(self, header, name, elffile):
Section.__init__(self, header, name, elffile)
stringtable = elffile.get_section(header['sh_link'])
Dynamic.__init__(self, self.stream, self.elffile, stringtable,
self['sh_offset'])
class DynamicSegment(Segment, Dynamic):
""" ELF dynamic table segment. Knows how to process the list of tags.
"""
def __init__(self, header, stream, elffile):
# The string table section to be used to resolve string names in
# the dynamic tag array is the one pointed at by the sh_link field
# of the dynamic section header.
# So we must look for the dynamic section contained in the dynamic
# segment, we do so by searching for the dynamic section whose content
# is located at the same offset as the dynamic segment
stringtable = None
for section in elffile.iter_sections():
if (isinstance(section, DynamicSection) and
section['sh_offset'] == header['p_offset']):
stringtable = elffile.get_section(section['sh_link'])
break
Segment.__init__(self, header, stream)
Dynamic.__init__(self, stream, elffile, stringtable, self['p_offset'])
self._symbol_list = None
self._symbol_name_map = None
def num_symbols(self):
""" Number of symbols in the table recovered from DT_SYMTAB
"""
if self._symbol_list is None:
self._symbol_list = list(self.iter_symbols())
return len(self._symbol_list)
def get_symbol(self, index):
""" Get the symbol at index #index from the table (Symbol object)
"""
if self._symbol_list is None:
self._symbol_list = list(self.iter_symbols())
return self._symbol_list[index]
def get_symbol_by_name(self, name):
""" Get a symbol(s) by name. Return None if no symbol by the given name
exists.
"""
# The first time this method is called, construct a name to number
# mapping
#
if self._symbol_name_map is None:
self._symbol_name_map = defaultdict(list)
for i, sym in enumerate(self.iter_symbols()):
self._symbol_name_map[sym.name].append(i)
symnums = self._symbol_name_map.get(name)
return [self.get_symbol(i) for i in symnums] if symnums else None
def iter_symbols(self):
""" Yield all symbols in this dynamic segment. The symbols are usually
the same as returned by SymbolTableSection.iter_symbols. However,
in stripped binaries, SymbolTableSection might have been removed.
This method reads from the mandatory dynamic tag DT_SYMTAB.
"""
tab_ptr, tab_offset = self.get_table_offset('DT_SYMTAB')
if tab_ptr is None or tab_offset is None:
raise ELFError('Segment does not contain DT_SYMTAB.')
symbol_size = self.elfstructs.Elf_Sym.sizeof()
end_ptr = None
# Check if a DT_GNU_HASH tag exists and recover the number of symbols
# from the corresponding section
_, gnu_hash_offset = self.get_table_offset('DT_GNU_HASH')
if gnu_hash_offset is not None:
hash_section = GNUHashSection(self.stream, gnu_hash_offset,
self.elffile)
end_ptr = tab_ptr + \
hash_section.get_number_of_symbols() * symbol_size
# If DT_GNU_HASH did not exist, maybe we can use DT_HASH
if end_ptr is None:
_, hash_offset = self.get_table_offset('DT_HASH')
if hash_offset is not None:
hash_section = HashSection(self.stream, hash_offset,
self.elffile)
end_ptr = tab_ptr + \
hash_section.get_number_of_symbols() * symbol_size
if end_ptr is None:
# Find closest higher pointer than tab_ptr. We'll use that to mark
# the end of the symbol table.
nearest_ptr = None
for tag in self.iter_tags():
tag_ptr = tag['d_ptr']
if tag['d_tag'] == 'DT_SYMENT':
if symbol_size != tag['d_val']:
# DT_SYMENT is the size of one symbol entry. It must be
# the same as returned by Elf_Sym.sizeof.
raise ELFError('DT_SYMENT (%d) != Elf_Sym (%d).' %
(tag['d_val'], symbol_size))
if (tag_ptr > tab_ptr and
(nearest_ptr is None or nearest_ptr > tag_ptr)):
nearest_ptr = tag_ptr
if nearest_ptr is None:
# Use the end of segment that contains DT_SYMTAB.
for segment in self.elffile.iter_segments():
if (segment['p_vaddr'] <= tab_ptr and
tab_ptr <= (segment['p_vaddr'] + segment['p_filesz'])):
nearest_ptr = segment['p_vaddr'] + segment['p_filesz']
end_ptr = nearest_ptr
if end_ptr is None:
raise ELFError('Cannot determine the end of DT_SYMTAB.')
string_table = self._get_stringtable()
for i in range((end_ptr - tab_ptr) // symbol_size):
symbol = struct_parse(self.elfstructs.Elf_Sym, self._stream,
i * symbol_size + tab_offset)
symbol_name = string_table.get_string(symbol['st_name'])
yield Symbol(symbol, symbol_name)
pyelftools-0.26/elftools/elf/elffile.py 0000664 0000000 0000000 00000065715 13572204573 0020244 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/elffile.py
#
# ELFFile - main class for accessing ELF files
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
import io
import struct
import zlib
try:
import resource
PAGESIZE = resource.getpagesize()
except ImportError:
# Windows system
import mmap
PAGESIZE = mmap.PAGESIZE
from ..common.py3compat import BytesIO
from ..common.exceptions import ELFError
from ..common.utils import struct_parse, elf_assert
from .structs import ELFStructs
from .sections import (
Section, StringTableSection, SymbolTableSection,
SUNWSyminfoTableSection, NullSection, NoteSection,
StabSection, ARMAttributesSection)
from .dynamic import DynamicSection, DynamicSegment
from .relocation import RelocationSection, RelocationHandler
from .gnuversions import (
GNUVerNeedSection, GNUVerDefSection,
GNUVerSymSection)
from .segments import Segment, InterpSegment, NoteSegment
from ..dwarf.dwarfinfo import DWARFInfo, DebugSectionDescriptor, DwarfConfig
class ELFFile(object):
""" Creation: the constructor accepts a stream (file-like object) with the
contents of an ELF file.
Accessible attributes:
stream:
The stream holding the data of the file - must be a binary
stream (bytes, not string).
elfclass:
32 or 64 - specifies the word size of the target machine
little_endian:
boolean - specifies the target machine's endianness
elftype:
string or int, either known value of E_TYPE enum defining ELF
type (e.g. executable, dynamic library or core dump) or integral
unparsed value
header:
the complete ELF file header
e_ident_raw:
the raw e_ident field of the header
"""
def __init__(self, stream):
self.stream = stream
self._identify_file()
self.structs = ELFStructs(
little_endian=self.little_endian,
elfclass=self.elfclass)
self.structs.create_basic_structs()
self.header = self._parse_elf_header()
self.structs.create_advanced_structs(
self['e_type'],
self['e_machine'],
self['e_ident']['EI_OSABI'])
self.stream.seek(0)
self.e_ident_raw = self.stream.read(16)
self._file_stringtable_section = self._get_file_stringtable()
self._section_name_map = None
def num_sections(self):
""" Number of sections in the file
"""
return self['e_shnum']
def get_section(self, n):
""" Get the section at index #n from the file (Section object or a
subclass)
"""
section_header = self._get_section_header(n)
return self._make_section(section_header)
def get_section_by_name(self, name):
""" Get a section from the file, by name. Return None if no such
section exists.
"""
# The first time this method is called, construct a name to number
# mapping
#
if self._section_name_map is None:
self._section_name_map = {}
for i, sec in enumerate(self.iter_sections()):
self._section_name_map[sec.name] = i
secnum = self._section_name_map.get(name, None)
return None if secnum is None else self.get_section(secnum)
def iter_sections(self):
""" Yield all the sections in the file
"""
for i in range(self.num_sections()):
yield self.get_section(i)
def num_segments(self):
""" Number of segments in the file
"""
return self['e_phnum']
def get_segment(self, n):
""" Get the segment at index #n from the file (Segment object)
"""
segment_header = self._get_segment_header(n)
return self._make_segment(segment_header)
def iter_segments(self):
""" Yield all the segments in the file
"""
for i in range(self.num_segments()):
yield self.get_segment(i)
def address_offsets(self, start, size=1):
""" Yield a file offset for each ELF segment containing a memory region.
A memory region is defined by the range [start...start+size). The
offset of the region is yielded.
"""
end = start + size
for seg in self.iter_segments():
# consider LOAD only to prevent same address being yielded twice
if seg['p_type'] != 'PT_LOAD':
continue
if (start >= seg['p_vaddr'] and
end <= seg['p_vaddr'] + seg['p_filesz']):
yield start - seg['p_vaddr'] + seg['p_offset']
def has_dwarf_info(self):
""" Check whether this file appears to have debugging information.
We assume that if it has the .debug_info or .zdebug_info section, it
has all the other required sections as well.
"""
return (self.get_section_by_name('.debug_info') or
self.get_section_by_name('.zdebug_info') or
self.get_section_by_name('.eh_frame'))
def get_dwarf_info(self, relocate_dwarf_sections=True):
""" Return a DWARFInfo object representing the debugging information in
this file.
If relocate_dwarf_sections is True, relocations for DWARF sections
are looked up and applied.
"""
# Expect that has_dwarf_info was called, so at least .debug_info is
# present.
# Sections that aren't found will be passed as None to DWARFInfo.
section_names = ('.debug_info', '.debug_aranges', '.debug_abbrev',
'.debug_str', '.debug_line', '.debug_frame',
'.debug_loc', '.debug_ranges', '.debug_pubtypes',
'.debug_pubnames')
compressed = bool(self.get_section_by_name('.zdebug_info'))
if compressed:
section_names = tuple(map(lambda x: '.z' + x[1:], section_names))
# As it is loaded in the process image, .eh_frame cannot be compressed
section_names += ('.eh_frame', )
(debug_info_sec_name, debug_aranges_sec_name, debug_abbrev_sec_name,
debug_str_sec_name, debug_line_sec_name, debug_frame_sec_name,
debug_loc_sec_name, debug_ranges_sec_name, debug_pubtypes_name,
debug_pubnames_name, eh_frame_sec_name) = section_names
debug_sections = {}
for secname in section_names:
section = self.get_section_by_name(secname)
if section is None:
debug_sections[secname] = None
else:
dwarf_section = self._read_dwarf_section(
section,
relocate_dwarf_sections)
if compressed and secname.startswith('.z'):
dwarf_section = self._decompress_dwarf_section(dwarf_section)
debug_sections[secname] = dwarf_section
return DWARFInfo(
config=DwarfConfig(
little_endian=self.little_endian,
default_address_size=self.elfclass // 8,
machine_arch=self.get_machine_arch()),
debug_info_sec=debug_sections[debug_info_sec_name],
debug_aranges_sec=debug_sections[debug_aranges_sec_name],
debug_abbrev_sec=debug_sections[debug_abbrev_sec_name],
debug_frame_sec=debug_sections[debug_frame_sec_name],
eh_frame_sec=debug_sections[eh_frame_sec_name],
debug_str_sec=debug_sections[debug_str_sec_name],
debug_loc_sec=debug_sections[debug_loc_sec_name],
debug_ranges_sec=debug_sections[debug_ranges_sec_name],
debug_line_sec=debug_sections[debug_line_sec_name],
debug_pubtypes_sec = debug_sections[debug_pubtypes_name],
debug_pubnames_sec = debug_sections[debug_pubnames_name]
)
def get_machine_arch(self):
""" Return the machine architecture, as detected from the ELF header.
"""
architectures = {
'EM_M32' : 'AT&T WE 32100',
'EM_SPARC' : 'SPARC',
'EM_386' : 'x86',
'EM_68K' : 'Motorola 68000',
'EM_88K' : 'Motorola 88000',
'EM_IAMCU' : 'Intel MCU',
'EM_860' : 'Intel 80860',
'EM_MIPS' : 'MIPS',
'EM_S370' : 'IBM System/370',
'EM_MIPS_RS3_LE' : 'MIPS RS3000 Little-endian',
'EM_PARISC' : 'Hewlett-Packard PA-RISC',
'EM_VPP500' : 'Fujitsu VPP500',
'EM_SPARC32PLUS' : 'Enhanced SPARC',
'EM_960' : 'Intel 80960',
'EM_PPC' : 'PowerPC',
'EM_PPC64' : '64-bit PowerPC',
'EM_S390' : 'IBM System/390',
'EM_SPU' : 'IBM SPU/SPC',
'EM_V800' : 'NEC V800',
'EM_FR20' : 'Fujitsu FR20',
'EM_RH32' : 'TRW RH-32',
'EM_RCE' : 'Motorola RCE',
'EM_ARM' : 'ARM',
'EM_ALPHA' : 'Digital Alpha',
'EM_SH' : 'Hitachi SH',
'EM_SPARCV9' : 'SPARC Version 9',
'EM_TRICORE' : 'Siemens TriCore embedded processor',
'EM_ARC' : 'Argonaut RISC Core, Argonaut Technologies Inc.',
'EM_H8_300' : 'Hitachi H8/300',
'EM_H8_300H' : 'Hitachi H8/300H',
'EM_H8S' : 'Hitachi H8S',
'EM_H8_500' : 'Hitachi H8/500',
'EM_IA_64' : 'Intel IA-64',
'EM_MIPS_X' : 'MIPS-X',
'EM_COLDFIRE' : 'Motorola ColdFire',
'EM_68HC12' : 'Motorola M68HC12',
'EM_MMA' : 'Fujitsu MMA',
'EM_PCP' : 'Siemens PCP',
'EM_NCPU' : 'Sony nCPU',
'EM_NDR1' : 'Denso NDR1',
'EM_STARCORE' : 'Motorola Star*Core',
'EM_ME16' : 'Toyota ME16',
'EM_ST100' : 'STMicroelectronics ST100',
'EM_TINYJ' : 'Advanced Logic TinyJ',
'EM_X86_64' : 'x64',
'EM_PDSP' : 'Sony DSP',
'EM_PDP10' : 'Digital Equipment PDP-10',
'EM_PDP11' : 'Digital Equipment PDP-11',
'EM_FX66' : 'Siemens FX66',
'EM_ST9PLUS' : 'STMicroelectronics ST9+ 8/16 bit',
'EM_ST7' : 'STMicroelectronics ST7 8-bit',
'EM_68HC16' : 'Motorola MC68HC16',
'EM_68HC11' : 'Motorola MC68HC11',
'EM_68HC08' : 'Motorola MC68HC08',
'EM_68HC05' : 'Motorola MC68HC05',
'EM_SVX' : 'Silicon Graphics SVx',
'EM_ST19' : 'STMicroelectronics ST19 8-bit',
'EM_VAX' : 'Digital VAX',
'EM_CRIS' : 'Axis Communications 32-bit',
'EM_JAVELIN' : 'Infineon Technologies 32-bit',
'EM_FIREPATH' : 'Element 14 64-bit DSP',
'EM_ZSP' : 'LSI Logic 16-bit DSP',
'EM_MMIX' : 'Donald Knuth\'s educational 64-bit',
'EM_HUANY' : 'Harvard University machine-independent object files',
'EM_PRISM' : 'SiTera Prism',
'EM_AVR' : 'Atmel AVR 8-bit',
'EM_FR30' : 'Fujitsu FR30',
'EM_D10V' : 'Mitsubishi D10V',
'EM_D30V' : 'Mitsubishi D30V',
'EM_V850' : 'NEC v850',
'EM_M32R' : 'Mitsubishi M32R',
'EM_MN10300' : 'Matsushita MN10300',
'EM_MN10200' : 'Matsushita MN10200',
'EM_PJ' : 'picoJava',
'EM_OPENRISC' : 'OpenRISC 32-bit',
'EM_ARC_COMPACT' : 'ARC International ARCompact',
'EM_XTENSA' : 'Tensilica Xtensa',
'EM_VIDEOCORE' : 'Alphamosaic VideoCore',
'EM_TMM_GPP' : 'Thompson Multimedia',
'EM_NS32K' : 'National Semiconductor 32000 series',
'EM_TPC' : 'Tenor Network TPC',
'EM_SNP1K' : 'Trebia SNP 1000',
'EM_ST200' : 'STMicroelectronics ST200',
'EM_IP2K' : 'Ubicom IP2xxx',
'EM_MAX' : 'MAX',
'EM_CR' : 'National Semiconductor CompactRISC',
'EM_F2MC16' : 'Fujitsu F2MC16',
'EM_MSP430' : 'Texas Instruments msp430',
'EM_BLACKFIN' : 'Analog Devices Blackfin',
'EM_SE_C33' : 'Seiko Epson S1C33',
'EM_SEP' : 'Sharp',
'EM_ARCA' : 'Arca RISC',
'EM_UNICORE' : 'PKU-Unity MPRC',
'EM_EXCESS' : 'eXcess',
'EM_DXP' : 'Icera Semiconductor Deep Execution Processor',
'EM_ALTERA_NIOS2' : 'Altera Nios II',
'EM_CRX' : 'National Semiconductor CompactRISC CRX',
'EM_XGATE' : 'Motorola XGATE',
'EM_C166' : 'Infineon C16x/XC16x',
'EM_M16C' : 'Renesas M16C',
'EM_DSPIC30F' : 'Microchip Technology dsPIC30F',
'EM_CE' : 'Freescale Communication Engine RISC core',
'EM_M32C' : 'Renesas M32C',
'EM_TSK3000' : 'Altium TSK3000',
'EM_RS08' : 'Freescale RS08',
'EM_SHARC' : 'Analog Devices SHARC',
'EM_ECOG2' : 'Cyan Technology eCOG2',
'EM_SCORE7' : 'Sunplus S+core7 RISC',
'EM_DSP24' : 'New Japan Radio (NJR) 24-bit DSP',
'EM_VIDEOCORE3' : 'Broadcom VideoCore III',
'EM_LATTICEMICO32' : 'Lattice FPGA RISC',
'EM_SE_C17' : 'Seiko Epson C17',
'EM_TI_C6000' : 'TI TMS320C6000',
'EM_TI_C2000' : 'TI TMS320C2000',
'EM_TI_C5500' : 'TI TMS320C55x',
'EM_TI_ARP32' : 'TI Application Specific RISC, 32bit',
'EM_TI_PRU' : 'TI Programmable Realtime Unit',
'EM_MMDSP_PLUS' : 'STMicroelectronics 64bit VLIW',
'EM_CYPRESS_M8C' : 'Cypress M8C',
'EM_R32C' : 'Renesas R32C',
'EM_TRIMEDIA' : 'NXP Semiconductors TriMedia',
'EM_QDSP6' : 'QUALCOMM DSP6',
'EM_8051' : 'Intel 8051',
'EM_STXP7X' : 'STMicroelectronics STxP7x',
'EM_NDS32' : 'Andes Technology RISC',
'EM_ECOG1' : 'Cyan Technology eCOG1X',
'EM_ECOG1X' : 'Cyan Technology eCOG1X',
'EM_MAXQ30' : 'Dallas Semiconductor MAXQ30',
'EM_XIMO16' : 'New Japan Radio (NJR) 16-bit',
'EM_MANIK' : 'M2000 Reconfigurable RISC',
'EM_CRAYNV2' : 'Cray Inc. NV2',
'EM_RX' : 'Renesas RX',
'EM_METAG' : 'Imagination Technologies META',
'EM_MCST_ELBRUS' : 'MCST Elbrus',
'EM_ECOG16' : 'Cyan Technology eCOG16',
'EM_CR16' : 'National Semiconductor CompactRISC CR16 16-bit',
'EM_ETPU' : 'Freescale',
'EM_SLE9X' : 'Infineon Technologies SLE9X',
'EM_L10M' : 'Intel L10M',
'EM_K10M' : 'Intel K10M',
'EM_AARCH64' : 'AArch64',
'EM_AVR32' : 'Atmel 32-bit',
'EM_STM8' : 'STMicroeletronics STM8 8-bit',
'EM_TILE64' : 'Tilera TILE64',
'EM_TILEPRO' : 'Tilera TILEPro',
'EM_MICROBLAZE' : 'Xilinx MicroBlaze 32-bit RISC',
'EM_CUDA' : 'NVIDIA CUDA',
'EM_TILEGX' : 'Tilera TILE-Gx',
'EM_CLOUDSHIELD' : 'CloudShield',
'EM_COREA_1ST' : 'KIPO-KAIST Core-A 1st generation',
'EM_COREA_2ND' : 'KIPO-KAIST Core-A 2nd generation',
'EM_ARC_COMPACT2' : 'Synopsys ARCompact V2',
'EM_OPEN8' : 'Open8 8-bit RISC',
'EM_RL78' : 'Renesas RL78',
'EM_VIDEOCORE5' : 'Broadcom VideoCore V',
'EM_78KOR' : 'Renesas 78KOR',
'EM_56800EX' : 'Freescale 56800EX',
'EM_BA1' : 'Beyond BA1',
'EM_BA2' : 'Beyond BA2',
'EM_XCORE' : 'XMOS xCORE',
'EM_MCHP_PIC' : 'Microchip 8-bit PIC',
'EM_INTEL205' : 'Reserved by Intel',
'EM_INTEL206' : 'Reserved by Intel',
'EM_INTEL207' : 'Reserved by Intel',
'EM_INTEL208' : 'Reserved by Intel',
'EM_INTEL209' : 'Reserved by Intel',
'EM_KM32' : 'KM211 KM32 32-bit',
'EM_KMX32' : 'KM211 KMX32 32-bit',
'EM_KMX16' : 'KM211 KMX16 16-bit',
'EM_KMX8' : 'KM211 KMX8 8-bit',
'EM_KVARC' : 'KM211 KVARC',
'EM_CDP' : 'Paneve CDP',
'EM_COGE' : 'Cognitive',
'EM_COOL' : 'Bluechip Systems CoolEngine',
'EM_NORC' : 'Nanoradio Optimized RISC',
'EM_CSR_KALIMBA' : 'CSR Kalimba',
'EM_Z80' : 'Zilog Z80',
'EM_VISIUM' : 'VISIUMcore',
'EM_FT32' : 'FTDI Chip FT32 32-bit RISC',
'EM_MOXIE' : 'Moxie',
'EM_AMDGPU' : 'AMD GPU',
'EM_RISCV' : 'RISC-V'
}
return architectures.get(self['e_machine'], '')
#-------------------------------- PRIVATE --------------------------------#
def __getitem__(self, name):
""" Implement dict-like access to header entries
"""
return self.header[name]
def _identify_file(self):
""" Verify the ELF file and identify its class and endianness.
"""
# Note: this code reads the stream directly, without using ELFStructs,
# since we don't yet know its exact format. ELF was designed to be
# read like this - its e_ident field is word-size and endian agnostic.
self.stream.seek(0)
magic = self.stream.read(4)
elf_assert(magic == b'\x7fELF', 'Magic number does not match')
ei_class = self.stream.read(1)
if ei_class == b'\x01':
self.elfclass = 32
elif ei_class == b'\x02':
self.elfclass = 64
else:
raise ELFError('Invalid EI_CLASS %s' % repr(ei_class))
ei_data = self.stream.read(1)
if ei_data == b'\x01':
self.little_endian = True
elif ei_data == b'\x02':
self.little_endian = False
else:
raise ELFError('Invalid EI_DATA %s' % repr(ei_data))
def _section_offset(self, n):
""" Compute the offset of section #n in the file
"""
return self['e_shoff'] + n * self['e_shentsize']
def _segment_offset(self, n):
""" Compute the offset of segment #n in the file
"""
return self['e_phoff'] + n * self['e_phentsize']
def _make_segment(self, segment_header):
""" Create a Segment object of the appropriate type
"""
segtype = segment_header['p_type']
if segtype == 'PT_INTERP':
return InterpSegment(segment_header, self.stream)
elif segtype == 'PT_DYNAMIC':
return DynamicSegment(segment_header, self.stream, self)
elif segtype == 'PT_NOTE':
return NoteSegment(segment_header, self.stream, self)
else:
return Segment(segment_header, self.stream)
def _get_section_header(self, n):
""" Find the header of section #n, parse it and return the struct
"""
return struct_parse(
self.structs.Elf_Shdr,
self.stream,
stream_pos=self._section_offset(n))
def _get_section_name(self, section_header):
""" Given a section header, find this section's name in the file's
string table
"""
name_offset = section_header['sh_name']
return self._file_stringtable_section.get_string(name_offset)
def _make_section(self, section_header):
""" Create a section object of the appropriate type
"""
name = self._get_section_name(section_header)
sectype = section_header['sh_type']
if sectype == 'SHT_STRTAB':
return StringTableSection(section_header, name, self)
elif sectype == 'SHT_NULL':
return NullSection(section_header, name, self)
elif sectype in ('SHT_SYMTAB', 'SHT_DYNSYM', 'SHT_SUNW_LDYNSYM'):
return self._make_symbol_table_section(section_header, name)
elif sectype == 'SHT_SUNW_syminfo':
return self._make_sunwsyminfo_table_section(section_header, name)
elif sectype == 'SHT_GNU_verneed':
return self._make_gnu_verneed_section(section_header, name)
elif sectype == 'SHT_GNU_verdef':
return self._make_gnu_verdef_section(section_header, name)
elif sectype == 'SHT_GNU_versym':
return self._make_gnu_versym_section(section_header, name)
elif sectype in ('SHT_REL', 'SHT_RELA'):
return RelocationSection(section_header, name, self)
elif sectype == 'SHT_DYNAMIC':
return DynamicSection(section_header, name, self)
elif sectype == 'SHT_NOTE':
return NoteSection(section_header, name, self)
elif sectype == 'SHT_PROGBITS' and name == '.stab':
return StabSection(section_header, name, self)
elif sectype == 'SHT_ARM_ATTRIBUTES':
return ARMAttributesSection(section_header, name, self)
else:
return Section(section_header, name, self)
def _make_symbol_table_section(self, section_header, name):
""" Create a SymbolTableSection
"""
linked_strtab_index = section_header['sh_link']
strtab_section = self.get_section(linked_strtab_index)
return SymbolTableSection(
section_header, name,
elffile=self,
stringtable=strtab_section)
def _make_sunwsyminfo_table_section(self, section_header, name):
""" Create a SUNWSyminfoTableSection
"""
linked_strtab_index = section_header['sh_link']
strtab_section = self.get_section(linked_strtab_index)
return SUNWSyminfoTableSection(
section_header, name,
elffile=self,
symboltable=strtab_section)
def _make_gnu_verneed_section(self, section_header, name):
""" Create a GNUVerNeedSection
"""
linked_strtab_index = section_header['sh_link']
strtab_section = self.get_section(linked_strtab_index)
return GNUVerNeedSection(
section_header, name,
elffile=self,
stringtable=strtab_section)
def _make_gnu_verdef_section(self, section_header, name):
""" Create a GNUVerDefSection
"""
linked_strtab_index = section_header['sh_link']
strtab_section = self.get_section(linked_strtab_index)
return GNUVerDefSection(
section_header, name,
elffile=self,
stringtable=strtab_section)
def _make_gnu_versym_section(self, section_header, name):
""" Create a GNUVerSymSection
"""
linked_strtab_index = section_header['sh_link']
strtab_section = self.get_section(linked_strtab_index)
return GNUVerSymSection(
section_header, name,
elffile=self,
symboltable=strtab_section)
def _get_segment_header(self, n):
""" Find the header of segment #n, parse it and return the struct
"""
return struct_parse(
self.structs.Elf_Phdr,
self.stream,
stream_pos=self._segment_offset(n))
def _get_file_stringtable(self):
""" Find the file's string table section
"""
stringtable_section_num = self['e_shstrndx']
return StringTableSection(
header=self._get_section_header(stringtable_section_num),
name='',
elffile=self)
def _parse_elf_header(self):
""" Parses the ELF file header and assigns the result to attributes
of this object.
"""
return struct_parse(self.structs.Elf_Ehdr, self.stream, stream_pos=0)
def _read_dwarf_section(self, section, relocate_dwarf_sections):
""" Read the contents of a DWARF section from the stream and return a
DebugSectionDescriptor. Apply relocations if asked to.
"""
# The section data is read into a new stream, for processing
section_stream = BytesIO()
section_stream.write(section.data())
if relocate_dwarf_sections:
reloc_handler = RelocationHandler(self)
reloc_section = reloc_handler.find_relocations_for_section(section)
if reloc_section is not None:
reloc_handler.apply_section_relocations(
section_stream, reloc_section)
return DebugSectionDescriptor(
stream=section_stream,
name=section.name,
global_offset=section['sh_offset'],
size=section['sh_size'],
address=section['sh_addr'])
@staticmethod
def _decompress_dwarf_section(section):
""" Returns the uncompressed contents of the provided DWARF section.
"""
# TODO: support other compression formats from readelf.c
assert section.size > 12, 'Unsupported compression format.'
section.stream.seek(0)
# According to readelf.c the content should contain "ZLIB"
# followed by the uncompressed section size - 8 bytes in
# big-endian order
compression_type = section.stream.read(4)
assert compression_type == b'ZLIB', \
'Invalid compression type: %r' % (compression_type)
uncompressed_size = struct.unpack('>Q', section.stream.read(8))[0]
decompressor = zlib.decompressobj()
uncompressed_stream = BytesIO()
while True:
chunk = section.stream.read(PAGESIZE)
if not chunk:
break
uncompressed_stream.write(decompressor.decompress(chunk))
uncompressed_stream.write(decompressor.flush())
uncompressed_stream.seek(0, io.SEEK_END)
size = uncompressed_stream.tell()
assert uncompressed_size == size, \
'Wrong uncompressed size: expected %r, but got %r' % (
uncompressed_size, size,
)
return section._replace(stream=uncompressed_stream, size=size)
pyelftools-0.26/elftools/elf/enums.py 0000664 0000000 0000000 00000106134 13572204573 0017754 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/enums.py
#
# Mappings of enum names to values
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..common.utils import merge_dicts
from ..construct import Pass
# e_ident[EI_CLASS] in the ELF header
ENUM_EI_CLASS = dict(
ELFCLASSNONE=0,
ELFCLASS32=1,
ELFCLASS64=2
)
# e_ident[EI_DATA] in the ELF header
ENUM_EI_DATA = dict(
ELFDATANONE=0,
ELFDATA2LSB=1,
ELFDATA2MSB=2
)
# e_version in the ELF header
ENUM_E_VERSION = dict(
EV_NONE=0,
EV_CURRENT=1,
_default_=Pass,
)
# e_ident[EI_OSABI] in the ELF header
ENUM_EI_OSABI = dict(
ELFOSABI_SYSV=0,
ELFOSABI_HPUX=1,
ELFOSABI_NETBSD=2,
ELFOSABI_LINUX=3,
ELFOSABI_HURD=4,
ELFOSABI_SOLARIS=6,
ELFOSABI_AIX=7,
ELFOSABI_IRIX=8,
ELFOSABI_FREEBSD=9,
ELFOSABI_TRU64=10,
ELFOSABI_MODESTO=11,
ELFOSABI_OPENBSD=12,
ELFOSABI_OPENVMS=13,
ELFOSABI_NSK=14,
ELFOSABI_AROS=15,
ELFOSABI_FENIXOS=16,
ELFOSABI_CLOUD=17,
ELFOSABI_SORTIX=53,
ELFOSABI_ARM_AEABI=64,
ELFOSABI_ARM=97,
ELFOSABI_STANDALONE=255,
_default_=Pass,
)
# e_type in the ELF header
ENUM_E_TYPE = dict(
ET_NONE=0,
ET_REL=1,
ET_EXEC=2,
ET_DYN=3,
ET_CORE=4,
ET_LOPROC=0xff00,
ET_HIPROC=0xffff,
_default_=Pass,
)
# e_machine in the ELF header
ENUM_E_MACHINE = dict(
EM_NONE = 0, # No machine
EM_M32 = 1, # AT&T WE 32100
EM_SPARC = 2, # SPARC
EM_386 = 3, # Intel 80386
EM_68K = 4, # Motorola 68000
EM_88K = 5, # Motorola 88000
EM_IAMCU = 6, # Intel MCU
EM_860 = 7, # Intel 80860
EM_MIPS = 8, # MIPS I Architecture
EM_S370 = 9, # IBM System/370 Processor
EM_MIPS_RS3_LE = 10, # MIPS RS3000 Little-endian
EM_PARISC = 15, # Hewlett-Packard PA-RISC
EM_VPP500 = 17, # Fujitsu VPP500
EM_SPARC32PLUS = 18, # Enhanced instruction set SPARC
EM_960 = 19, # Intel 80960
EM_PPC = 20, # PowerPC
EM_PPC64 = 21, # 64-bit PowerPC
EM_S390 = 22, # IBM System/390 Processor
EM_SPU = 23, # IBM SPU/SPC
EM_V800 = 36, # NEC V800
EM_FR20 = 37, # Fujitsu FR20
EM_RH32 = 38, # TRW RH-32
EM_RCE = 39, # Motorola RCE
EM_ARM = 40, # ARM 32-bit architecture (AARCH32)
EM_ALPHA = 41, # Digital Alpha
EM_SH = 42, # Hitachi SH
EM_SPARCV9 = 43, # SPARC Version 9
EM_TRICORE = 44, # Siemens TriCore embedded processor
EM_ARC = 45, # Argonaut RISC Core, Argonaut Technologies Inc.
EM_H8_300 = 46, # Hitachi H8/300
EM_H8_300H = 47, # Hitachi H8/300H
EM_H8S = 48, # Hitachi H8S
EM_H8_500 = 49, # Hitachi H8/500
EM_IA_64 = 50, # Intel IA-64 processor architecture
EM_MIPS_X = 51, # Stanford MIPS-X
EM_COLDFIRE = 52, # Motorola ColdFire
EM_68HC12 = 53, # Motorola M68HC12
EM_MMA = 54, # Fujitsu MMA Multimedia Accelerator
EM_PCP = 55, # Siemens PCP
EM_NCPU = 56, # Sony nCPU embedded RISC processor
EM_NDR1 = 57, # Denso NDR1 microprocessor
EM_STARCORE = 58, # Motorola Star*Core processor
EM_ME16 = 59, # Toyota ME16 processor
EM_ST100 = 60, # STMicroelectronics ST100 processor
EM_TINYJ = 61, # Advanced Logic Corp. TinyJ embedded processor family
EM_X86_64 = 62, # AMD x86-64 architecture
EM_PDSP = 63, # Sony DSP Processor
EM_PDP10 = 64, # Digital Equipment Corp. PDP-10
EM_PDP11 = 65, # Digital Equipment Corp. PDP-11
EM_FX66 = 66, # Siemens FX66 microcontroller
EM_ST9PLUS = 67, # STMicroelectronics ST9+ 8/16 bit microcontroller
EM_ST7 = 68, # STMicroelectronics ST7 8-bit microcontroller
EM_68HC16 = 69, # Motorola MC68HC16 Microcontroller
EM_68HC11 = 70, # Motorola MC68HC11 Microcontroller
EM_68HC08 = 71, # Motorola MC68HC08 Microcontroller
EM_68HC05 = 72, # Motorola MC68HC05 Microcontroller
EM_SVX = 73, # Silicon Graphics SVx
EM_ST19 = 74, # STMicroelectronics ST19 8-bit microcontroller
EM_VAX = 75, # Digital VAX
EM_CRIS = 76, # Axis Communications 32-bit embedded processor
EM_JAVELIN = 77, # Infineon Technologies 32-bit embedded processor
EM_FIREPATH = 78, # Element 14 64-bit DSP Processor
EM_ZSP = 79, # LSI Logic 16-bit DSP Processor
EM_MMIX = 80, # Donald Knuth's educational 64-bit processor
EM_HUANY = 81, # Harvard University machine-independent object files
EM_PRISM = 82, # SiTera Prism
EM_AVR = 83, # Atmel AVR 8-bit microcontroller
EM_FR30 = 84, # Fujitsu FR30
EM_D10V = 85, # Mitsubishi D10V
EM_D30V = 86, # Mitsubishi D30V
EM_V850 = 87, # NEC v850
EM_M32R = 88, # Mitsubishi M32R
EM_MN10300 = 89, # Matsushita MN10300
EM_MN10200 = 90, # Matsushita MN10200
EM_PJ = 91, # picoJava
EM_OPENRISC = 92, # OpenRISC 32-bit embedded processor
EM_ARC_COMPACT = 93, # ARC International ARCompact processor (old spelling/synonym: EM_ARC_A5)
EM_XTENSA = 94, # Tensilica Xtensa Architecture
EM_VIDEOCORE = 95, # Alphamosaic VideoCore processor
EM_TMM_GPP = 96, # Thompson Multimedia General Purpose Processor
EM_NS32K = 97, # National Semiconductor 32000 series
EM_TPC = 98, # Tenor Network TPC processor
EM_SNP1K = 99, # Trebia SNP 1000 processor
EM_ST200 = 100, # STMicroelectronics (www.st.com) ST200 microcontroller
EM_IP2K = 101, # Ubicom IP2xxx microcontroller family
EM_MAX = 102, # MAX Processor
EM_CR = 103, # National Semiconductor CompactRISC microprocessor
EM_F2MC16 = 104, # Fujitsu F2MC16
EM_MSP430 = 105, # Texas Instruments embedded microcontroller msp430
EM_BLACKFIN = 106, # Analog Devices Blackfin (DSP) processor
EM_SE_C33 = 107, # S1C33 Family of Seiko Epson processors
EM_SEP = 108, # Sharp embedded microprocessor
EM_ARCA = 109, # Arca RISC Microprocessor
EM_UNICORE = 110, # Microprocessor series from PKU-Unity Ltd. and MPRC of Peking University
EM_EXCESS = 111, # eXcess: 16/32/64-bit configurable embedded CPU
EM_DXP = 112, # Icera Semiconductor Inc. Deep Execution Processor
EM_ALTERA_NIOS2 = 113, # Altera Nios II soft-core processor
EM_CRX = 114, # National Semiconductor CompactRISC CRX microprocessor
EM_XGATE = 115, # Motorola XGATE embedded processor
EM_C166 = 116, # Infineon C16x/XC16x processor
EM_M16C = 117, # Renesas M16C series microprocessors
EM_DSPIC30F = 118, # Microchip Technology dsPIC30F Digital Signal Controller
EM_CE = 119, # Freescale Communication Engine RISC core
EM_M32C = 120, # Renesas M32C series microprocessors
EM_TSK3000 = 131, # Altium TSK3000 core
EM_RS08 = 132, # Freescale RS08 embedded processor
EM_SHARC = 133, # Analog Devices SHARC family of 32-bit DSP processors
EM_ECOG2 = 134, # Cyan Technology eCOG2 microprocessor
EM_SCORE7 = 135, # Sunplus S+core7 RISC processor
EM_DSP24 = 136, # New Japan Radio (NJR) 24-bit DSP Processor
EM_VIDEOCORE3 = 137, # Broadcom VideoCore III processor
EM_LATTICEMICO32 = 138, # RISC processor for Lattice FPGA architecture
EM_SE_C17 = 139, # Seiko Epson C17 family
EM_TI_C6000 = 140, # The Texas Instruments TMS320C6000 DSP family
EM_TI_C2000 = 141, # The Texas Instruments TMS320C2000 DSP family
EM_TI_C5500 = 142, # The Texas Instruments TMS320C55x DSP family
EM_TI_ARP32 = 143, # Texas Instruments Application Specific RISC Processor, 32bit fetch
EM_TI_PRU = 144, # Texas Instruments Programmable Realtime Unit
EM_MMDSP_PLUS = 160, # STMicroelectronics 64bit VLIW Data Signal Processor
EM_CYPRESS_M8C = 161, # Cypress M8C microprocessor
EM_R32C = 162, # Renesas R32C series microprocessors
EM_TRIMEDIA = 163, # NXP Semiconductors TriMedia architecture family
EM_QDSP6 = 164, # QUALCOMM DSP6 Processor
EM_8051 = 165, # Intel 8051 and variants
EM_STXP7X = 166, # STMicroelectronics STxP7x family of configurable and extensible RISC processors
EM_NDS32 = 167, # Andes Technology compact code size embedded RISC processor family
EM_ECOG1 = 168, # Cyan Technology eCOG1X family
EM_ECOG1X = 168, # Cyan Technology eCOG1X family
EM_MAXQ30 = 169, # Dallas Semiconductor MAXQ30 Core Micro-controllers
EM_XIMO16 = 170, # New Japan Radio (NJR) 16-bit DSP Processor
EM_MANIK = 171, # M2000 Reconfigurable RISC Microprocessor
EM_CRAYNV2 = 172, # Cray Inc. NV2 vector architecture
EM_RX = 173, # Renesas RX family
EM_METAG = 174, # Imagination Technologies META processor architecture
EM_MCST_ELBRUS = 175, # MCST Elbrus general purpose hardware architecture
EM_ECOG16 = 176, # Cyan Technology eCOG16 family
EM_CR16 = 177, # National Semiconductor CompactRISC CR16 16-bit microprocessor
EM_ETPU = 178, # Freescale Extended Time Processing Unit
EM_SLE9X = 179, # Infineon Technologies SLE9X core
EM_L10M = 180, # Intel L10M
EM_K10M = 181, # Intel K10M
EM_AARCH64 = 183, # ARM 64-bit architecture (AARCH64)
EM_AVR32 = 185, # Atmel Corporation 32-bit microprocessor family
EM_STM8 = 186, # STMicroeletronics STM8 8-bit microcontroller
EM_TILE64 = 187, # Tilera TILE64 multicore architecture family
EM_TILEPRO = 188, # Tilera TILEPro multicore architecture family
EM_MICROBLAZE = 189, # Xilinx MicroBlaze 32-bit RISC soft processor core
EM_CUDA = 190, # NVIDIA CUDA architecture
EM_TILEGX = 191, # Tilera TILE-Gx multicore architecture family
EM_CLOUDSHIELD = 192, # CloudShield architecture family
EM_COREA_1ST = 193, # KIPO-KAIST Core-A 1st generation processor family
EM_COREA_2ND = 194, # KIPO-KAIST Core-A 2nd generation processor family
EM_ARC_COMPACT2 = 195, # Synopsys ARCompact V2
EM_OPEN8 = 196, # Open8 8-bit RISC soft processor core
EM_RL78 = 197, # Renesas RL78 family
EM_VIDEOCORE5 = 198, # Broadcom VideoCore V processor
EM_78KOR = 199, # Renesas 78KOR family
EM_56800EX = 200, # Freescale 56800EX Digital Signal Controller (DSC)
EM_BA1 = 201, # Beyond BA1 CPU architecture
EM_BA2 = 202, # Beyond BA2 CPU architecture
EM_XCORE = 203, # XMOS xCORE processor family
EM_MCHP_PIC = 204, # Microchip 8-bit PIC(r) family
EM_INTEL205 = 205, # Reserved by Intel
EM_INTEL206 = 206, # Reserved by Intel
EM_INTEL207 = 207, # Reserved by Intel
EM_INTEL208 = 208, # Reserved by Intel
EM_INTEL209 = 209, # Reserved by Intel
EM_KM32 = 210, # KM211 KM32 32-bit processor
EM_KMX32 = 211, # KM211 KMX32 32-bit processor
EM_KMX16 = 212, # KM211 KMX16 16-bit processor
EM_KMX8 = 213, # KM211 KMX8 8-bit processor
EM_KVARC = 214, # KM211 KVARC processor
EM_CDP = 215, # Paneve CDP architecture family
EM_COGE = 216, # Cognitive Smart Memory Processor
EM_COOL = 217, # Bluechip Systems CoolEngine
EM_NORC = 218, # Nanoradio Optimized RISC
EM_CSR_KALIMBA = 219, # CSR Kalimba architecture family
EM_Z80 = 220, # Zilog Z80
EM_VISIUM = 221, # Controls and Data Services VISIUMcore processor
EM_FT32 = 222, # FTDI Chip FT32 high performance 32-bit RISC architecture
EM_MOXIE = 223, # Moxie processor family
EM_AMDGPU = 224, # AMD GPU architecture
EM_RISCV = 243, # RISC-V
# Reservations
# reserved 11-14 Reserved for future use
# reserved 16 Reserved for future use
# reserved 24-35 Reserved for future use
# reserved 121-130 Reserved for future use
# reserved 145-159 Reserved for future use
# reserved 145-159 Reserved for future use
# reserved 182 Reserved for future Intel use
# reserved 184 Reserved for future ARM use
# unknown/reserve? 225 - 242
_default_=Pass,
)
# sh_type in the section header
#
# This is the "base" dict that doesn't hold processor-specific values; from it
# we later create per-processor dicts that use the LOPROC...HIPROC range to
# define processor-specific values. The proper dict should be used based on the
# machine the ELF header refers to.
ENUM_SH_TYPE_BASE = dict(
SHT_NULL=0,
SHT_PROGBITS=1,
SHT_SYMTAB=2,
SHT_STRTAB=3,
SHT_RELA=4,
SHT_HASH=5,
SHT_DYNAMIC=6,
SHT_NOTE=7,
SHT_NOBITS=8,
SHT_REL=9,
SHT_SHLIB=10,
SHT_DYNSYM=11,
SHT_INIT_ARRAY=14,
SHT_FINI_ARRAY=15,
SHT_PREINIT_ARRAY=16,
SHT_GROUP=17,
SHT_SYMTAB_SHNDX=18,
SHT_NUM=19,
SHT_LOOS=0x60000000,
SHT_GNU_ATTRIBUTES=0x6ffffff5,
SHT_GNU_HASH=0x6ffffff6,
SHT_GNU_LIBLIST=0x6ffffff7,
SHT_GNU_verdef=0x6ffffffd, # also SHT_SUNW_verdef
SHT_GNU_verneed=0x6ffffffe, # also SHT_SUNW_verneed
SHT_GNU_versym=0x6fffffff, # also SHT_SUNW_versym, SHT_HIOS
# These are commented out because they carry no semantic meaning in
# themselves and may be overridden by target-specific enums.
#SHT_LOPROC=0x70000000,
#SHT_HIPROC=0x7fffffff,
SHT_LOUSER=0x80000000,
SHT_HIUSER=0xffffffff,
SHT_SUNW_LDYNSYM=0x6ffffff3,
SHT_SUNW_syminfo=0x6ffffffc,
_default_=Pass,
)
ENUM_SH_TYPE_AMD64 = merge_dicts(
ENUM_SH_TYPE_BASE,
dict(SHT_AMD64_UNWIND=0x70000001))
ENUM_SH_TYPE_ARM = merge_dicts(
ENUM_SH_TYPE_BASE,
dict(
SHT_ARM_EXIDX=0x70000001,
SHT_ARM_PREEMPTMAP=0x70000002,
SHT_ARM_ATTRIBUTES=0x70000003,
SHT_ARM_DEBUGOVERLAY=0x70000004))
ENUM_SH_TYPE_MIPS = merge_dicts(
ENUM_SH_TYPE_BASE,
dict(
SHT_MIPS_LIBLIST=0x70000000,
SHT_MIPS_DEBUG=0x70000005,
SHT_MIPS_REGINFO=0x70000006,
SHT_MIPS_PACKAGE=0x70000007,
SHT_MIPS_PACKSYM=0x70000008,
SHT_MIPS_RELD=0x70000009,
SHT_MIPS_IFACE=0x7000000b,
SHT_MIPS_CONTENT=0x7000000c,
SHT_MIPS_OPTIONS=0x7000000d,
SHT_MIPS_SHDR=0x70000010,
SHT_MIPS_FDESC=0x70000011,
SHT_MIPS_EXTSYM=0x70000012,
SHT_MIPS_DENSE=0x70000013,
SHT_MIPS_PDESC=0x70000014,
SHT_MIPS_LOCSYM=0x70000015,
SHT_MIPS_AUXSYM=0x70000016,
SHT_MIPS_OPTSYM=0x70000017,
SHT_MIPS_LOCSTR=0x70000018,
SHT_MIPS_LINE=0x70000019,
SHT_MIPS_RFDESC=0x7000001a,
SHT_MIPS_DELTASYM=0x7000001b,
SHT_MIPS_DELTAINST=0x7000001c,
SHT_MIPS_DELTACLASS=0x7000001d,
SHT_MIPS_DWARF=0x7000001e,
SHT_MIPS_DELTADECL=0x7000001f,
SHT_MIPS_SYMBOL_LIB=0x70000020,
SHT_MIPS_EVENTS=0x70000021,
SHT_MIPS_TRANSLATE=0x70000022,
SHT_MIPS_PIXIE=0x70000023,
SHT_MIPS_XLATE=0x70000024,
SHT_MIPS_XLATE_DEBUG=0x70000025,
SHT_MIPS_WHIRL=0x70000026,
SHT_MIPS_EH_REGION=0x70000027,
SHT_MIPS_XLATE_OLD=0x70000028,
SHT_MIPS_PDR_EXCEPTION=0x70000029))
ENUM_ELFCOMPRESS_TYPE = dict(
ELFCOMPRESS_ZLIB=1,
ELFCOMPRESS_LOOS=0x60000000,
ELFCOMPRESS_HIOS=0x6fffffff,
ELFCOMPRESS_LOPROC=0x70000000,
ELFCOMPRESS_HIPROC=0x7fffffff,
_default_=Pass,
)
# p_type in the program header
# some values scavenged from the ELF headers in binutils-2.21
#
# Using the same base + per-processor augmentation technique as in sh_type.
ENUM_P_TYPE_BASE = dict(
PT_NULL=0,
PT_LOAD=1,
PT_DYNAMIC=2,
PT_INTERP=3,
PT_NOTE=4,
PT_SHLIB=5,
PT_PHDR=6,
PT_TLS=7,
PT_LOOS=0x60000000,
PT_HIOS=0x6fffffff,
# These are commented out because they carry no semantic meaning in
# themselves and may be overridden by target-specific enums.
#PT_LOPROC=0x70000000,
#PT_HIPROC=0x7fffffff,
PT_GNU_EH_FRAME=0x6474e550,
PT_GNU_STACK=0x6474e551,
PT_GNU_RELRO=0x6474e552,
_default_=Pass,
)
ENUM_P_TYPE_ARM = merge_dicts(
ENUM_P_TYPE_BASE,
dict(
PT_ARM_ARCHEXT=0x70000000,
PT_ARM_EXIDX=0x70000001))
ENUM_P_TYPE_AARCH64 = merge_dicts(
ENUM_P_TYPE_BASE,
dict(
PT_AARCH64_ARCHEXT=0x70000000,
PT_AARCH64_UNWIND=0x70000001))
ENUM_P_TYPE_MIPS = merge_dicts(
ENUM_P_TYPE_BASE,
dict(PT_MIPS_ABIFLAGS=0x70000003))
# st_info bindings in the symbol header
ENUM_ST_INFO_BIND = dict(
STB_LOCAL=0,
STB_GLOBAL=1,
STB_WEAK=2,
STB_NUM=3,
STB_LOOS=10,
STB_HIOS=12,
STB_LOPROC=13,
STB_HIPROC=15,
_default_=Pass,
)
# st_info type in the symbol header
ENUM_ST_INFO_TYPE = dict(
STT_NOTYPE=0,
STT_OBJECT=1,
STT_FUNC=2,
STT_SECTION=3,
STT_FILE=4,
STT_COMMON=5,
STT_TLS=6,
STT_NUM=7,
STT_RELC=8,
STT_SRELC=9,
STT_LOOS=10,
STT_HIOS=12,
STT_LOPROC=13,
STT_HIPROC=15,
_default_=Pass,
)
# visibility from st_other
ENUM_ST_VISIBILITY = dict(
STV_DEFAULT=0,
STV_INTERNAL=1,
STV_HIDDEN=2,
STV_PROTECTED=3,
STV_EXPORTED=4,
STV_SINGLETON=5,
STV_ELIMINATE=6,
_default_=Pass,
)
# st_shndx
ENUM_ST_SHNDX = dict(
SHN_UNDEF=0,
SHN_ABS=0xfff1,
SHN_COMMON=0xfff2,
_default_=Pass,
)
# d_tag
ENUM_D_TAG_COMMON = dict(
DT_NULL=0,
DT_NEEDED=1,
DT_PLTRELSZ=2,
DT_PLTGOT=3,
DT_HASH=4,
DT_STRTAB=5,
DT_SYMTAB=6,
DT_RELA=7,
DT_RELASZ=8,
DT_RELAENT=9,
DT_STRSZ=10,
DT_SYMENT=11,
DT_INIT=12,
DT_FINI=13,
DT_SONAME=14,
DT_RPATH=15,
DT_SYMBOLIC=16,
DT_REL=17,
DT_RELSZ=18,
DT_RELENT=19,
DT_PLTREL=20,
DT_DEBUG=21,
DT_TEXTREL=22,
DT_JMPREL=23,
DT_BIND_NOW=24,
DT_INIT_ARRAY=25,
DT_FINI_ARRAY=26,
DT_INIT_ARRAYSZ=27,
DT_FINI_ARRAYSZ=28,
DT_RUNPATH=29,
DT_FLAGS=30,
DT_ENCODING=32,
DT_PREINIT_ARRAY=32,
DT_PREINIT_ARRAYSZ=33,
DT_NUM=34,
DT_LOOS=0x6000000d,
DT_HIOS=0x6ffff000,
DT_LOPROC=0x70000000,
DT_HIPROC=0x7fffffff,
DT_PROCNUM=0x35,
DT_VALRNGLO=0x6ffffd00,
DT_GNU_PRELINKED=0x6ffffdf5,
DT_GNU_CONFLICTSZ=0x6ffffdf6,
DT_GNU_LIBLISTSZ=0x6ffffdf7,
DT_CHECKSUM=0x6ffffdf8,
DT_PLTPADSZ=0x6ffffdf9,
DT_MOVEENT=0x6ffffdfa,
DT_MOVESZ=0x6ffffdfb,
DT_SYMINSZ=0x6ffffdfe,
DT_SYMINENT=0x6ffffdff,
DT_GNU_HASH=0x6ffffef5,
DT_TLSDESC_PLT=0x6ffffef6,
DT_TLSDESC_GOT=0x6ffffef7,
DT_GNU_CONFLICT=0x6ffffef8,
DT_GNU_LIBLIST=0x6ffffef9,
DT_CONFIG=0x6ffffefa,
DT_DEPAUDIT=0x6ffffefb,
DT_AUDIT=0x6ffffefc,
DT_PLTPAD=0x6ffffefd,
DT_MOVETAB=0x6ffffefe,
DT_SYMINFO=0x6ffffeff,
DT_VERSYM=0x6ffffff0,
DT_RELACOUNT=0x6ffffff9,
DT_RELCOUNT=0x6ffffffa,
DT_FLAGS_1=0x6ffffffb,
DT_VERDEF=0x6ffffffc,
DT_VERDEFNUM=0x6ffffffd,
DT_VERNEED=0x6ffffffe,
DT_VERNEEDNUM=0x6fffffff,
DT_AUXILIARY=0x7ffffffd,
DT_FILTER=0x7fffffff,
_default_=Pass,
)
# Above are the dynamic tags which are valid always.
# Below are the dynamic tags which are only valid in certain contexts.
ENUM_D_TAG_SOLARIS = dict(
DT_SUNW_AUXILIARY=0x6000000d,
DT_SUNW_RTLDINF=0x6000000e,
DT_SUNW_FILTER=0x6000000f,
DT_SUNW_CAP=0x60000010,
DT_SUNW_SYMTAB=0x60000011,
DT_SUNW_SYMSZ=0x60000012,
DT_SUNW_ENCODING=0x60000013,
DT_SUNW_SORTENT=0x60000013,
DT_SUNW_SYMSORT=0x60000014,
DT_SUNW_SYMSORTSZ=0x60000015,
DT_SUNW_TLSSORT=0x60000016,
DT_SUNW_TLSSORTSZ=0x60000017,
DT_SUNW_CAPINFO=0x60000018,
DT_SUNW_STRPAD=0x60000019,
DT_SUNW_CAPCHAIN=0x6000001a,
DT_SUNW_LDMACH=0x6000001b,
DT_SUNW_CAPCHAINENT=0x6000001d,
DT_SUNW_CAPCHAINSZ=0x6000001f,
)
ENUM_D_TAG_MIPS = dict(
DT_MIPS_RLD_VERSION=0x70000001,
DT_MIPS_TIME_STAMP=0x70000002,
DT_MIPS_ICHECKSUM=0x70000003,
DT_MIPS_IVERSION=0x70000004,
DT_MIPS_FLAGS=0x70000005,
DT_MIPS_BASE_ADDRESS=0x70000006,
DT_MIPS_CONFLICT=0x70000008,
DT_MIPS_LIBLIST=0x70000009,
DT_MIPS_LOCAL_GOTNO=0x7000000a,
DT_MIPS_CONFLICTNO=0x7000000b,
DT_MIPS_LIBLISTNO=0x70000010,
DT_MIPS_SYMTABNO=0x70000011,
DT_MIPS_UNREFEXTNO=0x70000012,
DT_MIPS_GOTSYM=0x70000013,
DT_MIPS_HIPAGENO=0x70000014,
DT_MIPS_RLD_MAP=0x70000016,
DT_MIPS_RLD_MAP_REL=0x70000035,
)
# Here is the mapping from e_machine enum to the extra dynamic tags which it
# validates. Solaris is missing from this list because its inclusion is not
# controlled by e_machine but rather e_ident[EI_OSABI].
# TODO: add the rest of the machine-specific dynamic tags, not just mips and
# solaris
ENUMMAP_EXTRA_D_TAG_MACHINE = dict(
EM_MIPS=ENUM_D_TAG_MIPS,
EM_MIPS_RS3_LE=ENUM_D_TAG_MIPS,
)
# Here is the full combined mapping from tag name to value
ENUM_D_TAG = dict(ENUM_D_TAG_COMMON)
ENUM_D_TAG.update(ENUM_D_TAG_SOLARIS)
for k in ENUMMAP_EXTRA_D_TAG_MACHINE:
ENUM_D_TAG.update(ENUMMAP_EXTRA_D_TAG_MACHINE[k])
ENUM_DT_FLAGS = dict(
DF_ORIGIN=0x1,
DF_SYMBOLIC=0x2,
DF_TEXTREL=0x4,
DF_BIND_NOW=0x8,
DF_STATIC_TLS=0x10,
)
ENUM_DT_FLAGS_1 = dict(
DF_1_NOW=0x1,
DF_1_GLOBAL=0x2,
DF_1_GROUP=0x4,
DF_1_NODELETE=0x8,
DF_1_LOADFLTR=0x10,
DF_1_INITFIRST=0x20,
DF_1_NOOPEN=0x40,
DF_1_ORIGIN=0x80,
DF_1_DIRECT=0x100,
DF_1_TRANS=0x200,
DF_1_INTERPOSE=0x400,
DF_1_NODEFLIB=0x800,
DF_1_NODUMP=0x1000,
DF_1_CONFALT=0x2000,
DF_1_ENDFILTEE=0x4000,
DF_1_DISPRELDNE=0x8000,
DF_1_DISPRELPND=0x10000,
DF_1_NODIRECT=0x20000,
DF_1_IGNMULDEF=0x40000,
DF_1_NOKSYMS=0x80000,
DF_1_NOHDR=0x100000,
DF_1_EDITED=0x200000,
DF_1_NORELOC=0x400000,
DF_1_SYMINTPOSE=0x800000,
DF_1_GLOBAUDIT=0x1000000,
DF_1_SINGLETON=0x2000000,
DF_1_STUB=0x4000000,
DF_1_PIE=0x8000000,
)
ENUM_RELOC_TYPE_MIPS = dict(
R_MIPS_NONE=0,
R_MIPS_16=1,
R_MIPS_32=2,
R_MIPS_REL32=3,
R_MIPS_26=4,
R_MIPS_HI16=5,
R_MIPS_LO16=6,
R_MIPS_GPREL16=7,
R_MIPS_LITERAL=8,
R_MIPS_GOT16=9,
R_MIPS_PC16=10,
R_MIPS_CALL16=11,
R_MIPS_GPREL32=12,
R_MIPS_SHIFT5=16,
R_MIPS_SHIFT6=17,
R_MIPS_64=18,
R_MIPS_GOT_DISP=19,
R_MIPS_GOT_PAGE=20,
R_MIPS_GOT_OFST=21,
R_MIPS_GOT_HI16=22,
R_MIPS_GOT_LO16=23,
R_MIPS_SUB=24,
R_MIPS_INSERT_A=25,
R_MIPS_INSERT_B=26,
R_MIPS_DELETE=27,
R_MIPS_HIGHER=28,
R_MIPS_HIGHEST=29,
R_MIPS_CALL_HI16=30,
R_MIPS_CALL_LO16=31,
R_MIPS_SCN_DISP=32,
R_MIPS_REL16=33,
R_MIPS_ADD_IMMEDIATE=34,
R_MIPS_PJUMP=35,
R_MIPS_RELGOT=36,
R_MIPS_JALR=37,
R_MIPS_TLS_DTPMOD32=38,
R_MIPS_TLS_DTPREL32=39,
R_MIPS_TLS_DTPMOD64=40,
R_MIPS_TLS_DTPREL64=41,
R_MIPS_TLS_GD=42,
R_MIPS_TLS_LDM=43,
R_MIPS_TLS_DTPREL_HI16=44,
R_MIPS_TLS_DTPREL_LO16=45,
R_MIPS_TLS_GOTTPREL=46,
R_MIPS_TLS_TPREL32=47,
R_MIPS_TLS_TPREL64=48,
R_MIPS_TLS_TPREL_HI16=49,
R_MIPS_TLS_TPREL_LO16=50,
R_MIPS_GLOB_DAT=51,
R_MIPS_COPY=126,
R_MIPS_JUMP_SLOT=127,
_default_=Pass,
)
ENUM_RELOC_TYPE_i386 = dict(
R_386_NONE=0,
R_386_32=1,
R_386_PC32=2,
R_386_GOT32=3,
R_386_PLT32=4,
R_386_COPY=5,
R_386_GLOB_DAT=6,
R_386_JUMP_SLOT=7,
R_386_RELATIVE=8,
R_386_GOTOFF=9,
R_386_GOTPC=10,
R_386_32PLT=11,
R_386_TLS_TPOFF=14,
R_386_TLS_IE=15,
R_386_TLS_GOTIE=16,
R_386_TLS_LE=17,
R_386_TLS_GD=18,
R_386_TLS_LDM=19,
R_386_16=20,
R_386_PC16=21,
R_386_8=22,
R_386_PC8=23,
R_386_TLS_GD_32=24,
R_386_TLS_GD_PUSH=25,
R_386_TLS_GD_CALL=26,
R_386_TLS_GD_POP=27,
R_386_TLS_LDM_32=28,
R_386_TLS_LDM_PUSH=29,
R_386_TLS_LDM_CALL=30,
R_386_TLS_LDM_POP=31,
R_386_TLS_LDO_32=32,
R_386_TLS_IE_32=33,
R_386_TLS_LE_32=34,
R_386_TLS_DTPMOD32=35,
R_386_TLS_DTPOFF32=36,
R_386_TLS_TPOFF32=37,
R_386_TLS_GOTDESC=39,
R_386_TLS_DESC_CALL=40,
R_386_TLS_DESC=41,
R_386_IRELATIVE=42,
R_386_USED_BY_INTEL_200=200,
R_386_GNU_VTINHERIT=250,
R_386_GNU_VTENTRY=251,
_default_=Pass,
)
ENUM_RELOC_TYPE_x64 = dict(
R_X86_64_NONE=0,
R_X86_64_64=1,
R_X86_64_PC32=2,
R_X86_64_GOT32=3,
R_X86_64_PLT32=4,
R_X86_64_COPY=5,
R_X86_64_GLOB_DAT=6,
R_X86_64_JUMP_SLOT=7,
R_X86_64_RELATIVE=8,
R_X86_64_GOTPCREL=9,
R_X86_64_32=10,
R_X86_64_32S=11,
R_X86_64_16=12,
R_X86_64_PC16=13,
R_X86_64_8=14,
R_X86_64_PC8=15,
R_X86_64_DTPMOD64=16,
R_X86_64_DTPOFF64=17,
R_X86_64_TPOFF64=18,
R_X86_64_TLSGD=19,
R_X86_64_TLSLD=20,
R_X86_64_DTPOFF32=21,
R_X86_64_GOTTPOFF=22,
R_X86_64_TPOFF32=23,
R_X86_64_PC64=24,
R_X86_64_GOTOFF64=25,
R_X86_64_GOTPC32=26,
R_X86_64_GOT64=27,
R_X86_64_GOTPCREL64=28,
R_X86_64_GOTPC64=29,
R_X86_64_GOTPLT64=30,
R_X86_64_PLTOFF64=31,
R_X86_64_GOTPC32_TLSDESC=34,
R_X86_64_TLSDESC_CALL=35,
R_X86_64_TLSDESC=36,
R_X86_64_IRELATIVE=37,
R_X86_64_GNU_VTINHERIT=250,
R_X86_64_GNU_VTENTRY=251,
_default_=Pass,
)
# Sunw Syminfo Bound To special values
ENUM_SUNW_SYMINFO_BOUNDTO = dict(
SYMINFO_BT_SELF=0xffff,
SYMINFO_BT_PARENT=0xfffe,
SYMINFO_BT_NONE=0xfffd,
SYMINFO_BT_EXTERN=0xfffc,
_default_=Pass,
)
# Versym section, version dependency index
ENUM_VERSYM = dict(
VER_NDX_LOCAL=0,
VER_NDX_GLOBAL=1,
VER_NDX_LORESERVE=0xff00,
VER_NDX_ELIMINATE=0xff01,
_default_=Pass,
)
# Sunw Syminfo Bound To special values
ENUM_SUNW_SYMINFO_BOUNDTO = dict(
SYMINFO_BT_SELF=0xffff,
SYMINFO_BT_PARENT=0xfffe,
SYMINFO_BT_NONE=0xfffd,
SYMINFO_BT_EXTERN=0xfffc,
_default_=Pass,
)
# PT_NOTE section types for all ELF types except ET_CORE
ENUM_NOTE_N_TYPE = dict(
NT_GNU_ABI_TAG=1,
NT_GNU_HWCAP=2,
NT_GNU_BUILD_ID=3,
NT_GNU_GOLD_VERSION=4,
_default_=Pass,
)
# PT_NOTE section types for ET_CORE
ENUM_CORE_NOTE_N_TYPE = dict(
NT_PRSTATUS=1,
NT_FPREGSET=2,
NT_PRPSINFO=3,
NT_TASKSTRUCT=4,
NT_AUXV=6,
NT_SIGINFO=0x53494749,
NT_FILE=0x46494c45,
_default_=Pass,
)
# Values in GNU .note.ABI-tag notes (n_type=='NT_GNU_ABI_TAG')
ENUM_NOTE_ABI_TAG_OS = dict(
ELF_NOTE_OS_LINUX=0,
ELF_NOTE_OS_GNU=1,
ELF_NOTE_OS_SOLARIS2=2,
ELF_NOTE_OS_FREEBSD=3,
ELF_NOTE_OS_NETBSD=4,
ELF_NOTE_OS_SYLLABLE=5,
_default_=Pass,
)
ENUM_RELOC_TYPE_ARM = dict(
R_ARM_NONE=0,
R_ARM_PC24=1,
R_ARM_ABS32=2,
R_ARM_REL32=3,
R_ARM_LDR_PC_G0=4,
R_ARM_ABS16=5,
R_ARM_ABS12=6,
R_ARM_THM_ABS5=7,
R_ARM_ABS8=8,
R_ARM_SBREL32=9,
R_ARM_THM_CALL=10,
R_ARM_THM_PC8=11,
R_ARM_BREL_ADJ=12,
R_ARM_SWI24=13,
R_ARM_THM_SWI8=14,
R_ARM_XPC25=15,
R_ARM_THM_XPC22=16,
R_ARM_TLS_DTPMOD32=17,
R_ARM_TLS_DTPOFF32=18,
R_ARM_TLS_TPOFF32=19,
R_ARM_COPY=20,
R_ARM_GLOB_DAT=21,
R_ARM_JUMP_SLOT=22,
R_ARM_RELATIVE=23,
R_ARM_GOTOFF32=24,
R_ARM_BASE_PREL=25,
R_ARM_GOT_BREL=26,
R_ARM_PLT32=27,
R_ARM_CALL=28,
R_ARM_JUMP24=29,
R_ARM_THM_JUMP24=30,
R_ARM_BASE_ABS=31,
R_ARM_ALU_PCREL_7_0=32,
R_ARM_ALU_PCREL_15_8=33,
R_ARM_ALU_PCREL_23_15=34,
R_ARM_LDR_SBREL_11_0_NC=35,
R_ARM_ALU_SBREL_19_12_NC=36,
R_ARM_ALU_SBREL_27_20_CK=37,
R_ARM_TARGET1=38,
R_ARM_SBREL31=39,
R_ARM_V4BX=40,
R_ARM_TARGET2=41,
R_ARM_PREL31=42,
R_ARM_MOVW_ABS_NC=43,
R_ARM_MOVT_ABS=44,
R_ARM_MOVW_PREL_NC=45,
R_ARM_MOVT_PREL=46,
R_ARM_THM_MOVW_ABS_NC=47,
R_ARM_THM_MOVT_ABS=48,
R_ARM_THM_MOVW_PREL_NC=49,
R_ARM_THM_MOVT_PREL=50,
R_ARM_THM_JUMP19=51,
R_ARM_THM_JUMP6=52,
R_ARM_THM_ALU_PREL_11_0=53,
R_ARM_THM_PC12=54,
R_ARM_ABS32_NOI=55,
R_ARM_REL32_NOI=56,
R_ARM_ALU_PC_G0_NC=57,
R_ARM_ALU_PC_G0=58,
R_ARM_ALU_PC_G1_NC=59,
R_ARM_ALU_PC_G1=60,
R_ARM_ALU_PC_G2=61,
R_ARM_LDR_PC_G1=62,
R_ARM_LDR_PC_G2=63,
R_ARM_LDRS_PC_G0=64,
R_ARM_LDRS_PC_G1=65,
R_ARM_LDRS_PC_G2=66,
R_ARM_LDC_PC_G0=67,
R_ARM_LDC_PC_G1=68,
R_ARM_LDC_PC_G2=69,
R_ARM_ALU_SB_G0_NC=70,
R_ARM_ALU_SB_G0=71,
R_ARM_ALU_SB_G1_NC=72,
R_ARM_ALU_SB_G1=73,
R_ARM_ALU_SB_G2=74,
R_ARM_LDR_SB_G0=75,
R_ARM_LDR_SB_G1=76,
R_ARM_LDR_SB_G2=77,
R_ARM_LDRS_SB_G0=78,
R_ARM_LDRS_SB_G1=79,
R_ARM_LDRS_SB_G2=80,
R_ARM_LDC_SB_G0=81,
R_ARM_LDC_SB_G1=82,
R_ARM_LDC_SB_G2=83,
R_ARM_MOVW_BREL_NC=84,
R_ARM_MOVT_BREL=85,
R_ARM_MOVW_BREL=86,
R_ARM_THM_MOVW_BREL_NC=87,
R_ARM_THM_MOVT_BREL=88,
R_ARM_THM_MOVW_BREL=89,
R_ARM_PLT32_ABS=94,
R_ARM_GOT_ABS=95,
R_ARM_GOT_PREL=96,
R_ARM_GOT_BREL12=97,
R_ARM_GOTOFF12=98,
R_ARM_GOTRELAX=99,
R_ARM_GNU_VTENTRY=100,
R_ARM_GNU_VTINHERIT=101,
R_ARM_THM_JUMP11=102,
R_ARM_THM_JUMP8=103,
R_ARM_TLS_GD32=104,
R_ARM_TLS_LDM32=105,
R_ARM_TLS_LDO32=106,
R_ARM_TLS_IE32=107,
R_ARM_TLS_LE32=108,
R_ARM_TLS_LDO12=109,
R_ARM_TLS_LE12=110,
R_ARM_TLS_IE12GP=111,
R_ARM_PRIVATE_0=112,
R_ARM_PRIVATE_1=113,
R_ARM_PRIVATE_2=114,
R_ARM_PRIVATE_3=115,
R_ARM_PRIVATE_4=116,
R_ARM_PRIVATE_5=117,
R_ARM_PRIVATE_6=118,
R_ARM_PRIVATE_7=119,
R_ARM_PRIVATE_8=120,
R_ARM_PRIVATE_9=121,
R_ARM_PRIVATE_10=122,
R_ARM_PRIVATE_11=123,
R_ARM_PRIVATE_12=124,
R_ARM_PRIVATE_13=125,
R_ARM_PRIVATE_14=126,
R_ARM_PRIVATE_15=127,
R_ARM_ME_TOO=128,
R_ARM_THM_TLS_DESCSEQ16=129,
R_ARM_THM_TLS_DESCSEQ32=130,
R_ARM_THM_GOT_BREL12=131,
R_ARM_IRELATIVE=140,
)
ENUM_RELOC_TYPE_AARCH64 = dict(
R_AARCH64_NONE=256,
R_AARCH64_ABS64=257,
R_AARCH64_ABS32=258,
R_AARCH64_ABS16=259,
R_AARCH64_PREL64=260,
R_AARCH64_PREL32=261,
R_AARCH64_PREL16=262,
R_AARCH64_MOVW_UABS_G0=263,
R_AARCH64_MOVW_UABS_G0_NC=264,
R_AARCH64_MOVW_UABS_G1=265,
R_AARCH64_MOVW_UABS_G1_NC=266,
R_AARCH64_MOVW_UABS_G2=267,
R_AARCH64_MOVW_UABS_G2_NC=268,
R_AARCH64_MOVW_UABS_G3=269,
R_AARCH64_MOVW_SABS_G0=270,
R_AARCH64_MOVW_SABS_G1=271,
R_AARCH64_MOVW_SABS_G2=272,
R_AARCH64_LD_PREL_LO19=273,
R_AARCH64_ADR_PREL_LO21=274,
R_AARCH64_ADR_PREL_PG_HI21=275,
R_AARCH64_ADR_PREL_PG_HI21_NC=276,
R_AARCH64_ADD_ABS_LO12_NC=277,
R_AARCH64_LDST8_ABS_LO12_NC=278,
R_AARCH64_TSTBR14=279,
R_AARCH64_CONDBR19=280,
R_AARCH64_JUMP26=282,
R_AARCH64_CALL26=283,
R_AARCH64_LDST16_ABS_LO12_NC=284,
R_AARCH64_LDST32_ABS_LO12_NC=285,
R_AARCH64_LDST64_ABS_LO12_NC=286,
R_AARCH64_MOVW_PREL_G0=287,
R_AARCH64_MOVW_PREL_G0_NC=288,
R_AARCH64_MOVW_PREL_G1=289,
R_AARCH64_MOVW_PREL_G1_NC=290,
R_AARCH64_MOVW_PREL_G2=291,
R_AARCH64_MOVW_PREL_G2_NC=292,
R_AARCH64_MOVW_PREL_G3=293,
R_AARCH64_MOVW_GOTOFF_G0=300,
R_AARCH64_MOVW_GOTOFF_G0_NC=301,
R_AARCH64_MOVW_GOTOFF_G1=302,
R_AARCH64_MOVW_GOTOFF_G1_NC=303,
R_AARCH64_MOVW_GOTOFF_G2=304,
R_AARCH64_MOVW_GOTOFF_G2_NC=305,
R_AARCH64_MOVW_GOTOFF_G3=306,
R_AARCH64_GOTREL64=307,
R_AARCH64_GOTREL32=308,
R_AARCH64_GOT_LD_PREL19=309,
R_AARCH64_LD64_GOTOFF_LO15=310,
R_AARCH64_ADR_GOT_PAGE=311,
R_AARCH64_LD64_GOT_LO12_NC=312,
R_AARCH64_TLSGD_ADR_PREL21=512,
R_AARCH64_TLSGD_ADR_PAGE21=513,
R_AARCH64_TLSGD_ADD_LO12_NC=514,
R_AARCH64_TLSGD_MOVW_G1=515,
R_AARCH64_TLSGD_MOVW_G0_NC=516,
R_AARCH64_TLSLD_ADR_PREL21=517,
R_AARCH64_TLSLD_ADR_PAGE21=518,
R_AARCH64_TLSLD_ADD_LO12_NC=519,
R_AARCH64_TLSLD_MOVW_G1=520,
R_AARCH64_TLSLD_MOVW_G0_NC=521,
R_AARCH64_TLSLD_LD_PREL19=522,
R_AARCH64_TLSLD_MOVW_DTPREL_G2=523,
R_AARCH64_TLSLD_MOVW_DTPREL_G1=524,
R_AARCH64_TLSLD_MOVW_DTPREL_G1_NC=525,
R_AARCH64_TLSLD_MOVW_DTPREL_G0=526,
R_AARCH64_TLSLD_MOVW_DTPREL_G0_NC=527,
R_AARCH64_TLSLD_ADD_DTPREL_HI12=528,
R_AARCH64_TLSLD_ADD_DTPREL_LO12=529,
R_AARCH64_TLSLD_ADD_DTPREL_LO12_NC=530,
R_AARCH64_TLSLD_LDST8_DTPREL_LO12=531,
R_AARCH64_TLSLD_LDST8_DTPREL_LO12_NC=532,
R_AARCH64_TLSLD_LDST16_DTPREL_LO12=533,
R_AARCH64_TLSLD_LDST16_DTPREL_LO12_NC=534,
R_AARCH64_TLSLD_LDST32_DTPREL_LO12=535,
R_AARCH64_TLSLD_LDST32_DTPREL_LO12_NC=536,
R_AARCH64_TLSLD_LDST64_DTPREL_LO12=537,
R_AARCH64_TLSLD_LDST64_DTPREL_LO12_NC=538,
R_AARCH64_TLSIE_MOVW_GOTTPREL_G1=539,
R_AARCH64_TLSIE_MOVW_GOTTPREL_G0_NC=540,
R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21=541,
R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC=542,
R_AARCH64_TLSIE_LD_GOTTPREL_PREL19=543,
R_AARCH64_TLSLE_MOVW_TPREL_G2=544,
R_AARCH64_TLSLE_MOVW_TPREL_G1=545,
R_AARCH64_TLSLE_MOVW_TPREL_G1_NC=546,
R_AARCH64_TLSLE_MOVW_TPREL_G0=547,
R_AARCH64_TLSLE_MOVW_TPREL_G0_NC=548,
R_AARCH64_TLSLE_ADD_TPREL_HI12=549,
R_AARCH64_TLSLE_ADD_TPREL_LO12=550,
R_AARCH64_TLSLE_ADD_TPREL_LO12_NC=551,
R_AARCH64_TLSLE_LDST8_TPREL_LO12=552,
R_AARCH64_TLSLE_LDST8_TPREL_LO12_NC=553,
R_AARCH64_TLSLE_LDST16_TPREL_LO12=554,
R_AARCH64_TLSLE_LDST16_TPREL_LO12_NC=555,
R_AARCH64_TLSLE_LDST32_TPREL_LO12=556,
R_AARCH64_TLSLE_LDST32_TPREL_LO12_NC=557,
R_AARCH64_TLSLE_LDST64_TPREL_LO12=558,
R_AARCH64_TLSLE_LDST64_TPREL_LO12_NC=559,
R_AARCH64_COPY=1024,
R_AARCH64_GLOB_DAT=1025,
R_AARCH64_JUMP_SLOT=1026,
R_AARCH64_RELATIVE=1027,
R_AARCH64_TLS_DTPREL64=1028,
R_AARCH64_TLS_DTPMOD64=1029,
R_AARCH64_TLS_TPREL64=1030,
R_AARCH64_TLS_DTPREL32=1031,
R_AARCH64_TLS_DTPMOD32=1032,
R_AARCH64_TLS_TPREL32=1033,
)
ENUM_ATTR_TAG_ARM = dict(
TAG_FILE=1,
TAG_SECTION=2,
TAG_SYMBOL=3,
TAG_CPU_RAW_NAME=4,
TAG_CPU_NAME=5,
TAG_CPU_ARCH=6,
TAG_CPU_ARCH_PROFILE=7,
TAG_ARM_ISA_USE=8,
TAG_THUMB_ISA_USE=9,
TAG_FP_ARCH=10,
TAG_WMMX_ARCH=11,
TAG_ADVANCED_SIMD_ARCH=12,
TAG_PCS_CONFIG=13,
TAG_ABI_PCS_R9_USE=14,
TAG_ABI_PCS_RW_DATA=15,
TAG_ABI_PCS_RO_DATA=16,
TAG_ABI_PCS_GOT_USE=17,
TAG_ABI_PCS_WCHAR_T=18,
TAG_ABI_FP_ROUNDING=19,
TAG_ABI_FP_DENORMAL=20,
TAG_ABI_FP_EXCEPTIONS=21,
TAG_ABI_FP_USER_EXCEPTIONS=22,
TAG_ABI_FP_NUMBER_MODEL=23,
TAG_ABI_ALIGN_NEEDED=24,
TAG_ABI_ALIGN_PRESERVED=25,
TAG_ABI_ENUM_SIZE=26,
TAG_ABI_HARDFP_USE=27,
TAG_ABI_VFP_ARGS=28,
TAG_ABI_WMMX_ARGS=29,
TAG_ABI_OPTIMIZATION_GOALS=30,
TAG_ABI_FP_OPTIMIZATION_GOALS=31,
TAG_COMPATIBILITY=32,
TAG_CPU_UNALIGNED_ACCESS=34,
TAG_FP_HP_EXTENSION=36,
TAG_ABI_FP_16BIT_FORMAT=38,
TAG_MPEXTENSION_USE=42,
TAG_DIV_USE=44,
TAG_NODEFAULTS=64,
TAG_ALSO_COMPATIBLE_WITH=65,
TAG_T2EE_USE=66,
TAG_CONFORMANCE=67,
TAG_VIRTUALIZATION_USE=68,
TAG_MPEXTENSION_USE_OLD=70,
)
pyelftools-0.26/elftools/elf/gnuversions.py 0000664 0000000 0000000 00000020277 13572204573 0021212 0 ustar 00root root 0000000 0000000 #------------------------------------------------------------------------------
# elftools: elf/gnuversions.py
#
# ELF sections
#
# Yann Rouillard (yann@pleiades.fr.eu.org)
# This code is in the public domain
#------------------------------------------------------------------------------
from ..construct import CString
from ..common.utils import struct_parse, elf_assert
from .sections import Section, Symbol
class Version(object):
""" Version object - representing a version definition or dependency
entry from a "Version Needed" or a "Version Dependency" table section.
This kind of entry contains a pointer to an array of auxiliary entries
that store the information about version names or dependencies.
These entries are not stored in this object and should be accessed
through the appropriate method of a section object which will return
an iterator of VersionAuxiliary objects.
Similarly to Section objects, allows dictionary-like access to
verdef/verneed entry
"""
def __init__(self, entry, name=None):
self.entry = entry
self.name = name
def __getitem__(self, name):
""" Implement dict-like access to entry
"""
return self.entry[name]
class VersionAuxiliary(object):
""" Version Auxiliary object - representing an auxiliary entry of a version
definition or dependency entry
Similarly to Section objects, allows dictionary-like access to the
verdaux/vernaux entry
"""
def __init__(self, entry, name):
self.entry = entry
self.name = name
def __getitem__(self, name):
""" Implement dict-like access to entries
"""
return self.entry[name]
class GNUVersionSection(Section):
""" Common ancestor class for ELF SUNW|GNU Version Needed/Dependency
sections class which contains shareable code
"""
def __init__(self, header, name, elffile, stringtable,
field_prefix, version_struct, version_auxiliaries_struct):
super(GNUVersionSection, self).__init__(header, name, elffile)
self.stringtable = stringtable
self.field_prefix = field_prefix
self.version_struct = version_struct
self.version_auxiliaries_struct = version_auxiliaries_struct
def num_versions(self):
""" Number of version entries in the section
"""
return self['sh_info']
def _field_name(self, name, auxiliary=False):
""" Return the real field's name of version or a version auxiliary
entry
"""
middle = 'a_' if auxiliary else '_'
return self.field_prefix + middle + name
def _iter_version_auxiliaries(self, entry_offset, count):
""" Yield all auxiliary entries of a version entry
"""
name_field = self._field_name('name', auxiliary=True)
next_field = self._field_name('next', auxiliary=True)
for _ in range(count):
entry = struct_parse(
self.version_auxiliaries_struct,
self.stream,
stream_pos=entry_offset)
name = self.stringtable.get_string(entry[name_field])
version_aux = VersionAuxiliary(entry, name)
yield version_aux
entry_offset += entry[next_field]
def iter_versions(self):
""" Yield all the version entries in the section
Each time it returns the main version structure
and an iterator to walk through its auxiliaries entries
"""
aux_field = self._field_name('aux')
count_field = self._field_name('cnt')
next_field = self._field_name('next')
entry_offset = self['sh_offset']
for _ in range(self.num_versions()):
entry = struct_parse(
self.version_struct,
self.stream,
stream_pos=entry_offset)
elf_assert(entry[count_field] > 0,
'Expected number of version auxiliary entries (%s) to be > 0'
'for the following version entry: %s' % (
count_field, str(entry)))
version = Version(entry)
aux_entries_offset = entry_offset + entry[aux_field]
version_auxiliaries_iter = self._iter_version_auxiliaries(
aux_entries_offset, entry[count_field])
yield version, version_auxiliaries_iter
entry_offset += entry[next_field]
class GNUVerNeedSection(GNUVersionSection):
""" ELF SUNW or GNU Version Needed table section.
Has an associated StringTableSection that's passed in the constructor.
"""
def __init__(self, header, name, elffile, stringtable):
super(GNUVerNeedSection, self).__init__(
header, name, elffile, stringtable, 'vn',
elffile.structs.Elf_Verneed, elffile.structs.Elf_Vernaux)
self._has_indexes = None
def has_indexes(self):
""" Return True if at least one version definition entry has an index
that is stored in the vna_other field.
This information is used for symbol versioning
"""
if self._has_indexes is None:
self._has_indexes = False
for _, vernaux_iter in self.iter_versions():
for vernaux in vernaux_iter:
if vernaux['vna_other']:
self._has_indexes = True
break
return self._has_indexes
def iter_versions(self):
for verneed, vernaux in super(GNUVerNeedSection, self).iter_versions():
verneed.name = self.stringtable.get_string(verneed['vn_file'])
yield verneed, vernaux
def get_version(self, index):
""" Get the version information located at index #n in the table
Return boths the verneed structure and the vernaux structure
that contains the name of the version
"""
for verneed, vernaux_iter in self.iter_versions():
for vernaux in vernaux_iter:
if vernaux['vna_other'] == index:
return verneed, vernaux
return None
class GNUVerDefSection(GNUVersionSection):
""" ELF SUNW or GNU Version Definition table section.
Has an associated StringTableSection that's passed in the constructor.
"""
def __init__(self, header, name, elffile, stringtable):
super(GNUVerDefSection, self).__init__(
header, name, elffile, stringtable, 'vd',
elffile.structs.Elf_Verdef, elffile.structs.Elf_Verdaux)
def get_version(self, index):
""" Get the version information located at index #n in the table
Return boths the verdef structure and an iterator to retrieve
both the version names and dependencies in the form of
verdaux entries
"""
for verdef, verdaux_iter in self.iter_versions():
if verdef['vd_ndx'] == index:
return verdef, verdaux_iter
return None
class GNUVerSymSection(Section):
""" ELF SUNW or GNU Versym table section.
Has an associated SymbolTableSection that's passed in the constructor.
"""
def __init__(self, header, name, elffile, symboltable):
super(GNUVerSymSection, self).__init__(header, name, elffile)
self.symboltable = symboltable
def num_symbols(self):
""" Number of symbols in the table
"""
return self['sh_size'] // self['sh_entsize']
def get_symbol(self, n):
""" Get the symbol at index #n from the table (Symbol object)
It begins at 1 and not 0 since the first entry is used to
store the current version of the syminfo table
"""
# Grab the symbol's entry from the stream
entry_offset = self['sh_offset'] + n * self['sh_entsize']
entry = struct_parse(
self.structs.Elf_Versym,
self.stream,
stream_pos=entry_offset)
# Find the symbol name in the associated symbol table
name = self.symboltable.get_symbol(n).name
return Symbol(entry, name)
def iter_symbols(self):
""" Yield all the symbols in the table
"""
for i in range(self.num_symbols()):
yield self.get_symbol(i)
pyelftools-0.26/elftools/elf/hash.py 0000664 0000000 0000000 00000006103 13572204573 0017543 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/hash.py
#
# ELF hash table sections
#
# Andreas Ziegler (andreas.ziegler@fau.de)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..common.utils import struct_parse
class HashSection(object):
""" Minimal part of an ELF hash section to find the number of symbols in the
symbol table - useful for super-stripped binaries without section
headers where only the start of the symbol table is known from the
dynamic segment. The layout and contents are nicely described at
https://flapenguin.me/2017/04/24/elf-lookup-dt-hash/.
"""
def __init__(self, stream, offset, elffile):
self._stream = stream
self._offset = offset
self._elffile = elffile
self.params = struct_parse(self._elffile.structs.Elf_Hash,
self._stream,
self._offset)
def get_number_of_symbols(self):
""" Get the number of symbols from the hash table parameters.
"""
return self.params['nchains']
class GNUHashSection(object):
""" Minimal part of a GNU hash section to find the number of symbols in the
symbol table - useful for super-stripped binaries without section
headers where only the start of the symbol table is known from the
dynamic segment. The layout and contents are nicely described at
https://flapenguin.me/2017/05/10/elf-lookup-dt-gnu-hash/.
"""
def __init__(self, stream, offset, elffile):
self._stream = stream
self._offset = offset
self._elffile = elffile
self.params = struct_parse(self._elffile.structs.Gnu_Hash,
self._stream,
self._offset)
def get_number_of_symbols(self):
""" Get the number of symbols in the hash table by finding the bucket
with the highest symbol index and walking to the end of its chain.
"""
# Element sizes in the hash table
wordsize = self._elffile.structs.Elf_word('').sizeof()
xwordsize = self._elffile.structs.Elf_xword('').sizeof()
# Find highest index in buckets array
max_idx = max(self.params['buckets'])
if max_idx < self.params['symoffset']:
return self.params['symoffset']
# Position the stream at the start of the corresponding chain
chain_pos = self._offset + 4 * wordsize + \
self.params['bloom_size'] * xwordsize + \
self.params['nbuckets'] * wordsize + \
(max_idx - self.params['symoffset']) * wordsize
# Walk the chain to its end (lowest bit is set)
while True:
cur_hash = struct_parse(self._elffile.structs.Elf_word('elem'),
self._stream,
chain_pos)
if cur_hash & 1:
return max_idx + 1
max_idx += 1
chain_pos += wordsize
pyelftools-0.26/elftools/elf/notes.py 0000664 0000000 0000000 00000003776 13572204573 0017765 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/notes.py
#
# ELF notes
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..common.py3compat import bytes2str
from ..common.utils import struct_parse, roundup
from ..construct import CString
def iter_notes(elffile, offset, size):
""" Yield all the notes in a section or segment.
"""
end = offset + size
while offset < end:
note = struct_parse(
elffile.structs.Elf_Nhdr,
elffile.stream,
stream_pos=offset)
note['n_offset'] = offset
offset += elffile.structs.Elf_Nhdr.sizeof()
elffile.stream.seek(offset)
# n_namesz is 4-byte aligned.
disk_namesz = roundup(note['n_namesz'], 2)
note['n_name'] = bytes2str(
CString('').parse(elffile.stream.read(disk_namesz)))
offset += disk_namesz
desc_data = bytes2str(elffile.stream.read(note['n_descsz']))
if note['n_type'] == 'NT_GNU_ABI_TAG':
note['n_desc'] = struct_parse(elffile.structs.Elf_abi,
elffile.stream,
offset)
elif note['n_type'] == 'NT_GNU_BUILD_ID':
note['n_desc'] = ''.join('%.2x' % ord(b) for b in desc_data)
elif note['n_type'] == 'NT_PRPSINFO':
note['n_desc'] = struct_parse(elffile.structs.Elf_Prpsinfo,
elffile.stream,
offset)
elif note['n_type'] == 'NT_FILE':
note['n_desc'] = struct_parse(elffile.structs.Elf_Nt_File,
elffile.stream,
offset)
else:
note['n_desc'] = desc_data
offset += roundup(note['n_descsz'], 2)
note['n_size'] = offset - note['n_offset']
yield note
pyelftools-0.26/elftools/elf/relocation.py 0000664 0000000 0000000 00000026401 13572204573 0020762 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/relocation.py
#
# ELF relocations
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from collections import namedtuple
from ..common.exceptions import ELFRelocationError
from ..common.utils import elf_assert, struct_parse
from .sections import Section
from .enums import (
ENUM_RELOC_TYPE_i386, ENUM_RELOC_TYPE_x64, ENUM_RELOC_TYPE_MIPS, ENUM_RELOC_TYPE_ARM, ENUM_D_TAG)
class Relocation(object):
""" Relocation object - representing a single relocation entry. Allows
dictionary-like access to the entry's fields.
Can be either a REL or RELA relocation.
"""
def __init__(self, entry, elffile):
self.entry = entry
self.elffile = elffile
def is_RELA(self):
""" Is this a RELA relocation? If not, it's REL.
"""
return 'r_addend' in self.entry
def __getitem__(self, name):
""" Dict-like access to entries
"""
return self.entry[name]
def __repr__(self):
return '' % (
'RELA' if self.is_RELA() else 'REL',
self.entry)
def __str__(self):
return self.__repr__()
class RelocationTable(object):
""" Shared functionality between relocation sections and relocation tables
"""
def __init__(self, elffile, offset, size, is_rela):
self._stream = elffile.stream
self._elffile = elffile
self._elfstructs = elffile.structs
self._size = size
self._offset = offset
self._is_rela = is_rela
if is_rela:
self.entry_struct = self._elfstructs.Elf_Rela
else:
self.entry_struct = self._elfstructs.Elf_Rel
self.entry_size = self.entry_struct.sizeof()
def is_RELA(self):
""" Is this a RELA relocation section? If not, it's REL.
"""
return self._is_rela
def num_relocations(self):
""" Number of relocations in the section
"""
return self._size // self.entry_size
def get_relocation(self, n):
""" Get the relocation at index #n from the section (Relocation object)
"""
entry_offset = self._offset + n * self.entry_size
entry = struct_parse(
self.entry_struct,
self._stream,
stream_pos=entry_offset)
return Relocation(entry, self._elffile)
def iter_relocations(self):
""" Yield all the relocations in the section
"""
for i in range(self.num_relocations()):
yield self.get_relocation(i)
class RelocationSection(Section, RelocationTable):
""" ELF relocation section. Serves as a collection of Relocation entries.
"""
def __init__(self, header, name, elffile):
Section.__init__(self, header, name, elffile)
RelocationTable.__init__(self, self.elffile,
self['sh_offset'], self['sh_size'], header['sh_type'] == 'SHT_RELA')
elf_assert(header['sh_type'] in ('SHT_REL', 'SHT_RELA'),
'Unknown relocation type section')
elf_assert(header['sh_entsize'] == self.entry_size,
'Expected sh_entsize of %s section to be %s' % (
header['sh_type'], self.entry_size))
class RelocationHandler(object):
""" Handles the logic of relocations in ELF files.
"""
def __init__(self, elffile):
self.elffile = elffile
def find_relocations_for_section(self, section):
""" Given a section, find the relocation section for it in the ELF
file. Return a RelocationSection object, or None if none was
found.
"""
reloc_section_names = (
'.rel' + section.name,
'.rela' + section.name)
# Find the relocation section aimed at this one. Currently assume
# that either .rel or .rela section exists for this section, but
# not both.
for relsection in self.elffile.iter_sections():
if ( isinstance(relsection, RelocationSection) and
relsection.name in reloc_section_names):
return relsection
return None
def apply_section_relocations(self, stream, reloc_section):
""" Apply all relocations in reloc_section (a RelocationSection object)
to the given stream, that contains the data of the section that is
being relocated. The stream is modified as a result.
"""
# The symbol table associated with this relocation section
symtab = self.elffile.get_section(reloc_section['sh_link'])
for reloc in reloc_section.iter_relocations():
self._do_apply_relocation(stream, reloc, symtab)
def _do_apply_relocation(self, stream, reloc, symtab):
# Preparations for performing the relocation: obtain the value of
# the symbol mentioned in the relocation, as well as the relocation
# recipe which tells us how to actually perform it.
# All peppered with some sanity checking.
if reloc['r_info_sym'] >= symtab.num_symbols():
raise ELFRelocationError(
'Invalid symbol reference in relocation: index %s' % (
reloc['r_info_sym']))
sym_value = symtab.get_symbol(reloc['r_info_sym'])['st_value']
reloc_type = reloc['r_info_type']
recipe = None
if self.elffile.get_machine_arch() == 'x86':
if reloc.is_RELA():
raise ELFRelocationError(
'Unexpected RELA relocation for x86: %s' % reloc)
recipe = self._RELOCATION_RECIPES_X86.get(reloc_type, None)
elif self.elffile.get_machine_arch() == 'x64':
if not reloc.is_RELA():
raise ELFRelocationError(
'Unexpected REL relocation for x64: %s' % reloc)
recipe = self._RELOCATION_RECIPES_X64.get(reloc_type, None)
elif self.elffile.get_machine_arch() == 'MIPS':
if reloc.is_RELA():
raise ELFRelocationError(
'Unexpected RELA relocation for MIPS: %s' % reloc)
recipe = self._RELOCATION_RECIPES_MIPS.get(reloc_type, None)
elif self.elffile.get_machine_arch() == 'ARM':
if reloc.is_RELA():
raise ELFRelocationError(
'Unexpected RELA relocation for ARM: %s' % reloc)
recipe = self._RELOCATION_RECIPES_ARM.get(reloc_type, None)
if recipe is None:
raise ELFRelocationError(
'Unsupported relocation type: %s' % reloc_type)
# So now we have everything we need to actually perform the relocation.
# Let's get to it:
# 0. Find out which struct we're going to be using to read this value
# from the stream and write it back.
if recipe.bytesize == 4:
value_struct = self.elffile.structs.Elf_word('')
elif recipe.bytesize == 8:
value_struct = self.elffile.structs.Elf_word64('')
else:
raise ELFRelocationError('Invalid bytesize %s for relocation' %
recipe.bytesize)
# 1. Read the value from the stream (with correct size and endianness)
original_value = struct_parse(
value_struct,
stream,
stream_pos=reloc['r_offset'])
# 2. Apply the relocation to the value, acting according to the recipe
relocated_value = recipe.calc_func(
value=original_value,
sym_value=sym_value,
offset=reloc['r_offset'],
addend=reloc['r_addend'] if recipe.has_addend else 0)
# 3. Write the relocated value back into the stream
stream.seek(reloc['r_offset'])
# Make sure the relocated value fits back by wrapping it around. This
# looks like a problem, but it seems to be the way this is done in
# binutils too.
relocated_value = relocated_value % (2 ** (recipe.bytesize * 8))
value_struct.build_stream(relocated_value, stream)
# Relocations are represented by "recipes". Each recipe specifies:
# bytesize: The number of bytes to read (and write back) to the section.
# This is the unit of data on which relocation is performed.
# has_addend: Does this relocation have an extra addend?
# calc_func: A function that performs the relocation on an extracted
# value, and returns the updated value.
#
_RELOCATION_RECIPE_TYPE = namedtuple('_RELOCATION_RECIPE_TYPE',
'bytesize has_addend calc_func')
def _reloc_calc_identity(value, sym_value, offset, addend=0):
return value
def _reloc_calc_sym_plus_value(value, sym_value, offset, addend=0):
return sym_value + value
def _reloc_calc_sym_plus_value_pcrel(value, sym_value, offset, addend=0):
return sym_value + value - offset
def _reloc_calc_sym_plus_addend(value, sym_value, offset, addend=0):
return sym_value + addend
def _reloc_calc_sym_plus_addend_pcrel(value, sym_value, offset, addend=0):
return sym_value + addend - offset
def _arm_reloc_calc_sym_plus_value_pcrel(value, sym_value, offset, addend=0):
return sym_value // 4 + value - offset // 4
_RELOCATION_RECIPES_ARM = {
ENUM_RELOC_TYPE_ARM['R_ARM_ABS32']: _RELOCATION_RECIPE_TYPE(
bytesize=4, has_addend=False,
calc_func=_reloc_calc_sym_plus_value),
ENUM_RELOC_TYPE_ARM['R_ARM_CALL']: _RELOCATION_RECIPE_TYPE(
bytesize=4, has_addend=False,
calc_func=_arm_reloc_calc_sym_plus_value_pcrel),
}
# https://dmz-portal.mips.com/wiki/MIPS_relocation_types
_RELOCATION_RECIPES_MIPS = {
ENUM_RELOC_TYPE_MIPS['R_MIPS_NONE']: _RELOCATION_RECIPE_TYPE(
bytesize=4, has_addend=False, calc_func=_reloc_calc_identity),
ENUM_RELOC_TYPE_MIPS['R_MIPS_32']: _RELOCATION_RECIPE_TYPE(
bytesize=4, has_addend=False,
calc_func=_reloc_calc_sym_plus_value),
}
_RELOCATION_RECIPES_X86 = {
ENUM_RELOC_TYPE_i386['R_386_NONE']: _RELOCATION_RECIPE_TYPE(
bytesize=4, has_addend=False, calc_func=_reloc_calc_identity),
ENUM_RELOC_TYPE_i386['R_386_32']: _RELOCATION_RECIPE_TYPE(
bytesize=4, has_addend=False,
calc_func=_reloc_calc_sym_plus_value),
ENUM_RELOC_TYPE_i386['R_386_PC32']: _RELOCATION_RECIPE_TYPE(
bytesize=4, has_addend=False,
calc_func=_reloc_calc_sym_plus_value_pcrel),
}
_RELOCATION_RECIPES_X64 = {
ENUM_RELOC_TYPE_x64['R_X86_64_NONE']: _RELOCATION_RECIPE_TYPE(
bytesize=8, has_addend=True, calc_func=_reloc_calc_identity),
ENUM_RELOC_TYPE_x64['R_X86_64_64']: _RELOCATION_RECIPE_TYPE(
bytesize=8, has_addend=True, calc_func=_reloc_calc_sym_plus_addend),
ENUM_RELOC_TYPE_x64['R_X86_64_PC32']: _RELOCATION_RECIPE_TYPE(
bytesize=4, has_addend=True,
calc_func=_reloc_calc_sym_plus_addend_pcrel),
ENUM_RELOC_TYPE_x64['R_X86_64_32']: _RELOCATION_RECIPE_TYPE(
bytesize=4, has_addend=True, calc_func=_reloc_calc_sym_plus_addend),
ENUM_RELOC_TYPE_x64['R_X86_64_32S']: _RELOCATION_RECIPE_TYPE(
bytesize=4, has_addend=True, calc_func=_reloc_calc_sym_plus_addend),
}
pyelftools-0.26/elftools/elf/sections.py 0000664 0000000 0000000 00000040146 13572204573 0020454 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/sections.py
#
# ELF sections
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..common.exceptions import ELFCompressionError
from ..common.utils import struct_parse, elf_assert, parse_cstring_from_stream
from collections import defaultdict
from .constants import SH_FLAGS
from .notes import iter_notes
import zlib
class Section(object):
""" Base class for ELF sections. Also used for all sections types that have
no special functionality.
Allows dictionary-like access to the section header. For example:
> sec = Section(...)
> sec['sh_type'] # section type
"""
def __init__(self, header, name, elffile):
self.header = header
self.name = name
self.elffile = elffile
self.stream = self.elffile.stream
self.structs = self.elffile.structs
self._compressed = header['sh_flags'] & SH_FLAGS.SHF_COMPRESSED
if self.compressed:
# Read the compression header now to know about the size/alignment
# of the decompressed data.
header = struct_parse(self.structs.Elf_Chdr,
self.stream,
stream_pos=self['sh_offset'])
self._compression_type = header['ch_type']
self._decompressed_size = header['ch_size']
self._decompressed_align = header['ch_addralign']
else:
self._decompressed_size = header['sh_size']
self._decompressed_align = header['sh_addralign']
@property
def compressed(self):
""" Is this section compressed?
"""
return self._compressed
@property
def data_size(self):
""" Return the logical size for this section's data.
This can be different from the .sh_size header field when the section
is compressed.
"""
return self._decompressed_size
@property
def data_alignment(self):
""" Return the logical alignment for this section's data.
This can be different from the .sh_addralign header field when the
section is compressed.
"""
return self._decompressed_align
def data(self):
""" The section data from the file.
Note that data is decompressed if the stored section data is
compressed.
"""
# If this section is compressed, deflate it
if self.compressed:
c_type = self._compression_type
if c_type == 'ELFCOMPRESS_ZLIB':
# Read the data to decompress starting right after the
# compression header until the end of the section.
hdr_size = self.structs.Elf_Chdr.sizeof()
self.stream.seek(self['sh_offset'] + hdr_size)
compressed = self.stream.read(self['sh_size'] - hdr_size)
decomp = zlib.decompressobj()
result = decomp.decompress(compressed, self.data_size)
else:
raise ELFCompressionError(
'Unknown compression type: {:#0x}'.format(c_type)
)
if len(result) != self._decompressed_size:
raise ELFCompressionError(
'Decompressed data is {} bytes long, should be {} bytes'
' long'.format(len(result), self._decompressed_size)
)
else:
self.stream.seek(self['sh_offset'])
result = self.stream.read(self._decompressed_size)
return result
def is_null(self):
""" Is this a null section?
"""
return False
def __getitem__(self, name):
""" Implement dict-like access to header entries
"""
return self.header[name]
def __eq__(self, other):
try:
return self.header == other.header
except AttributeError:
return False
def __hash__(self):
return hash(self.header)
class NullSection(Section):
""" ELF NULL section
"""
def is_null(self):
return True
class StringTableSection(Section):
""" ELF string table section.
"""
def get_string(self, offset):
""" Get the string stored at the given offset in this string table.
"""
table_offset = self['sh_offset']
s = parse_cstring_from_stream(self.stream, table_offset + offset)
return s.decode('utf-8') if s else ''
class SymbolTableSection(Section):
""" ELF symbol table section. Has an associated StringTableSection that's
passed in the constructor.
"""
def __init__(self, header, name, elffile, stringtable):
super(SymbolTableSection, self).__init__(header, name, elffile)
self.stringtable = stringtable
elf_assert(self['sh_entsize'] > 0,
'Expected entry size of section %r to be > 0' % name)
elf_assert(self['sh_size'] % self['sh_entsize'] == 0,
'Expected section size to be a multiple of entry size in section %r' % name)
self._symbol_name_map = None
def num_symbols(self):
""" Number of symbols in the table
"""
return self['sh_size'] // self['sh_entsize']
def get_symbol(self, n):
""" Get the symbol at index #n from the table (Symbol object)
"""
# Grab the symbol's entry from the stream
entry_offset = self['sh_offset'] + n * self['sh_entsize']
entry = struct_parse(
self.structs.Elf_Sym,
self.stream,
stream_pos=entry_offset)
# Find the symbol name in the associated string table
name = self.stringtable.get_string(entry['st_name'])
return Symbol(entry, name)
def get_symbol_by_name(self, name):
""" Get a symbol(s) by name. Return None if no symbol by the given name
exists.
"""
# The first time this method is called, construct a name to number
# mapping
#
if self._symbol_name_map is None:
self._symbol_name_map = defaultdict(list)
for i, sym in enumerate(self.iter_symbols()):
self._symbol_name_map[sym.name].append(i)
symnums = self._symbol_name_map.get(name)
return [self.get_symbol(i) for i in symnums] if symnums else None
def iter_symbols(self):
""" Yield all the symbols in the table
"""
for i in range(self.num_symbols()):
yield self.get_symbol(i)
class Symbol(object):
""" Symbol object - representing a single symbol entry from a symbol table
section.
Similarly to Section objects, allows dictionary-like access to the
symbol entry.
"""
def __init__(self, entry, name):
self.entry = entry
self.name = name
def __getitem__(self, name):
""" Implement dict-like access to entries
"""
return self.entry[name]
class SUNWSyminfoTableSection(Section):
""" ELF .SUNW Syminfo table section.
Has an associated SymbolTableSection that's passed in the constructor.
"""
def __init__(self, header, name, elffile, symboltable):
super(SUNWSyminfoTableSection, self).__init__(header, name, elffile)
self.symboltable = symboltable
def num_symbols(self):
""" Number of symbols in the table
"""
return self['sh_size'] // self['sh_entsize'] - 1
def get_symbol(self, n):
""" Get the symbol at index #n from the table (Symbol object).
It begins at 1 and not 0 since the first entry is used to
store the current version of the syminfo table.
"""
# Grab the symbol's entry from the stream
entry_offset = self['sh_offset'] + n * self['sh_entsize']
entry = struct_parse(
self.structs.Elf_Sunw_Syminfo,
self.stream,
stream_pos=entry_offset)
# Find the symbol name in the associated symbol table
name = self.symboltable.get_symbol(n).name
return Symbol(entry, name)
def iter_symbols(self):
""" Yield all the symbols in the table
"""
for i in range(1, self.num_symbols() + 1):
yield self.get_symbol(i)
class NoteSection(Section):
""" ELF NOTE section. Knows how to parse notes.
"""
def iter_notes(self):
""" Yield all the notes in the section. Each result is a dictionary-
like object with "n_name", "n_type", and "n_desc" fields, amongst
others.
"""
return iter_notes(self.elffile, self['sh_offset'], self['sh_size'])
class StabSection(Section):
""" ELF stab section.
"""
def iter_stabs(self):
""" Yield all stab entries. Result type is ELFStructs.Elf_Stabs.
"""
offset = self['sh_offset']
size = self['sh_size']
end = offset + size
while offset < end:
stabs = struct_parse(
self.structs.Elf_Stabs,
self.elffile.stream,
stream_pos=offset)
stabs['n_offset'] = offset
offset += self.structs.Elf_Stabs.sizeof()
self.stream.seek(offset)
yield stabs
class ARMAttribute(object):
""" ARM attribute object - representing a build attribute of ARM ELF files.
"""
def __init__(self, structs, stream):
self._tag = struct_parse(structs.Elf_Attribute_Tag, stream)
self.extra = None
if self.tag in ('TAG_FILE', 'TAG_SECTION', 'TAG_SYMBOL'):
self.value = struct_parse(structs.Elf_word('value'), stream)
if self.tag != 'TAG_FILE':
self.extra = []
s_number = struct_parse(structs.Elf_uleb128('s_number'), stream)
while s_number != 0:
self.extra.append(s_number)
s_number = struct_parse(structs.Elf_uleb128('s_number'),
stream
)
elif self.tag in ('TAG_CPU_RAW_NAME', 'TAG_CPU_NAME', 'TAG_CONFORMANCE'):
self.value = struct_parse(structs.Elf_ntbs('value',
encoding='utf-8'),
stream)
elif self.tag == 'TAG_COMPATIBILITY':
self.value = struct_parse(structs.Elf_uleb128('value'), stream)
self.extra = struct_parse(structs.Elf_ntbs('vendor_name',
encoding='utf-8'),
stream)
elif self.tag == 'TAG_ALSO_COMPATIBLE_WITH':
self.value = ARMAttribute(structs, stream)
if type(self.value.value) is not str:
nul = struct_parse(structs.Elf_byte('nul'), stream)
elf_assert(nul == 0,
"Invalid terminating byte %r, expecting NUL." % nul)
else:
self.value = struct_parse(structs.Elf_uleb128('value'), stream)
@property
def tag(self):
return self._tag['tag']
def __repr__(self):
s = '' % (self.tag, self.value)
s += ' %s' % self.extra if self.extra is not None else ''
return s
class ARMAttributesSubsubsection(object):
""" Subsubsection of an ELF .ARM.attributes section's subsection.
"""
def __init__(self, stream, structs, offset):
self.stream = stream
self.offset = offset
self.structs = structs
self.header = ARMAttribute(self.structs, self.stream)
self.attr_start = self.stream.tell()
def iter_attributes(self, tag=None):
""" Yield all attributes (limit to |tag| if specified).
"""
for attribute in self._make_attributes():
if tag is None or attribute.tag == tag:
yield attribute
@property
def num_attributes(self):
""" Number of attributes in the subsubsection.
"""
return sum(1 for _ in self.iter_attributes()) + 1
@property
def attributes(self):
""" List of all attributes in the subsubsection.
"""
return [self.header] + list(self.iter_attributes())
def _make_attributes(self):
""" Create all attributes for this subsubsection except the first one
which is the header.
"""
end = self.offset + self.header.value
self.stream.seek(self.attr_start)
while self.stream.tell() != end:
yield ARMAttribute(self.structs, self.stream)
def __repr__(self):
s = ""
return s % (self.header.tag[4:], self.header.value)
class ARMAttributesSubsection(object):
""" Subsection of an ELF .ARM.attributes section.
"""
def __init__(self, stream, structs, offset):
self.stream = stream
self.offset = offset
self.structs = structs
self.header = struct_parse(self.structs.Elf_Attr_Subsection_Header,
self.stream,
self.offset
)
self.subsubsec_start = self.stream.tell()
def iter_subsubsections(self, scope=None):
""" Yield all subsubsections (limit to |scope| if specified).
"""
for subsubsec in self._make_subsubsections():
if scope is None or subsubsec.header.tag == scope:
yield subsubsec
@property
def num_subsubsections(self):
""" Number of subsubsections in the subsection.
"""
return sum(1 for _ in self.iter_subsubsections())
@property
def subsubsections(self):
""" List of all subsubsections in the subsection.
"""
return list(self.iter_subsubsections())
def _make_subsubsections(self):
""" Create all subsubsections for this subsection.
"""
end = self.offset + self['length']
self.stream.seek(self.subsubsec_start)
while self.stream.tell() != end:
subsubsec = ARMAttributesSubsubsection(self.stream,
self.structs,
self.stream.tell())
self.stream.seek(self.subsubsec_start + subsubsec.header.value)
yield subsubsec
def __getitem__(self, name):
""" Implement dict-like access to header entries.
"""
return self.header[name]
def __repr__(self):
s = ""
return s % (self.header['vendor_name'], self.header['length'])
class ARMAttributesSection(Section):
""" ELF .ARM.attributes section.
"""
def __init__(self, header, name, elffile):
super(ARMAttributesSection, self).__init__(header, name, elffile)
fv = struct_parse(self.structs.Elf_byte('format_version'),
self.stream,
self['sh_offset']
)
elf_assert(chr(fv) == 'A',
"Unknown attributes version %s, expecting 'A'." % chr(fv)
)
self.subsec_start = self.stream.tell()
def iter_subsections(self, vendor_name=None):
""" Yield all subsections (limit to |vendor_name| if specified).
"""
for subsec in self._make_subsections():
if vendor_name is None or subsec['vendor_name'] == vendor_name:
yield subsec
@property
def num_subsections(self):
""" Number of subsections in the section.
"""
return sum(1 for _ in self.iter_subsections())
@property
def subsections(self):
""" List of all subsections in the section.
"""
return list(self.iter_subsections())
def _make_subsections(self):
""" Create all subsections for this section.
"""
end = self['sh_offset'] + self.data_size
self.stream.seek(self.subsec_start)
while self.stream.tell() != end:
subsec = ARMAttributesSubsection(self.stream,
self.structs,
self.stream.tell())
self.stream.seek(self.subsec_start + subsec['length'])
yield subsec
pyelftools-0.26/elftools/elf/segments.py 0000664 0000000 0000000 00000010046 13572204573 0020446 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/segments.py
#
# ELF segments
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..construct import CString
from ..common.utils import struct_parse
from .constants import SH_FLAGS
from .notes import iter_notes
class Segment(object):
def __init__(self, header, stream):
self.header = header
self.stream = stream
def data(self):
""" The segment data from the file.
"""
self.stream.seek(self['p_offset'])
return self.stream.read(self['p_filesz'])
def __getitem__(self, name):
""" Implement dict-like access to header entries
"""
return self.header[name]
def section_in_segment(self, section):
""" Is the given section contained in this segment?
Note: this tries to reproduce the intricate rules of the
ELF_SECTION_IN_SEGMENT_STRICT macro of the header
elf/include/internal.h in the source of binutils.
"""
# Only the 'strict' checks from ELF_SECTION_IN_SEGMENT_1 are included
segtype = self['p_type']
sectype = section['sh_type']
secflags = section['sh_flags']
# Only PT_LOAD, PT_GNU_RELR0 and PT_TLS segments can contain SHF_TLS
# sections
if ( secflags & SH_FLAGS.SHF_TLS and
segtype in ('PT_TLS', 'PT_GNU_RELR0', 'PT_LOAD')):
return False
# PT_TLS segment contains only SHF_TLS sections, PT_PHDR no sections
# at all
elif ( (secflags & SH_FLAGS.SHF_TLS) != 0 and
segtype not in ('PT_TLS', 'PT_PHDR')):
return False
# In ELF_SECTION_IN_SEGMENT_STRICT the flag check_vma is on, so if
# this is an alloc section, check whether its VMA is in bounds.
if secflags & SH_FLAGS.SHF_ALLOC:
secaddr = section['sh_addr']
vaddr = self['p_vaddr']
# This checks that the section is wholly contained in the segment.
# The third condition is the 'strict' one - an empty section will
# not match at the very end of the segment (unless the segment is
# also zero size, which is handled by the second condition).
if not (secaddr >= vaddr and
secaddr - vaddr + section['sh_size'] <= self['p_memsz'] and
secaddr - vaddr <= self['p_memsz'] - 1):
return False
# If we've come this far and it's a NOBITS section, it's in the segment
if sectype == 'SHT_NOBITS':
return True
secoffset = section['sh_offset']
poffset = self['p_offset']
# Same logic as with secaddr vs. vaddr checks above, just on offsets in
# the file
return (secoffset >= poffset and
secoffset - poffset + section['sh_size'] <= self['p_filesz'] and
secoffset - poffset <= self['p_filesz'] - 1)
class InterpSegment(Segment):
""" INTERP segment. Knows how to obtain the path to the interpreter used
for this ELF file.
"""
def __init__(self, header, stream):
super(InterpSegment, self).__init__(header, stream)
def get_interp_name(self):
""" Obtain the interpreter path used for this ELF file.
"""
path_offset = self['p_offset']
return struct_parse(
CString('', encoding='utf-8'),
self.stream,
stream_pos=path_offset)
class NoteSegment(Segment):
""" NOTE segment. Knows how to parse notes.
"""
def __init__(self, header, stream, elffile):
super(NoteSegment, self).__init__(header, stream)
self.elffile = elffile
def iter_notes(self):
""" Yield all the notes in the segment. Each result is a dictionary-
like object with "n_name", "n_type", and "n_desc" fields, amongst
others.
"""
return iter_notes(self.elffile, self['p_offset'], self['p_filesz'])
pyelftools-0.26/elftools/elf/structs.py 0000664 0000000 0000000 00000042775 13572204573 0020346 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools: elf/structs.py
#
# Encapsulation of Construct structs for parsing an ELF file, adjusted for
# correct endianness and word-size.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from ..construct import (
UBInt8, UBInt16, UBInt32, UBInt64,
ULInt8, ULInt16, ULInt32, ULInt64,
SBInt32, SLInt32, SBInt64, SLInt64,
Struct, Array, Enum, Padding, BitStruct, BitField, Value, String, CString
)
from ..common.construct_utils import ULEB128
from .enums import *
class ELFStructs(object):
""" Accessible attributes:
Elf_{byte|half|word|word64|addr|offset|sword|xword|xsword}:
Data chunks, as specified by the ELF standard, adjusted for
correct endianness and word-size.
Elf_Ehdr:
ELF file header
Elf_Phdr:
Program header
Elf_Shdr:
Section header
Elf_Sym:
Symbol table entry
Elf_Rel, Elf_Rela:
Entries in relocation sections
"""
def __init__(self, little_endian=True, elfclass=32):
assert elfclass == 32 or elfclass == 64
self.little_endian = little_endian
self.elfclass = elfclass
def create_basic_structs(self):
""" Create word-size related structs and ehdr struct needed for
initial determining of ELF type.
"""
if self.little_endian:
self.Elf_byte = ULInt8
self.Elf_half = ULInt16
self.Elf_word = ULInt32
self.Elf_word64 = ULInt64
self.Elf_addr = ULInt32 if self.elfclass == 32 else ULInt64
self.Elf_offset = self.Elf_addr
self.Elf_sword = SLInt32
self.Elf_xword = ULInt32 if self.elfclass == 32 else ULInt64
self.Elf_sxword = SLInt32 if self.elfclass == 32 else SLInt64
else:
self.Elf_byte = UBInt8
self.Elf_half = UBInt16
self.Elf_word = UBInt32
self.Elf_word64 = UBInt64
self.Elf_addr = UBInt32 if self.elfclass == 32 else UBInt64
self.Elf_offset = self.Elf_addr
self.Elf_sword = SBInt32
self.Elf_xword = UBInt32 if self.elfclass == 32 else UBInt64
self.Elf_sxword = SBInt32 if self.elfclass == 32 else SBInt64
self._create_ehdr()
self._create_leb128()
self._create_ntbs()
def create_advanced_structs(self, e_type=None, e_machine=None, e_ident_osabi=None):
""" Create all ELF structs except the ehdr. They may possibly depend
on provided e_type and/or e_machine parsed from ehdr.
"""
self._create_phdr(e_machine)
self._create_shdr(e_machine)
self._create_chdr()
self._create_sym()
self._create_rel()
self._create_dyn(e_machine, e_ident_osabi)
self._create_sunw_syminfo()
self._create_gnu_verneed()
self._create_gnu_verdef()
self._create_gnu_versym()
self._create_gnu_abi()
self._create_note(e_type)
self._create_stabs()
self._create_arm_attributes()
self._create_elf_hash()
self._create_gnu_hash()
#-------------------------------- PRIVATE --------------------------------#
def _create_ehdr(self):
self.Elf_Ehdr = Struct('Elf_Ehdr',
Struct('e_ident',
Array(4, self.Elf_byte('EI_MAG')),
Enum(self.Elf_byte('EI_CLASS'), **ENUM_EI_CLASS),
Enum(self.Elf_byte('EI_DATA'), **ENUM_EI_DATA),
Enum(self.Elf_byte('EI_VERSION'), **ENUM_E_VERSION),
Enum(self.Elf_byte('EI_OSABI'), **ENUM_EI_OSABI),
self.Elf_byte('EI_ABIVERSION'),
Padding(7)
),
Enum(self.Elf_half('e_type'), **ENUM_E_TYPE),
Enum(self.Elf_half('e_machine'), **ENUM_E_MACHINE),
Enum(self.Elf_word('e_version'), **ENUM_E_VERSION),
self.Elf_addr('e_entry'),
self.Elf_offset('e_phoff'),
self.Elf_offset('e_shoff'),
self.Elf_word('e_flags'),
self.Elf_half('e_ehsize'),
self.Elf_half('e_phentsize'),
self.Elf_half('e_phnum'),
self.Elf_half('e_shentsize'),
self.Elf_half('e_shnum'),
self.Elf_half('e_shstrndx'),
)
def _create_leb128(self):
self.Elf_uleb128 = ULEB128
def _create_ntbs(self):
self.Elf_ntbs = CString
def _create_phdr(self, e_machine=None):
p_type_dict = ENUM_P_TYPE_BASE
if e_machine == 'EM_ARM':
p_type_dict = ENUM_P_TYPE_ARM
elif e_machine == 'EM_AARCH64':
p_type_dict = ENUM_P_TYPE_AARCH64
elif e_machine == 'EM_MIPS':
p_type_dict = ENUM_P_TYPE_MIPS
if self.elfclass == 32:
self.Elf_Phdr = Struct('Elf_Phdr',
Enum(self.Elf_word('p_type'), **p_type_dict),
self.Elf_offset('p_offset'),
self.Elf_addr('p_vaddr'),
self.Elf_addr('p_paddr'),
self.Elf_word('p_filesz'),
self.Elf_word('p_memsz'),
self.Elf_word('p_flags'),
self.Elf_word('p_align'),
)
else: # 64
self.Elf_Phdr = Struct('Elf_Phdr',
Enum(self.Elf_word('p_type'), **p_type_dict),
self.Elf_word('p_flags'),
self.Elf_offset('p_offset'),
self.Elf_addr('p_vaddr'),
self.Elf_addr('p_paddr'),
self.Elf_xword('p_filesz'),
self.Elf_xword('p_memsz'),
self.Elf_xword('p_align'),
)
def _create_shdr(self, e_machine=None):
"""Section header parsing.
Depends on e_machine because of machine-specific values in sh_type.
"""
sh_type_dict = ENUM_SH_TYPE_BASE
if e_machine == 'EM_ARM':
sh_type_dict = ENUM_SH_TYPE_ARM
elif e_machine == 'EM_X86_64':
sh_type_dict = ENUM_SH_TYPE_AMD64
elif e_machine == 'EM_MIPS':
sh_type_dict = ENUM_SH_TYPE_MIPS
self.Elf_Shdr = Struct('Elf_Shdr',
self.Elf_word('sh_name'),
Enum(self.Elf_word('sh_type'), **sh_type_dict),
self.Elf_xword('sh_flags'),
self.Elf_addr('sh_addr'),
self.Elf_offset('sh_offset'),
self.Elf_xword('sh_size'),
self.Elf_word('sh_link'),
self.Elf_word('sh_info'),
self.Elf_xword('sh_addralign'),
self.Elf_xword('sh_entsize'),
)
def _create_chdr(self):
# Structure of compressed sections header. It is documented in Oracle
# "Linker and Libraries Guide", Part IV ELF Application Binary
# Interface, Chapter 13 Object File Format, Section Compression:
# https://docs.oracle.com/cd/E53394_01/html/E54813/section_compression.html
fields = [
Enum(self.Elf_word('ch_type'), **ENUM_ELFCOMPRESS_TYPE),
self.Elf_xword('ch_size'),
self.Elf_xword('ch_addralign'),
]
if self.elfclass == 64:
fields.insert(1, self.Elf_word('ch_reserved'))
self.Elf_Chdr = Struct('Elf_Chdr', *fields)
def _create_rel(self):
# r_info is also taken apart into r_info_sym and r_info_type.
# This is done in Value to avoid endianity issues while parsing.
if self.elfclass == 32:
r_info_sym = Value('r_info_sym',
lambda ctx: (ctx['r_info'] >> 8) & 0xFFFFFF)
r_info_type = Value('r_info_type',
lambda ctx: ctx['r_info'] & 0xFF)
else: # 64
r_info_sym = Value('r_info_sym',
lambda ctx: (ctx['r_info'] >> 32) & 0xFFFFFFFF)
r_info_type = Value('r_info_type',
lambda ctx: ctx['r_info'] & 0xFFFFFFFF)
self.Elf_Rel = Struct('Elf_Rel',
self.Elf_addr('r_offset'),
self.Elf_xword('r_info'),
r_info_sym,
r_info_type,
)
self.Elf_Rela = Struct('Elf_Rela',
self.Elf_addr('r_offset'),
self.Elf_xword('r_info'),
r_info_sym,
r_info_type,
self.Elf_sxword('r_addend'),
)
def _create_dyn(self, e_machine=None, e_ident_osabi=None):
d_tag_dict = dict(ENUM_D_TAG_COMMON)
if e_machine in ENUMMAP_EXTRA_D_TAG_MACHINE:
d_tag_dict.update(ENUMMAP_EXTRA_D_TAG_MACHINE[e_machine])
elif e_ident_osabi == 'ELFOSABI_SOLARIS':
d_tag_dict.update(ENUM_D_TAG_SOLARIS)
self.Elf_Dyn = Struct('Elf_Dyn',
Enum(self.Elf_sxword('d_tag'), **d_tag_dict),
self.Elf_xword('d_val'),
Value('d_ptr', lambda ctx: ctx['d_val']),
)
def _create_sym(self):
# st_info is hierarchical. To access the type, use
# container['st_info']['type']
st_info_struct = BitStruct('st_info',
Enum(BitField('bind', 4), **ENUM_ST_INFO_BIND),
Enum(BitField('type', 4), **ENUM_ST_INFO_TYPE))
# st_other is hierarchical. To access the visibility,
# use container['st_other']['visibility']
st_other_struct = BitStruct('st_other',
Padding(5),
Enum(BitField('visibility', 3), **ENUM_ST_VISIBILITY))
if self.elfclass == 32:
self.Elf_Sym = Struct('Elf_Sym',
self.Elf_word('st_name'),
self.Elf_addr('st_value'),
self.Elf_word('st_size'),
st_info_struct,
st_other_struct,
Enum(self.Elf_half('st_shndx'), **ENUM_ST_SHNDX),
)
else:
self.Elf_Sym = Struct('Elf_Sym',
self.Elf_word('st_name'),
st_info_struct,
st_other_struct,
Enum(self.Elf_half('st_shndx'), **ENUM_ST_SHNDX),
self.Elf_addr('st_value'),
self.Elf_xword('st_size'),
)
def _create_sunw_syminfo(self):
self.Elf_Sunw_Syminfo = Struct('Elf_Sunw_Syminfo',
Enum(self.Elf_half('si_boundto'), **ENUM_SUNW_SYMINFO_BOUNDTO),
self.Elf_half('si_flags'),
)
def _create_gnu_verneed(self):
# Structure of "version needed" entries is documented in
# Oracle "Linker and Libraries Guide", Chapter 13 Object File Format
self.Elf_Verneed = Struct('Elf_Verneed',
self.Elf_half('vn_version'),
self.Elf_half('vn_cnt'),
self.Elf_word('vn_file'),
self.Elf_word('vn_aux'),
self.Elf_word('vn_next'),
)
self.Elf_Vernaux = Struct('Elf_Vernaux',
self.Elf_word('vna_hash'),
self.Elf_half('vna_flags'),
self.Elf_half('vna_other'),
self.Elf_word('vna_name'),
self.Elf_word('vna_next'),
)
def _create_gnu_verdef(self):
# Structure of "version definition" entries are documented in
# Oracle "Linker and Libraries Guide", Chapter 13 Object File Format
self.Elf_Verdef = Struct('Elf_Verdef',
self.Elf_half('vd_version'),
self.Elf_half('vd_flags'),
self.Elf_half('vd_ndx'),
self.Elf_half('vd_cnt'),
self.Elf_word('vd_hash'),
self.Elf_word('vd_aux'),
self.Elf_word('vd_next'),
)
self.Elf_Verdaux = Struct('Elf_Verdaux',
self.Elf_word('vda_name'),
self.Elf_word('vda_next'),
)
def _create_gnu_versym(self):
# Structure of "version symbol" entries are documented in
# Oracle "Linker and Libraries Guide", Chapter 13 Object File Format
self.Elf_Versym = Struct('Elf_Versym',
Enum(self.Elf_half('ndx'), **ENUM_VERSYM),
)
def _create_gnu_abi(self):
# Structure of GNU ABI notes is documented in
# https://code.woboq.org/userspace/glibc/csu/abi-note.S.html
self.Elf_abi = Struct('Elf_abi',
Enum(self.Elf_word('abi_os'), **ENUM_NOTE_ABI_TAG_OS),
self.Elf_word('abi_major'),
self.Elf_word('abi_minor'),
self.Elf_word('abi_tiny'),
)
def _create_note(self, e_type=None):
# Structure of "PT_NOTE" section
self.Elf_Nhdr = Struct('Elf_Nhdr',
self.Elf_word('n_namesz'),
self.Elf_word('n_descsz'),
Enum(self.Elf_word('n_type'),
**(ENUM_NOTE_N_TYPE if e_type != "ET_CORE"
else ENUM_CORE_NOTE_N_TYPE)),
)
# A process psinfo structure according to
# http://elixir.free-electrons.com/linux/v2.6.35/source/include/linux/elfcore.h#L84
if self.elfclass == 32:
self.Elf_Prpsinfo = Struct('Elf_Prpsinfo',
self.Elf_byte('pr_state'),
String('pr_sname', 1),
self.Elf_byte('pr_zomb'),
self.Elf_byte('pr_nice'),
self.Elf_xword('pr_flag'),
self.Elf_half('pr_uid'),
self.Elf_half('pr_gid'),
self.Elf_half('pr_pid'),
self.Elf_half('pr_ppid'),
self.Elf_half('pr_pgrp'),
self.Elf_half('pr_sid'),
String('pr_fname', 16),
String('pr_psargs', 80),
)
else: # 64
self.Elf_Prpsinfo = Struct('Elf_Prpsinfo',
self.Elf_byte('pr_state'),
String('pr_sname', 1),
self.Elf_byte('pr_zomb'),
self.Elf_byte('pr_nice'),
Padding(4),
self.Elf_xword('pr_flag'),
self.Elf_word('pr_uid'),
self.Elf_word('pr_gid'),
self.Elf_word('pr_pid'),
self.Elf_word('pr_ppid'),
self.Elf_word('pr_pgrp'),
self.Elf_word('pr_sid'),
String('pr_fname', 16),
String('pr_psargs', 80),
)
# A PT_NOTE of type NT_FILE matching the definition in
# https://chromium.googlesource.com/
# native_client/nacl-binutils/+/upstream/master/binutils/readelf.c
# Line 15121
self.Elf_Nt_File = Struct('Elf_Nt_File',
self.Elf_xword("num_map_entries"),
self.Elf_xword("page_size"),
Array(lambda ctx: ctx.num_map_entries,
Struct('Elf_Nt_File_Entry',
self.Elf_addr('vm_start'),
self.Elf_addr('vm_end'),
self.Elf_offset('page_offset'))),
Array(lambda ctx: ctx.num_map_entries,
CString('filename')))
def _create_stabs(self):
# Structure of one stabs entry, see binutils/bfd/stabs.c
# Names taken from https://sourceware.org/gdb/current/onlinedocs/stabs.html#Overview
self.Elf_Stabs = Struct('Elf_Stabs',
self.Elf_word('n_strx'),
self.Elf_byte('n_type'),
self.Elf_byte('n_other'),
self.Elf_half('n_desc'),
self.Elf_word('n_value'),
)
def _create_arm_attributes(self):
# Structure of a build attributes subsection header. A subsection is
# either public to all tools that process the ELF file or private to
# the vendor's tools.
self.Elf_Attr_Subsection_Header = Struct('Elf_Attr_Subsection',
self.Elf_word('length'),
self.Elf_ntbs('vendor_name',
encoding='utf-8')
)
# Structure of a build attribute tag.
self.Elf_Attribute_Tag = Struct('Elf_Attribute_Tag',
Enum(self.Elf_uleb128('tag'),
**ENUM_ATTR_TAG_ARM)
)
def _create_elf_hash(self):
# Structure of the old SYSV-style hash table header. It is documented
# in the Oracle "Linker and Libraries Guide", Part IV ELF Application
# Binary Interface, Chapter 14 Object File Format, Section Hash Table
# Section:
# https://docs.oracle.com/cd/E53394_01/html/E54813/chapter6-48031.html
self.Elf_Hash = Struct('Elf_Hash',
self.Elf_word('nbuckets'),
self.Elf_word('nchains'),
Array(lambda ctx: ctx['nbuckets'], self.Elf_word('buckets')),
Array(lambda ctx: ctx['nchains'], self.Elf_word('chains')))
def _create_gnu_hash(self):
# Structure of the GNU-style hash table header. Documentation for this
# table is mostly in the GLIBC source code, a good explanation of the
# format can be found in this blog post:
# https://flapenguin.me/2017/05/10/elf-lookup-dt-gnu-hash/
self.Gnu_Hash = Struct('Gnu_Hash',
self.Elf_word('nbuckets'),
self.Elf_word('symoffset'),
self.Elf_word('bloom_size'),
self.Elf_word('bloom_shift'),
Array(lambda ctx: ctx['bloom_size'], self.Elf_xword('bloom')),
Array(lambda ctx: ctx['nbuckets'], self.Elf_word('buckets')))
pyelftools-0.26/examples/ 0000775 0000000 0000000 00000000000 13572204573 0015467 5 ustar 00root root 0000000 0000000 pyelftools-0.26/examples/dwarf_decode_address.py 0000664 0000000 0000000 00000010603 13572204573 0022154 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: dwarf_decode_address.py
#
# Decode an address in an ELF file to find out which function it belongs to
# and from which filename/line it comes in the original source file.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.common.py3compat import maxint, bytes2str
from elftools.dwarf.descriptions import describe_form_class
from elftools.elf.elffile import ELFFile
def process_file(filename, address):
print('Processing file:', filename)
with open(filename, 'rb') as f:
elffile = ELFFile(f)
if not elffile.has_dwarf_info():
print(' file has no DWARF info')
return
# get_dwarf_info returns a DWARFInfo context object, which is the
# starting point for all DWARF-based processing in pyelftools.
dwarfinfo = elffile.get_dwarf_info()
funcname = decode_funcname(dwarfinfo, address)
file, line = decode_file_line(dwarfinfo, address)
print('Function:', bytes2str(funcname))
print('File:', bytes2str(file))
print('Line:', line)
def decode_funcname(dwarfinfo, address):
# Go over all DIEs in the DWARF information, looking for a subprogram
# entry with an address range that includes the given address. Note that
# this simplifies things by disregarding subprograms that may have
# split address ranges.
for CU in dwarfinfo.iter_CUs():
for DIE in CU.iter_DIEs():
try:
if DIE.tag == 'DW_TAG_subprogram':
lowpc = DIE.attributes['DW_AT_low_pc'].value
# DWARF v4 in section 2.17 describes how to interpret the
# DW_AT_high_pc attribute based on the class of its form.
# For class 'address' it's taken as an absolute address
# (similarly to DW_AT_low_pc); for class 'constant', it's
# an offset from DW_AT_low_pc.
highpc_attr = DIE.attributes['DW_AT_high_pc']
highpc_attr_class = describe_form_class(highpc_attr.form)
if highpc_attr_class == 'address':
highpc = highpc_attr.value
elif highpc_attr_class == 'constant':
highpc = lowpc + highpc_attr.value
else:
print('Error: invalid DW_AT_high_pc class:',
highpc_attr_class)
continue
if lowpc <= address <= highpc:
return DIE.attributes['DW_AT_name'].value
except KeyError:
continue
return None
def decode_file_line(dwarfinfo, address):
# Go over all the line programs in the DWARF information, looking for
# one that describes the given address.
for CU in dwarfinfo.iter_CUs():
# First, look at line programs to find the file/line for the address
lineprog = dwarfinfo.line_program_for_CU(CU)
prevstate = None
for entry in lineprog.get_entries():
# We're interested in those entries where a new state is assigned
if entry.state is None:
continue
if entry.state.end_sequence:
# if the line number sequence ends, clear prevstate.
prevstate = None
continue
# Looking for a range of addresses in two consecutive states that
# contain the required address.
if prevstate and prevstate.address <= address < entry.state.address:
filename = lineprog['file_entry'][prevstate.file - 1].name
line = prevstate.line
return filename, line
prevstate = entry.state
return None, None
if __name__ == '__main__':
if sys.argv[1] == '--test':
process_file(sys.argv[2], 0x400503)
sys.exit(0)
if len(sys.argv) < 3:
print('Expected usage: {0} '.format(sys.argv[0]))
sys.exit(1)
addr = int(sys.argv[1], 0)
process_file(sys.argv[2], addr)
pyelftools-0.26/examples/dwarf_die_tree.py 0000664 0000000 0000000 00000004613 13572204573 0021010 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: dwarf_die_tree.py
#
# In the .debug_info section, Dwarf Information Entries (DIEs) form a tree.
# pyelftools provides easy access to this tree, as demonstrated here.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.elf.elffile import ELFFile
def process_file(filename):
print('Processing file:', filename)
with open(filename, 'rb') as f:
elffile = ELFFile(f)
if not elffile.has_dwarf_info():
print(' file has no DWARF info')
return
# get_dwarf_info returns a DWARFInfo context object, which is the
# starting point for all DWARF-based processing in pyelftools.
dwarfinfo = elffile.get_dwarf_info()
for CU in dwarfinfo.iter_CUs():
# DWARFInfo allows to iterate over the compile units contained in
# the .debug_info section. CU is a CompileUnit object, with some
# computed attributes (such as its offset in the section) and
# a header which conforms to the DWARF standard. The access to
# header elements is, as usual, via item-lookup.
print(' Found a compile unit at offset %s, length %s' % (
CU.cu_offset, CU['unit_length']))
# Start with the top DIE, the root for this CU's DIE tree
top_DIE = CU.get_top_DIE()
print(' Top DIE with tag=%s' % top_DIE.tag)
# We're interested in the filename...
print(' name=%s' % top_DIE.get_full_path())
# Display DIEs recursively starting with top_DIE
die_info_rec(top_DIE)
def die_info_rec(die, indent_level=' '):
""" A recursive function for showing information about a DIE and its
children.
"""
print(indent_level + 'DIE tag=%s' % die.tag)
child_indent = indent_level + ' '
for child in die.iter_children():
die_info_rec(child, child_indent)
if __name__ == '__main__':
if sys.argv[1] == '--test':
for filename in sys.argv[2:]:
process_file(filename)
pyelftools-0.26/examples/dwarf_location_info.py 0000664 0000000 0000000 00000011732 13572204573 0022053 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: dwarf_location_info.py
#
# Examine DIE entries which have either location list values or location
# expression values and decode that information.
#
# Location information can either be completely contained within a DIE
# (using 'DW_FORM_exprloc' in DWARFv4 or 'DW_FORM_block1' in earlier
# versions) or be a reference to a location list contained within
# the .debug_loc section (using 'DW_FORM_sec_offset' in DWARFv4 or
# 'DW_FORM_data4' / 'DW_FORM_data8' in earlier versions).
#
# The LocationParser object parses the DIE attributes and handles both
# formats.
#
# The directory 'test/testfiles_for_location_info' contains test files with
# location information represented in both DWARFv4 and DWARFv2 forms.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.common.py3compat import itervalues
from elftools.elf.elffile import ELFFile
from elftools.dwarf.descriptions import (
describe_DWARF_expr, set_global_machine_arch)
from elftools.dwarf.locationlists import (
LocationEntry, LocationExpr, LocationParser)
def process_file(filename):
print('Processing file:', filename)
with open(filename, 'rb') as f:
elffile = ELFFile(f)
if not elffile.has_dwarf_info():
print(' file has no DWARF info')
return
# get_dwarf_info returns a DWARFInfo context object, which is the
# starting point for all DWARF-based processing in pyelftools.
dwarfinfo = elffile.get_dwarf_info()
# The location lists are extracted by DWARFInfo from the .debug_loc
# section, and returned here as a LocationLists object.
location_lists = dwarfinfo.location_lists()
# This is required for the descriptions module to correctly decode
# register names contained in DWARF expressions.
set_global_machine_arch(elffile.get_machine_arch())
# Create a LocationParser object that parses the DIE attributes and
# creates objects representing the actual location information.
loc_parser = LocationParser(location_lists)
for CU in dwarfinfo.iter_CUs():
# DWARFInfo allows to iterate over the compile units contained in
# the .debug_info section. CU is a CompileUnit object, with some
# computed attributes (such as its offset in the section) and
# a header which conforms to the DWARF standard. The access to
# header elements is, as usual, via item-lookup.
print(' Found a compile unit at offset %s, length %s' % (
CU.cu_offset, CU['unit_length']))
# A CU provides a simple API to iterate over all the DIEs in it.
for DIE in CU.iter_DIEs():
# Go over all attributes of the DIE. Each attribute is an
# AttributeValue object (from elftools.dwarf.die), which we
# can examine.
for attr in itervalues(DIE.attributes):
# Check if this attribute contains location information
if loc_parser.attribute_has_location(attr, CU['version']):
print(' DIE %s. attr %s.' % (DIE.tag, attr.name))
loc = loc_parser.parse_from_attribute(attr,
CU['version'])
# We either get a list (in case the attribute is a
# reference to the .debug_loc section) or a LocationExpr
# object (in case the attribute itself contains location
# information).
if isinstance(loc, LocationExpr):
print(' %s' % (
describe_DWARF_expr(loc.loc_expr,
dwarfinfo.structs)))
elif isinstance(loc, list):
print(show_loclist(loc,
dwarfinfo,
indent=' '))
def show_loclist(loclist, dwarfinfo, indent):
""" Display a location list nicely, decoding the DWARF expressions
contained within.
"""
d = []
for loc_entity in loclist:
if isinstance(loc_entity, LocationEntry):
d.append('%s <<%s>>' % (
loc_entity,
describe_DWARF_expr(loc_entity.loc_expr, dwarfinfo.structs)))
else:
d.append(str(loc_entity))
return '\n'.join(indent + s for s in d)
if __name__ == '__main__':
if sys.argv[1] == '--test':
for filename in sys.argv[2:]:
process_file(filename)
pyelftools-0.26/examples/dwarf_pubnames_types.py 0000664 0000000 0000000 00000011021 13572204573 0022255 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: dwarf_pubnames_types.py
#
# Dump the contents of .debug_pubnames and .debug_pubtypes sections from the
# ELF file.
#
# Note: sample_exe64.elf doesn't have a .debug_pubtypes section.
#
# Vijay Ramasami (rvijayc@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.elf.elffile import ELFFile
from elftools.common.py3compat import bytes2str
def process_file(filename):
print('Processing file:', filename)
with open(filename, 'rb') as f:
elffile = ELFFile(f)
if not elffile.has_dwarf_info():
print(' file has no DWARF info')
return
# get_dwarf_info returns a DWARFInfo context object, which is the
# starting point for all DWARF-based processing in pyelftools.
dwarfinfo = elffile.get_dwarf_info()
# get .debug_pubtypes section.
pubnames = dwarfinfo.get_pubnames()
if pubnames is None:
print('ERROR: No .debug_pubnames section found in ELF.')
else:
print('%d entries found in .debug_pubnames' % len(pubnames))
# try getting information on a global symbol.
print('Trying pubnames example ...')
sym_name = 'main'
try:
entry = pubnames[sym_name]
except KeyError:
print('ERROR: No pubname entry found for ' + sym_name)
else:
print('%s: cu_ofs = %d, die_ofs = %d' %
(sym_name, entry.cu_ofs, entry.die_ofs))
# get the actual CU/DIE that has this information.
print('Fetching the actual die for %s ...' % sym_name)
for cu in dwarfinfo.iter_CUs():
if cu.cu_offset == entry.cu_ofs:
for die in cu.iter_DIEs():
if die.offset == entry.die_ofs:
print('Die Name: %s' %
bytes2str(die.attributes['DW_AT_name'].value))
# dump all entries in .debug_pubnames section.
print('Dumping .debug_pubnames table ...')
print('-' * 66)
print('%50s%8s%8s' % ('Symbol', 'CU_OFS', 'DIE_OFS'))
print('-' * 66)
for (name, entry) in pubnames.items():
print('%50s%8d%8d' % (name, entry.cu_ofs, entry.die_ofs))
print('-' * 66)
# get .debug_pubtypes section.
pubtypes = dwarfinfo.get_pubtypes()
if pubtypes is None:
print('ERROR: No .debug_pubtypes section found in ELF')
else:
print('%d entries found in .debug_pubtypes' % len(pubtypes))
# try getting information on a global type.
sym_name = 'char'
# note: using the .get() API (pubtypes[key] will also work).
entry = pubtypes.get(sym_name)
if entry is None:
print('ERROR: No pubtype entry for %s' % sym_name)
else:
print('%s: cu_ofs %d, die_ofs %d' %
(sym_name, entry.cu_ofs, entry.die_ofs))
# get the actual CU/DIE that has this information.
print('Fetching the actual die for %s ...' % sym_name)
for cu in dwarfinfo.iter_CUs():
if cu.cu_offset == entry.cu_ofs:
for die in cu.iter_DIEs():
if die.offset == entry.die_ofs:
print('Die Name: %s' %
bytes2str(die.attributes['DW_AT_name'].value))
# dump all entries in .debug_pubtypes section.
print('Dumping .debug_pubtypes table ...')
print('-' * 66)
print('%50s%8s%8s' % ('Symbol', 'CU_OFS', 'DIE_OFS'))
print('-' * 66)
for (name, entry) in pubtypes.items():
print('%50s%8d%8d' % (name, entry.cu_ofs, entry.die_ofs))
print('-' * 66)
if __name__ == '__main__':
if sys.argv[1] == '--test':
process_file(sys.argv[2])
sys.exit(0)
if len(sys.argv) < 2:
print('Expected usage: {0} '.format(sys.argv[0]))
sys.exit(1)
process_file(sys.argv[1])
pyelftools-0.26/examples/dwarf_range_lists.py 0000664 0000000 0000000 00000006512 13572204573 0021542 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: dwarf_range_lists.py
#
# Examine DIE entries which have range list values, and decode these range
# lists.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.common.py3compat import itervalues
from elftools.elf.elffile import ELFFile
from elftools.dwarf.descriptions import (
describe_DWARF_expr, set_global_machine_arch)
from elftools.dwarf.ranges import RangeEntry
def process_file(filename):
print('Processing file:', filename)
with open(filename, 'rb') as f:
elffile = ELFFile(f)
if not elffile.has_dwarf_info():
print(' file has no DWARF info')
return
# get_dwarf_info returns a DWARFInfo context object, which is the
# starting point for all DWARF-based processing in pyelftools.
dwarfinfo = elffile.get_dwarf_info()
# The range lists are extracted by DWARFInfo from the .debug_ranges
# section, and returned here as a RangeLists object.
range_lists = dwarfinfo.range_lists()
if range_lists is None:
print(' file has no .debug_ranges section')
return
for CU in dwarfinfo.iter_CUs():
# DWARFInfo allows to iterate over the compile units contained in
# the .debug_info section. CU is a CompileUnit object, with some
# computed attributes (such as its offset in the section) and
# a header which conforms to the DWARF standard. The access to
# header elements is, as usual, via item-lookup.
print(' Found a compile unit at offset %s, length %s' % (
CU.cu_offset, CU['unit_length']))
# A CU provides a simple API to iterate over all the DIEs in it.
for DIE in CU.iter_DIEs():
# Go over all attributes of the DIE. Each attribute is an
# AttributeValue object (from elftools.dwarf.die), which we
# can examine.
for attr in itervalues(DIE.attributes):
if attribute_has_range_list(attr):
# This is a range list. Its value is an offset into
# the .debug_ranges section, so we can use the range
# lists object to decode it.
rangelist = range_lists.get_range_list_at_offset(
attr.value)
print(' DIE %s. attr %s.\n%s' % (
DIE.tag,
attr.name,
rangelist))
def attribute_has_range_list(attr):
""" Only some attributes can have range list values, if they have the
required DW_FORM (rangelistptr "class" in DWARF spec v3)
"""
if attr.name == 'DW_AT_ranges':
if attr.form in ('DW_FORM_data4', 'DW_FORM_data8'):
return True
return False
if __name__ == '__main__':
if sys.argv[1] == '--test':
for filename in sys.argv[2:]:
process_file(filename)
pyelftools-0.26/examples/elf_low_high_api.py 0000664 0000000 0000000 00000006605 13572204573 0021327 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: elf_low_high_api.py
#
# A simple example that shows some usage of the low-level API pyelftools
# provides versus the high-level API while inspecting an ELF file's symbol
# table.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.elf.elffile import ELFFile
from elftools.elf.sections import SymbolTableSection
def process_file(filename):
print('Processing file:', filename)
with open(filename, 'rb') as f:
section_info_lowlevel(f)
f.seek(0)
section_info_highlevel(f)
def section_info_lowlevel(stream):
print('Low level API...')
# We'll still be using the ELFFile context object. It's just too
# convenient to give up, even in the low-level API demonstation :-)
elffile = ELFFile(stream)
# The e_shnum ELF header field says how many sections there are in a file
print(' %s sections' % elffile['e_shnum'])
# Try to find the symbol table
for i in range(elffile['e_shnum']):
section_offset = elffile['e_shoff'] + i * elffile['e_shentsize']
# Parse the section header using structs.Elf_Shdr
stream.seek(section_offset)
section_header = elffile.structs.Elf_Shdr.parse_stream(stream)
if section_header['sh_type'] == 'SHT_SYMTAB':
# Some details about the section. Note that the section name is a
# pointer to the object's string table, so it's only a number
# here. To get to the actual name one would need to parse the string
# table section and extract the name from there (or use the
# high-level API!)
print(' Section name: %s, type: %s' % (
section_header['sh_name'], section_header['sh_type']))
break
else:
print(' No symbol table found. Perhaps this ELF has been stripped?')
def section_info_highlevel(stream):
print('High level API...')
elffile = ELFFile(stream)
# Just use the public methods of ELFFile to get what we need
# Note that section names are strings.
print(' %s sections' % elffile.num_sections())
section = elffile.get_section_by_name('.symtab')
if not section:
print(' No symbol table found. Perhaps this ELF has been stripped?')
return
# A section type is in its header, but the name was decoded and placed in
# a public attribute.
print(' Section name: %s, type: %s' %(
section.name, section['sh_type']))
# But there's more... If this section is a symbol table section (which is
# the case in the sample ELF file that comes with the examples), we can
# get some more information about it.
if isinstance(section, SymbolTableSection):
num_symbols = section.num_symbols()
print(" It's a symbol section with %s symbols" % num_symbols)
print(" The name of the last symbol in the section is: %s" % (
section.get_symbol(num_symbols - 1).name))
if __name__ == '__main__':
if sys.argv[1] == '--test':
for filename in sys.argv[2:]:
process_file(filename)
pyelftools-0.26/examples/elf_notes.py 0000664 0000000 0000000 00000003456 13572204573 0020027 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: elf_notes.py
#
# An example of obtaining note sections from an ELF file and examining
# the notes it contains.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.elf.elffile import ELFFile
from elftools.elf.sections import NoteSection
def process_file(filename):
print('Processing file:', filename)
with open(filename, 'rb') as f:
for sect in ELFFile(f).iter_sections():
if not isinstance(sect, NoteSection):
continue
print(' Note section "%s" at offset 0x%.8x with size %d' % (
sect.name, sect.header['sh_offset'], sect.header['sh_size']))
for note in sect.iter_notes():
print(' Name:', note['n_name'])
print(' Type:', note['n_type'])
desc = note['n_desc']
if note['n_type'] == 'NT_GNU_ABI_TAG':
print(' Desc: %s, ABI: %d.%d.%d' % (
desc['abi_os'],
desc['abi_major'],
desc['abi_minor'],
desc['abi_tiny']))
elif note['n_type'] == 'NT_GNU_BUILD_ID':
print(' Desc:', desc)
else:
print(' Desc:', ''.join('%.2x' % ord(b) for b in desc))
if __name__ == '__main__':
if sys.argv[1] == '--test':
for filename in sys.argv[2:]:
process_file(filename)
pyelftools-0.26/examples/elf_relocations.py 0000664 0000000 0000000 00000003235 13572204573 0021214 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: elf_relocations.py
#
# An example of obtaining a relocation section from an ELF file and examining
# the relocation entries it contains.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.elf.elffile import ELFFile
from elftools.elf.relocation import RelocationSection
def process_file(filename):
print('Processing file:', filename)
with open(filename, 'rb') as f:
elffile = ELFFile(f)
# Read the .rela.dyn section from the file, by explicitly asking
# ELFFile for this section
# The section names are strings
reladyn_name = '.rela.dyn'
reladyn = elffile.get_section_by_name(reladyn_name)
if not isinstance(reladyn, RelocationSection):
print(' The file has no %s section' % reladyn_name)
print(' %s section with %s relocations' % (
reladyn_name, reladyn.num_relocations()))
for reloc in reladyn.iter_relocations():
print(' Relocation (%s)' % 'RELA' if reloc.is_RELA() else 'REL')
# Relocation entry attributes are available through item lookup
print(' offset = %s' % reloc['r_offset'])
if __name__ == '__main__':
if sys.argv[1] == '--test':
for filename in sys.argv[2:]:
process_file(filename)
pyelftools-0.26/examples/elf_show_debug_sections.py 0000664 0000000 0000000 00000001727 13572204573 0022733 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: elf_show_debug_sections.py
#
# Show the names of all .debug_* sections in ELF files.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.elf.elffile import ELFFile
def process_file(filename):
print('In file:', filename)
with open(filename, 'rb') as f:
elffile = ELFFile(f)
for section in elffile.iter_sections():
if section.name.startswith('.debug'):
print(' ' + section.name)
if __name__ == '__main__':
if sys.argv[1] == '--test':
for filename in sys.argv[2:]:
process_file(filename)
pyelftools-0.26/examples/elfclass_address_size.py 0000664 0000000 0000000 00000002537 13572204573 0022403 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: elfclass_address_size.py
#
# This example explores the ELF class (32 or 64-bit) and address size in each
# of the CUs in the DWARF information.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.elf.elffile import ELFFile
def process_file(filename):
with open(filename, 'rb') as f:
elffile = ELFFile(f)
# elfclass is a public attribute of ELFFile, read from its header
print('%s: elfclass is %s' % (filename, elffile.elfclass))
if elffile.has_dwarf_info():
dwarfinfo = elffile.get_dwarf_info()
for CU in dwarfinfo.iter_CUs():
# cu_offset is a public attribute of CU
# address_size is part of the CU header
print(' CU at offset 0x%x. address_size is %s' % (
CU.cu_offset, CU['address_size']))
if __name__ == '__main__':
if sys.argv[1] == '--test':
for filename in sys.argv[2:]:
process_file(filename)
pyelftools-0.26/examples/examine_dwarf_info.py 0000664 0000000 0000000 00000003655 13572204573 0021676 0 ustar 00root root 0000000 0000000 #-------------------------------------------------------------------------------
# elftools example: examine_dwarf_info.py
#
# An example of examining information in the .debug_info section of an ELF file.
#
# Eli Bendersky (eliben@gmail.com)
# This code is in the public domain
#-------------------------------------------------------------------------------
from __future__ import print_function
import sys
# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']
from elftools.elf.elffile import ELFFile
def process_file(filename):
print('Processing file:', filename)
with open(filename, 'rb') as f:
elffile = ELFFile(f)
if not elffile.has_dwarf_info():
print(' file has no DWARF info')
return
# get_dwarf_info returns a DWARFInfo context object, which is the
# starting point for all DWARF-based processing in pyelftools.
dwarfinfo = elffile.get_dwarf_info()
for CU in dwarfinfo.iter_CUs():
# DWARFInfo allows to iterate over the compile units contained in
# the .debug_info section. CU is a CompileUnit object, with some
# computed attributes (such as its offset in the section) and
# a header which conforms to the DWARF standard. The access to
# header elements is, as usual, via item-lookup.
print(' Found a compile unit at offset %s, length %s' % (
CU.cu_offset, CU['unit_length']))
# The first DIE in each compile unit describes it.
top_DIE = CU.get_top_DIE()
print(' Top DIE with tag=%s' % top_DIE.tag)
# We're interested in the filename...
print(' name=%s' % top_DIE.get_full_path())
if __name__ == '__main__':
if sys.argv[1] == '--test':
for filename in sys.argv[2:]:
process_file(filename)
pyelftools-0.26/examples/reference_output/ 0000775 0000000 0000000 00000000000 13572204573 0021045 5 ustar 00root root 0000000 0000000 pyelftools-0.26/examples/reference_output/dwarf_decode_address.out 0000664 0000000 0000000 00000000116 13572204573 0025707 0 ustar 00root root 0000000 0000000 Processing file: ./examples/sample_exe64.elf
Function: main
File: z.c
Line: 3
pyelftools-0.26/examples/reference_output/dwarf_die_tree.out 0000664 0000000 0000000 00000004324 13572204573 0024544 0 ustar 00root root 0000000 0000000 Processing file: ./examples/sample_exe64.elf
Found a compile unit at offset 0, length 115
Top DIE with tag=DW_TAG_compile_unit
name=/usr/src/packages/BUILD/glibc-2.11.1/csu/../sysdeps/x86_64/elf/start.S
DIE tag=DW_TAG_compile_unit
Found a compile unit at offset 119, length 135
Top DIE with tag=DW_TAG_compile_unit
name=/usr/src/packages/BUILD/glibc-2.11.1/csu/init.c
DIE tag=DW_TAG_compile_unit
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_variable
DIE tag=DW_TAG_const_type
Found a compile unit at offset 258, length 156
Top DIE with tag=DW_TAG_compile_unit
name=/tmp/ebenders/z.c
DIE tag=DW_TAG_compile_unit
DIE tag=DW_TAG_subprogram
DIE tag=DW_TAG_formal_parameter
DIE tag=DW_TAG_formal_parameter
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_pointer_type
DIE tag=DW_TAG_pointer_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_variable
Found a compile unit at offset 418, length 300
Top DIE with tag=DW_TAG_compile_unit
name=/usr/src/packages/BUILD/glibc-2.11.1/csu/elf-init.c
DIE tag=DW_TAG_compile_unit
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_typedef
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_subprogram
DIE tag=DW_TAG_subprogram
DIE tag=DW_TAG_formal_parameter
DIE tag=DW_TAG_formal_parameter
DIE tag=DW_TAG_formal_parameter
DIE tag=DW_TAG_variable
DIE tag=DW_TAG_lexical_block
DIE tag=DW_TAG_variable
DIE tag=DW_TAG_pointer_type
DIE tag=DW_TAG_pointer_type
DIE tag=DW_TAG_base_type
DIE tag=DW_TAG_const_type
DIE tag=DW_TAG_array_type
DIE tag=DW_TAG_subrange_type
DIE tag=DW_TAG_subroutine_type
DIE tag=DW_TAG_formal_parameter
DIE tag=DW_TAG_formal_parameter
DIE tag=DW_TAG_formal_parameter
DIE tag=DW_TAG_pointer_type
DIE tag=DW_TAG_variable
DIE tag=DW_TAG_variable
pyelftools-0.26/examples/reference_output/dwarf_location_info.out 0000664 0000000 0000000 00000004004 13572204573 0025602 0 ustar 00root root 0000000 0000000 Processing file: ./examples/sample_exe64.elf
Found a compile unit at offset 0, length 115
Found a compile unit at offset 119, length 135
DIE DW_TAG_variable. attr DW_AT_location.
(DW_OP_addr: 400608)
Found a compile unit at offset 258, length 156
DIE DW_TAG_subprogram. attr DW_AT_frame_base.
LocationEntry(begin_offset=0, end_offset=1, loc_expr=[119, 8]) <<(DW_OP_breg7 (rsp): 8)>>
LocationEntry(begin_offset=1, end_offset=4, loc_expr=[119, 16]) <<(DW_OP_breg7 (rsp): 16)>>
LocationEntry(begin_offset=4, end_offset=43, loc_expr=[118, 16]) <<(DW_OP_breg6 (rbp): 16)>>
DIE DW_TAG_formal_parameter. attr DW_AT_location.
(DW_OP_fbreg: -20)
DIE DW_TAG_formal_parameter. attr DW_AT_location.
(DW_OP_fbreg: -32)
DIE DW_TAG_variable. attr DW_AT_location.
(DW_OP_addr: 601018)
Found a compile unit at offset 418, length 300
DIE DW_TAG_subprogram. attr DW_AT_frame_base.
(DW_OP_breg7 (rsp): 8)
DIE DW_TAG_subprogram. attr DW_AT_frame_base.
LocationEntry(begin_offset=16, end_offset=64, loc_expr=[119, 8]) <<(DW_OP_breg7 (rsp): 8)>>
LocationEntry(begin_offset=64, end_offset=153, loc_expr=[119, 192, 0]) <<(DW_OP_breg7 (rsp): 64)>>
DIE DW_TAG_formal_parameter. attr DW_AT_location.
LocationEntry(begin_offset=16, end_offset=85, loc_expr=[85]) <<(DW_OP_reg5 (rdi))>>
LocationEntry(begin_offset=85, end_offset=143, loc_expr=[94]) <<(DW_OP_reg14 (r14))>>
DIE DW_TAG_formal_parameter. attr DW_AT_location.
LocationEntry(begin_offset=16, end_offset=85, loc_expr=[84]) <<(DW_OP_reg4 (rsi))>>
LocationEntry(begin_offset=85, end_offset=138, loc_expr=[93]) <<(DW_OP_reg13 (r13))>>
DIE DW_TAG_formal_parameter. attr DW_AT_location.
LocationEntry(begin_offset=16, end_offset=85, loc_expr=[81]) <<(DW_OP_reg1 (rdx))>>
LocationEntry(begin_offset=85, end_offset=133, loc_expr=[92]) <<(DW_OP_reg12 (r12))>>
DIE DW_TAG_variable. attr DW_AT_location.
LocationEntry(begin_offset=92, end_offset=123, loc_expr=[83]) <<(DW_OP_reg3 (rbx))>>
pyelftools-0.26/examples/reference_output/dwarf_pubnames_types.out 0000664 0000000 0000000 00000001556 13572204573 0026026 0 ustar 00root root 0000000 0000000 Processing file: ./examples/sample_exe64.elf
5 entries found in .debug_pubnames
Trying pubnames example ...
main: cu_ofs = 258, die_ofs = 303
Fetching the actual die for main ...
Die Name: main
Dumping .debug_pubnames table ...
------------------------------------------------------------------
Symbol CU_OFS DIE_OFS
------------------------------------------------------------------
_IO_stdin_used 119 230
main 258 303
glob 258 395
__libc_csu_fini 418 495
__libc_csu_init 418 523
------------------------------------------------------------------
ERROR: No .debug_pubtypes section found in ELF
pyelftools-0.26/examples/reference_output/dwarf_range_lists.out 0000664 0000000 0000000 00000000721 13572204573 0025273 0 ustar 00root root 0000000 0000000 Processing file: ./examples/sample_exe64.elf
Found a compile unit at offset 0, length 115
Found a compile unit at offset 119, length 135
Found a compile unit at offset 258, length 156
Found a compile unit at offset 418, length 300
DIE DW_TAG_lexical_block. attr DW_AT_ranges.
[RangeEntry(begin_offset=26, end_offset=40), RangeEntry(begin_offset=85, end_offset=118), RangeEntry(begin_offset=73, end_offset=77), RangeEntry(begin_offset=64, end_offset=67)]
pyelftools-0.26/examples/reference_output/elf_low_high_api.out 0000664 0000000 0000000 00000000431 13572204573 0025053 0 ustar 00root root 0000000 0000000 Processing file: ./examples/sample_exe64.elf
Low level API...
42 sections
Section name: 1, type: SHT_SYMTAB
High level API...
42 sections
Section name: .symtab, type: SHT_SYMTAB
It's a symbol section with 80 symbols
The name of the last symbol in the section is: _init
pyelftools-0.26/examples/reference_output/elf_notes.out 0000664 0000000 0000000 00000000723 13572204573 0023556 0 ustar 00root root 0000000 0000000 Processing file: ./examples/sample_exe64.elf
Note section ".note.ABI-tag" at offset 0x00000254 with size 32
Name: GNU
Type: NT_GNU_ABI_TAG
Desc: ELF_NOTE_OS_LINUX, ABI: 2.6.4
Note section ".note.SuSE" at offset 0x00000274 with size 24
Name: SuSE
Type: 1163097427
Desc: 01000a02
Note section ".note.gnu.build-id" at offset 0x0000028c with size 36
Name: GNU
Type: NT_GNU_BUILD_ID
Desc: 8e50cda8e25993499ac4aa2d8deaf58d0949d47d
pyelftools-0.26/examples/reference_output/elf_relocations.out 0000664 0000000 0000000 00000000201 13572204573 0024737 0 ustar 00root root 0000000 0000000 Processing file: ./examples/sample_exe64.elf
.rela.dyn section with 1 relocations
Relocation (RELA)
offset = 6295520
pyelftools-0.26/examples/reference_output/elf_show_debug_sections.out 0000664 0000000 0000000 00000000255 13572204573 0026463 0 ustar 00root root 0000000 0000000 In file: ./examples/sample_exe64.elf
.debug_aranges
.debug_pubnames
.debug_info
.debug_abbrev
.debug_line
.debug_frame
.debug_str
.debug_loc
.debug_ranges
pyelftools-0.26/examples/reference_output/elfclass_address_size.out 0000664 0000000 0000000 00000000311 13572204573 0026124 0 ustar 00root root 0000000 0000000 ./examples/sample_exe64.elf: elfclass is 64
CU at offset 0x0. address_size is 8
CU at offset 0x77. address_size is 8
CU at offset 0x102. address_size is 8
CU at offset 0x1a2. address_size is 8
pyelftools-0.26/examples/reference_output/examine_dwarf_info.out 0000664 0000000 0000000 00000001164 13572204573 0025424 0 ustar 00root root 0000000 0000000 Processing file: ./examples/sample_exe64.elf
Found a compile unit at offset 0, length 115
Top DIE with tag=DW_TAG_compile_unit
name=/usr/src/packages/BUILD/glibc-2.11.1/csu/../sysdeps/x86_64/elf/start.S
Found a compile unit at offset 119, length 135
Top DIE with tag=DW_TAG_compile_unit
name=/usr/src/packages/BUILD/glibc-2.11.1/csu/init.c
Found a compile unit at offset 258, length 156
Top DIE with tag=DW_TAG_compile_unit
name=/tmp/ebenders/z.c
Found a compile unit at offset 418, length 300
Top DIE with tag=DW_TAG_compile_unit
name=/usr/src/packages/BUILD/glibc-2.11.1/csu/elf-init.c
pyelftools-0.26/examples/sample_exe64.elf 0000664 0000000 0000000 00000030055 13572204573 0020456 0 ustar 00root root 0000000 0000000 ELF > @ @ 0 @ 8 @ * ' @ @ @ @ @ 8 8@ 8@ @ @ ` ` @ @` @` T T@ T@ \ \ Ptd @ @ $ $ Qtd Rtd ` ` /lib64/ld-linux-x86-64.so.2 GNU SuSESuSE
GNU PͨYIĪ- I} __gmon_start__ libc.so.6 __libc_start_main GLIBC_2.2.5 ui , ` ` Hc H5 % @ % h 1I^HHPTI @ H0@ H@ HH HtHÐUHSH= uK0` H H-(` HHXH9s% HBH (` H H9rs H[fff. UH=o Ht Ht8` IA@ ÐUH}HuHE EÐfffff. Hl$L|$H- L= Ld$Ll$Lt$H\$H8L)AIHIKHt1@ LLDAHH9rH\$Hl$Ld$Ll$ Lt$(L|$0H8ÐUHSHHH Ht` D HHHuH[ÐHOH ; <